Spotify says it disabled accounts tied to piracy group Anna’s Archive after the group claimed it scraped ~86 million audio files and metadata spanning 256 million tracks, framing the collection as a “preservation archive.”
What happened and when
Spotify confirmed on December 22 that it disabled user accounts it says were involved in unlawful scraping linked to Anna’s Archive. The group announced the project over December 20–21, claiming it had collected a massive dataset covering most of Spotify’s listening activity and nearly the entire public catalog.
Spotify said it is monitoring for suspicious behavior and has added new safeguards aimed at preventing similar activity.
The scale of the alleged scrape
Anna’s Archive said the project totals nearly 300 terabytes and includes:
- Metadata for an estimated 99.9% of Spotify’s catalog
- Coverage of releases through July 2025
- 186 million unique ISRCs (International Standard Recording Codes), which identify recordings
- Audio allegedly preserved primarily in OGG Vorbis at 160 kbps for popular tracks, with lower-bitrate re-encodes for less popular tracks to save space
The group also claimed the dataset represents over 99.6% of Spotify’s listening activity, presenting the effort as an “open” archive designed for long-term preservation.
Key figures at a glance
| Metric (as claimed) | Figure | What it describes |
| Audio files scraped | ~86 million | Track audio files obtained |
| Tracks referenced | 256 million | Track-level entries in the dataset |
| Total archive size | ~300 TB | Combined audio + metadata footprint |
| Catalog coverage (metadata) | ~99.9% | Share of Spotify catalog covered by metadata |
| Listening activity represented | ~99.6% | Share of activity the group claims to cover |
| Unique ISRCs | 186 million | Recording identifiers included |
How the group said it collected and packaged the data
Anna’s Archive described a workflow that mixes catalog metadata collection with audio extraction.
- Metadata first: As of December 21, the group said it had released metadata publicly, with audio planned later.
- Staged audio distribution: The group said it would distribute audio in bulk torrents in stages, organized by track popularity.
- Format choices: Popular tracks were reportedly kept in Spotify’s original OGG Vorbis format at 160 kbps, while less popular tracks were re-encoded at lower bitrates to reduce storage.
Those claims—especially around audio access—matter because they imply more than catalog browsing. Metadata on streaming platforms is often partially public by design. Audio extraction, however, typically involves bypassing technical protections.
Spotify’s response and what it says was affected
Spotify said it identified and disabled accounts it characterized as “nefarious” and connected to unlawful scraping.
In a separate statement described by the company, Spotify said an internal review found that:
- a third party scraped public metadata, and
- used illicit tactics to circumvent DRM protections to access some audio files
Spotify emphasized that it found no impact on user accounts or personal data, and reiterated its long-standing stance against piracy and in support of rights holders.
Spotify’s main claims
| Topic | Spotify’s stated position |
| What happened | “Unlawful scraping” tied to specific accounts |
| What Spotify did | Disabled accounts and added safeguards |
| User data impact | No impact on user accounts or personal data (per Spotify) |
| Audio access | Claims “illicit tactics” were used to bypass DRM for some audio |
| Ongoing action | Monitoring for suspicious behavior |
Why this matters for artists, labels, and streaming platforms
Large-scale scraping disputes are not new in tech, but music adds a layer of complexity because platforms must balance:
- Artist and label rights (copyright, licensing, royalty reporting)
- User experience (easy discovery, consistent playback)
- Platform security (account abuse, automation, circumvention tools)
If a dataset contains a meaningful share of platform audio and identifiers (like ISRCs), it could be repurposed for:
- Unauthorized redistribution of recordings
- Linking recordings across services using identifiers
- Automated mirroring of catalogs
- Downstream piracy that competes with legitimate streams
At the same time, “preservation” arguments often appear in piracy debates, typically framing streaming’s shifting catalogs—where tracks can disappear due to licensing changes—as a reason to store copies outside the platform. Rights holders generally reject that rationale when it involves copying and redistribution without permission.
The preservation claim vs. copyright enforcement
Anna’s Archive positioned the scrape as a preservation project. Spotify characterized it as copyright abuse.
That clash reflects a broader tension in digital media:
- Streaming services offer convenient access but do not guarantee permanence for every release.
- Preservation efforts can be legitimate when done with permission, legal deposit, or licensed archives—but become legally and ethically contested when they involve circumventing protection or mass redistribution.
Spotify’s phrasing—“anti-copyright attacks” and “unlawful scraping”—signals it views the event as both a security incident and a rights enforcement issue.
What happens next
Spotify says it has already implemented additional safeguards and is actively monitoring for similar behavior. If Anna’s Archive proceeds with the staged torrent releases it described, the incident could expand from a scraping dispute into a wider piracy enforcement effort involving:
- additional account takedowns,
- escalated technical blocking and detection, and
- potential legal actions by rights holders or industry groups, depending on distribution scope.
For everyday Spotify users, Spotify’s key point is that it says personal data wasn’t affected. For artists and labels, the primary concern is whether any audio distribution occurs at scale, and whether identifiers and metadata are used to facilitate copying or redistribution.
Final thoughts
Spotify’s December 22 confirmation shows streaming platforms are treating large-scale scraping as both a security threat and a copyright enforcement issue. The next phase depends on whether the project remains largely a metadata dump or turns into widespread audio distribution—something Spotify says it is actively working to prevent.






