The Unbelievable Scale of AI’s Pirated-Books Problem
6 Articles
6 Articles


The Unbelievable Scale of AI’s Pirated-Books Problem
Meta pirated millions of books to train its AI. Search through them here.
Report: The Unbelievable Scale of AI’s Pirated-Books Problem; Idaho Legislature’s Budget Committee Approves Half of Funding Proposed For Digital Library Grants; & More Headlines
AI The Unbelievable Scale of AI’s Pirated-Books Problem (The Atlantic) British Library Groundbreaking British Library Development Confirmed (via BL) Idaho Idaho Legislature’s Budget Committee Approves Half of Funding Proposed For Digital Library Grants (via Idaho Capital Sun) Internet Archive Internet Archive Responds to Record Labels: Stop Playing “Hide-The-Ball” (via IA Blog) Misinformation Disagreement as a Way to Study Misinformation an…
@_alexreisner: Search LibGen, the Pirated-Books Database That Meta Used to Train AI
[Meta knowingly chose to reproduce a database of pirated books called “LibGen” to train Meta’s LLM Llama 3. If this isn’t criminal copyright infringement, what is? Alex Reisner created a search tool for authors to find their books in Meta’s contaminated AI. Why would they do this? In the words of Austin songwriter Guy Forsyth, because “Americans are freedom loving people and nothing says freedom like getting away with it.”{] Editor’s note: This …
Search LibGen, the Pirated-Books Database That Meta Used to Train AI - Stephen's Lighthouse
Search LibGen, the Pirated-Books Database That Meta Used to Train AI Millions of books and scientific papers are captured in the collection’s current iteration. https://www.theatlantic.com/technology/archive/2025/03/search-libgen-data-set/682094/ Pro plugin deactivated or invalid The post Search LibGen, the Pirated-Books Database That Meta Used to Train AI first appeared on Stephen's Lighthouse.
Coverage Details
Bias Distribution
- 100% of the sources lean Left
To view factuality data please Upgrade to Premium
Ownership
To view ownership data please Upgrade to Vantage