Harvard and Google to release 1 million public-domain books as AI training dataset
- Harvard and Google will release 1 million public-domain books as an AI training dataset.
- The collection will include books from various genres and topics.
- This initiative aims to enhance machine learning models.
- Public access to the dataset is expected to foster innovation.
13 Articles
13 Articles
Harvard opens access to 1 million books for training AI models
Harvard’s Library Innovation Lab lets everyone use 1 million books for AI training under its Institutional Data Initiative (IDI). The educational institution explains it will allow the world to benefit from these collections that it has preserved for years. More importantly, these books will help build the world’s AI future by training AI models with quality information. Why Harvard encourages AI training Harvard Law Today reported that the uni…
Harvard and Google to release 1 million public-domain books as AI training dataset
AI training data has a big price tag, one best-suited for deep-pocketed tech firms. This is why Harvard University plans to release a dataset that includes in the region of 1 million public-domain books, spanning genres, languages, and authors including Dickens, Dante, and Shakespeare, which are no longer copyright-protected due to their age. The new […] © 2024 TechCrunch. All rights reserved. For personal use only.
Coverage Details
Bias Distribution
- 75% of the sources lean Left
To view factuality data please Upgrade to Premium
Ownership
To view ownership data please Upgrade to Vantage