Recently, two professors from the State University of New York (SUNY) College of Health Sciences, Susana Martinez-Conde and Stephen Macknik, filed a class-action lawsuit against Apple Inc. They accuse Apple of using an unauthorized pirated book collection called Books3, which includes their works, to train its Apple Intelligence artificial intelligence model. This incident has once again drawn widespread attention to copyright issues in the process of training artificial intelligence.

Copyright, Piracy

According to the complaint, the professors' books "Champions of Illusion: The Science Behind Mind-Boggling Images and Mystifying Brain Puzzles" and "Sleights of Mind: What the Neuroscience of Magic Reveals About Our Everyday Deceptions" were used to train Apple Foundation Intelligence Models and the OpenELM language model. The complaint states that Apple copied their works without permission and used them for testing model performance as well as filtering out copyrighted content from appearing to end users.

Books3 was a "shadow library" widely used for artificial intelligence training, containing up to 186,640 books sourced from the private BitTorrent tracker Bibliotik. When Apple released OpenELM in April 2024, it admitted to using the "The Pile" dataset, which indeed included content from Books3. Currently, Books3 has been taken down due to copyright issues as of October 2023.

The case has attracted attention for two reasons. On one hand, authors should receive legal compensation for the reuse and reproduction of their works; on the other hand, there is still widespread controversy regarding the legality of copyright for reading materials in AI training. For example, Google often uses unauthorized content for AI summaries without necessarily citing the sources, making it difficult for creators to obtain their rightful rights.

The US court in the Midjourney-related case pointed out that tracing and compensating for AI training phases is difficult. However, in a recent Anthropic case, the judge considered that storing training books in a central database may constitute direct copyright infringement. If Apple is found guilty of "willful infringement," it could face fines of up to $150,000 per book.

Currently, the two professors are requesting a jury trial, economic compensation, and a ban on Apple from using their works in the future. Apple has not yet publicly responded to the substantive content of the lawsuit. Although the complaint mentions that Apple's market value increased by $200 billion on the day Apple Intelligence was released, in fact, Apple's market value had increased more than four times in the past five years, indicating that the market impact of this event still needs further observation.

Key Points:

💼 Scholars have sued Apple, accusing it of using pirated books without authorization to train AI.  

📚 Books3 is called a "shadow library," containing a large amount of unauthorized book texts.  

⚖️ If found guilty of "willful infringement," Apple faces a high risk of heavy penalties.