Major tech companies Apple, Amazon, and OpenAI have recently been involved in legal disputes, facing a class-action lawsuit initiated by multiple content creators. The plaintiffs accuse these companies of bypassing YouTube's anti-scraping protection mechanisms to privately download and use millions of video data for AI model training.

Data scraping violations have caused public outrage

Several YouTube channel owners stated that Apple and other companies used a dataset called Panda-70M to locate and extract video clips. The plaintiffs claim their original content was illegally used more than 500 times in this dataset, constituting an intentional circumvention of the copyright protection system.

A research paper from Apple revealed its video generation model's reliance on this dataset, which has become key evidence in the case. Creators believe that even if the dataset only provided indexes, the actual scraping and training actions have already infringed upon their statutory rights.

Plaintiffs seek damages and an injunction

The plaintiffs have already filed a request with the court for a jury trial, demanding compensation up to the maximum amount allowed by law. In addition to monetary compensation, the lawsuit also seeks a permanent injunction from the court to prohibit the relevant companies from continuing to use infringing content.

At present, Amazon and OpenAI have not made an official response to this class-action lawsuit. Legal experts point out that as the demand for AI training data increases, the debate over "scraping rights" and "copyright" will become a common occurrence in the industry.