What happened?
U.S. courts ruled that Meta and Anthropic's use of copyrighted books for training AI falls under 'fair use.' The two companies won in recent lawsuits, but the court emphasized that the ruling is case-specific and does not fully legalize AI training data sources.
Meta was sued by 13 authors for using protected books to train the Llama model. The court found that its use was transformative, and the plaintiffs failed to prove market harm, thus ruling that it did not constitute infringement.
In another case, Anthropic's training of Claude AI was deemed similar to human reading and learning, also falling under fair use. However, the judge warned that if a company establishes a permanent digital library, it could still constitute infringement.
Meta's AI training was found not to infringe copyright.
Generative AI continues to spark copyright disputes, and the California court this week made rulings favoring model training companies in two cases accusing AI models of illegally using 'copyrighted books' for training.
Following the not guilty verdict for AI startup Anthropic, tech giant Meta also won in court, with the court ruling that its AI model training practices comply with the 'fair use' principle in U.S. copyright law.
The outcomes of the two cases are a significant boost for AI model companies; however, federal judges also clearly stated that these rulings do not grant tech companies a free pass, primarily because the plaintiffs were not adequately prepared in terms of litigation strategy and evidence.
Meta's training practices comply with the 'fair use' principle.
Last year, Meta was accused of illegally using copyrighted books without authorization from authors and publishers to train the generative AI model 'Llama.' The books involved include Sarah Silverman's (The Bedwetter) and Pulitzer Prize-winning author Junot Díaz's (The Brief Wondrous Life of Oscar Wao).
San Francisco District Court Judge Chhabria issued a 'summary judgment' on June 26 regarding this case, stating that the case does not need to be tried by a jury, and Meta's training practices can be considered fair use. He believed that Meta's use of books is transformative and not a simple reproduction of original content but used for developing a language model, serving a different purpose.
🚀 'Transformative': One of the criteria for determining 'fair use' under U.S. copyright law, used to judge whether a use creates a new purpose, meaning, or function rather than simply reproducing the original work.
The ruling in this case does not mean that all uses of copyrighted content for training by Meta are legal; it only indicates that these plaintiffs presented incorrect arguments and did not sufficiently prove key elements, especially that evidence of market impact was almost nonexistent, emphasized Chhabria.
Among the four main factors for fair use determination, the most concerning is the 'potential harm to the market for the original work.' However, the judge pointed out that the plaintiffs did not provide sufficient evidence to prove that their books suffered a decline in sales or market harm due to being used for AI training.
Anthropic also won: AI learning like human reading.
In another lawsuit, AI company Anthropic also received court support. The case was initiated by writers Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, accusing Anthropic of unauthorized use of their books to train the ChatGPT competitor Claude.
Judge William Alsup pointed out in his ruling that the training methods employed by Anthropic are highly transformative and equivalent to 'human learning through reading.' Based on this view, the judge ruled that it falls under fair use.
However, it is noteworthy that Alsup clearly pointed out that if a company downloads a large number of pirated books to establish a 'permanent digital library,' it would not qualify for fair use protection. While the court is willing to support AI development, there is still a boundary regarding the misuse of data sources.
Fair use requires 'case-by-case determination,' not a get-out-of-jail-free card for the tech industry.
Even though Meta and Anthropic won consecutively, both judges repeatedly emphasized in their rulings that these victories should not be interpreted as AI model training being protected by fair use in all circumstances.
Judge Chhabria stated directly: 'If AI companies use copyrighted books for training and create tools that generate a large amount of competitive content, causing substantial harm to the original authors' market, then no matter how 'transformative' the use, it is difficult to constitute fair use.'
He even stated that markets for content like the news industry may be more susceptible to indirect competition from AI-generated content, indicating that if media companies sue in the future, the court's stance may be stricter.
There are still multiple copyright lawsuits against AI companies ongoing, such as the (New York Times) lawsuit against OpenAI and Microsoft, focusing on the use of news articles; Disney and Universal have also sued the AI image platform Midjourney, questioning whether its training models involve infringement of numerous film and television works.
This article is authorized for reprint from: (Web3+)
Original title: (Writers sue Meta for infringement, but the judge says OK! Where exactly is the boundary of 'fair use' in training AI?)
Original author: Li Pengrui
'Writers sue Meta AI for infringement but lose! The judge: This is a case, not a get-out-of-jail-free card for the tech industry' was first published in 'Crypto City.'