TheJakartaPost

Please Update your browser

Your browser is out of date, and may not be compatible with our website. A list of the most popular web browsers can be found below.
Just click on the icons to get to the download page.

Jakarta Post

Meta knew it used pirated books to train AI, authors say

Ta-Nehisi Coates, comedian Sarah Silverman and other authors suing Meta for copyright infringement made the accusations in filings made public on Wednesday in California federal court. They said internal documents produced by Meta during the discovery process showed the company knew the works were pirated.

News desk (Reuters)
Washington
Fri, January 10, 2025 Published on Jan. 10, 2025 Published on 2025-01-10T16:13:27+07:00

Change text size

Gift Premium Articles
to Anyone

Share the best of The Jakarta Post with friends, family, or colleagues. As a subscriber, you can gift 3 to 5 articles each month that anyone can read—no subscription needed!
Meta knew it used pirated books to train AI, authors say Facebook CEO Mark Zuckerberg speaks at the F8 summit in San Francisco, California, on March 25, 2015. Facebook shares plunged March 19, 2018 as the social media giant was pounded by criticism at home and abroad over revelations that a firm working for Donald Trump's presidential campaign harvested and misused data on 50 million members. (AFP/Josh Edelson)

M

eta Platforms used pirated versions of copyrighted books to train its artificial intelligence systems with approval from its CEO Mark Zuckerberg, a group of authors alleged in newly disclosed court papers. 

Ta-Nehisi Coates, comedian Sarah Silverman and other authors suing Meta for copyright infringement made the accusations in filings made public on Wednesday in California federal court. They said internal documents produced by Meta during the discovery process showed the company knew the works were pirated.

Spokespeople for Meta did not immediately respond to a request for comment.

The authors sued Meta in 2023, arguing that the tech giant misused their books to train its large language model Llama. 

The case is one of several alleging that copyrighted works by authors, artists and others were used to develop AI products without permission. Defendants have argued that they made fair use of copyrighted material.

The authors asked the court on Wednesday for permission to file an updated complaint. They said new evidence showed Meta used the AI training dataset LibGen, which allegedly includes millions of pirated works, and distributed it through peer-to-peer torrents.

They said internal Meta communications showed Zuckerberg "approved Meta's use of the LibGen dataset notwithstanding concerns within Meta's AI executive team (and others at Meta) that LibGen is 'a dataset we know to be pirated.'"

US District Judge Vince Chhabria last year dismissed claims that text generated by Meta's chatbots infringed the authors' copyrights and that Meta unlawfully stripped their books' copyright management information (CMI). 

The writers argued Wednesday that the evidence bolstered their infringement claims and justified reviving their CMI claim and adding a new computer fraud claim. 

Chhabria said during a hearing on Thursday that he would allow the writers to file an amended complaint but expressed skepticism about the merits of the fraud and CMI claims.

Your Opinion Matters

Share your experiences, suggestions, and any issues you've encountered on The Jakarta Post. We're here to listen.

Enter at least 30 characters
0 / 30

Thank You

Thank you for sharing your thoughts. We appreciate your feedback.

Share options

Quickly share this news with your network—keep everyone informed with just a single click!

Change text size options

Customize your reading experience by adjusting the text size to small, medium, or large—find what’s most comfortable for you.

Gift Premium Articles
to Anyone

Share the best of The Jakarta Post with friends, family, or colleagues. As a subscriber, you can gift 3 to 5 articles each month that anyone can read—no subscription needed!

Continue in the app

Get the best experience—faster access, exclusive features, and a seamless way to stay updated.