Sarah Silverman is Suing OpenAI for Stealing her Jokes

US Comedian and author Sarah Silverman, along with novelists Christopher Golden and Richard Kadrey, are suing OpenAI over claims of Copyright infringement.

In a series of class action lawsuits filed Friday, Silverman, Golden and Kadrey claim OpenAI used their copyright-protected content to train the AI language model GPT, which powers a range of chatbots including ChatGPT.

The trio allege that when prompted, ChatGPT will generate a summary of their work based on their own writing. This, they claim, infringes copyright laws as they did not give their consent to their books being fed to the generative AI chatbot.

“OpenAI made copies of Plaintiffs' books during the training process of the OpenAI Language Models without Plaintiffs' permission,” the lawsuit reads.

“Specifically, OpenAI copied at least Plaintiff Tremblay's book The Cabin at the End of the World; and Plaintiff Awad's books 13 Ways of Looking at a Girl and Bunny.”

Silverman and the authors also claim that OpenAI’s ChatGPT breaches the Digital Millennium Copyright Act (DMCA) for regurgitating their content without the legally-required copyright management information found in original books.

"At no point did ChatGPT reproduce any of the copyright management information Plaintiffs included with their published works,” Silverman et al state in a second suit.

Double whammy

It’s not just OpenAI facing the authors’ legal wrath. In a separate lawsuit against Meta, Silverman, Golden and Kadrey allege their books were accessible in datasets Meta used to train a series of open-source AI Models the tech titan introduced in February.

The authors point to a paper Meta released earlier this year, which details the sources the tech titan used to train its AI models – one of which being ThePile.

Sarah Silverman Lawsuit OpenAI — *Lawsuit against OpenAI detailing how it scraped Shadow libraries to train ChatGPT.*

ThePile, the suit notes, was described in an EleutherAI paper as being put together from “a copy of the contents of the Bibliotik private tracker.” Bibliotik and the other “shadow libraries” are “flagrantly illegal,” the suit states.

In both complaints against Meta and OpenAI, the authors say they “did not consent to the use of their copyrighted books as training material” for the companies’ AI systems.

They ask that both companies pay statutory damages and restitution of profits, and have requested a permanent injunction to stop them from continuing their actions.

A threat to copyright

Silverman et al’s lawsuit is just the latest legal action targetting AI companies like OpenAI. At the end of June, a US law firm hit OpenAI with a $3 billion lawsuit for violating privacy laws by scraping data from the web to train ChatGPT.

Meanwhile, in January, Getty Images sued the AI Art generator Stability AI for allegedly taking millions of copyright-protected images from the site to train its AI image generator Stable Diffusion.

Experts warn that the method by which AI firms obtain their data may lead to the copyright-protected work of millions of content creators being stolen, raising questions about the future of creative industries and the ability to tell fact from fiction.

????????????
How many lawsuits will have piled up in six months or a year?????

— みや (@miyamoyame) July 10, 2023

While neither Meta nor OpenAI has revealed exactly which resources it has scraped from the web, both have admitted to using hundreds of thousands of copyrighted books stored on shadow library websites including those referenced in authors’ suits.

OpenAI trains its Large Language Models (LLMs) by scraping publicly available text and images from the internet. These resources are not only limited to books – but also blogs, websites and even social media posts shared online.

This method of taking content from the web currently sits in a legal grey area, with lawmakers struggling to decide if AI companies’ scraping of data breaks copyright laws.

Since AI technologies are still in continuous development, It is yet to be seen whether governments will be able to legally prevent companies from taking work without their consent.

The EU recently passed the world’s first AI Act, which serves to protect the mass harvesting of people’s private and sensitive data by AI companies.

But it remains unclear if this legislation would enforce any sort of restrictions on the scraping of publicly-available online content by these companies.

kasjdf laskjdf asldkfj asdf

new title abc

This is title

AI chatbots must learn to say ‘help!’ says Microsoft executive

test scheduling

This is another test

test 24234

This is a test again

adejh fgbuewv

What Happened to the IRL App? Fake Users and Real Consequences

How a Labour Government Will Change UK Tech, According to Experts

Top 10 Best Public DNS Servers for 2024

The Tendency to Blame the System for Personal Mistakes: An Analysis

Top 10 Facility Management Software Solutions for 2024

Communications Tech Events to Attend: An Analyst's Take

How The Open Group Portfolio of Digital Open Standards Supports your Digital Business Transformation Journey

Test 1

All Social Security Numbers Leaked in Massive Data Breach

Zero Trust Security: Mastering the Weakest Link

Automated Network Pentesting: Your Secret Weapon in Cybersecurity

new title

What Happened to Hi5? From Social Media Star to Digital Footnote

The Growth of Enterprise Tech Podcasting

Olympic Venue Among 40 Museums Targeted by Ransomware Attack: What You Need to Know

test scheduling

test schedule publish ( 12-11-2024 9:10 am bst )

kasjdf laskjdf asldkfj asdf

new title abc

Zero Trust Security: Mastering the Weakest Link

Automated Network Pentesting: Your Secret Weapon in Cybersecurity

Empowering Local Government with FME: Fremont’s Path to RIPA Compliance

AI Strategy in Latin America: Imitation Over Innovation

Top 10 Facility Management Software Solutions for 2024

Top 10 GIS Software Tools And Solutions

Top 10 Biggest GDPR Fines in History (So Far)

Top 10 Building Automation Systems (BAS) for 2024

match

Astera: Revolutionizing Insurance Data Modeling Through Data Vaults

Astera: The Blueprint for End-to-End Data Warehouse Automation

Astera: Transitioning from Manual to Automated Data Pipelines

Cybersecurity Luminary Stephen Khan to Receive Prestigious Hall of Fame Award at Infosecurity Europe

Leadership powerhouse Claire Williams OBE reveals how to navigate change and develop a strong team culture at Infosecurity Europe 2024

Digital Transformation Week Unveils Keynote Topics: Empowering Enterprises with Real-World Insights

Generative AI and Deepfake Expert, Henry Ajder to discuss the impact of generative AI on cybersecurity at Infosecurity Europe 2024

Astera Empowers DXC Technology to Modernize their Legacy Data

Astera Code-free Automated Data Integration

Astera: Build your Custom Data Warehouse in 3 Simple Steps

Astera: Step by Step No-Code Data Preparation

Double whammy

A threat to copyright

More from Ellis Stewart

Ellis Stewart

Recommended for you

All Social Security Numbers Leaked in Massive Data Breach