Extract from Isha Marathe’s article “In OpenAI Copyright Lawsuits, Discovery Complications Likely to Take Center Stage”
Earlier this month, three authors, including comedian Sarah Silverman, sued OpenAI and Meta over dual claims of copyright infringement alleging that their generative AI software scraped their data without consent. The lawsuit came after Silverman’s legal team asked ChatGPT to summarize excerpts of her book “Bedwetter,” which the chatbot successfully did. Then, it was prompted to reproduce the copyright management information that went along with the published work—which it failed to do.
Until now, a significant burden in most copyright infringement cases against generative AI tools—largely from artists—has been the task of proving the model, also known as a large language model (LLM), was trained on a specific work.
By ChatGPT’s own admission, that challenge is somewhat overcome in the latest class action against OpenAI.
However, for IP owners looking to navigate the advent generative AI tools, the development begs the question: Do datasets LLMs are trained on qualify for trade secret protections? And if so, what challenges might that bring to the discovery process?