J1 AI
  • Home
  • Previous
  • Last

2023 September, 21

A lawsuit was filed on Tuesday, September 19, 2023, in the Southern District of New York, submitted on behalf of the Authors Guild and 17 well-known writers.

The authors include well-known names of American writers such as John Grisham, George R.R. Martin, Scott Turow, Jodi Picoult, David Baldacci, Michael Connelly and George Saunders.

5 Minutes to read

The authors claim that the company OpenAI has included their books in its language models, thereby violating their copyrights, and thus, systematic theft has been committed on a large scale.

The complaint is the latest legal challenge for OpenAI regarding the data it collects and the algorithm used, which is based on the AI tool ChatGPT. The ChatGPT tool uses artificial intelligence to answer user questions, like writing texts in sophisticated language. ChatGPT mimics human behavior and would respond to completing such tasks.

The writer Douglas Preston, said he was shocked when he asked ChatGPT to describe minor characters in his books and it spat back detailed information that wasn’t available in reviews or Wikipedia entries for the novels.

That’s when I looked at this and said: My God, ChatGPT has read my books, and how many of my books has it read?

It knew everything, and that’s when I got a very bad feeling.

— Douglas Preston

The results of the authors' queries to ChatGPT suggest that the sources from which the AI system has taken the data are not only from reviews, Wikipedia, or other sources available to the public.

Companies like OpenAI rely on large language models or LLMs that deal with huge amounts of text and data to be fed to create these AIs. ChatGPT and similar AI systems seriously threaten the livelihood of Authors with their underlying LLMs.

According to the lawsuit, these language models are trained using their works without obtaining the consent of the authors. In the authors' opinion, this represents a significant violation of the protection of intellectual property and violates the copyright law.

Like the artificial intelligence algorithms that neuronal networks use to generate results, tracking which data sources were used for the results the network was trained with is impossible. Therefore, it would be not possible for the company OpenAI to precisely name the sources of the text passages from which the data comes. However, the authors state that this can only come from an analysis of the originals.

Result checker

OpenAI provided a tool until the beginning of 2023 that allowed checking whether a text was generated by the ChatGPT or is an original. Today, this tool cannot be found on the Internet or from the manufacturer.

Checking plagiarism will play an even greater role in the future. The current legal dispute will only be carried out with great effort. A possible solution could be a tool, that can distinguish for sure originals from synthetically generated data such as text, images, and videos.

In the past I referred to a project by Liam Swayne and showed examples of how you can write entire books using ChatGPT. Some examples from Liam Swayne were able to show that ChatGPT is capable of well-known authors such as George R.R. Martin to imitate his work.

In light of the current lawsuit, I have withdrawn publication of these articles.