Generative AI and Copyright: Datasets and Fair Use
The rapid advancement of generative artificial intelligence (AI) technologies, characterized by platforms like OpenAI’s ChatGPT and Google’s Gemini, has ushered in transformative changes across various industries. However, the extensive data required to train these large language models (LLMs) often includes copyrighted materials, giving rise to significant legal challenges and copyright infringement claims by artists, creators, and companies.
Legal Challenges and Emerging Litigation
The use of copyrighted works in datasets for training LLMs has triggered numerous lawsuits. Notably, in 2023, comedian Sarah Silverman and author George R.R. Martin filed suits against OpenAI, citing unauthorized use of their creative works for AI training. These cases underline the tension between AI development and copyright protection, a conflict exacerbated by the insatiable data needs of these technologies.
Reports of AI platforms scraping YouTube videos for data further complicate these legal landscapes, potentially breaching platform terms of service and infringing upon creators’ copyrights. Such actions highlight the pressing need for clear legal guidelines in the use of digital content for AI training.
Copyright and Fair Use Considerations
In the realm of AI, the distinction between input and output is crucial for copyright considerations. Under U.S. law, AI-generated content, lacking human authorship, generally does not qualify for copyright. However, significant human contributions to AI-generated works can be protected. This blurs the lines between technology and traditional content creation, necessitating sophisticated legal interpretations to determine copyright ownership.
The Fair Use Doctrine and AI
The fair use doctrine, codified under Section 107 of the Copyright Act, provides for the unlicensed use of copyrighted material under specific circumstances, which are assessed on a case-by-case basis due to the doctrine’s inherent flexibility. This evaluation includes:
Purpose and Character of the Use: Non-commercial, educational, or transformative uses favor fair use. For AI, where data is transformed into new patterns and predictions, this factor could be pivotal.
Nature of the Copyrighted Work: Uses of more factual or publicly available works are more likely to be deemed fair compared to creative or unpublished works.
Amount and Substantiality: The scale of data used by AI can be both a point of contention and a defense, depending on whether the use is deemed excessive or essential for the technology’s functionality.
Effect on the Market: The impact on the original work’s market is crucial. AI’s use of copyrighted material could be argued to have minimal market impact, especially if the data is publicly available, thus potentially supporting a fair use claim.
Precedential Cases
Cases like Kelly v. Arriba Soft Corporation and Authors Guild v. Google, Inc. provide valuable precedents. In Kelly, the court ruled that the transformative use of thumbnails by a search engine was fair use. Similarly, Google’s book digitization project was deemed a transformative use in the Authors Guild, suggesting that creating searchable indexes from copyrighted texts could be considered fair use.
These cases demonstrate how transformative use is a key factor in fair use analyses and can be particularly relevant for AI technologies that inherently transform data into new forms of expression.
As AI technologies continue to evolve, the intersection with copyright law remains highly dynamic and contentious. The ongoing litigation, including recent cases against tech giants like Google and Microsoft, underscores the need for legal frameworks that balance innovation with the rights of copyright holders. Fair use will likely continue to play a critical role in shaping the legal landscape for AI development, requiring nuanced and forward-thinking legal interpretations to navigate this complex terrain.
Holon Law Partners has 100+ years of experience combined guiding clients through complex cases and legal intricacies. Our approach is empathetic, customized, and client-centered with a focus on you and your unique business needs. To schedule a consultation with us, call our team at (866) 372-0726 or email us at: info@holonlaw.com.