Author: Landen Stafford

Fair Use and Artificial Intelligence 2026 Update

Fair Use Week is an annual celebration of the legal doctrine of fair use, which plays an essential role in teaching, education, and scholarship. The fair use doctrine allows for the use of copyrighted works in certain circumstances, which is determined using a four-factor test that considers the purpose of the use, the nature of the copyrighted work, the amount and substantiality used, and the effect of the use on the market for the copyrighted work.

Fair Use in Current Artificial Intelligence Lawsuits

A year ago, in recognition of Fair Use Week, Copyright Corner wrote about the intersection of fair use and artificial intelligence (AI). It explored how AI might interact with a four-factor fair use analysis, summarized guidance from Parts 1 and 2 of the U.S. Copyright Office’s Copyright and Artificial Intelligence report, and provided an overview of some current AI lawsuits involving copyright. At the time, a new summary judgement ruling in Thomson Reuters v Ross Intelligence had been recently released, giving an early look into how courts might rule on fair use in the context of AI. Since then, Part 3: Generative AI Training of the U.S. Copyright Office’s AI report was published, and further rulings in current cases involving copyright and AI have been released, which both expand on how the standing legal frameworks in copyright law, and more specifically, fair use, may be applied to the unprecedented advances in AI technology. To keep up with how courts are applying established principles of the fair use doctrine to AI, let’s explore some of the key takeaways from recent decisions in Bartz v. Anthropic and Kadrey v. Meta.

In Reuters v. Ross, the court rejected a fair use defense in the use of copyrighted works for training of an AI legal search tool. Weighing the first factor, the court held that Ross was using the headnotes as AI data to create a competing legal research tool, which was not a transformative use. Additionally, under the fourth factor, the court found that Ross’s legal research tool served as a market substitute and also noted consideration for the effect of Ross’s use on a potential market for AI training data. Perhaps unsurprising to those familiar with fair use analyses, those same two factors—the first and the fourth—play a large role in the decisions in Bartz and Kadrey, which were released in the same week in June 2025, only two days apart. In both cases, lawsuits were brought against companies that used copyrighted works to train large language models without authorization from the copyright owners. Both defendants made fair use defenses, and in both cases, the judges ruled in favor of fair use, at least in part. There is much to be gathered, however, beyond a simple fair use finding from the analysis the judges provide in their opinions.

In Bartz v. Anthropic, while the court denied summary judgment for Anthropic’s use of pirated copies of copyrighted works to assemble a central library, it also found that training AI on copyrighted works is a fair use if the works are acquired legally. According to the judge, Anthropic’s use of legally acquired works to train AI models was “transformative—spectacularly so,”[1] which supports a fair use finding under the first factor. When evaluating any potential market harm under the fourth factor, the judge found “copies used to train specific LLMs did not and will not displace demand for copies of Author’s works” because there is no evidence that the AI would produce exact copies of the works, meaning there is no direct substitution.[2] The judge went on to analogize the plaintiff’s argument for potential market harm: “Authors’ complaint is no different than it would be if they complained that training school children to write well would result in an explosion of competing works.”[3] Ultimately, the decision in Bartz v. Anthropic suggests that companies training AI on copyrighted works may be able to rely on fair use when those works they are using are lawfully acquired. For the issue of Anthropic’s use of pirated copies of works, notable developments in the case came in August and September 2025 when it was certified as a class action and preliminary approval of a $1.5 billion settlement was announced. Nearly half a million works have claims in this case through the certified class, meaning the potential statutory damages Anthropic could have faced had they lost were astronomical, so they decided to settle for a smaller amount to avert that risk.

The judge’s opinion in Kadrey v. Meta, despite an overall finding of fair use, proved to be more controversial. Similar to Bartz, the judge in Kadrey found the use of copyrighted works as training data for AI models to be transformative, noting that the original purpose of the works was to be “read for entertainment or education” and using the works in various AI functions was a sufficiently different purpose.[4] Approaching the fourth factor, however, is where the Kadrey decisions differs significantly from the analysis in Bartz. The judge in Kadrey employs a novel “market dilution” theory, arguing that “[n]o other use…has anything near the potential to flood the market with competing works the way that LLM training does.”[5] In the judge’s opinion, the competing works serve as indirect substitutes for the copyrighted works (instead of the typical direct substitutes that are considered under the fourth factor in a fair use analysis). The judge finds that even a transformative use can produce an indirect market substitute under the market dilution theory: “No matter how transformative LLM may be, it’s hard to imagine that it can be fair use to use copyrighted books to develop a tool to make billions or trillions of dollars while enabling the creation of a potentially endless stream of competing works that could significantly harm the market for those books.”[6] The same analogy from Bartz about teaching children to write is addressed in Kadrey, with the judge arguing that “using books to teach children to write is not remotely like using books to create a product that a single individual could employ to generate countless competing works with a miniscule fraction of the time and creativity it would otherwise take.”[7] Ultimately, the opinion in Kadrey emphasizes the unprecedented nature of AI technology and how it may affect a copyright owner’s ability capitalize on their work. Though the court ultimately grants summary judgement for Meta on its fair use defense,[8] it outlines an analysis that may prove prohibitive for those relying on fair use to train AI models on copyrighted works.

It is important to remember that each fair use case is different and highly fact dependent, so results of fair use and AI cases will likely continue to vary based on the facts of a particular case, just as Bartz and Kadrey did, so it may be wise to hesitate from drawing generalizations from the results of only two cases. Hopefully, as courts perform more fair use analyses and subsequently release more opinions, a more clear and consistent thread of practices will become evident, giving a clearer picture on what is and is not fair use in the context of AI.


[1] Bartz v. Anthropic PBC, 3:24-cv-05417, (N.D. Cal.). gov.uscourts.cand.434709.231.0.pdf

[2] Id.

[3] Id.

[4] Kadrey v. Meta Platforms, Inc., 3:23-cv-03417, (N.D. Cal.) gov.uscourts.cand.415175.598.0.pdf

[5] Id.

[6] Id.

[7] Id.

[8] According to the judge in Kadrey, “the plaintiffs presented no meaningful evidence on market dilution at all,” which was a driving factor in the fourth factor favoring the defendant and resulted in a finding of fair use. Had the plaintiffs been able to demonstrate market dilution, the fourth factor may have gone to a jury or they may have been able “to win on the fair use issue at summary judgment.”


By Landen Stafford (Copyright Services Specialist at Copyright Services, The Ohio State University Libraries)

Artificial Intelligence and the Public Domain

Note: On January 1, 2026, a new batch of works entered the public domain in the United States—those that were published or registered in 1930. These works, if they met all required copyright formalities, received the maximum term of protection of 95 years. They now join countless other works already in the public domain in the United States and are free of copyright. This means they may be freely copied, adapted, distributed, performed, and displayed without permission from a rightsholder.

Artificial Intelligence and the Public Domain

Another Public Domain Day has come and gone, and the public domain in the United States continues to expand and become an even more valuable and essential part of the copyright lifecycle that makes creative works available to be freely used and inspire new works. Given its place of importance in the world of intellectual property and the creative cycle, one might wonder how the public domain and the works that comprise it can interact with the most influential technology of today, artificial intelligence. Generally speaking, the relationship between AI and the public domain is reciprocal, where the public domain provides works that serve as training data for AI models, which in turn generate new works that are often in the public domain. Expanding the nuances for both parts of this transaction can make the relationship clearer.

Generative AI Outputs and the Public Domain

To receive copyright protection in the United States, a work must contain human authorship, among other requirements. The outputs of generative AI can be protected by copyright only where a human author has determined sufficient expressive elements. If a human adds creative elements to the AI-created output, such as arrangements or modifications, then they may be able to claim copyright in their contributions. The provision of prompts, however, does not provide the requisite human authorship for copyrightability because the subsequent output is considered the expression of the artificial intelligence, not the user. Therefore, strictly AI-generated works—those that are created solely by a machine without sufficient human intervention—are not eligible for copyright protection and are in the public domain in the United States. In January 2025, the U.S. Copyright Office released Part 2 of their report on Copyright and Artificial Intelligence, which further explores the copyrightability of outputs created by generative AI.

Training AI Using Public Domain Works

The public domain is composed of a large amount of high-quality content, including some of history’s most prominent works of literature, musical compositions, works of the U.S. Federal Government, obscure artwork, and even recent works that have been dedicated to the public domain by the author. Taking advantage of this vast source of content as training data for AI models is desirable not only because of the shear amount of content available but because the diversity of the materials is equally as expansive, both of which are crucial for high-performing, accurate AI tools. Also advantageous is that public domain works are free of copyright, meaning they can be used to train AI models without negotiating a license, paying royalties, relying on fair use, or risking claims of copyright infringement (see Chat GPT is eating the world for a list of current litigation involving AI companies).

An obstacle that may inhibit the use of public domain materials is that many of the works exist only in physical form in libraries and archives. Digital versions are necessary to be useful in the context of AI training. Institutions such as the Harvard Law School Library have recognized this obstacle, and they have responded by compiling the Institutional Book Corpus, a collection of almost one million public domain books that have been digitized and made available for anyone to use as data to train AI models. The corpus, released through their Institutional Data Initiative, contains books in 379 unique languages covering a variety of topics including language and literature, law, philosophy, psychology, religion, science, social science, political science, agriculture, and medicine. By stewarding these public domain books in such a proactive way, institutions like the Harvard Law School Library are able to increase access to high-quality content that can be used for the ethical development of AI technology, which is crucial as demands for transparency around the data on which AI models are trained are louder than ever.

 

Interested in learning more about the public domain? Explore the Public Domain Day website to learn more about the Public Domain Project at The Ohio State University and to view additional copyright and public domain resources.

 

By Landen Stafford (Copyright Services Specialist at Copyright Services, The Ohio State University Libraries)