Fair Use Week is an annual celebration of the legal doctrine of fair use, which plays an essential role in teaching, education, and scholarship. The fair use doctrine allows for the use of copyrighted works in certain circumstances, which is determined using a four-factor test that considers the purpose of the use, the nature of the copyrighted work, the amount and substantiality used, and the effect of the use on the market for the copyrighted work.

Fair Use in Current Artificial Intelligence Lawsuits

A year ago, in recognition of Fair Use Week, Copyright Corner wrote about the intersection of fair use and artificial intelligence (AI). It explored how AI might interact with a four-factor fair use analysis, summarized guidance from Parts 1 and 2 of the U.S. Copyright Office’s Copyright and Artificial Intelligence report, and provided an overview of some current AI lawsuits involving copyright. At the time, a new summary judgement ruling in Thomson Reuters v Ross Intelligence had been recently released, giving an early look into how courts might rule on fair use in the context of AI. Since then, Part 3: Generative AI Training of the U.S. Copyright Office’s AI report was published, and further rulings in current cases involving copyright and AI have been released, which both expand on how the standing legal frameworks in copyright law, and more specifically, fair use, may be applied to the unprecedented advances in AI technology. To keep up with how courts are applying established principles of the fair use doctrine to AI, let’s explore some of the key takeaways from recent decisions in Bartz v. Anthropic and Kadrey v. Meta.

In Reuters v. Ross, the court rejected a fair use defense in the use of copyrighted works for training of an AI legal search tool. Weighing the first factor, the court held that Ross was using the headnotes as AI data to create a competing legal research tool, which was not a transformative use. Additionally, under the fourth factor, the court found that Ross’s legal research tool served as a market substitute and also noted consideration for the effect of Ross’s use on a potential market for AI training data. Perhaps unsurprising to those familiar with fair use analyses, those same two factors—the first and the fourth—play a large role in the decisions in Bartz and Kadrey, which were released in the same week in June 2025, only two days apart. In both cases, lawsuits were brought against companies that used copyrighted works to train large language models without authorization from the copyright owners. Both defendants made fair use defenses, and in both cases, the judges ruled in favor of fair use, at least in part. There is much to be gathered, however, beyond a simple fair use finding from the analysis the judges provide in their opinions.

In Bartz v. Anthropic, while the court denied summary judgment for Anthropic’s use of pirated copies of copyrighted works to assemble a central library, it also found that training AI on copyrighted works is a fair use if the works are acquired legally. According to the judge, Anthropic’s use of legally acquired works to train AI models was “transformative—spectacularly so,”[1] which supports a fair use finding under the first factor. When evaluating any potential market harm under the fourth factor, the judge found “copies used to train specific LLMs did not and will not displace demand for copies of Author’s works” because there is no evidence that the AI would produce exact copies of the works, meaning there is no direct substitution.[2] The judge went on to analogize the plaintiff’s argument for potential market harm: “Authors’ complaint is no different than it would be if they complained that training school children to write well would result in an explosion of competing works.”[3] Ultimately, the decision in Bartz v. Anthropic suggests that companies training AI on copyrighted works may be able to rely on fair use when those works they are using are lawfully acquired. For the issue of Anthropic’s use of pirated copies of works, notable developments in the case came in August and September 2025 when it was certified as a class action and preliminary approval of a $1.5 billion settlement was announced. Nearly half a million works have claims in this case through the certified class, meaning the potential statutory damages Anthropic could have faced had they lost were astronomical, so they decided to settle for a smaller amount to avert that risk.

The judge’s opinion in Kadrey v. Meta, despite an overall finding of fair use, proved to be more controversial. Similar to Bartz, the judge in Kadrey found the use of copyrighted works as training data for AI models to be transformative, noting that the original purpose of the works was to be “read for entertainment or education” and using the works in various AI functions was a sufficiently different purpose.[4] Approaching the fourth factor, however, is where the Kadrey decisions differs significantly from the analysis in Bartz. The judge in Kadrey employs a novel “market dilution” theory, arguing that “[n]o other use…has anything near the potential to flood the market with competing works the way that LLM training does.”[5] In the judge’s opinion, the competing works serve as indirect substitutes for the copyrighted works (instead of the typical direct substitutes that are considered under the fourth factor in a fair use analysis). The judge finds that even a transformative use can produce an indirect market substitute under the market dilution theory: “No matter how transformative LLM may be, it’s hard to imagine that it can be fair use to use copyrighted books to develop a tool to make billions or trillions of dollars while enabling the creation of a potentially endless stream of competing works that could significantly harm the market for those books.”[6] The same analogy from Bartz about teaching children to write is addressed in Kadrey, with the judge arguing that “using books to teach children to write is not remotely like using books to create a product that a single individual could employ to generate countless competing works with a miniscule fraction of the time and creativity it would otherwise take.”[7] Ultimately, the opinion in Kadrey emphasizes the unprecedented nature of AI technology and how it may affect a copyright owner’s ability capitalize on their work. Though the court ultimately grants summary judgement for Meta on its fair use defense,[8] it outlines an analysis that may prove prohibitive for those relying on fair use to train AI models on copyrighted works.

It is important to remember that each fair use case is different and highly fact dependent, so results of fair use and AI cases will likely continue to vary based on the facts of a particular case, just as Bartz and Kadrey did, so it may be wise to hesitate from drawing generalizations from the results of only two cases. Hopefully, as courts perform more fair use analyses and subsequently release more opinions, a more clear and consistent thread of practices will become evident, giving a clearer picture on what is and is not fair use in the context of AI.

[1] Bartz v. Anthropic PBC, 3:24-cv-05417, (N.D. Cal.). gov.uscourts.cand.434709.231.0.pdf

[2] Id.

[3] Id.

[4] Kadrey v. Meta Platforms, Inc., 3:23-cv-03417, (N.D. Cal.) gov.uscourts.cand.415175.598.0.pdf

[5] Id.

[6] Id.

[7] Id.

[8] According to the judge in Kadrey, “the plaintiffs presented no meaningful evidence on market dilution at all,” which was a driving factor in the fourth factor favoring the defendant and resulted in a finding of fair use. Had the plaintiffs been able to demonstrate market dilution, the fourth factor may have gone to a jury or they may have been able “to win on the fair use issue at summary judgment.”

By Landen Stafford (Copyright Services Specialist at Copyright Services, The Ohio State University Libraries)

This week is Fair Use Week, an annual celebration of the legal doctrine of fair use, which plays an essential role in teaching, education, and scholarship. This year, we are looking at the development of fair use in Generative Artificial Intelligence.

What is Fair Use?

The fair use doctrine allows for the use of copyrighted works in certain circumstances, which is determined using a four-factor test that considers the purpose of the use, the nature of the copyrighted work, the amount and substantiality used, and the effect of the use on the market for the copyrighted work. Fair use is purposely vague to avoid unnecessarily limiting the use of copyrighted materials, but this vagueness could also result in uncertainty about whether a use is a fair use or an infringement until it is challenged in court. Nowhere is that vagueness and uncertainty more prevalent than in the current climate around fair use and artificial intelligence.

The Role of Fair Use in Generative Artificial Intelligence

As the growing number of lawsuits brought against AI companies indicates (see ChatGPT Is Eating the World), there is sentiment among many copyright owners that the inclusion of copyrighted works in datasets used to train AI tools without permission constitutes infringement, as do the outputs produced by AI tools that are copies of or significantly similar to the copyrighted works. AI companies rely, in part, on fair use to defend their use of copyrighted works. As is true with any fair use case, to determine the strength of a fair use argument, courts will balance the fair use factors to see if the use is favorable. Let’s explore how each factor might apply to AI.

Factor 1: The Purpose and Character of the Use

When considering the purpose of the use, which is the first fair use factor, the potential commerciality of the AI companies’ use gets weighed with their claim of transformative use. Any possibility of commercial benefit that AI companies stand to gain from using the copyrighted works will weigh against a finding of fair use. This has a significant impact on any AI tools that require a paid subscription to use. However, if companies can successfully argue that their use is transformative and adds value that is new and different than the original purpose of the copyrighted work, that will weigh in favor of fair use. The transformative use, according to AI companies, is that copyrighted works are being used as data to help AI models recognize patterns that will in turn help them generate new and unique content. A transformative use argument is also considered with the output generated by the AI tool. If the output is substantially similar to the original copyrighted work and both works share the same or highly similar purpose, the use may not be considered transformative.[1]

Factor Two: The Nature of the Copyrighted Work

The second fair use factor is the nature of the copyrighted work, which examines characteristics such as if the work is fact or fiction and is the work published or unpublished. The use of highly creative works like novels, song lyrics, etc.—which are often used to train AI tools—typically weighs against fair use.

Factor Three: The Amount and Substantiality of the Portion Used

The third factor evaluates the amount and substantiality of the copyrighted work used in relation to the copyrighted work as a whole. Typically, a larger portion of a copyrighted work used, or the use of the heart of a work, weighs against fair use. However, if the use of an entire work is appropriate to accomplish a favored use, such as a use that is transformative, it may not weigh against fair use. AI companies could argue that ingesting anything less than the entirety of copyrighted works would lessen the accuracy of their AI tools and hamper their ability to achieve their transformative use in training the tool.

Factor Four: Market Effect

Under the fourth fair use factor, courts consider if the use has an effect on the market for the copyrighted work. If the value of a copyrighted work is affected by it being used to train AI tools, that would weigh against fair use, as would any situation where the use served as a market substitute for the original copyrighted work. For example, some copyright owners take advantage of the potential to license their works for monetary gain. If an AI company chooses to avoid a readily available license and use the copyrighted work without permission, they would have a direct negative effect on the value of the work. Additionally, if a generated output is a copy of or substantially similar to the copyrighted work, it could act as a substitute for the copyrighted work, again directly affecting the market.

None of the fair use factors are determinative on their own—a use that is found to be transformative does not guarantee that a court will rule in favor of fair use. There may be other factors that weigh heavily in favor of the copyright owner that will cumulatively force a ruling against fair use. All of that to say, fair use cases greatly depend on the specific facts of each unique case, making it difficult to support any generalizations that you may hear about fair use and AI.

Current AI Lawsuits

As noted above, issues of copyright infringement and fair use are currently being litigated in court. Most recently, the district court in Delaware released a new summary judgement ruling in Thomson Reuters v. Ross Intelligence, rejecting a fair use defense in the use of copyrighted works for training of an AI legal search tool. In the case, Ross Intelligence trained their legal-research search engine using Bulk Memos, which consisted of compilations of legal questions and answers incorporating Westlaw headnotes (summaries of key points of law and case holdings).[2] In considering the fair use factors, the court held that Ross’s use was not transformative; Ross was using the headnotes as AI data to create a competing legal research tool. Additionally, the court found that Ross’s legal research tool served as a market substitute for Westlaw and also noted consideration for the effect of Ross’s use on a potential market for AI training data.

Two other major cases currently making their way through the courts that are addressing fair use in the training of AI tools is The New York Times Company v. Microsoft Corporation, involving use of New York Times articles in the training of OpenAI’s large language models, and Author’s Guild v. OpenAI, involving use of works from a class of professional fiction writers for training of OpenAI’s large language models.

We have written before about The New York Times v. Microsoft case; in their complaint, The New York Times have claimed that OpenAI has unlawfully used The Times’s works, including articles, in-depth investigations, opinion pieces, reviews, and how-to guides, to train the large language models that power CoPilot (previously Bing Chat) and ChatGPT. The New York Times states these AI tools “can generate output that recites Times content verbatim, closely summarizes it, and mimic its expressive style.”[3] According to Microsoft and OpenAI, large language models can be trained to recognize patterns in data, but reproduction of entire copyrighted works is not what the models and tools are designed to produce.[4]

OpenAI and Microsoft are also facing a lawsuit by the Author’s Guild. In their amended complaint filed on December 4, 2023, the Author’s Guild states that ChatGPT produces summaries of copyrighted text used in the training of the tool and the large language model underlying the tool, and that these summaries are themselves derivative works. The Author’s Guild also asserts that the plaintiff authors have suffered harm from the use of their copyrighted works, including lost opportunities to license their works and displacement of human-authored books.

Guidance from the United States Copyright Office

In 2023, the United States Copyright Office began examining the copyright law and policy issues raised by generative artificial intelligence in the scope of creating works and using copyrighted works in the training of AI. Their comprehensive initiative included public listening sessions, registration guidance for AI generated works, and publishing a Notice of Inquiry seeking public input on copyright issues raised by artificial intelligence. Their report, Copyright and Artificial Intelligence, analyzes copyright law and policy issues raised by artificial intelligence. The report will be issued in three parts.

Part 1 of the Copyright and Artificial Intelligence report was published on July 21, 2024 and addressed the topic of digital replicas. Part 2 of the report, published in January 2025, focuses on the copyrightability of outputs created using generative AI. The report states that existing principles of copyright law are flexible enough to apply to this new technology, as they have applied to technological innovations in the past. The report also concludes that the outputs of generative AI can be protected by copyright only where a human author has determined sufficient expressive elements. This can include situations where a human-authored work is perceptible in an AI output, or a human makes creative arrangements or modifications of the output, but not the mere provision of prompts. The report confirms that the use of AI to assist in the process of creation or the inclusion of AI-generated material in a larger human-generated work does not bar copyrightability. It also finds that the case has not been made for changes to existing law to provide additional protection for AI-generated outputs.

Emerging Industry Solutions

As courts continue to work through these copyright issues and the U.S. Copyright Office completes their research and guidance, some have turned to licensing deals to facilitate AI training needs. Approaches have included opt-in models, such as the one offered by Cambridge University Press, that allow authors to opt-in to future licensing agreements with generative AI providers. Some opt-in models also offer payment to the author. The recent deal between Microsoft and HarperCollins, for example, allows authors to opt-in to the AI training program with a payment of $5,000 per title, with half of that amount going to the author. AI training datasets may also avoid copyright issues by limiting data to public domain works. In December of 2024, for example, Harvard announced the Institutional Data Initiative, with backing from Microsoft and OpenAI, that intends to share a dataset that includes 1 million public domain books.

What’s Next?

We await the US Copyright Office’s much anticipated third report on AI, which is set to explore “the legal implications of training AI models on copyrighted works” and hopefully provide practical guidance on the subject. Between that report and the many case rulings that may be forthcoming, hopefully the aforementioned vagueness and uncertainty will gradually transition to functional clarity on how to approach the intersection of fair use and artificial intelligence.

See the resources listed below for more information on fair use and artificial intelligence:

Congressional Research Service, Generative Artificial Intelligence and Copyright Law (September 29, 2023). Available at: https://crsreports.congress.gov/product/pdf/LSB/LSB10922
United States Copyright Office, Copyright and Artificial Intelligence, https://copyright.gov/ai/
Knibbs, Kate, Every AI Copyright Lawsuit in the US, Visualized, Wired, https://www.wired.com/story/ai-copyright-case-tracker/ (last updated December 19, 2024).

[1] In Andy Warhol Foundation for the Visual Arts, Inc. v Goldsmith, the U.S. Supreme Court found that the Andy Warhol Foundation’s use of Goldsmith’s photograph of Prince shared “substantially the same purpose” as the original, and their “use is of a commercial nature,” affirming the Second Circuit Court of Appeals decision that the Foundation’s use did not qualify as fair use.

[2] The court holds that while the judicial opinions from which the headnotes are derived are not copyrightable, the headnotes “can introduce creativity by distilling, synthesizing, or explaining part of an opinion, and thus be copyrightable.” Thompson Reuters Enterprise Centre GMBH and West Publishing Corp., v Ross Intelligence Inc., Case No. 1:20-cv-613-SB (D.D.C. 2025), 7, https://www.ded.uscourts.gov/sites/ded/files/opinions/20-613_5.pdf

[3] The New York Times Company v. Microsoft Corporation, et al., Case No. 1:23-cv-11195, United States District Court, Southern District of New York, https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf (Filed on Dec. 27, 2023).

[4] Allyn, Bobby. “’The New York Times’ Takes OpenAI to Court. ChatGPT’s Future Could Be on the Line.” NPR, 14 Jan. 2025, www.npr.org/2025/01/14/nx-s1-5258952/new-york-times-openai-microsoft.

By Allison Schultz (Instructional Designer & Library Liaison, Ohio State Online), Landen Stafford (Copyright Services Specialist, Copyright Services), and Maria Scheid (Head, Copyright Services)

Ohio State nav bar

Copyright Corner

Tag: Generative Artificial Intelligence

Fair Use and Artificial Intelligence 2026 Update

Fair Use in Current Artificial Intelligence Lawsuits

Fair Use Week 2025: Fair Use and Artificial Intelligence

What is Fair Use?

The Role of Fair Use in Generative Artificial Intelligence

Factor 1: The Purpose and Character of the Use

Factor Two: The Nature of the Copyrighted Work

Factor Three: The Amount and Substantiality of the Portion Used

Factor Four: Market Effect

Current AI Lawsuits

Guidance from the United States Copyright Office

Emerging Industry Solutions

What’s Next?

Search This Site

Contact Us

Categories

Tags

Recent Posts

Recent Comments

Archives

Meta