Readers expect authenticity. AI delivers efficiency. Can both exist within journalism? What happens to journalism when the most efficient “writer” is a mimicry of the style of every article that came before it? Generative Artificial Intelligence (Gen AI) is an advanced AI system that produces new content, such as written articles, images, or audio, through the analysis of previous human work and large datasets available online. Within media platforms, it can draft articles, edit, and shape our news coverage, which goes beyond autocorrect and seeps into authenticity. Though the initial implementation of this tool cannot be traced to a specific point in time, this shift in credibility in journalism is now beyond technology; it is about the conditions under which the public understands truth.
A news story is more than text, it’s a collection of informed choices: which sources are trusted, which facts are verifiable, and which words accurately represent events. When Gen AI processes huge amounts of data and is prompted to generate articles to publish and inform the public, there is a risk of weakening the link between the trained journalist and the people who trust the media they’re consuming. This raises the question: who owns the output, and who is responsible for its outcomes? A recent trend in copyright and property litigation debates an unprecedented question: Is it legal for Gen AI to learn from the experiences of media companies to help these same companies? Although there are now thousands of cases regarding AI and copyright infringement, three will be discussed in detail. Three cases will be discussed in detail because of their potential to change precedent for the IP legal landscape. The New York Times Co. v. Microsoft Corp., Reddit v. Anthropic, and independent authors in Kadrey v. Meta Platforms, Inc. are fighting to defend the humanity behind written work that AI platforms are attempting to model.
The New York Times Co. v. Microsoft Corp (Southern District of New York) is an ongoing lawsuit where the Plaintiff (the Times) claims that the Defendant (Microsoft) copied millions of Times articles and training material without permission to train their AI models (see The New York Times Co. v. Microsoft Corp., No. 23-cv-11195 (SHS), ECF No. 514 (S.D.N.Y. Apr. 4, 2025)). The Times argues that this use of its content is illegal and harms its business (NYT v. Microsoft, No. 23-cv-11195 (S.D.N.Y. 2025)). The decision in this case, among others, will determine whether AI companies can use copyrighted reporting for training and whether this enables them to enter the market as direct competitors to the news organizations on which they were trained. The case marks one of the first major attempts to apply copyright as a protection mechanism in an environment where AI can replicate the valued talent associated with professional journalism. The claim focuses on the economic function of copyright: to protect investment in original reporting by preventing others from using it for competing commercial gain. While fair use traditionally allows limited use of copyrighted material for scholarship, criticism, or transformative purposes, Plaintiff argues that this use does not align with fair use principles (NYT v. Microsoft, No. 23-cv-11195 (S.D.N.Y. 2025)).
The Copyright Act updated December 2024, contained in Title XVII of the United States Code, defines four factors of fair use: (1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes; (2) the nature of the copyrighted work; (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) the effect of the use upon the potential market for, or value of, the copyrighted work (17 U.S.C. § 107 (2024)). The concern is direct and commercial in nature: if AI systems trained in journalism can offer a competing product, the courts must determine whether copyright still effectively protects journalism. Without legal support, the line between authentic writing and copyright infringement of human work is placed at risk.
Similarly, the main dispute in the ongoing case of Reddit v. Anthropic (San Francisco County Superior Court) raises concerns about how AI companies obtain and use online content for model training when that content is created by users who never consented to its commercial use. This case involves user-generated posts, including deleted material, being treated as training data for Anthropic’s Claude LLMs (see Reddit, Inc. v. Anthropic PBC, No. CGC-25-625892 (Cal. Super. Ct. San Francisco Cty. June 4, 2025)). Reddit alleges that Anthropic trained its models on Reddit data without authorization and continued to collect Reddit content even after claiming to have blocked its systems from accessing the site (Reddit, No. CGC-25-625892). This raises the question of whether the collection of user-generated content for AI training is a form of unlicensed commercial use and whether our existing legal constraints are enough to protect writers against data and thought extraction. Aside from user-consent discourse, it is essential to examine overall authorship through authors and their risks in this copyright race.
Kadrey v. Meta Platforms, Inc. (Northern District of California) is an ongoing lawsuit brought by three authors, who claim Meta illegally used their books to train its Large Language Model Meta AI (LLaMA) model (see Kadrey v. Meta Platforms, Inc., No. 3:23-cv-03417 (N.D. Cal. Mar. 7, 2025)). In this case, it was concluded that the plaintiffs failed to show that AI’s outputs reproduced their copyrighted text, which limits copyright protection to reach only near-exact copying instead of the author’s style and ideas (Kadrey, No. 3:23-cv-03417). Copyright law may not be sufficient to protect the original expression, indicative of journalism. This raises a concern: if AI can consume and mimic tone, structure, and writing methods without being legally considered “copying,” this means the traditional skill set that sets apart human writers becomes harder to defend. AI-generated content may compete with human work, and the lack of legal restrictions will make it hard to uphold the economic and creative value of authorship. This creates a vicious cycle where even if the journalist puts out more and relevant writing, it only becomes more training data for AI to copy; as a result, the journalist is stuck training their replacement until they’re outcompeted.
The cases discussed demonstrate that as courts begin to address the legal implications of AI, they are part of a broader conversation about how news is created and trusted. Expressions of media have always relied on a human connection between the reporter and the reader, but as AI becomes more capable of producing this valuable text, we need to ensure technology does not distance the public from human responsibility. There is value in human expression through the unique capabilities of ethics and creativity in written work, which must be protected to preserve quality and trust in media.
Citations:
17 U.S.C. § 107 (2024)
Beyond Copyright: Reddit’s Lawsuit Against Anthropic, AILaw & Policy (June 17, 2025), https://www.ailawandpolicy.com/2025/06/beyond-copyright-reddits-lawsuit-against-anthropic/#_ftn1
Kadrey v. Meta Platforms, Inc., No. 3:23-cv-03417, ECF No. 598 (N.D. Cal. Mar. 7, 2025).
The New York Times Co. v. Microsoft Corp., No. 1:23-cv-11195, ECF No. 514 (S.D.N.Y. Apr. 4, 2025).
Reddit, Inc. v. Anthropic PBC, No. CGC-25-625892 (Cal. Super. Ct. S.F. Cnty. June 4, 2025) (complaint).Reddit, Inc. v. Anthropic PBC, No. CGC-25-625892 (Cal. Super. Ct. S.F. Cnty. June 4, 2025).
