The Volokh Conspiracy

Mostly law professors | Sometimes contrarian | Often libertarian | Always independent

Free Speech

Journal of Free Speech Law: "Inevitable Errors: Defamation by Hallucination in AI Reasoning Models," by Lyrissa Lidsky & Andrew Daves

|The Volokh Conspiracy |


This new article is here. The Introduction:

Artificial general intelligence is "probably the greatest threat to the continued existence of humanity." Or so claims OpenAI's Chief Executive Officer Sam Altman. In a seeming paradox, OpenAI defines its mission as ensuring "that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity."

Whether artificial general intelligence becomes a universal boon or an existential threat—or both—there is general agreement concerning its revolutionary potential. Indeed, Microsoft founder Bill Gates has called it "as fundamental an advance as the personal computer or the internet," and Google CEO Sundar Pichai has predicted that it will "bring about a more profound shift in human life than electricity or Promethean fire."

Thus far, AI systems are not generally smarter than humans. Not yet. Large Language Models (LLMs), however, are advancing at a startling pace. LLMs use artificial intelligence to synthesize massive amounts of textual data and then predict text and generate responses to users in "natural" human language. On a scale measuring the progress of LLMs toward general intelligence, OpenAI's flagship model, GPT-3 scored zero percent in 2020. Just five years later, OpenAI's o3-preview scored between 75% and 88%. Meanwhile, OpenAI competitors such as Anthropic, Google, and DeepSeek are likewise racing to deliver on the promise of "systems that can think and act rationally in ways that mirror human behavior and intelligence."

Even as LLM models make progress toward general intelligence, there are already AI systems that have exceeded human performance on narrow, clearly scoped tasks. For example, chess engines have been performing at superhuman levels for years, and AI models can now help detect breast cancer far earlier than human experts—and the models continue to improve. Meanwhile, OpenAI's o1 reasoning model has an LSAT score higher than the median student admitted to the law schools at Harvard, Yale, and Stanford.

As AI systems begin to mirror human thought, it pays to remember the words of Seneca: "[N]ot all mirrors stick to the truth." LLMs now regularly create outputs that appear to be the product of independent thought, but LLMs are essentially prediction engines that "answer" prompts (or inputs) by calculating which words are most likely to come next and then assembling them into an output. LLMs, as such, do not predict truth but instead predict probabilities. In doing so, they sometimes replicate false information common in their training data.

They also inevitably produce "plausible yet false outputs," commonly referred to as hallucinations. An LLM may produce fake legal documents, non-existent academic citations, or false biographical data. Although LLM producers and users can employ various tactics to reduce hallucinations, these errors cannot be eliminated. And they are quite prevalent. In fact, data gathered from multiple chatbots suggests that hallucinations occur in 3% to 10% of outputs, leading one legal scholar to dub LLMs "Large Libel Models."

Radio host Mark Walters found out about hallucinations the hard way. Walters is the first person in the United States to sue an LLM producer for defamation. Walters alleged that OpenAI's Large Language Model iteration operating at the time, GPT-3.5, falsely accused him of fraud and embezzlement. The accusation, which was unequivocally false and potentially reputation-damaging, arose in response to a prompt from a journalist named Frederick Riehl. Fortunately, Riehl had more than an inkling that the accusation was false, and he contacted Walters shortly after receiving it. No one else received the hallucination. Walters nonetheless sued OpenAI for its defamatory output.

The rise of defamation-by-hallucination cases presents significant challenges for defamation law. In most states and in most cases, defamation liability hinges on the defendant's scienter, or mental state. As a matter of tort law, plaintiffs generally must establish that the defendant was at least negligent in publishing a defamatory, false, factual statement about them. As a matter of constitutional law, plaintiffs who are public officials or public figures must prove actual malice on the part of the defendant: In other words, such plaintiffs must prove that the defendant knew at the time of publishing that the defamatory statement was false or recklessly disregarded its falsity. To state the obvious, it is not possible to judge whether an LLM "knew" of or was reckless as to the falsity of an output prior to publishing it in response to a prompt. Nor is it possible to determine whether the LLM exercised reasonable care. Only human actors have scienter as to falsity, or even as to the act of publishing.

Generative AI is not the first technology to which defamation law has had to adapt. Laws addressing infliction of reputational harm predate the printing press, and each new technology of communication has forced the common law to reexamine how to protect reputation without unduly deterring valuable speech. In the United States, courts have long adapted tort doctrines to achieve what they perceive to be sound communications policy. This was true even before the Supreme Court intervened to balance state interests in protecting reputation against free speech concerns. Adapting defamation law to generative AI is simply the latest challenge in defamation law's long and ongoing evolution.

Nonetheless, the stakes are high. If defamation law imposes liability on AI companies for every defamatory hallucination, the potential of LLMs as a tool for discovery, data analysis, information gathering, and creativity enhancement may be squandered. In fact, it is easy to imagine a new form of barratry emerging, in which lawyers and others search name after name to generate endless numbers of new clients. The costs of defending against claims and the threat of massive, unpredictable monetary awards may cause developers of Large Language Models to implement excessive filtering and content moderation, shift costs to end users, and reduce accessibility. In addressing defamation by hallucination, therefore, legal decision-makers must balance the desire to impose accountability for reputational harm with the need to foster the development of an important and perhaps revolutionary medium of information gathering and generation.

This article proposes a two-pronged approach to address harms caused by the newest AI reasoning models. First, this article introduces a framework for applying centuries-old tort doctrines to these models. Rooted in tort law rather than the First Amendment, this framework takes as a given that AI "speech" is valuable to recipients. and that public access to powerful, new information-gathering tools is essential to sound communications policy. Second, recognizing that tort law alone cannot adequately address the reputational harms the newest models pose, this article argues for statutory reforms focused on reducing such harms and making them easier to prove when they do occur.

Part I examines the application of existing defamation doctrines—both common law and constitutional—to "foundation-model" generative AI speech, highlighting the points at which existing doctrines may be deployed to accommodate communications policy concerns. This Part examines these doctrines through the prism of the trial court's order in Walters v. OpenAI, which addressed (1) whether a hallucination generated by an LLM in response to a prompt is a factual assertion when the prompter has serious reason to doubt the veracity of the output; (2) how to apply defamation's scienter requirements when the speech in question is generated by an LLM; and (3) whether a plaintiff can recover presumed damages when the evidence indicates no damage actually occurred. This Part also explores a fourth doctrine unexplored in the Walters order: namely, whether an LLM is a publisher or a distributor of the content it produces.

Part II highlights the challenges and opportunities posed by the newest LLM reasoning models, particularly "chain of thought" models. Reasoning models can "understand" concepts, contexts, and ambiguity in ways that their predecessors could not, which makes them more capable than their predecessors of solving "complex reasoning and problem-solving tasks." Chain-of-thought models, in particular, promise to deliver greater accuracy (i.e., more correct results) as their final outputs. Even as these models produce more reliable final outputs, however, new avenues for defamatory outputs arise in their intermediate "chains of thought." Moreover, even those reasoning models that "show their sources" continue to produce hallucinated outputs. Relying on the latest computer-science research on hallucinations, and particularly on a hallucination "taxonomy" developed by Yale researchers, this Part demonstrates that hallucinations are inevitable and, in some instances, surprisingly valuable. This fact has significant implications for those seeking to "regulate" hallucinations by way of defamation law.

Based on insights from Part II, Part III identifies the chief policy considerations that should underpin the adaptation of defamation law to this new technology. This Part begins to sketch how defamation law can balance the protection of individual reputation with the need to accommodate those hallucinations that are inextricably linked to LLMs' generative capacities. Specially, this Part contends that legal rules in this area must attempt to incentivize compensation for provable reputational harm, correction of the digital record, model transparency, innovations for safety and accuracy, the exercise of reasonable care at all stages of AI development, and the exercise of reasonable care by AI users.

Part IV proposes a legal framework for addressing defamatory hallucinations. This framework rests on the proposition that the law should treat defamatory hallucinations—both in an LLM's final output and in its chains of thought—as "inevitable errors," in much the same way that the Supreme Court in New York Times v. Sullivan recognized that some journalistic errors must be tolerated in order to produce "uninhibited, robust and wide-open" public discourse.

Part IV first explores some of the practical and technological limitations of prior scholarly proposals addressing defamation by hallucination in "foundation model" LLMs. Next, it draws lessons from defamation law's adaptation to past technologies, such as the telegraph, to argue that LLMs bear more similarities to information distributors than to publishers, and AI developers' liability for defamation should be governed accordingly. However, any tort law "privilege" accorded to AI developers must be balanced by the imposition of statutory duties designed to protect victims of defamation by hallucination. Specifically, LLM producers should have a duty to warn (and educate) their users that it is unreasonable to rely on AI outputs without verification. Furthermore, LLM producers should have a duty to preserve search records for a limited period of time in order to enable defamation plaintiffs to prove the existence and scope of their harm. Finally, this Part contends that users who repeat defamatory hallucinations without taking reasonable steps to verify them should often face liability, because such users will often be the cheapest cost avoiders in the realm of reputational harm. We conclude with caveats about the limits of defamation law as a regulatory tool for dealing with hallucination-based harms.