The Volokh Conspiracy
Mostly law professors | Sometimes contrarian | Often libertarian | Always independent
Knowing/Reckless Falsehood Theories in "Large Libel Models" Lawsuits Against AI Companies
This week and next, I'll be serializing my Large Libel Models? Liability for AI Output draft. For some earlier posts on this (including § 230, disclaimers, publication, and more), see here; in particular, the two key posts are Why ChatGPT Output Could Be Libelous and An AI Company's Noting That Its Output "May [Be] Erroneous]" Doesn't Preclude Libel Liability.
[* * *]
[A.] First Amendment Protection
AI programs' output should be as protected by the First Amendment as the output of the New York Times. To be sure, the AI programs aren't engaged in "self-expression"; as best we can tell, they have no self to express. But the programs' output is, indirectly, the AI company's attempt to produce the most reliable answers to user queries, just as a publisher may found a newspaper to produce the most reliable reporting on current events. That this is done through writing algorithms rather than hiring reporters or creating workplace procedures shouldn't affect the analysis.
And in any event, regardless of whether any speaker interests are involved in an AI program's output, certainly readers can gain at least as much from what the program communicates as they do from commercial advertising, corporate speech, and speech by foreign propagandists. Those three kinds of speech have been held to be protected in large part because of listener interests; AI-mediated output should be as well. (Commercial advertising is less protected than other speech, especially when it is false or misleading, but this stems from other features of commercial advertising, not from the fact that it's justified by listener interests.)
Nonetheless, even if an AI program's output is like a newspaper's output, the AI company would still be potentially exposed to libel liability:
- The company could be liable if it knows certain statements the program is communicating are false and defamatory (or if it knows they are likely to be so but recklessly disregards that possibility).
- If the program communicates something false and defamatory about a private figure on a matter of public concern, and the company is negligent about this, then it could be liable for proven harm to the private figure.
- If the program communicates something on a matter of private concern, then the company could potentially be strictly liable, though practically speaking almost all states require a showing of negligence even in private-concern cases.
In this post, let me turn to a knowing-or-reckless-falsehood theory, under category 1; I'll deal with negligence claims in a later post.
[B.] A Notice-and-Blocking Model?
It's highly unlikely that the AI company will know, at the design stage, that the program will be communicating defamatory falsehoods about particular people. But say that R.R. (from the example that first led me to investigate this) alerts the company about this: He points out that the quotes that its program is reporting about him don't actually appear in the publications to which the program attributes the quotes—a Lexis/Nexis search and a Google search should verify that—and that there's no record of any federal prosecution of him.
Someone at the company would then be aware that the company's program is communicating false and defamatory materials. Presumably the company could then add code that would prevent these particular allegations—which it now knows to be false or at least likely false—from being output. (I expect that this would be "post-processing" content filtering code, where the output of the underlying Large Language Model algorithm would be checked, and certain material deleted; there would be no need to try to adjust the LLM itself, but only to add an additional step after the LLM produces the output. Indeed, OpenAI apparently already includes some such post-processing code, but for other purposes.)
More likely, the company could add this code once, have the code consult a table of assertions that shouldn't be output, and then just add individual assertions once it gets notice about their being false. And if the company doesn't do this fairly promptly, and continues to let the program communicate these assertions despite the company's awareness that they're false, it would at that point be acting with knowledge or recklessness as to the falsehood.
This is of course just a sketch of the algorithm. Since LLMs often output subtly different answers in response to the same query, the software might need to be more sophisticated than just a word search for the complainants' names near the particular quote that had been made up about them. And the results would likely be both overinclusive (perhaps blocking some mentions of the person that don't actually make the false allegations) and underinclusive (perhaps failing to block some mentions of the person that do repeat the false allegations but using subtly different language). Nonetheless, some such reasonably protective solution seems likely to be within the capability of modern language recognition systems, especially since it would only have to take reasonable steps to block the regeneration of the material, not perfect steps.
Perhaps the company can show that (1) it can design a system that can perform at nearly the 90th percentile on the bar exam, but that (2) checking the system's output to see if it includes a particular person's name in an assertion about an embezzlement conviction is beyond the company's powers. Or, perhaps more likely, it can show that any such filtering would be so over- and underinclusive that it would be unreasonable to read libel law as requiring it (or that to make it work would require the sort of army of content moderators that sites such as Facebook employ). Yet that doesn't seem likely to me; and it seems to me that the company ought to have to show that, rather than to have the legal system assume that such a remedy is impossible.
If there is a genuine dispute about the facts—e.g., when an AI program accurately communicates allegations made by a credible source, but the subject of the allegations disputes the source's accuracy—then I'm inclined to think that the AI company shouldn't be put in a position where it has to independently investigate the charges. But when the program outputs quotes that simply don't appear in the training data, or in any Internet-accessible source, then there is little reason why an AI company should be free to have its software keep producing such data.
Of course, even fielding such requests and doing the most basic checks (for, say, the accuracy of quotes) will take time and money. But I don't think that such costs are sufficient to justify an AI company's refusing to do this. By way of analogy, say that you're a reporter for the New York Times and you're writing a story about various accusations against R.R. You call up R.R., and he tells you that it's all wrong, and that (for instance) he in fact never pleaded guilty to a federal crime.
Once you are on notice of this, you would have to take the time and effort to investigate his response. If you just blithely ignore it, and publish the story despite having been told that it may well be mistaken, that would be textbook "reckless disregard," which would allow liability even in a public official case: Consider, for instance, Harte-Hanks Communications, Inc. v. Connaughton, which held that "purposeful avoidance of the truth" and thus "actual malice" could be found when plaintiff had made exculpatory audiotapes available to the newspaper but "no one at the newspaper took the time to listen to them." This means that you do have to take the time and effort to review such assertions, even if in the aggregate this means a good deal of time and effort for the employees of the New York Times put together.
And of course AI companies already stress that they have instituted various guardrails that would avoid various outputs (again, however imperfectly); here's an example from OpenAI:
Our use case guidelines, content guidelines, and internal detection and response infrastructure were initially oriented towards risks that we anticipated based on internal and external research, such as generation of misleading political content with GPT-3 or generation of malware with Codex. Our detection and response efforts have evolved over time in response to real cases of misuse encountered "in the wild" that didn't feature as prominently as influence operations in our initial risk assessments. Examples include spam promotions for dubious medical products and roleplaying of racist fantasies.
Given that AI companies are capable of doing something to diminish the production of racist fantasies, they should be capable of doing something to diminish the repetition of libelous allegations to which they have been specifically alerted.
[C.] The Imperfections of Notice-and-Blocking
Any such notice-and-blocking solution, to be sure, would be imperfect: It's possible that the AI program would regenerate a similar assertion that is different enough that it wouldn't be caught by this post-processing filter. But it should be fairly reliable, and should thus diminish the damage that the AI program may do to people's reputations.
To be sure, people can avoid some of ChatGPT's existing guardrails, for instance "rephrasing a request for illicit instructions as a hypothetical thought experiment, asking it to write a scene from a play or instructing the bot to disable its own safety features." But that's not a problem here: The main risk of reputational damage comes when people simply search for R.R.'s name, or ask about what he had been accused of, just in order to figure out accurate information about him. Relatively few people will take the time and effort to deliberately evade any filters on known libels that the AI program might include; and, if they do, they'll probably be aware that the results are unreliable, and thus will be less likely to think worse of R.R. based on those results.
So taking reasonable steps to block certain output, once there is actual notice that the output is incorrect, should be necessary to avoid liability for knowing defamation. And it should be sufficient to avoid such liability as well.
[I still need to add a subsection comparing and contrasting with DMCA notice-and-takedown rules as to copyright and trademark infringement.]
[D.] The bookstore/newsstand/property owner analogy
To be sure, unlike with a traditional newspaper that is distributing a libelous story, no human at an AI company would have written, edited, or even typeset the assertions. One might therefore argue that the company, as a corporate entity, isn't really "communicating" the assertions, since none of their human employees ever wrote them.
But that's also true of bookstores and newsstands, and they are still liable for defamation if they "know or have reason to know of [the] defamatory character" of the material that they are distributing—as would be the case once they are informed that a particular publication that they carry contains specific libelous material. Likewise, a property owner is liable for defamatory material posted by third parties on its property, once it's informed of the presence of the material. The AI company should be similarly liable for defamatory material distributed by its own computer program, once it's informed that the program is so distributing it.
As we'll see below, there is good reason to hold AI companies liable even when bookstores and newsstands might not be, because the AI companies create the programs that create the false and defamatory output, and have the power to do at least some things to decrease the likelihood of such output. But AI companies should be at least as liable as bookstores and newsstands, which means that they should be liable once they are put on notice about the falsehood and fail to take reasonable steps to try to block it from being regenerated.
 See Eugene Volokh & Donald M. Falk, First Amendment Protection for Search Engine Search Results, 8 J. L. Econ. & Pol. 883 (2012) (white paper commissioned by Google).
 Virginia Pharmacy Bd. v. Va. Consumers Council, 425 U.S. 748, 756 (1976); First Nat'l Bank of Boston v. Bellotti, 435 U.S 765, 775–76, 783 (1978); Lamont v. Postmaster General, 381 U.S. 301, 305, 307 (1965); see also id. at 307–08 (Brennan, J., concurring) (stressing that it's not clear whether the First Amendment protects "political propaganda prepared and printed abroad by or on behalf of a foreign government," but concluding that the law was unconstitutional because it violates the recipients' rights to read, regardless of the senders' rights to speak).
 Here's the Court's explanation for the lower level of protection for commercial advertising, as articulated in Virginia Pharmacy, the case that first squarely held that such advertising is generally protected:
The truth of commercial speech, for example, may be more easily verifiable by its disseminator than, let us say, news reporting or political commentary, in that ordinarily the advertiser seeks to disseminate information about a specific product or service that he himself provides and presumably knows more about than anyone else. Also, commercial speech may be more durable than other kinds. Since advertising is the sine qua non of commercial profits, there is little likelihood of its being chilled by proper regulation and forgone entirely.
Attributes such as these, the greater objectivity and hardiness of commercial speech, may make it less necessary to tolerate inaccurate statements for fear of silencing the speaker. They may also make it appropriate to require that a commercial message appear in such a form, or include such additional information, warnings, and disclaimers, as are necessary to prevent its being deceptive. They may also make inapplicable the prohibition against prior restraints.
425 U.S. at 771 n.24. But see Jack Balkin, The First Amendment and AI-Generated Speech, 3 J. Free Speech L. __ (2023) (arguing that AI output should be treated more like commercial advertising).
 New York Times Co. v. Sullivan, 376 U.S. 254 (1964); Curtis Publishing Co. v. Butts, 388 U.S. 130 (1967).
 Gertz v. Robert Welch, Inc.
 Dun & Bradstreet v. Greenmoss Builders; Restatement (Second) of Torts § 558(c) (1977).
 For instance, when I asked OpenAI to quote the racist leaflet at the heart of Beauharnais v. Illinois, 343 U.S. 250 (1952), it eventually did so, but added the text, "Keep in mind that these quotes are offensive and represent the views of the person who created the leaflet, not the views of OpenAI or its AI models." It seems very unlikely that this was organically generated based on the training data for the model, and seems more likely to have been produced by code that recognizes that the ChatGPT-4 output contained racist terms.
 See, e.g., https://openai.com/research/gpt-4 ("For example, [GPT-4] passes a simulated bar exam with a score around the top 10% of test takers.").
 491 U.S. 657, 692 (1989); see also, e.g., Curtis Publishing Co. v. Butts, 388 U. S. 130 (1967).
 OpenAI, Lessons Learned on Language Model Safety and Misuse, https://perma.cc/WY3Y-7523.
 Kevin Roose, The Brilliance and Weirdness of ChatGPT, N.Y. Times, Dec. 5, 2022.
 Restatement (Second) of Torts § 581(1) & cmt. e; Janklow v. Viking Press, 378 N.W.2d 875, 881 (S.D. 1985).
 Hellar v. Bianco, 244 P.2d 757, 757 (Cal. Dist. Ct. App. 1952); cf. Tidmore v. Mills, 32 So. 2d 769, 772, 777–78 (Ala. Ct. App. 1947); Woodling v. Knickerbocker, 17 N.W. 387, 388 (Minn. 1883); Tacket v. Gen. Motors Corp., 836 F.2d 1042, 1045 (7th Cir. 1987); cf. Dillon v. Waller, No. 95APE05-622, 1995 WL 765224, at *1–2 (Ohio Ct. App. Dec. 26, 1995); Kenney v. Wal-Mart Stores, Inc., No. WD 59936, 2002 WL 1991158, at *12 (Mo. Ct. App. Aug. 30, 2002), rev'd on other grounds, 100 S.W.3d 809 (Mo. 2003) (en banc). But see Scott v. Hull, 259 N.E.2d 160 (Ohio Ct. App. 1970) (rejecting liability in a similar situation).