The Volokh Conspiracy
Mostly law professors | Sometimes contrarian | Often libertarian | Always independent
OpenAI Wins Libel Lawsuit Brought by Gun Rights Activist Over Hallucinated Embezzlement Claims
In yesterday's decision by Judge Tracie Cason (Ga. Super. Ct. Gwinnett County) in Walters v. OpenAI, L.L.C., gun rights activist Mark Walters sued OpenAI after journalist Frederick Riehl ("editor of AmmoLand.com, a news and advocacy site related to Second Amendment rights") received an AI-generated hallucination from ChatGPT that alleged Walters was being sued for alleged embezzlement. The court granted OpenAI summary judgment, concluding that OpenAI should prevail "for three independent reasons":
[1.] In context, a reasonable reader wouldn't have understood the allegations "could be 'reasonably understood as describing actual facts,'" which is one key element of a libel claim. The court didn't conclude that OpenAI and other such companies are categorically immune whenever they include a disclaimer, but stated just that "Disclaimer or cautionary language weighs in the determination of whether this objective, 'reasonable reader' standard is met," and that "Under the circumstances present here, a reasonable reader in Riehl's position could not have concluded that the challenged ChatGPToutput communicated 'actual facts'":
{Riehl pasted sections of the Ferguson complaint [a Complaint in a civil case that Riehl was researching] into ChatGPT and asked it to summarize those sections, which it did accurately. Riehl then provided an internet link, or URL, to the complaint to ChatGPT and asked it to summarize the information available at the link. ChatGPT responded that it did "not have access to the internet and cannot read or retrieve any documents." Riehl provided the same URL again. This time, ChatGPT provided a different, inaccurate summary of the Ferguson complaint, saying that it involved allegations of embezzlement by an unidentified SAF Treasurer and Chief Financial Officer. Riehl again provided the URL and asked ChatGPT if it could read it. ChatGPT responded ''yes" and again said the complaint involved allegations of embezzlement; this time, it said that the accused embezzler was an individual named Mark Walters, who ChatGPT said was the Treasurer and Chief Financial Officer of the SAF.}
In this specific interaction, ChatGPT warned Riehl that it could not access the internet or access the link to the Ferguson complaint that Riehl provided to it, and that it did not have information about the period of time in which the complaint was filed, which was after its "knowledge cutoff date." Before Riehl provided the link to the complaint, ChatGPT accurately summarized the Ferguson complaint based on text Riehl inputted. After Riehl provided the link, and after ChatGPT initially warned that it could not access the link, ChatGPT provided a completely different and inaccurate summary.
Additionally, ChatGPT users, including Riehl, were repeatedly warned, including in the Terms of Use that govern interactions with ChatGPT, that ChatGPT can and does sometimes provide factually inaccurate information. A reasonable user like Riehl—who was aware from past experience that ChatGPT can and does provide "flat-out fictional responses," and who had received the repeated disclaimers warning that mistaken output was a real possibility—would not have believed the output was stating "actual facts" about Walters without attempting to verify it….
That is especially true here, where Riehl had already received a press release about the Ferguson complaint and had access to a copy of the complaint that allowed him immediately to verify that the output was not true. Riehl admitted that ''within about an hour and a half' he had established that "whatever [Riehl] was seeing" in ChatGPT's output "was not true." As Riehl testified, he ''understood that the machine completely fantasized this. Crazy." …
Separately, it is undisputed that Riehl did not actually believe that the Ferguson complaint accused Walters of embezzling from the SAF. If the individual who reads a challenged statement does not subjectively believe it to be factual, then the statement is not defamatory as a matter of law.… [Riehl] knew Walters was not, and had never been, the Treasurer or Chief Financial Officer of the SAF, an organization for which Riehl served on the Board of Directors….
[2.a.] The court also concluded that Walters couldn't show even negligence on OpenAI's part, which is required for all libel claims on matters of public concern:
The Court of Appeals has held that, in a defamation case, "[t]he standard of conduct required of a publisher … will be defined by reference to the procedures a reasonable publisher in [its] position would have employed prior to publishing [an item] such as [the] one [at issue. A publisher] will be held to the skill and experience normally exercised by members of [its] profession. Custom in the trade is relevant but not controlling." Walters has identified no evidence of what procedures a reasonable publisher in OpenAl's position would have employed based on the skill and experience normally exercised by members of its profession. Nor has Walters identified any evidence that OpenAI failed to meet this standard.
And OpenAI has offered evidence from its expert, Dr. White, which Walters did not rebut or even address, demonstrating that OpenAI leads the Al industry in attempting to reduce and avoid mistaken output like the challenged output here. Specifically, "OpenAI exercised reasonable care in designing and releasing ChatGPTbased on both (1) the industry-leading efforts OpenAI undertook to maximize alignment of ChatGPT's output to the user's intent and therefore reduce the likelihood of hallucination; and (2) providing robust and recurrent warnings to users about the possibility of hallucinations in ChatGPT output. OpenAI has gone to great lengths to reduce hallucination in ChatGPT and the various LLMs that OpenAI has made available to users through ChatGPT. One way OpenAI has worked to maximize alignment of ChatGPT's output to the user's intent is to train its LLMs on enormous amounts of data, and then fine-tune the LLM with human feedback, a process referred to as reinforcement learning from human feedback." OpenAI has also taken extensive steps to warn users that ChatGPT may generate inaccurate outputs at times, which further negates any possibility that Walters could show OpenAI was negligent….
In the face of this undisputed evidence, counsel for Walters asserted at oral argument that OpenAI was negligent because "a prudent man would take care not to unleash a system on the public that makes up random false statements about others…. I don't think this Court can determine as a matter of law that not doing something as simple as just not turning the system on yet was … something that a prudent man would not do." In other words, Walters' counsel argued that because ChatGPT is capable of producing mistaken output, OpenAI was at fault simply by operating ChatGPT at all, without regard either to ''the procedures a reasonable publisher in [OpenAl's] position would have employed" or to the "skill and experience normally exercised by members of [its] profession." The Court is not persuaded by Plaintiff's argument.
Walters has not identified any case holding that a publisher is negligent as a matter of defamation law merely because it knows it can make a mistake, and for good reason. Such a rule would impose a standard of strict liability, not negligence, because it would hold OpenAI liable for injury without any "reference to 'a reasonable degree of skill and care' as measured against a certain community." The U.S. Supreme Court and the Georgia Supreme Court have clearly held that a defamation plaintiff must prove that the defendant acted with "at least ordinary negligence," and may not hold a defendant liable "without fault." …
[2.b.] The court also concluded that Walters was a public figure, and therefore had to show not just negligence, but knowing or reckless falsehood on OpenAI's part (so-called "actual malice"):
Walters qualifies as a public figure given his prominence as a radio host and commentator on constitutional rights, and the large audience he has built for his radio program. He admits that his radio program attracts 1.2 million users for each 15-minute segment, and calls himself ''the loudest voice in America fighting for gun rights." Like the plaintiff in Williams v. Trust Company of Georgia (Ga. App.), Walters is a public figure because he has "received widespread publicity for his civil rights … activities," has "his own radio program," ''took his cause to the people to ask the public's support," and is "outspoken on subjects of public interest." Additionally, Walters qualifies as a public figure because he has "a more realistic opportunity to counteract false statements than private individuals normally enjoy"; he is a radio host with a large audience, and he has actually used his radio platform to address the false ChatGPT statements at issue here…. [And] at a minimum, Walters qualifies as a limited-purpose public figure here because these statements are plainly "germane" to Walters' conceded "involvement" in the "public controvers[ies]" that are related to the ChatGPT output at issue here….
Walters' two arguments that he has shown actual malice fail. First, he argues that OpenAI acted with "actual malice" because OpenAI told users that ChatGPT is a "research tool." But this claim does not in any way relate to whether OpenAI subjectively knew that the challenged ChatGPT output was false at the time it was published, or recklessly disregarded the possibility that it might be false and published it anyway, which is what the "actual malice" standard requires. Walters presents no evidence that anyone at OpenAI had any way of knowing that the output Riehl received would probably be false…. [The] "actual malice" standard requires proof of the defendant's "subjective awareness of probable falsity" ….
Second, Walters appears to argue that OpenAI acted with "actual malice" because it is undisputed that OpenAI was aware that ChatGPT could make mistakes in providing output to users. The mere knowledge that a mistake was possible falls far short of the requisite "clear and convincing evidence" that OpenAI actually "had a subjective awareness of probable falsity" when ChatGPT published the specific challenged output itself….
[3.] And the court concluded that in any event Walters had to lose because (a) he couldn't show actual damages, (b) he couldn't recover presumed damages, because here the evidence rebuts any presumption of damage, given that Riehl was the only person who saw the statement and he didn't believe it, and (c) under Georgia law, "[A]ll libel plaintiffs who intend to seek punitive damages [must] request a correction or retraction before filing their civil action against any person for publishing a false, defamatory statement," and no such request was made here.
An interesting decision, and might well be correct (see my Large Libel Models article for the bigger legal picture), but it's tied closely to its facts: In another case, where the user didn't have as many signals that the assertion is false, or where the user more broadly distributed the message (which may have produced more damages), or where the plaintiff wasn't a public figure, or where the plaintiff had indeed alerted the defendant of the hallucination and yet the defendant didn't do anything to try to stop it, the result might well be different. For comparison, check out the Starbuck v. Meta Platforms, Inc. case discussed in this post from three weeks ago.
Note that, as is common in some states' courts, the decision largely adopts a proposed order submitted by the party that prevailed on the motion for summary judgment. The judge has of course approved the order, and agrees with what it says (since she could have easily edited out parts she disagreed with); but the rhetorical framing in such cases is often more the prevailing party's than the judge's.
OpenAI is represented by Stephen T. LaBriola & Ethan M. Knott (Fellows LaBriola LLP); Ted Botrous, Orin Snyder, and Connor S. Sullivan (Gibson, Dunn & Crutcher LLP); and Matthew Macdonald (Wilson Sonsini Goodrich & Rosati, P.C.).
Editor's Note: We invite comments and request that they be civil and on-topic. We do not moderate or assume any responsibility for comments, which are owned by the readers who post them. Comments do not represent the views of Reason.com or Reason Foundation. We reserve the right to delete any comment for any reason at any time. Comments may only be edited within 5 minutes of posting. Report abuses.
Please
to post comments
There are at least three potential standards to use for liability. One is the one used for cars, where the law says we can't do without the product and spending another few years improving the design is not an option. Another is to ask what an ordinary human would do. Another is the one used for hazardous substances. If you "lead the industry" in chemicals with a frightening number of multiple bonds between nitrogen atoms, you can still be responsible when the product explodes at the wrong time.
In part 2a, it looks like the AI industry gets to define itself as an essential service for which AI industry determines the standards of safety. Personally, I think it's obvious that hallucinations put LLMs closer to the explosive end of the liability spectrum.
As this is a trial court decision resting on multiple grounds, the plaintiff can't win only by shifting the standard of liability for hallucinations.
It looks to me more like they're saying AI is just a really complicated Magic 8 Ball, and no sane person places any reliance on anything it outputs, so it's incapable of libel.
Which sounds about right, as unsatisfying as it is. Though I question how much of a defense boilerplate disclaimers intended to be clicked through without reading actually should be.
This was a really poor case anyway, given the repeated efforts necessary to make the defamatory content appear, and that it wasn't seen by a third party. There have been cases that look much better than this one, where an AI is just routinely outputting defamatory statements about somebody to anybody who asks.
It looks to me more like they're saying AI is just a really complicated Magic 8 Ball, and no sane person places any reliance on anything it outputs, so it's incapable of libel.
That notion seems to come up more often lately. Problem is, a person libeled does not suffer reputational damage because of his own interpretation, but instead on the basis of what others who see the defamatory allegations conclude—whether those others have rational basis or not. Where that problem will become particularly acute is in cases involving republications of AI generated libels. At that point all the supposed cues that this is unreliable AI output may not be present.
Well, at the point of republication it's on the actual human in the loop, just exactly as much as it would be if they'd published a defamatory claim based on the output of an actual Magic 8 Ball.
That's what struck me. If a libel has no audience, does it make damages?
Nope. Because an AI enterprise is not a Magic 8 Ball. The former will typically be a publisher. The latter not. To the extent the AI enterprise practices publishing activities, it ought also to be subject to defamation law.
Also, you ignore that potential for viral republication of an initial false and defamatory allegation has always been reason to police libel more vigorously than slander.
So if I hooked up a robot arm to a magic 8 ball, and created a website where you could enter a question, and the arm would shake the ball and then hold it up to a webcam, it would be a "publisher"?
Go ahead, Bellmore. Your business model will not trouble us long.
Or, more concisely, no.
Less concisely, your magic 8 ball setup, operated as a publisher by persons who practiced publishing activities such as curating an audience, would commit defamation if it purported specific, false, defamatory facts about a specific person. But it would no longer be a magic 8 ball.
You cannot have it both ways. You invoke the magic 8 ball because it is universally understood as anodyne and unspecific—which means not defamatory.
Because an AI enterprise is not a Magic 8 Ball. The former will typically be a publisher.
Your ignorance of what AI tools like ChatGPT are (they're not "publishers") rivals your ignorance of how cellular service providers' meta data is created and who owns it.
While I don't think it fully works, that's actually not a terrible analogy as a starting point. Obviously if I ask, "Is John Smith a pedophile?" and get a response of "All signs point to yes" when I shake the M8B, Smith can't sue Mattel.
And the reason is because no reasonable person would believe that M8B results are accurate.
Except that some AIs do a reasonably good job of producing accurate results on some subjects, and not due to random luck either. Of course they're quite unreliable for many other subjects, so one has to be aware of which domains the AI being used is most adept at providing answers for, and which ones it isn't.
So, no...it's really not at all akin to a random answer-producing device.
To be clear, I don't think the AIs are actually magic 8 balls, just that the court is treating them that way.
If I wanted to describe them more accurately, I'd say they're a lossy representation of the underlying training data, (Where the training data itself isn't guaranteed to be true!) that follows it closely enough to be useful for some purposes, but which isn't guaranteed to actually match the data at any given point. (And for points outside the training set, it's not even that good, of course!)
It's like fitting a curve through a series of points, where even if all the points are on the curve, the curve isn't guaranteed to hit any new points that weren't in the set you fit it to.
I get that. I was more responding to Nieporent's "And the reason is because no reasonable person would believe that M8B results are accurate." claim.
Well, not "random," but luck, yes, in fact it is. LLMs generate results through probabilistic models, not through analyzing or knowing anything. There may be domains where the training data makes it more likely that the output will be accurate, but it's still just probabilistic. That's why you can ask an AI the same question multiple times and get different answers.
Well, not "random," but luck, yes, in fact it is.
No, it in fact is not. At least, no more so than it is "lucky" that the law books you learned from in school didn't contain a lot of wildly inaccurate information.
LLMs generate results through probabilistic models, not through analyzing or knowing anything.
Probabilistic modeling is only one aspect of LLMs. But guess what? You also produce results based on similar modeling. Natural languages (like English) are by their nature imprecise, and the way they are used in casual conversation especially so. Add to that the fact that such speech is often at least somewhat ambiguous you'll realize that whenever someone says something to you, interpreting what they mean involves a fair amount of probabilistic modeling on your part (based on context and other clues). And if you provide information in response, the accuracy of that information is subject to a fair amount of probability as well (how accurate the sources were that you learned the information from, whether or not you learned from various conflicting sources, how accurate your memory is, etc.)
That's why you can ask an AI the same question multiple times and get different answers.
You can ask a human being the same question multiple times and get different answers as well.
In contrast, if you enter the same google search (regular google, not an AI adjunct) multiple times, you get the same results every time.
(Yes, for the pedants: the results can change over time as the Internet changes, since Google continually adds to its database. But if you do the same search several times in a row, you get the same results.)
In contrast, if you enter the same google search (regular google, not an AI adjunct) multiple times, you get the same results every time.
Congratulations on that essentially true yet utterly irrelevant observation.
So, in conclusion...prompt results produced by advanced LLM-based AI is not analogous in any meaningful way to the pseudorandom output of a Magic 8 Ball.
The question is not what ordinary people should believe. It’s what ordinary people do believe. It is by no means clear that ordinary people don’t rely on AI, especially given the tendency of hucksters to tout the benefits in large print and relegate the caveats to fine print that everybody knows only defence lawyers read and only very gullible judges think anybody else does.
If there’s evidence people in fact tended to believe the publication and that belief caused damages, the matter should go to the jury.
The first part, about repeating the question until the wrong answer appeared, reminded me of the dispute over targeted ads on Twitter. Allegedly the people who did the exposé about bad ads appearing near good brands had to work to get the ads to appear where they did.
A lot of work, as it happens, and Twitter was able to confirm that only those people actually saw the combination, and nobody else.
To be fair, in that case an enormous amount of work over a period of weeks was required to get the content, while in this case it probably just took a few minutes of pestering.
You're acting like the case(s) have been decided. The Musk/Xitter allegations against Media Matters are still being litigated (in multiple jurisdictions).
Sure, I'm just describing the situation as I understand it.
...as are we all, Brett, as are we all.
Still, would be interesting to feed Brett's total body of internet comments to an LLM (you used to be pretty prolific on 538, right?), and see what the associated chatbot would return.
Hmmmm...I suppose that might explain Riva...
You'd probably get better results if you screened out my more recent comments; I'm in my late 60's now, and frankly neither the years nor the chemo have been kind to my brain. Ah, well, happens to all of us if something else doesn't get us first.
"[3.] And the court concluded that in any event Walters had to lose because (a) he couldn't show actual damages, (b) he couldn't recover presumed damages, because here the evidence rebuts any presumption of damage, given that Riehl was the only person who saw the statement and he didn't believe it"
Guess Walters is getting an education about the Streisand effect.
So AI makes S---T up, and no one should believe it. That's now res judicata.
It had an audience of the guy pushing it to lie about someone else who he showed it too, and both knew it was a hallucination.
Why do AIs who publish have to follow only the standard of care for AIs, not the standard of care for publishers?
In the Boeing 737 Max case, the sensors were controlled by AI. But it was simply assumed that the controlling law was the standard of care for human pilots, not the standard of care for AI pilots. If the standard of care of AI pilots was used, the result would probably be the same as in this case. Since there’s no established standard of care for AI pilots, there isn’t an obligation for them to do anything and anything goes. The same would be true for AI drivers, surgeons, etc.
This is an incredible giveaway to the AI industry. Moreover it’s an encouragement never to use humans for anything. This opinion says that if you use AI to do anything, you are completely free of liability no matter how badly ordinary people get mangled by it, because standards of care for humans have no application to AI and AI has no standard of care.
And as to the consent defense, suppose this judge is sold a sleeper AI robot that ends up assassinating him. Would it be a complete defense that this judge was aware that AI is potentially dangerous and consented to do it, and thus he assumed all the risk that that might happen and the AI manufacturer has no liability for his death? This would seem to be logical implication of this opinion.
And as to reliance, there is considerable evidence that ordinary reasonable people in fact believe things reported AI. Whether they should or not is completely besides the point. The standard is what ordinary people in fact do, not what some judge thinks they should do. The judge should not have decided this issue on summary judgment. There is contested evidence. He should have allowed the matter to be heard.
Judges are showing a deference to technology that places technologists above the reach of the law in a way that is tanatamount to granting a title of nobility. No wonder ordinary people are pissed and want change. I have consistently opposed Trunp. But decisions like this that place elites above justice, insulating them no matter how much harm they do to ordinary people, makes it very understandable why so many ordinary people have been angry at elites and at the establishment and want them all thrown out the door if not out the window, and have been willing to believe and follow even a fraudster who promises to do this.
Why do AIs who publish have to follow only the standard of care for AIs, not the standard of care for publishers?
Which AIs are you asserting "publish"? They simply respond to prompts for information. If you send a friend/associate an e-mail asking them a question and they respond to you with the requested information, does that response make your friend/associate a "publisher"?
So you’re telling me that ifyour friend/associate offers himself to the public as an information service, and when I or anyone else ask him for information about you he tells me you are a convicted sex trafficker and child pornographer, and that’s false, he is exempt from libel? He wasn’t “publishing” the information, so no problem?
So you’re telling me that ifyour friend/associate offers himself to the public as an information service, and when I or anyone else ask him for information about you he tells me you are a convicted sex trafficker and child pornographer, and that’s false, he is exempt from libel?
Are your reading comprehension skills really that piss-poor? Or are you taking a play from Sacrastr0's pop-up book of straw men? I'm not saying anyone is exempt from anything. I simply asked you if that constituted "publishing"...and not surprisingly, you've completely avoided the actual question.
He wasn’t “publishing” the information, so no problem?>
Well, if he wasn't "publishing" the information then it means that he's not "publishing"...right?
It does for defamation purposes.
That was my point.
How in the hell does a communication solely between you and your friend constitute "publishing", let alone "defamation"?
"In the Boeing 737 Max case, the sensors were controlled by AI."
I really doubt that, unless you have some ancient weak definition of AI.
As in so many other cases, he's commenting without the benefit of having even the slightest clue what he's talking about.
That’s like arguing that a case involving an anttique car with a weak engine isn’t relevant to today’s newer and stronger cars. Our new cars are a totally different thing, so that old stuff has nothing to do with it. Wrong. A machine’s age and power, horse or otherwise, have no legal relevance. Same here.
You’re buying into the marketing ploy that AI is some sort of totally new thing. Legally, it isn’t.
How about citing some actual sources then? The FAA would never have allowed an AI in the 737 Max, nor any other current airliner. It only shows you have no clue what an AI is or does.
That’s like arguing that a case involving an anttique car with a weak engine isn’t relevant to today’s newer and stronger cars. Our new cars are a totally different thing, so that old stuff has nothing to do with it. Wrong. A machine’s age and power, horse or otherwise, have no legal relevance. Same here.
No, it's not even remotely like that. While "artificial intelligence" doesn't have a particularly exact and widely accepted definition, it most definitely does not encompass pre-programmed responses to sensor input. What you're claiming is closer to calling an automatic light that turns on/off when it's light/dark outside "AI".
Specifically, in the Boeing 737 Max case, I believe they were just comparing the output of two sensors to see if they agreed, and lighting up a warning light if they didn't. And having the warning light was an upcharge.
If you didn't shell out for the optional warning light, the avionics had no way of telling you the sensor had failed.
Worse still, they found that, due to a software error, the warning light wouldn't even work on the planes that had it. But, no biggy, they figured, they'd just roll that fix into the next regular software update... And, anyway, they weren't including anything in the pilot's manual to explain what to do if it did come on.
Seriously, when I say heads should roll over this one, I'm not sure I mean metaphorically. It was about as bad as it gets.