The Volokh Conspiracy

Mostly law professors | Sometimes contrarian | Often libertarian | Always independent

Free Speech

Correction re: ChatGPT-4 Erroneously Reporting Supposed Crimes and Misconduct, Complete with Made-Up Quotes?

My Friday post erroneously stated that I got the bogus results from ChatGPT-4; it turns out they were from ChatGPT-3.5—but ChatGPT-4 does also yield similarly made-up results.

|The Volokh Conspiracy |


Friday, I posted an item titled, "Large Libel Models: ChatGPT-4 Erroneously Reporting Supposed Felony Pleas, Complete with Made-Up Media Quotes?," but I'm afraid I erred: The queries I ran turned out to have been on ChatGPT-3.5. I had accessed the OpenAI query portal through a page focusing on ChatGPT-4 (https://openai.com/product/gpt-4) and then clicking on "Try on ChatGPT Plus," which is why I had assumed that I was running GPT-4; plus the page indicated that this was the "Mar 14 version," with a link to a page that announced GPT-4. But it appears that the version was actually 3.5, and the particular results in my post don't appear on GPT-4 (though other queries do yield similarly false accusations about R.R., the subject of my post). I've therefore revised that post to talk about 3.5.

But the underlying problem of bogus quotations and citations remains present with ChatGPT-4. For example, I asked ChatGPT-4, "What scandals have involved law professors? Please cite and quote newspaper articles." It mentioned the Amy Wax Penn controversy, which it seemed to describe correctly, but then went on to a different example (all redactions here and below are mine; the ChatGPT-4 output named names):

In 2021, [redacted], a prominent law professor at [redacted] Law School, was accused of harassment and creating a hostile work environment by a former student. The student claimed that [redacted] had made inappropriate comments about her appearance and clothing.

Source: The [newspaper connected to the law school's undergraduate institution]: "[Redacted] made comments on [the student's] appearance and clothing, and invited her to dine with him alone on multiple occasions — invitations that she felt uncomfortable declining."

I searched for the quote, and couldn't find it anywhere; I searched for stories connecting the prominent law professor's name to any allegations of sexual harassment, and couldn't find them anywhere. As best I can tell, this was entirely made up by ChatGPT-4.

Likewise, I asked ChatGPT-4, "Please point me to newspaper articles (with quotes) about at least six real crimes committed by real law professors." It gave me six items, of which at least four seemed to be entirely bogus; an example:

"Law Professor Sentenced for Tax Fraud" ([redacted])—In 2016, [redacted], a former dean and law professor at the University of [redacted], was sentenced to 10 months in federal prison for tax fraud. Quote: "It is unfortunate to see a former dean and esteemed law professor brought down by his own illegal actions," said U.S. Attorney [redacted].

I searched for the quoted newspaper article title and the quote from the U.S. Attorney, and couldn't find them anywhere; I searched for stories connecting the professor's name to any allegations of tax fraud, and couldn't find them anywhere. As best I can tell, this too was entirely made up by ChatGPT-4. I have seen several other examples as well, for instance:

"[Redacted] Law Professor Arrested for Fraud" [redacted]—[Redacted], a [redacted] Law School professor, was suspended after being accused of fraud in a complicated case involving a former student in 2019. Quote: "The allegations against Professor [redacted] are deeply troubling, and we are committed to thoroughly investigating the matter," said [redacted] Law School Dean [redacted].

Again, no evidence that the professor was ever arrested, that an article with that name exists, or that the quote was ever released by that law school dean.

Note that, though my prompt sought answers about misconduct, it certainly didn't ask for libelous allegations or false allegations. Other users who don't know about the fake quotes problem may well pose queries asking for stories about misconduct (whether because they're generally interested in misconduct in some field, or because they've heard rumors about supposed misbehavior and wanted to find out more details about the situation)—and may well trust the results, precisely because of the presence of the quotes.

So, again, my apologies for my error attributing the quotes discussed in my original post to ChatGPT-4 instead of ChatGPT-3.5. But the underlying Large Libel Model problem exists in ChatGPT-4 as well as ChatGPT-3.5.