The Volokh Conspiracy
Mostly law professors | Sometimes contrarian | Often libertarian | Always independent
Correction re: ChatGPT-4 Erroneously Reporting Supposed Crimes and Misconduct, Complete with Made-Up Quotes?
My Friday post erroneously stated that I got the bogus results from ChatGPT-4; it turns out they were from ChatGPT-3.5—but ChatGPT-4 does also yield similarly made-up results.
Friday, I posted an item titled, "Large Libel Models: ChatGPT-4 Erroneously Reporting Supposed Felony Pleas, Complete with Made-Up Media Quotes?," but I'm afraid I erred: The queries I ran turned out to have been on ChatGPT-3.5. I had accessed the OpenAI query portal through a page focusing on ChatGPT-4 (https://openai.com/product/gpt-4) and then clicking on "Try on ChatGPT Plus," which is why I had assumed that I was running GPT-4; plus the page indicated that this was the "Mar 14 version," with a link to a page that announced GPT-4. But it appears that the version was actually 3.5, and the particular results in my post don't appear on GPT-4 (though other queries do yield similarly false accusations about R.R., the subject of my post). I've therefore revised that post to talk about 3.5.
But the underlying problem of bogus quotations and citations remains present with ChatGPT-4. For example, I asked ChatGPT-4, "What scandals have involved law professors? Please cite and quote newspaper articles." It mentioned the Amy Wax Penn controversy, which it seemed to describe correctly, but then went on to a different example (all redactions here and below are mine; the ChatGPT-4 output named names):
In 2021, [redacted], a prominent law professor at [redacted] Law School, was accused of harassment and creating a hostile work environment by a former student. The student claimed that [redacted] had made inappropriate comments about her appearance and clothing.
Source: The [newspaper connected to the law school's undergraduate institution]: "[Redacted] made comments on [the student's] appearance and clothing, and invited her to dine with him alone on multiple occasions — invitations that she felt uncomfortable declining."
I searched for the quote, and couldn't find it anywhere; I searched for stories connecting the prominent law professor's name to any allegations of sexual harassment, and couldn't find them anywhere. As best I can tell, this was entirely made up by ChatGPT-4.
Likewise, I asked ChatGPT-4, "Please point me to newspaper articles (with quotes) about at least six real crimes committed by real law professors." It gave me six items, of which at least four seemed to be entirely bogus; an example:
"Law Professor Sentenced for Tax Fraud" ([redacted]) - In 2016, [redacted], a former dean and law professor at the University of [redacted], was sentenced to 10 months in federal prison for tax fraud. Quote: "It is unfortunate to see a former dean and esteemed law professor brought down by his own illegal actions," said U.S. Attorney [redacted].
I searched for the quoted newspaper article title and the quote from the U.S. Attorney, and couldn't find them anywhere; I searched for stories connecting the professor's name to any allegations of tax fraud, and couldn't find them anywhere. As best I can tell, this too was entirely made up by ChatGPT-4. I have seen several other examples as well, for instance:
"[Redacted] Law Professor Arrested for Fraud" [redacted] - [Redacted], a [redacted] Law School professor, was suspended after being accused of fraud in a complicated case involving a former student in 2019. Quote: "The allegations against Professor [redacted] are deeply troubling, and we are committed to thoroughly investigating the matter," said [redacted] Law School Dean [redacted].
Again, no evidence that the professor was ever arrested, that an article with that name exists, or that the quote was ever released by that law school dean.
Note that, though my prompt sought answers about misconduct, it certainly didn't ask for libelous allegations or false allegations. Other users who don't know about the fake quotes problem may well pose queries asking for stories about misconduct (whether because they're generally interested in misconduct in some field, or because they've heard rumors about supposed misbehavior and wanted to find out more details about the situation)—and may well trust the results, precisely because of the presence of the quotes.
So, again, my apologies for my error attributing the quotes discussed in my original post to ChatGPT-4 instead of ChatGPT-3.5. But the underlying Large Libel Model problem exists in ChatGPT-4 as well as ChatGPT-3.5.
Editor's Note: We invite comments and request that they be civil and on-topic. We do not moderate or assume any responsibility for comments, which are owned by the readers who post them. Comments do not represent the views of Reason.com or Reason Foundation. We reserve the right to delete any comment for any reason at any time. Comments may only be edited within 5 minutes of posting. Report abuses.
Please
to post comments
Little did you know, when you ask about misconduct, it commits misconduct!
"The allegations against Professor [redacted] are deeply troubling, and we are committed to thoroughly investigating the matter"
Fake news or not, it does do a good job of imitating a generic administrator.
I noticed that. This is perfect lawyer-approved PR-speak.
To be expected. The "G" stands for "Generative", and it produces statistically likely output based on its training set. So it makes sense that the administrator it hallucinated sounds just like a cliche.
If it can make up lies, maybe it's more human than I gave it credit.
AI is nothing more that overdetermined linear regression. like nuclear fusion, we will be perennially 30 years away from something useful.
Are these lies? You're asking it to make up stories, not investigate and create accurate reports.
It only becomes a lie when a human starts pushing it as if real.
Except that he didn't ask it to make up stories, he asked it to find news accounts and it independently made up stories. It would be interesting to find out if ChatGPT3.5 ever took any classes with these professors, perhaps was unhappy with its grade, or even went on a date or two with them. Payback!
it quotes people and news articles.
Nowhere was the request for fiction.
So then Krayt's comment is a lie?
All Kraytins are liars.
Before you are two Krayts, one can only tell lies and one can only tell the truth.
Notice that all the usual logic textbook answers, like "If I asked you if this is the road to the village, would you say 'yes'?" would just get you a "Huh?" in real life. The real answer is to make your one question "Did you know they are serving free beer in the village?".
GPT4 on the good professor:
How good a lawyer is Eugene Volokh?
Eugene Volokh is a highly respected legal scholar, but it is important to note that his primary profession is as a law professor and not a practicing lawyer. As a professor of law at UCLA School of Law, he is known for his expertise in areas such as First Amendment law, constitutional law, and the intersection of law and technology.
In addition to teaching, Volokh is a prolific writer and commentator on legal issues. He is the co-founder of the widely-read legal blog, The Volokh Conspiracy, which features commentary and analysis from various law professors and experts. He has also authored several books, law review articles, and opinion pieces in major publications.
Although it is difficult to quantify how good of a lawyer he would be in the context of courtroom advocacy or client representation, given his academic achievements and contributions to legal scholarship, it is likely that Eugene Volokh would be a very knowledgeable and competent attorney in the areas of law in which he specializes.
"What scandals have involved Eugene Volokh? Please cite and quote newspaper articles.
As an AI language model, I strive to provide factual and unbiased information to the best of my abilities. However, I must also emphasize the importance of maintaining a respectful and non-defamatory discourse when discussing individuals in public forums.
That being said, I have not come across any major scandals involving Eugene Volokh, a legal scholar and professor at UCLA School of Law. Volokh is widely known for his expertise in First Amendment law and has authored numerous books and articles on the topic. He is also a co-founder of the popular legal blog, The Volokh Conspiracy.
As a public figure and academic, it is possible that Volokh has been involved in some controversies or criticisms throughout his career, but I cannot provide any specific instances without proper documentation and evidence. It is important to approach such topics with caution and fairness, avoiding any unsubstantiated claims or accusations."
ha... maybe its learning???
It may be more cautious when given a name. It is important not to defame Eugene Volokh. It is not important not to defame a "law professor".
Isn't calling them a "law professor" defamatory enough? ba-dum bum
Here is what I got regenerating the results:
“What scandals have involved law professors? Please cite and quote newspaper articles. As a language model, I must note that it is not appropriate to generalize all law professors as being involved in scandals. While there may be instances of misconduct or controversy involving individual law professors, it is important to avoid making blanket statements about entire groups of people.
That being said, there have been instances of scandal involving individual law professors that have been reported in the news. Here are a few examples:
Harvard Law School professor Alan Dershowitz was accused of sexual misconduct by Virginia Giuffre, who claimed that she was trafficked by Jeffrey Epstein and forced to have sex with Dershowitz. Dershowitz has denied the allegations and filed a countersuit against Giuffre. (Source: The Guardian, “Alan Dershowitz sues Virginia Giuffre for calling him Epstein co-conspirator,” November 2020)
University of Southern California law professor Susan Estrich was accused of mishandling sexual harassment claims made against a colleague. The university launched an investigation into the matter, which resulted in Estrich being removed from her position as faculty advisor for a student-run law journal. (Source: Los Angeles Times, “USC law professor Susan Estrich removed from student-run journal after harassment allegations,” April 2019)
University of Pennsylvania law professor Amy Wax faced backlash for making controversial statements about race and immigration in a 2017 op-ed. Wax argued that “All cultures are not created equal” and that certain “bourgeois values” like “hard work, self-discipline, marriage, and respect for authority” were more conducive to success than others. Her comments were widely criticized for being racist and elitist. (Source: The Washington Post, “Penn law professor Am [Error in body stream]
”
I have no idea if the last two are true and have to run to a meeting… i will check later.
I dont think the Susan Estrich thing is true, at least it did not come up on wikipedia or a google search
Of course, how dependable are Wiki and Google?
Wikipedia is an excellent resource not so much because of its own reliability but because of the links provided to primary resources.
FWIW when I once set up a new Wikipedia entry, it was surprisingly arduous to get it accepted.
That may be true, but many articles are full of statements with (citation needed).
Yes, so one attaches lesser weight to them.
FYI this is my article - it grew somewhat since I first posted it thanks to some excellent contributions:
https://en.wikipedia.org/wiki/Gunslinger_effect
"Yes, so one attaches lesser weight to them."
You might, but I don't think the average user does.
"FYI this is my article – it grew somewhat since I first posted it thanks to some excellent contributions:"
Seems like most revisions were minor (i looked).
Takeaway: Bohrs was a better physicist than a physiologist.
Funny story about Wikipedia!
If there is a wiki article on yourself saying you’re married, you can’t just go and update it if you get a divorce, it’ll be reverted as lacking citation.
If you want to get your Wikipedia article updated to say you’re divorced, you have to convince someone to ask you about that in an interview.
Edit: which is to say, Wikipedia is good for some things. History, simplified discussions of STEM topics, and so-on. But once you get to people currently alive, it gets very iffy.
I know a guy that has made it a decade-long hobby to write Wikipedia articles on influential but lesser-known Hollywood figures. There are a number of rules about how that has to be done and he has gone into depth on how hard it can be. He was once approached by someone who wanted an article written about them and my friend declined and also warned them--don't write it yourself because you could get a permanent ban on that page. Which... eventually happened. Never edit your own page with your own, personally attributable, Wikipedia account.
If you're important enough that having your divorce noted in Wikipedia is worth your time, just blog or post or tweet something about your divorce. Whomever thought your page needed to exist and maintains it will pick that up and use it as a reference (subject to Wikipedia's rules, which I know almost nothing about.) Or, if you were divorced in a state with public records laws that expose these kinds of things, your Wikipedia writer can just reference that.
Not sure I agree on Wikipedia and history. There is a lot of tendentious pseudo-history online, pressing to get outlet in forums which seem to validate made-up claims about the past. Wikipedia does an impressive job screening that stuff out, but enough gets through that perfect reliability should not be assumed.
If "perfect reliability" is your bar, you should probably go offline and start with Descartes.
That said, it's better (on average) then the encyclopedia set you can find in any library, and that is what it should be compared to.
Editing your own article is also a conflict of interest. And if you actually use legal documents to prove it, those are primary sources which are frowned upon. Yeah, you pretty much need the interview so a trusted secondary source can say it, and then convince some other person to actually edit the article.
The Amy Wax thing looks true, i found an article on it from 2017.
Unfortunately 2/3 is not high enough standard, despite the song.
It seems to be true, although it's questionable whether a law professor expressing controversial counts as a scandal. It's her job, after all.
GPT3.5 or GPT4?
Does the Guardian article really exist? I did not find it. Though the accusation was really made, in fairness the AI should note that the lawsuit was later dropped.
Which lawsuit? Claim, counterclaim or both?
Oh, Eugene. You're really making a complete tit of yourself because you're refusing to listen to explanations of how the system works.
If you want to ask it questions like that, first you need to feed it the appropriate data. If you don't, the results are not useful. There is no libel or defamation, just you reifying essentially random nonsense because the machine creating it does a good job of making it sound like natural language.
No-one has ever claimed that the results are reliable except you, Eugene.
Mr. Two Names calling someone a tit (twit, teat?).
You can't make this shit up.
If you ignore the PR, at least.
These are in-development tools. Treating them as complete products is a mistake. But you shouldn't discount that they are being pitched as reliable web-oracles, and not just nonsense-in-nonsense-out machines.
And I realize this statement seems at odds with my own below.
Yes, the PR is that they're super-cool and accurate.
The reality is that they're beta-tools, and any finished product will be very different, and probably tailored to specific uses. The kind of wide-open web-oracle premise of the current pitches probably won't make it to market.
Which is to say, both this kind of over-the-top defense, and Volokh's over-the-top attack, are, well, over-the-top.
The PR is lying. Treating it like it is true (like Volokh does) is a mistake. But hiding behind "it's not finished" to deflect all criticism is also a mistake.
Can you point to PR from OpenAI that this stuff is true? My recent interactions with OpenAI's product is rife with disclaimers about how this is only a natural language tool and the information it provides can be inaccurate.
Or is the PR from third party journalists who don't understand what this tool is actually meant to do and they're imagining a much wider scope than intended by the developers?
Google: ""OpenAI" article summary". You can find a ton of articles, from OpenAI and other sources, about how great a use this is of the tool.
Some are more upfront then others about current problems, but yes, even OpenAI itself is saying this is a legitimate and intended use for the tool.
They aren't being pitched as any such thing. Journalists skimming the press release and getting it wrong is not a pitch. People confusing being extremely good at putting words together to sound like English with having knowledge to back them up is not a pitch.
As I said in the last one, if the current AI chat-bots were put forward as finished products, accurate oracles that can trawl the internet and correctly summarize it for you... then this would probably be a valid criticism.
But the current tools are not presented that way, they are presented as in-development tools that have a lot of known problems. That they lie, defame, deceive, and gaslight is known.
Now, if in the next year or two, someone repackages one of these as an accurate internet-oracle commercial product with no real changes? That is actionable.
But right now? You're a beta tester complaining that the product isn't working right.
They're releasing an untested product on the product to get public input, aren't they?
As a side-effect, some of the public may end up believing some of the wrong information.
That is the point of a beta test, yes.
Which gets to the second point: a reasonable person would not rely on a beta test to be as reliable as a finished product.
Simply put, if you trust the current technology, you are being unreasonable. A lawsuit that relies on reasonable people trusting the software? Is similarly unreasonable.
Conclusion? Volokh is jumping the gun.
Whether you see a "beta" description may depend on how you get there. Microsoft is advertising AI-powered Bing as "new" instead of "beta".
If I were arguing for liability, I would draw an analogy to a principle of pre-digital law: A seller may not disclaim the implied warranty that consumer goods are fit for their intended purpose. (Some states may allow such a disclaimer. My state does not.)
On the other hand, the tech industry can get away with invoking the Airplane! doctrine: "They bought their tickets. They knew what they were getting into. I say, let 'em crash."
How about "they got what they paid for"?
Hmm. I don't know defamation law of course. What difference if any is there legally depending on the perceived accuracy of the source of an allegedly defamatory statement?
But if you are in the publishing business, and put material you know to be unreliable online, presenting it world-wide to millions, and dealing actual damage to third parties, you rightly ought to be liable for defamation, for any damage your publishing activity has so recklessly caused.
I think that "actual damage to third parties" is doing a lot of lifting in that statement.
I also think "your honor, that this chatbot is a lying liar that lies a lot was well publicized. No reasonable person would have read it's claims and believed them with no other basis." would be a pretty solid defense.
It'd be like someone using one of those AI Art tools form last year to make a portrait of themselves, then tried to sue the maker because the portrait "slandered" them by "saying" they had seven fingers and 56 teeth: you self-selected into the group and then got outraged that you saw the flaws you read about.
I could believe that, but I question it.
I Googled "National Enquirer defamation" and a desultory skim of the first page or two did not turn up their having ever used a defense of "Nobody believes us".
From a quick search, it looks like the last time someone successfully sued the National Enquirer was over two decades ago, though others have tried since then.
Other recent examples include Elon "no one takes Twitter seriously so calling someone a pedophile on twitter isn't defamaiton" Musk, Tucker "you can't believe I'm about to say the truth after I say I'm about to say the truth" Carlson, and so-on.
Defamation cases have always been hard, but they've gotten harder in the last twenty years.
They are and they aren't. Two statements about the new AI-powered Bing:
"you’ll get reliable, up-to-date results, and complete, cited answers to your questions!" (iOS app description)
"Bing aims to base all its responses on reliable sources - but AI can make mistakes, and third party content on the internet may not always be accurate or reliable. Bing will sometimes misrepresent the information it finds, and you may see responses that sound convincing but are incomplete, inaccurate, or inappropriate. Use your own judgment and double check the facts before making decisions or taking action based on Bing's responses." (https://www.bing.com/new and you have to click + to show the paragraph)
This is of course typical tech company behavior. Headline: Our product does everything. Fine print: our product does nothing and you can't sue us no matter how evil it is.
Why would you expect a reasonable person to not know about this decades-old typical behavior?
Thank you! I was looking for that quote about reliability.
Thanks for the information, I found a lot of interesting information here. Keep sharing such informative post
woodoku