The Volokh Conspiracy

Mostly law professors | Sometimes contrarian | Often libertarian | Always independent

Are AI Program Outputs Reasonably Perceived as Factual? A Response to Eugene

A different view.

|

I am not one to disagree with Eugene about an area within his area of expertise, but I confess I have a different reaction to his view, expressed in his draft paper, that AI program outputs "would reasonably appear to state or imply assertions of objective fact." Take OpenAI's ChatGPT.  Eugene argues in his draft at page 8 that OpenAI's business model is premised on ChatGPT outputs being factually correct:

OpenAI has touted ChatGPT as a reliable source of assertions of fact, not just as a source of entertaining nonsense. . . .  The AI companies' current and future business models rests entirely on their programs' credibility for producing reasonable accurate summaries of the facts. When OpenAI helps promote ChatGPT's ability to get high scores on bar exams or the SAT, it's similarly trying to get the public to view ChatGPT's output as reliable. Likewise when its software is incorporated into search engines, or into other applications, presumably precisely because it's seen as pretty reliable. It can't then turn around and, in a libel lawsuit, raise a defense that it's all just Jabberwocky.

Naturally, everyone understands that AI programs aren't perfect. But everyone understands that newspapers aren't perfect, either—yet that can't be enough to give newspapers immunity from defamation liability; likewise for AI programs. And that's especially so when the output is framed in quite definite language, complete with purported quotes from respected publications.

Here's my question: Is the reasonable observer test about business models, or is it about what people familiar with the service would think?  Because if the test is about what normal observers would think, it seems to me that no one who tries ChatGPT could think its output is factually accurate.

That's what makes ChatGPT distinctive and interesting, I think. It combines good writing and ease of language that sounds real, on one hand, with obvious factual inaccuracies, on the other.  It's all style, no substance. The false claims of fact are an essential characteristic of the ChatGPT user experience, it seems to me.  If you spend five minutes querying it, there's no way you can miss this.

For example, back in January, I asked ChatGPT to write a bio for me.  This should be easy to do accurately, as there are lots of online bios of me if you just google my name.   ChatGPT's version was well written, but it had lots and lots of details wrong.

To correct the errors in the ChatGPT output, I joined Berkeley in 2019, not 2018; I didn't go to Yale Law School; I didn't clerk for Judge O'Scannlain; I wasn't an appellate lawyer at DOJ; there is no 2019 edition of my Computer Crime Law casebook, and it certainly wouldn't be the 2nd edition, as we're now on the 5th edition already; I'm not a fellow at the American College of Trial Lawyers; and I've never to my knowledge been an advisor to the U.S. Sentencing Commission.  (Some would say I'm also not an invaluable asset to the law school community, but let's stick to the provable facts here, people.)

My sense is that these kinds of factual errors are ubiquitous when using ChatGPT.  It has style, but not substance.  ChatGPT is like the student who didn't do the reading but has amazing verbal skills; it creates the superficial impression of competence without knowledge.  Maybe that isn't what OpenAI would want it to be.  But I would think that's the conclusion a typical user gets pretty quickly from querying ChatGPT.