The Volokh Conspiracy
Mostly law professors | Sometimes contrarian | Often libertarian | Always independent
Hallucinations in the District of New Jersey
Is it an abuse of judicial power for a judge to issue an opinion with AI hallucinations?
Eugene blogged about Judge Julien Xavier Neals of the U.S. District Court for the District of New Jersey. Judge Neals issued an opinion that included errors, including made-up quotes and incorrect case outcomes. One of the parties submitted a letter pointing out these errors. It seems pretty clear that the Judge, or at least someone in his chambers, used generative AI to create the opinion. (Judge Neals is sixty years old, so I would suspect a law clerk made this error.) The judge promptly withdrew the opinion.
I suppose this particular case is settled, but I would wager there are more orders on Judge Neals' dockets that have hallucinations. Indeed, I suspect there are many judges throughout the country that have issued opinions with hallucinations. Savvy litigators should start combing through all adverse orders, and try to determine if there are obvious indicia of hallucinations. This will make excellent grounds for reversal on appeal.
But let's take a step back. What do we make of a judge who issues an opinion based on made-up cases? To be sure, judges makes mistakes all the time. Moreover, clerks make mistakes that their judges do not catch. When I was clerking, I made a particularly egregious error that my judge did not catch. The non-prevailing party promptly filed a motion for reconsideration. The opinion was withdrawn, and a new opinion was issued. The outcome of the case was not altered by that error, but at least the opinion was corrected.
Still, one might ask how closely Judge Neals, and other judges, review the work of their law clerks. Do the judges actually check the citations to see if they are hallucinated? I would suspect most judges do not check citations. District Court dockets are very busy, and it is not realistic to expect judges to so closely scrutinize their clerks' work. (Later in Justice Blackmun's career, he apparently limited his review of draft opinions to checking citations.)
I think the more useful frame is to ask whether the judge has failed to adequately supervise his law clerks. Judges invariably have to delegate authority to law clerks, even if the judge ultimately signs all of the orders. That delegation must include what is effectively a duty of care. In other words, the judge should tell the clerk how to go about doing the job, and in particular, how not to go about doing the job. In 2025, I think all judges should tell their clerks to either not use AI at all (my advice), or to use AI responsibly and triple-check any cited cases. The failure to give this advice would be an abuse of discretion.
But is it more than just an abuse of discretion? Does an Article III judge abuse his power when he issues an opinion based on hallucinated cases that no one in his chambers bothered to check? Judges have the awesome power to affect a person's life, liberty, or property, merely by signing their names to a piece of paper. It is the order, and not the opinion, that has the legal effect.
I think we would all agree that a judge would abuse his power by deciding a case by flipping a coin or rolling a dice. I suppose using AI is a bit less reckless than a game of chance, but not by much. Relatedly, is it an abuse of power when a judge grants an ex parte TRO without evening reading the brief? I think we are starting to see some of the boundaries of the judicial power.
I don't know that Judge Neals will receive a misconduct complaint, as he promptly withdrew his opinion. But an enterprising sleuth could do a close analysis of all opinions from Judge Neals, and judges nationwide, and perhaps find a pattern of misconduct. Would that record show a judge cannot be trusted to exercise the judicial power?
And speaking of the District of New Jersey, can we be certain that Judge Neals voted to appoint Desiree Grace as the United States Attorney for the District of New Jersey? Or maybe it was a Chatbot?
I'll leave you with one anecdote I read in a recent article about what AI is doing to students:
My unease about ChatGPT's impact on writing turns out to be not just a Luddite worry of poet-professors. Early research suggests reasons for concern. A recent M.I.T. Media Lab study monitored 54 participants writing essays, with and without A.I., in order to assess what it called "the cognitive cost of using an L.L.M. in the educational context of writing an essay." The authors used EEG testing to measure brain activity and understand "neural activations" that took place while using L.L.M.s. The participants relying on ChatGPT to write demonstrated weaker brain connectivity, poorer memory recall of the essay they had just written, and less ownership over their writing, than the people who did not use L.L.M.s. The study calls this "cognitive debt" and concludes that the "results raise concerns about the long-term educational implications of L.L.M. reliance."
I still refuse to use AI. I may be the last man standing.
Editor's Note: We invite comments and request that they be civil and on-topic. We do not moderate or assume any responsibility for comments, which are owned by the readers who post them. Comments do not represent the views of Reason.com or Reason Foundation. We reserve the right to delete any comment for any reason at any time. Comments may only be edited within 5 minutes of posting. Report abuses.
Please
to post comments
I wouldn't be surprised if relying on AI decreases memory and understanding. When I copy code from examples or other programs, unless I have used that same code before often enough to already understand it, I always type it in from scratch instead of copy-n-paste, because it forces me to understand it.
But all of life is like that. I have very little technical knowledge of how my food was grown or raised, harvested or slaughtered, package, cooked, delivered, etc. Yes, I am not as knowledgeable as ancient Egyptians about phases of the moon and the seasons. But that frees up my brain for other tasks that the ancient Egyptians couldn't do.
AI now is primitive and barely functional. It is like using black powder instead of modern relatively safe explosives. Care is required. I enjoy playing with AI, or asking it history questions to give me a background with which to start (and verify in the process!). It is not reliable. But that's no reason to not play with it and explore its capabilities and limitations.
Indeed. It sounds a bit like the effect of satnav.
Once upon a time, best beloved, there was no satnav, so you had to work out your route to somewhere unfamiliar by looking at a map, and maybe scribbling down a few notes.
To cut down the number of times you had to stop and check the map or your notes, you'd actually try to remember the route you'd worked out. Like, actually in your head.
I used to be able to do that well enough. But now I've been using satnav for decades, I can't.
Use it or lose it.
That's a better analogy, thanks. Nowadays, I end up using map apps, but stepping through the route to help memorize it, and scribble a few notes on a postit.
I couldn't call my brother's phone number from memory to save my life, or my ex wife's, even before she was an ex. I always meant to memorize the former just in case. What if my phone got stolen?
Wait! I wrote it on a slip in my wallet! Saved by alternative external memory!
Wait! What if I was robbed and lost both the phone and wallet? Better write it down at home.
Wait! What if they took my keys? Well, I can yank the door open.
And I could log into the phone company, they xfer my contacts every time. What if I forgot the password, wait, I never memorized it. Wait! I wrote it down on a slip in my walle...ooooooooh.
It's not reliable and it's sycophantic - your descriptions suggest you are putting more effort into checking on it than you are getting in value.
This seems to be the case everywhere. It tells you what it thinks you want to here, praises mundane insights as original so you keep coming back.
When I was in college, a famous physics prof suggested that when we hear or read new things, we see if they have "the ring of truth" before buying into them. We don't have to understand them fully upfront--just see whether they, say, violate laws of thermodynamics or other giveaways. Digging into them afterwards may be fruitful or not; if they lack that ring, I'm not digging.
I was talking with a former patient of mine today. She asked me about CoQ10, something I have seen in passing, but knew precisely zero about. So I looked it up. Lots of references. When the wiki article said things that I already knew (about the chemistry and biochemistry of the stuff) I just glossed over it. When it made statements that might or might not be true, I looked at the references. Some were listed as opinions, which is not exactly what you'd see in a refereed journal. At least they were honest.
Patient said that I was the only person she knew who looked at footnotes.
I review a lot of articles for Radiology (a medical journal). Same thing. Sometimes references are on point, sometimes they refer to things totally unrelated to what they supposedly support. When an author does that, as far as I'm concerned, it's instant rejection. I have zero tolerance for that kind of thing.
When I have doubts about a paper, the first thing I do is check the first few footnotes. Particularly in fields that are prone to pseudoscience (e.g. "studies" of dietary supplements such as CoQ10), I've been surprised by how likely it is to find their footnotes, upon even casual inspection, to be obvious junk. It seems to suggest that if you are too weak to engage in rigorous reasoning, you almost certainly won't have the strength to do legitimate citations.
Reminds me of gold seller ads. They push bad times and gold going up, except...if they owned a bunch of gold to sell, and they thought it bad times, they'd hold onto it, not try to stampede you into buying theirs.
“Savvy litigators should start combing through all adverse orders, and try to determine if there are obvious indicia of hallucinations. This will make excellent grounds for reversal on appeal.”
Josh thinks reviewing adverse orders for errors is something “savvy” litigators should do, but aren’t currently doing. Lol. Lmao.
Ahem That's not what he said.
I have been practicing for nearly thirty years, and have done many appeals. I have identified numerous errors in judicial opinions. But it would never occur to me until now that a judicial opinion would contain a citation to a fictional case. AFAIK, this is unprecedented.
So, yes, this type of error is something that good lawyers should start looking out for.
I know Blackman is not well thought of by many around here. But cheap misstatements of his position speak more against his detractors than him.
If an order or opinion has cases or quotes that I don’t already know about from briefing or research, I always look those up to see what the court was relying on or where it got that particular language. If the court is using fake cases and quotes any competent lawyer would find out quickly. It’s not particularly savvy to review a court’s opinion before deciding what to do next see a new case you’re not familiar with and look it up. That’s just normal lawyer stuff.
The "type of error" is that the court misstated the facts, or the law, or both. What's unprecedented is the source of said error. But you don't need to do anything different to spot the error; all you need to do is what you always do (assuming you're a better lawyer than Blackman): review the opinion to see why the court ruled against you. After all, you need to see if there are grounds for appeal.
To be sure, before now if the court cited a case you wouldn't have assumed the court made it up. But if you didn't recognize it, you would've pulled the case to see why you didn't recognize it and if it really justified the court's ruling. And when you couldn't find it, you'd have raised the issue.
In this particular case, the ending of the letter suggests that the lawyers who wrote it *didn’t* discover the hallucinations on their own, but only thanks to other lawyers in a completely different case that had cited this one as supplemental authority. It’s possible that the firm in this case already knew about the hallucinations before the other firm highlighted them, but the timing of filing the letter suggests that they didn’t.
But then, I have no idea whether the lawyers in this case are good lawyers.
Yet another nail in the coffin where the judiciary is burying respect for the courts.
I recently interviewed someone for an associate position. I had pulled up a filing that they had submitted in a case.* When I looked at it, something seemed off to me.
During the interview, I asked them about the filing. Long story short, they admitted that it was drafted with AI.
Hard pass.
*I always, always look at writing samples for applicants- and the best way to do that is to find a case they were on and pull the docket.
Curious. Was your issue that he did not check the cites? Or that the prose was drafted with the help of AI?
I have tried using AI for the latter, not for briefs, but for articles and newsletters. Meaning, I draft it myself, and then run it through AI for suggested edits. Usually I am not impressed with the result. Occasionally it makes a decent suggestion.
There were some ... red flags ... in terms of the way the argument in what I pulled was drafted.
For example, there was a statement of undisputed facts... in a motion to dismiss. Which was weird (that's a summary judgment thing) but I could imagine seeing it if it had been presented in terms of the complaint's allegations- but it wasn't. Also, it didn't actually cite the complaint or have any citations to where these facts were located.
Then there were just oddities in writing. I've seen so many filings in my life. Good. Bad. Terrible. But this was just strange.
Got it. I would never use AI to draft a brief, for precisely the reasons you state. It will not be directed to the type of motion that is being filed.
I think a lot of AI is directed to promotional and advertising things. Which can be fine if you are sending out a firm newsletter. Not for a legal document.
Absolutely the judge should face a misconduct complaint. This is dramatically worse than a lawyer doing it for a litigant, because this is a government official imposing law in the form of binding decisions on people. Now, I don't think the penalty needs to be crazy harsh, but yes, the judge should go through the process.
I still refuse to use AI. I may be the last man standing.
When you spend all day talking to yourself in a mirror, that's the kind of grandiose declaration one would expect. But JB says he doesn't read the comments, so he wouldn't see that hardly anybody here, let alone in the wider legal world, uses or endorses the use of AI -- at least without rigorous cite-checking. "Last man standing" indeed.
He doesn’t need to even read comments. He just needs to talk to actual practicing lawyers occasionally.
Judge should sanction himself as I assume he would do if a lawyer on the case had done the same.
No way I can find a cite but I recall a federal judge a few years back fining himself $500 or so because he made some minor transgression.
I've read once or twice of judges fining themselves when their cell phones go off in the courtroom and they have a practice of fining others who allow that to happen. I've never seen it on something substantive.
"rolling a dice."
A reminder that we don't need AI to make errors.
I am 65 years old and I use generative AI. I also use Shepards and Westlaw. I try to read every case I cite. I don't trust AI research or analysis, but it is great for composition.
Whether AI hallucinations in an order rises to the level of judicial misconduct probably depends on the specifics. Judges need to rely on law clerks for their research and writing or else they simply would never get anything done. I don't necessarily expect every judge to anticipate every way a law clerk might misuse a new technology. If it happened once, the opinion was promptly withdrawn, the law clerk was reprimanded in proportion to the severity of the error (terminated if very serious), and the judge institutes new safeguards and training protocols moving forward, then I think everyone can probably just move on. On the other hand, if we are talking about pervasive errors due to the judge's negligence, then yes I think that rises to the level of misconduct.
Grading on a curve, this is one of his better entries, but there are red flags in the comments that suggest the limits to that grading.
As to the last sentence, he's no Tim Allen on this point, I think.
It's "abuse of discretion" but not "abuse of judicial power".
This judge needs to be punished in the same manner as an attorney would be. Otherwise it's OK for everyone.
"Is it an abuse of judicial power for a judge to issue an opinion with AI hallucinations?"
That you even think there is an answer other than "yes" says a lot about the current state of the legal "profession".
Just for the record, "AI hallucination" is a synonym for "lie".
ONE MORE TIME FOR THE LUDITES (and yes I am shouting).
I have asked this question multiple times in other AI threads and everyone seems to want to take the 5A protection. Westlaw (anyone ever hear of that) is shouting praises of the results it's AI engine produces. Of course, it costs money (but surely EV or someone on this site has or knows someone that has used) and ChatGPT is the cheap skate version that costs nothing.
So has anyone ever used the Westlaw AI, and if so how does it compare to what seems to be the el cheapo ChatGPT? Asking for a friend.