The Volokh Conspiracy
Mostly law professors | Sometimes contrarian | Often libertarian | Always independent
Federal Judge Requires All Lawyers to File Certificates Related to Use of Generative AI
The certificates must "attest[] either that no portion of the filing was drafted by generative artificial intelligence (such as ChatGPT, Harvey.AI, or Google Bard) or that any language drafted by generative artificial intelligence was checked for accuracy, using print reporters or traditional legal databases, by a human being."
From Judge Brantley Starr (N.D. Tex.), posted today, a standing order on Mandatory Certification Regarding Generative Artificial Intelligence (paragraph breaks added, as is commonly done in quotes on this blog):
All attorneys appearing before the Court must file on the docket a certificate attesting either that no portion of the filing was drafted by generative artificial intelligence (such as ChatGPT, Harvey.AI, or Google Bard) or that any language drafted by generative artificial intelligence was checked for accuracy, using print reporters or traditional legal databases, by a human being.
These platforms are incredibly powerful and have many uses in the law: form divorces, discovery requests, suggested errors in documents, anticipated questions at oral argument. But legal briefing is not one of them. Here's why. These platforms in their current states are prone to hallucinations and bias. On hallucinations, they make stuff up—even quotes and citations.
Another issue is reliability or bias. While attorneys swear an oath to set aside their personal prejudices, biases, and beliefs to faithfully uphold the law and represent their clients, generative artificial intelligence is the product of programming devised by humans who did not have to swear such an oath. As such, these systems hold no allegiance to any client, the rule of law, or the laws and Constitution of the United States (or, as addressed above, the truth). Unbound by any sense of duty, honor, or justice, such programs act according to computer code rather than conviction, based on programming rather than principle. Any party believing a platform has the requisite accuracy and reliability for legal briefing may move for leave and explain why.
Accordingly, the Court will strike any filing from an attorney who fails to file a certificate on the docket attesting that the attorney has read the Court's judge-specific requirements and understands that he or she will be held responsible under Rule 11 for the contents of any filing that he or she signs and submits to the Court, regardless of whether generative artificial intelligence drafted any portion of that filing. A template Certificate Regarding Judge-Specific Requirements is provided here.
Note that federal judges routinely have their own standing orders for lawyers practicing in their courtrooms. These are in addition to the local district rules, and to the normal Federal Rules of Civil and Criminal Procedure.
Editor's Note: We invite comments and request that they be civil and on-topic. We do not moderate or assume any responsibility for comments, which are owned by the readers who post them. Comments do not represent the views of Reason.com or Reason Foundation. We reserve the right to delete any comment for any reason at any time. Comments may only be edited within 5 minutes of posting. Report abuses.
Please
to post comments
Starr, nominated at 40 after a career in partisan law with paychecks from taxpayers, squeaked through the Senate and apparently has become quite the pronouncer of law at the age of 43 or 44.
Some people seek judgeships to secure the respect and authority they could not arrange in private practice. In some contexts, this end run involves door-knocking, envelope-licking, back-slapping, check-writing, etc. In others, longstanding membership in the Federalist Society and a record of partisan polemics seem to suffice.
"any language drafted by generative artificial intelligence was checked for accuracy, using print reporters or traditional legal databases, by a human being."
Seems like cite checking and the like would be fairly easy to automate, and wouldn't require the use of AI.
Just because technologies like Chat GPT are prone to errors when writing legal briefs doesn't mean that AI's designed for legal work are infeasible, or even that far off.
But the interests of the legal profession are not aligned with the interests of clients when it comes to efficiency.
12, I’m not understanding your point here. The judge seems to be saying, "[1] Don’t use AI in my courtroom. **OR!!!** [2] Feel free to use it, but give me a heads-up, in writing, that (a) you’ve used AI, AND, that (b) you have then checked everything AI-generated for accuracy.”
How is that not aligned with the interests of clients and efficiency? (Assuming, of course, that fake/invented cites and quotes and cases will hurt your clients’ cases, once they have been discovered.)
The point is that automated cite-checking is a fairly easy thing to develop (I'm kinda surprised no one has done it already--or maybe automated cite checkers exist), but it will face obstacles such as these.
They do exist, but the devil's in the details. Verifying citations are to cases that actually exists is straightforward enough, but requires having access to multiple commercial databases since citations to unpublished opinions are often provided in a format proprietary to one of those databases (e.g., "Doe v. Roe, 2014 WL 17839070 (M.D. Vt. 1943)" with "WL NNNNN" referring to an internal Westlaw ID). Westlaw and Lexis both have automated tools that work at least with their own cites -- not clear if they work across platforms. And they're not cheap.
The trickier bit (and particularly relevant to these recent discussions) is verifying that the language or, in particular, the concepts attributed to an opinion actually exists in the opinion. That's a tough nut to crack, particularly considering the cost of the automated cite checker not catching a material misrepresentation from the brief writer. And if you can't trust it 100% and thus have to double check anyway, it's not adding much if any real value.
It's not a question of what's the easiest way to do it. It's how to make sure it's done right by someone who can be held accountable.
That's my thought as well. It seems pretty intuitive to me that if I as a lawyer use an AI to write a brief or motion and the AI that I used makes an incorrect citation or otherwise incorrectly represents the facts or law, that it should be treated the same as if I personally wrote the brief or memo and made those mistakes myself. But it appears that some lawyers and plaintiffs who have done just that are trying to pass the buck and until that’s been resolved once and for all (insofar as anything can be), I suspect that a lot of judges are going to say “not getting in the middle of this – we're just going to make you certify that you’ve checked it for accuracy and if it’s not, we’ll use the certification that you made to discipline you.”
If you remember a few months back when Volokh was obsessed with AI for two weeks or so, he was quite adamant that liability would obviously fall on the tool for misrepresentations, and that the person spreading them had no responsibility.
It was... odd.
Which is to say, I agree with you: these are tools, and the person responsible for the tool's product is the person using the tool. Absent exceptional circumstances, assigning responsibility to the person that made the tool is erroneous.
This is stupid and not the way liability works. Tool makers are routinely held liable for their acts. If I manufacture a hammer, and you start hammering with it and the head flies off and hits someone in the face, I am liable as the manufacturer if I designed or manufactured it in such a way as to make that a significant possibility.
That is not remotely what he said.
Apparently generalized detection of fabrication in the outputs of these models actually is a tough thing to do; It's not like the companies working on them lack motive to get them to stop making shit up, but they're not having a lot of success.
It might be easier to accomplish in limited domains, but the payoff is proportionately less, so that's not where they're putting in the work.
But don't "interns" exist for this reason, among others?
Sure, but specific detection of fabricated legal citations is relatively easy. And hopefully LLM's specifically trained to write legal documents are coming, they would be much better for legal briefs than these generic engines, but one of the major impediments to developing them is regulatory risk.
Well, if you're having a LLM generating a legal document, it can just go ahead and generate the certification, while it's at it. 😉
Adding to my notes above, detection of incorrect legal citations is (factoring in the caveats I mentioned) relatively easy. But at that point, you still don't know if 1) the cited case is completely fake; 2) the case actually exists, but there's a typo in the citation somewhere; 3) there's a typo in the quote from the case; 4) the brief write cited one case and accidentally pasted in a quote from another, etc. So someone then has to take that on and sort it out. And that's generally straightforward when you can take it back to the human that wrote the brief who hopefully pulled an actual copy of the case or at least remembers enough about it to help track it down, but when the writer is a sociopathic LLM you don't have anything you can rely on to help run the issue to ground.
Thanks, judge, for uselessly burdening 99.99% of the lawyers who appear before you to possibly deter the one lawyer who might possibly use AI and wind up filimg a bad brief. What an abuse of power. Why not follow up with requiring an attestation that someone checked for typos?
What a burden to add a copy-and-paste statement to a filing — using a court-provided template — and sign it.
It's never a burden when someone else has to do it.
If you looked at the model certificate, you would know you only have to file one certificate covering all your pleadings in the case.
IANAL but this doesn't seem like that much of a burden. Can you educate me? I'm assuming that most competent lawyers don't use these technologies right now because of the possibility of errors, so there's no effect on them. I also assume that competent lawyers who do use these technologies check the results, so the only effect on them is the certification. Is that a heavy burden, or am I missing something else?
The burden isn’t in changing how they use (or don’t use) AI tools—it’s in adding this certification every time they file something. Or, more likely, dealing with the consequences of not realizing they were supposed to add it (especially since the order itself doesn’t make it entirely clear whether you’re supposed to file it once per case or once per filing).
1) You don't have to add this certification every time you file something. You have to file it once for the case. (The text of the model certification that's provided makes that clear.)
2) This would not be a big burden even if your interpretation were correct. There are many boilerplate things that get routinely filed, from certificates of service to certifications of word count to certifications that you've used antivirus software. This is e-filing; it takes 10 seconds.
3) What this requirement does is eliminate the possibility of someone using the Gosh-I-Didn't-Realize defense that Schwartz is using in the NY case.
Because not all judges are making this requirement. Its one judge. So if you have multiple cases in the same jurisdiction and file a bunch of stuff and you forget this one judge's rule and he strikes a pleading or some other important filing and then you blow a deadline or something... there could be consequences for the lawyer and the client. But the problem of AI generated legal filings seem minimal at best (less than 1% of filings? .01%?) Why make rules for the extreme outliers when there are already rules that deal with them if they submit bogus filings. Seems like a fix or solution in search of a problem that doesn't really exist.
As a lawyer you always have to check the individual judge's rules anyway; you don't memorize them and then guess whether you've complied. (Because they change!)
I practice in State court and we don't have 'judge rules.' We have local circuit court rules that apply to all judges within that specific circuit. Also statewide rules that apply to all circuits. And for appeals, specific rules for each appellate division and rules specific for the State Supreme Court.
All that being said, some judges are known for having their own 'rules' as it were. But they are informal in nature and the consequences for not following them minimal at best.
I already have to have a certification that I served the other party and a certification of my word count. Adding a certification that says "I didn't use AI and/or I checked everything AI pulled up" is not hard and becomes part of a template on every case.
It's embarrassing when lawyers have to certify that they haven't done something really stupid.
Embarrassing for whom . . . most lawyers, or the judge who imposed the requirement from the fringe?
Not clear why this would be particularly more embarrassing than, say, filing a certificate of conference verifying you tried in good faith to work out a disputed issue with the other side before taking it to the judge.
Do lawyers really have to swear to set aside their personal prejudices, biases, and beliefs? I would think that a lot of criminal defense lawyers have a bias against police, for example, and that bias might even help them defend their clients better.
His observation that software has no "allegiances" other than to it's developers really gets to the heart of the problem of using AI for any serious purpose: it does not have the capacity to be trusted.
Trust is the foundation of civil society. I'm trying to think of any real-world human transaction that doesn't rely on trust, and I'm coming up empty. It may be that AI could never be used for anything other than for our amusement -- or for criminal purposes. I imagine it would be especially good at the latter.
Essentially, trusting the AI is indirectly trusting the developers.
This would be easier to do in other industries, but so many software based companies have adopted business plans that are tacitly based on betraying their customers, that it’s hard to do here.
And I think that lack of trust is fully justified. You can see already a tendency for these companies to assert a level of control over the behavior of their product that involves claiming a veto power over perfectly legal uses. That’s NOT a good mindset when you’re considering whether a company is to be trusted.
I mean, I could easily see such a model saying, "I'm sorry, Dave. I can't defend that client."
Trust Brett to find the conspiracy lurking beneath!