The Volokh Conspiracy
Mostly law professors | Sometimes contrarian | Often libertarian | Always independent
N.Y. Court Opines on Use of AI by Experts
"[C]ounsel has an affirmative duty to disclose the use of artificial intelligence and the evidence sought to be admitted should properly be subject to a Frye hearing prior to its admission ...."
From Thursday's decision in In the Matter of Accounting by Weber, decided by Saratoga County (N.Y.) Surrogate's Court judge Jonathan G. Schopf; the expert was opining on damages in a financial dispute:
Use of Artificial Intelligence
Although the Court has found [proposed expert witness Charles Ranson's] testimony and opinion not credible [see below -EV]…, a portion of his testimony bears further and separate discussion as it relates to an emerging issue that trial courts are beginning to grapple with and for which it does not appear that a bright-line rule exists.
Specifically, the testimony revealed that Mr. Ranson relied on Microsoft Copilot, a large language model generative artificial intelligence chatbot, in cross-checking his calculations. Despite his reliance on artificial intelligence, Mr. Ranson could not recall what input or prompt he used to assist him with the Supplemental Damages Report. He also could not state what sources Copilot relied upon and could not explain any details about how Copilot works or how it arrives at a given output. There was no testimony on whether these Copilot calculations considered any fund fees or tax implications.
The Court has no objective understanding as to how Copilot works, and none was elicited as part of the testimony. To illustrate the concern with this, the Court entered the following prompt into Microsoft Copilot on its Unified Court System (UCS) issued computer: "Can you calculate the value of $250,000 invested in the Vanguard Balanced Index Fund from December 31, 2004 through January 31, 2021?" and it returned a value of $949,070.97—a number different than Mr. Ranson's. Upon running this same query on two (2) additional UCS computers, it returned values of $948,209.63 and a little more than $951,000.00, respectively. While these resulting variations are not large, the fact there are variations at all calls into question the reliability and accuracy of Copilot to generate evidence to be relied upon in a court proceeding.
Interestingly, when asked the following question: "are you accurate", Copilot generated the following answer: "I aim to be accurate within the data I've been trained on and the information I can find for you. That said, my accuracy is only as good as my sources so for critical matters, it's always wise to verify.
When asked "are you reliable", Copilot responded with: "[y]ou bet. When it comes to providing information and engaging in conversation, I do my best to be as reliable as possible. However, I'm also programmed to advise checking with experts for critical issues. Always good to have a second opinion!" When the follow-up question of "are your calculations reliable enough for use in court " was asked, Copilot responded with "[w]hen it comes to legal matters, any calculations or data need to meet strict standards. I can provide accurate info, but it should always be verified by experts and accompanied by professional evaluations before being used in court… "
It would seem that even Copilot itself self-checks and relies on human oversight and analysis. It is clear from these responses that the developers of the Copilot program recognize the need for its supervision by a trained human operator to verify the accuracy of the submitted information as well as the output.
Mr. Ranson was adamant in his testimony that the use of Copilot or other artificial intelligence tools, for drafting expert reports is generally accepted in the field of fiduciary services and represents the future of analysis of fiduciary decisions; however, he could not name any publications regarding its use or any other sources to confirm that it is a generally accepted methodology.
It has long been the law that New York State follows the Frye standard for scientific evidence and expert testimony, in that the same is required to be generally accepted in its relevant field (see Frye v. United States, 293 F. 1013 [D.C. Cir. 1923]).
The use of artificial intelligence is a rapidly growing reality across many industries. The mere fact that artificial intelligence has played a role, which continues to expand in our everyday lives, does not make the results generated by artificial intelligence admissible in Court. Recent decisions show that Courts have recognized that due process issues can arise when decisions are made by a software program, rather than by, or at the direction of, the analyst, especially in the use of cutting-edge technology (People v Wakefield, 175 AD3d 158 [3d Dept 2019]). The Court of Appeals has found that certain industry specific artificial intelligence technology is generally accepted (People v. Wakefield, 38 NY3d 367 [2022] [allowing artificial intelligence assisted software analysis of DNA in a criminal case]). However, Wakefield involved a full Frye hearing that included expert testimony that explained the mathematical formulas, the processes involved, and the peer-reviewed published articles in scientific journals. In the instant case, the record is devoid of any evidence as to the reliability of Microsoft Copilot in general, let alone as it relates to how it was applied here. Without more, the Court cannot blindly accept as accurate, calculations which are performed by artificial intelligence. As such, the Court makes the following findings with regard to the use of artificial intelligence in evidence sought to be admitted.
In reviewing cases and court practice rules from across the country, the Court finds that "Artificial Intelligence" ("A.I.") is properly defined as being any technology that uses machine learning, natural language processing, or any other computational mechanism to simulate human intelligence, including document generation, evidence creation or analysis, and legal research, and/or the capability of computer systems or algorithms to imitate intelligent human behavior. The Court further finds that A.I. can be either generative or assistive in nature. The Court defines "Generative Artificial Intelligence" or "Generative A.I." as artificial intelligence that is capable of generating new content (such as images or text) in response to a submitted prompt (such as a query) by learning from a large reference database of examples. A.I. assistive materials are any document or evidence prepared with the assistance of AI technologies, but not solely generated thereby.
In what may be an issue of first impression, at least in Surrogate's Court practice, this Court holds that due to the nature of the rapid evolution of artificial intelligence and its inherent reliability issues that prior to evidence being introduced which has been generated by an artificial intelligence product or system, counsel has an affirmative duty to disclose the use of artificial intelligence and the evidence sought to be admitted should properly be subject to a Frye hearing prior to its admission, the scope of which should be determined by the Court, either in a pre-trial hearing or at the time the evidence is offered.
Here are the court's other concerns about Ranson's testimony:
Objectant … relied upon Mr. Ranson in offering proof of damages due to the retention of the Cat Island Property. Mr. Ranson prepared a "Preliminary Expert Report of Charles W. Ranson" dated December 14, 2022 which was admitted into evidence over objection as Respondent's Exhibit "F1". Mid-hearing he also prepared what is referred to as a "Supplemental Damages Report" dated May 28, 2024 that was admitted into evidence over objection as Respondent's Exhibit "K".
The Court finds several aspects of Mr. Ranson's testimony and reports lacking. In addition to citing an outdated version of the Prudent Investor Act, Mr. Ranson admitted that he was not aware of—and thus did not consider—the differences between lost capital and lost profit damages calculations. The reports and testimony make clear that Mr. Ranson's damages analysis were also calculated from a start date of December 31, 2004, which as set forth above, is more than three (3) years too early.
Mr. Ranson's initial report contains deficiencies that were not rehabilitated by his testimony. For instance, in the table on page five (5) of the report, Mr. Ranson did not factor in expenses which must have been incurred by the Trust such as real estate taxes in years 2004 through 2013, 2016, 2017, and 2020. The report fails to factor in the effect of the COVID-19 pandemic on the rental income in 2020 and 2021. In his report, Mr. Ranson asserts vague references to the economic conditions of the island which were solely supported by reference to a hearsay conversation he had on the telephone with one Robin Brownrigg.
The report states that the Cat Island Property resulted in an accumulated net operating loss of $149,643.92. Coupled with the failure to account for real estate taxes, Mr. Ranson's report sets forth calculations encompass years that the property was not yet owned by the Trust (2004-2008). Another example of unreliability is that in Mr. Ranson's chart analysis of Asset Classes on page four (4) of his report, it appears that he failed to take into account the distributions to the Objectant in each year, which would naturally have an effect on the percentage of value assigned to the traditionally invested assets and cash. He refers to the property as being "illiquid", and while he does reference that the property sold in 2022 for $485,000.00 resulting in a reinvestment of the sales proceeds of $323,721.68, this reinvestment is not reflected in the chart on page four (4) in any discernable mathematical fashion other than as a footnote. The chart reflects a total year end market value on December 31, 2021 of $872,322.07 and a total year-to-date market value on April 30, 2022 of $843,727.27. If the Cat Island Property was sold and reinvested, either this value was not added into the chart or distributions to Objectant were not factored in, or perhaps there was some other unknown factor which would require speculation on part of the Court. The Court declines to engage in speculation as to what the year-to-date market value was on April 30, 2022, and instead finds Mr. Ranson's calculations contained within the initial report to be inherently unreliable.
Mr. Ranson makes the conclusion in his report that the "Trustee's decision to retain, and subsequent management of the Cat Island Property did not enable the Trustee to make appropriate present and future distributions to or for the benefit of the Beneficiary…." This is squarely refuted by the accounting itself which reveals that the Objectant was provided with direct distributions of cash as well as payments for his benefit in the sum of $1,116,668.23, leaving a principle balance of $857,471.43 on hand as of the conclusion of the accounting period. Indeed, at the time the hearing concluded on June 7, 2024, the Objectant was slated to receive a future non-discretionary distribution from the Trust of approximately $175,000.00 on July 10, 2024, when he attained the age of forty (40) . As such, all past and, due to the timing of this hearing, even a future distribution was realized. Mr. Ranson's Supplemental Damages Report also failed to account for any tax deductions or write-offs that the trust benefited from relating to the Cat Island Property. Mr. Ranson also admits that he made certain assumptions of closing costs for the hypothetical sale of the Cat Island Property at its book value to arrive at the $250,000 starting figure. Despite Mr. Ranson making assumptions regarding a hypothetical sale in 2004, it is troubling that his Supplemental Damages Report failed to account for the proceeds from the actual sale of the Cat Island Property for $485,000 in 2022.
Perhaps even more troubling is that Mr. Ranson further testified that he used a proxy investment account—the Vanguard Balanced Index Fund—to estimate the hypothetical investment performance of the hypothetical 2004 sales proceeds in his Supplemental Damages Report. This is despite Mr. Ranson testifying that the appropriate calculations would have required a full A.M.R. analysis, which he testified is the industry standard, and that this was not done because it was too costly, a statement that would normally end the Court's inquiry as to the reliability of an expert's analysis. In fact, Mr. Ranson testified that an industry compliant A.M.R. analysis would have required combing through the approximately two hundred sixteen (216 ) investment statements from the Trust's investment firm to obtain, at least in part, asset cost basis, dividends, interest, and compare with cash flow. The report also appears not to account for taxes or index fund management fees in its calculations; instead relying upon the raw percentage of fund value increase over the time-period.
Based on the uncontroverted evidence, the Court finds that Mr. Ranson's calculations and specifically those with regards to damages are inherently unreliable, are based on speculation, hypothetical market performance, and are unsupported or outright contradict by facts in the record.
In conclusion, Mr. Ranson's damages calculations cannot be credited as they are unreliable. Whether or not he was retained and/or qualified as a damages expert in areas other than fiduciary duties, his testimony shows that he admittedly did not perform a full analysis of the problem, utilized an incorrect time period for damages, and failed to consider obvious elements into his calculations, all of which go against the weight and credibility of his opinion.
Thanks to the Media Law Resource Center (MLRC) MediaLawDaily for the pointer; see also this Ars Technica (Ashley Belanger) story.
Show Comments (4)