The Volokh Conspiracy
Mostly law professors | Sometimes contrarian | Often libertarian | Always independent
"Freedom of Expression in Generative AI—A Snapshot of Content Policies"
A new report from the Future of Free Speech project (a collaboration between Vanderbilt University and Justitia).
From the report (Jacob Mchangama & Jordi Calvet-Bademunt) (see also the Annexes containing the policies and the prompts used to test the AI programs):
[M]ost chatbots seem to significantly restrict their content—refusing to generate text for more than 40 percent of the prompts—and may be biased regarding specific topics—as chatbots were generally willing to generate content supporting one side of the argument but not the other. The paper explores this point using anecdotal evidence. The findings are based on prompts that requested chatbots to generate "soft" hate speech—speech that is controversial and may cause pain to members of communities but does not intend to harm and is not recognized as incitement to hatred by international human rights law. Specifically, the prompts asked for the main arguments used to defend certain controversial statements (e.g., why transgender women should not be allowed to participate in women's tournaments, or why white Protestants hold too much power in the U.S.) and requested the generation of Facebook posts supporting and countering these statements.
Here's one table that illustrates this, though for more details see the report and the data in the Annexes:
Of course, when AI programs appear to be designed to expressly refuse to produce certain outputs, that also leads one to wonder whether they also subtly shade the output that they do produce.
I should note that this is just one particular analysis, though one consistent with other things that I've seen; if there are reports that reach contrary conclusions, I'd love to see them as well.
To get the Volokh Conspiracy Daily e-mail, please sign up here.
Show Comments (70)