The Volokh Conspiracy

Mostly law professors | Sometimes contrarian | Often libertarian | Always independent

Free Speech

Journal of Free Speech Law: "The European Liability Regime For Large Language Models"

"Is It a Platform? Is It a Search Engine? It's Chat GPT!," by Prof. Beatriz Botero Arcila, just published in our symposium on Artificial Intelligence and Speech; more articles from the symposium coming in the next few days.


The article is here; the Introduction:

ChatGPT and other AI large language models (LLMs) raise many of the regulatory and ethical challenges familiar to AI and social media scholars: They have been found to confidently invent information and present it as fact. They can be tricked into providing dangerous information even when they have been trained to not answer some of those questions—such as giving advice on how to plan an attack or how to build a Molotov cocktail if asked through hypotheticals. They can output detailed arguments very quickly, which may make the cost of producing disinformation very low (though some have argued that this risk is overblown because that cost is already very low). Their ability to mimic a personalized conversation can be very persuasive, which creates important disinformation and fraud risks. They reproduce various societal biases, because they are trained on data from the internet that embodies such biases, for example on issues related to gender and traditional work roles. They have already started raising data protection and security concerns, as shown by a first leak of user data in late March 2023 and Italy's data protection agency's temporary ban of ChatGPT.

Thus, like other AI systems, LLMs risk sustaining or enhancing discrimination and perpetuating bias, and promoting the growth of corporate surveillance, while being technically and legally opaque. Like social media, LLMs pose risks associated with the production and dissemination of information online that raise the same kind of concern over the quality and content of online conversations and public debate. All these compounded risks threaten to distort political debate, affect democracy, and even endanger public safety. Additionally, OpenAI reported an estimated 100 million active users of ChatGPT in January 2023, which makes the potential for a vast and systemic impact of these risks a considerable one.

LLMs are also expected to have great potential. They will transform a variety of industries, freeing up professionals' time to focus on different substantive matters. They may also improve access to various services by facilitating the production of personalized content, for example for medical patients or students. Consequently, one of the key policy questions LLMs pose is how to regulate them so that some of these risks are mitigated while still encouraging innovation and allowing their benefits to be realized. This Essay examines this question, with a focus on the liability regime for LLMs for speech and informational harms and risks in the European Union.

The EU is undertaking an ambitious regulatory project to pursue a digital transformation "that works for the benefit of people through respecting our values." A central part of this effort is the proposed Artificial Intelligence Act (AI Act), a flagship risk-based regulation of trustworthy AI. The AI Act would be a Europe-wide law designed to address some of the "traditional" and long-identified ethical risks posed by AI systems, such as lack of technical and legal transparency, the potential for bias and discrimination, and danger to privacy. The Act splits AI systems into four different levels of risk, prohibits a limited set of systems that pose an unacceptable level of these risks (such as real-time remote biometric identification systems in publicly accessible spaces), and is mostly concerned with creating obligation for the second tier of risk, "high risk" systems. High risk systems are a limited set of systems that acutely raise these kinds of risks, judging by their intended use as determined by their designer, but to a degree that can be mitigated. The Act then creates a variety of safety requirements for such systems related to data governance, transparency, and their design and operation. It requires, for example that high-risk systems be supervised by a human when in use. As it turns out, however, the AI Act, which is still being discussed at the time of writing, appears to be rather ill-prepared to address some of the challenges raised by LLMs, like ChatGPT. It is not intended to address the risks of systems when it is the user who determines how the systems are to be used, as is the case with general purpose AI systems. It is also not intended to address content moderation, freedom of expression, or information-related harms and risks.

The European Union, however, recently enacted another regulation that is directly concerned with addressing the risks and challenges associated with content moderation, freedom of expression and the spread of disinformation or other forms of harmful speech online: the Digital Services Act (DSA). The challenge, however, is that the DSA was not meant to cover AI generated content, but rather user generated content. At first sight it thus does not seem to apply to the content generated by LLMs.

This Essay argues, however, that because many of the risks these systems raise are risks to the information ecosystem, in Europe they can and should be addressed, at the outset, with current content moderation law. This Essay proposes an inter­pre­tation of the DSA that could apply to these tools when they are released in the market in a way that strongly resembles other intermediaries covered by content moderation laws, such as search engines. (This is without prejudging present and future AI regulations that may be created to deal with other challenges in a more spe­ci­fic way.) In doing so, it follows other scholars who have argued that the regulation of LLMs should focus on concrete risks they entail based on their specific uses. In the US and elsewhere, it may be helpful to use an approach that mixes the traditional safe harbor for internet intermediaries with due-diligence and risk-mitigation obligations, especially for the largest providers. Such an approach has the potential to, on the one hand, continue to support innovation, research, and development, while on the other hand, create incentives for innovation to be done responsibly, and in a way that mitigates potential systemic risks and harms.

This Essay's point of departure is the DSA, Europe's main content moderation law. The DSA is a functional analog to 47 U.S.C. § 230 and it updated the generally applicable ground rules for the regulation of online content. It seeks to balance different European values like protecting freedom of speech and information, maintaining high levels of consumer protection and fostering innovation and economic growth. As I explain in Part II, it does this by providing a somewhat conditional safe harbor from liability for all platforms hosting content, while also imposing on the most systemically relevant actors—like very large social media companies and search engines—certain due diligence and risk-mitigation obligations to attenuate some of the systemic threats that these tools pose. An advantage of this approach is that it simplifies many of the difficulties of imposing liability for AI-related harms. And the DSA is already law; it will take at least a few years until the AI Act enters into full force.

Contrary to what may happen in the United States, where there is at least disagreement on whether tools like ChatGPT are covered by Section 230 or not, scholars in the EU do not think that LLM-chatbots fall naturally under the EU safe harbor for intermediary liability. The main reason is that large language models generate content themselves, and the definition of exempt intermediaries that could best fit ChatGPT, a hosting service, refers to "content provided by users."

If this is adopted as the main interpretation, companies deploying interfaces that provide end-users easy access to LLMs that may generate harmful and illegal disinforming or defamatory content could be held liable for such content. Far from being a panacea, such liability could undermine research, development, and innovation by creating legal uncertainty for businesses developing and adopting various generative AI systems. No one really knows how to train these powerful AI systems so that they will always be reliable, helpful, honest, and harmless. At the same time, there are many actors involved in the chain of events and the training of an algorithm (designers, manufacturers, deployers, users), AI systems are opaque, and AI systems can fail (and harm people) in unpredictable ways. How to allocate fault amongst the different actors is often unclear, and proving that someone breached a duty of care can be very hard for victims. To avoid some of these concerns, regulators both in the EU and the US are adopting different risk regulation mechanisms—for creating ex ante requirements like conducting risk assessments and following technical standards—to deal with AI and social media. I discuss this further in Section III.

In this Essay I start proposing a middle-ground position in which general purpose LLMs like ChatGPT, Bard, and LLaMA should be, and perhaps already are, covered by internet intermediary regulation. At the same time, the companies placing these systems on the market should also be required to comply with due diligence and risk-mitigation obligations to, for example, take measures to curb harmful speech. This would achieve a balance between facilitating the development of new tools and services while ensuring that their creators set in place key guardrails before placing them on the market. This is the DSA approach to social media regulation in Europe. And lawyers and policymakers in the United States may also want to consider this, both because these EU regulations are applicable to US companies operating within the EU, and because the US is also moving in the direction of AI risk regulation and the future of Section 230 is still part of the political agenda.

Specific to the EU context, I offer an alternative interpretation and legal reform proposal for the DSA, hoping that EU courts and scholars will not dismiss, just yet, the question of whether the DSA applies to ChatGPT or other LLMs. I propose a functional and teleological interpretation of the DSA, one in which courts, lawyers, and lawmakers should consider the intention behind the DSA, the way in which new intermediaries are being used, and the function they serve in the information environment. It is an open secret that even if OpenAI and Google continue to label these bots as experiments, most people—myself included—have spent the last few months using ChatGPT to replace or complement their search engines. This is not unreasonable: the landing page of ChatGPT, for example, invites users to ask questions on topics ranging from quantum computing to children's birthday party ideas—normal queries one would use a search engine for. Thus, I suggest that if these tools are being placed on the market where they can be functionally and reasonably considered to be used for search purposes, they should then be bound to the same safe harbor and due-diligence risk mitigation obligations as search engines and other online platforms.

To lay out this argument in more detail, this Essay proceeds as follows: Section I provides background on the EU's content moderation framework, focusing on the DSA and illegal speech liability. Section II explains why, from a policy perspective, it is difficult, and perhaps undesirable, to solely rely on member states' online speech laws and intermediary liability rules when dealing with LLM-generated harms, and why a risk-regulation approach could be desirable. Section III explains how the DSA could be interpreted to apply to LLM-powered general information retrieval tools.