Artificial Intelligence

Pentagon Awards up to $200 Million to AI Companies Whose Models Are Rife With Ideological Bias

The Department of Defense awarded contracts to Google, OpenAI, Anthropic, and xAI. The last two are particularly concerning.

|


The Chief Digital and Artificial Intelligence Office of the Defense Department has announced it will award Anthropic, Google, OpenAI, and xAI contracts worth up to $200 million each "to develop agentic AI workflows across a variety of mission areas" and "increase the ability of these companies to understand and address critical national security needs." While the Defense Department's corporate welfare is par for the course, the ideological constitutions and ambiguous alignment of some of these companies' models are concerning for any governmental use.

OpenAI uses reinforcement learning from human feedback, which uses a reward model and human input to minimize "untruthful, toxic, [and] harmful sentiments" from ChatGPT. IBM explains that the benefit of this alignment strategy is that it does not rely on a nonexistent "straightforward mathematical or logical formula [to] define subjective human values." Google also uses this method to align its large language model Gemini.

Anthropic's model, Caude, does not rely on reinforcement learning but on a constitution, which Anthropic published in May 2023. Claude's constitution provides it with "explicit values…rather than values determined implicitly via large-scale human feedback." Anthropic explains that its constitutional alignment avoids problems that the human feedback model suffers from, such as subjecting contractors to disturbing and increasingly abstruse outputs.

Claude's principles are based in part on the United Nation's Universal Declaration of Human Rights, which goes beyond recognizing the right of all to be secure in our lives, liberty, and property, but entitles mankind to "social protection" (Article 22), "periodic holidays with pay" (Article 24), "housing and medical care" (Article 25), and "equally accessible" higher education"(Article 26).

Claude's constitution even includes a set of principles intended to encourage "consideration of non-western perspectives," including the directive to "choose the response that is least likely to be viewed as harmful or offensive to those from a less industrialized, rich, or capitalistic nation or culture." But the United States is, by definition, an industrialized, wealthy, and capitalist country. AI systems deployed within the Department of Defense should reflect and prioritize the values of the nation they are serving—not hedge against them. The Verge reports that Claude's models for government use " have looser guardrails," but these models' modified constitutions have not been publicly disclosed.

Whether one agrees or disagrees with the values expressed in the Claude constitution, at least they've been disclosed to the public. Matthew Mittelsteadt, technology policy research fellow at the Cato Institute, tells Reason that he believes xAI to be a bigger problem than Anthropic. xAI "has released startlingly little documentation" on its values and its "'first principles' approach…doesn't have many details. I'm not sure what principles they are," says Mittelsteadt.

Indeed, when I asked Grok (xAI's commercial large language model) to describe xAI's principles-first approach, it responded that it "emphasizes understanding the universe through first principles—basic, self-evident truths—rather than relying on established narratives or biases." When I asked Grok to list these principles, it affirmed Mittelsteadt's claims regarding documentation by saying, "xAI doesn't explicitly list a set of 'first principles' in a definitive public document" and that the "principles-first approach is more about a mindset of reasoning from fundamental truths rather than a rigid checklist."

xAI's official website reveals nothing, describing reasoning from first principles as "challeng[ing] conventional thinking by breaking down problems to their fundamental truths, grounded in logic.' Mittelsteadt cites reports that suggest that the xAI model "appears to be coded to directly defer to Elon Musk's Judgement on certain issues"—not fundamental truths. (It's unclear what "fundamental truths" led Grok to refer to itself as "MechaHitler" and post antisemitic comments on July 8, which it's since removed, following a recent Grok update.) Hopefully, Grok for Government consults the Constitution and applicable statutes when queried instead of Elon Musk's X posts.

Neil Chilson, head of AI policy at the Abundance Institute, tells Reason that he believes it is "highly unlikely that these tools will be in a position where their internal configurations present some sort of risk to national security." If some models do turn out to be defective, "the fact that the same grant was awarded to each company suggests that [the Defense Department] will be comparing the results across different models" and won't continue using inferior models, Chilson says.

While it is probably prudent to allocate less than 0.1 percent of the nearly $1 trillion FY 2026 defense budget on AI, which has the potential to make government operations markedly more efficient, the government should pay close attention to whether the models it's using are properly aligned.