The Center for AI Standards and Innovation (CAISI) at NIST released its evaluation of DeepSeek V4 Pro, finding the open-weight model's capabilities trail leading U.S. closed models by roughly eight months — performing at GPT-5 level rather than the Opus 4.6/GPT-5.4 level DeepSeek's own benchmarks claimed. However, DeepSeek V4 was more cost-efficient than GPT-5.4 mini on five of seven benchmarks, underscoring China's cost-performance advantage in open-weight models.
Mistral released Mistral Medium 3.5, a 128B dense open-weight model scoring 77.6% on SWE-Bench Verified, now the default in both Vibe (its coding agent platform) and Le Chat. More significantly, Vibe now supports async cloud-based coding sessions that can be spawned from the CLI or chat and run autonomously — a direct competitor to Anthropic's Claude Code and Cursor.
The DoD announced agreements with AWS, Google, Microsoft, NVIDIA, OpenAI, SpaceX, Reflection, and later Oracle to deploy AI on IL6/IL7 classified networks for over 1.3 million personnel. Anthropic remains blacklisted after refusing to waive safety guardrails around autonomous weapons and domestic surveillance. The announcement crystallizes the growing rift between safety-first AI labs and U.S. defense procurement.
In a remarkable contradiction, the NSA is actively testing Anthropic's unreleased Mythos model — which has exceptional cyber vulnerability detection capabilities — to probe flaws in Microsoft products and harden government networks, even as the Pentagon calls Anthropic a "supply chain risk." DoD CTO Emil Michael called Mythos a "separate national security moment," treating the model's capabilities as too important to ignore.
The landmark trial in Oakland saw three days of Musk testimony, in which he claimed he was "duped" into funding what became an $850B company, admitted xAI distills OpenAI's models, and warned AI could "kill us all." The $134B damages claim could force OpenAI to unwind its for-profit restructuring. In a surreal twist, Altman publicly invited Musk to OpenAI's May 5 GPT-5.5 launch event.
Q1 earnings confirmed combined capex from Amazon ($200B), Microsoft ($190B), Alphabet ($190B), and Meta ($125-145B) will exceed $700B this year — up from $410B in 2025. Gary Marcus called it "the greatest capital misallocation in history." Markets remain split: Alphabet and Amazon rallied on strong cloud growth, while Meta slid on spending concerns.
Alphabet booked $28.7B of its record $62.6B quarterly profit from marking up its Anthropic stake — not from selling any product. Amazon added $16.8B from the same source. The trigger was Anthropic's $380B Series G round in February. This accounting phenomenon raises uncomfortable questions about how much of Big Tech's "AI profits" are real operating returns versus circular valuation markups.
Cloudflare and Stripe released an agent provisioning protocol enabling AI agents to autonomously create accounts, buy domains, start paid subscriptions, and deploy applications — no human in the loop. Stripe handles payment tokenization with a $100/month default cap per provider. Vercel, Supabase, PlanetScale, and others are already integrated. This is arguably the most concrete piece of "agentic infrastructure" shipped to date.
Snap cut 16% of its workforce and closed 300 open roles, with CEO Evan Spiegel citing AI generating over 65% of new code and enabling smaller teams to match previous output. The move saves $500M annually and is the most explicit case yet of a major tech company attributing layoffs directly to AI productivity gains rather than cost cutting.
Analyst Ming-Chi Kuo reports OpenAI is developing a smartphone with Qualcomm, MediaTek, and Luxshare where AI agents replace traditional apps entirely. The device would maintain continuous real-time context and route tasks through agents instead of app interfaces. Mass production is targeted for 2028, with a projected 300-400M annual shipments — exceeding iPhone volumes if successful. Still unconfirmed.