NIST's Center for AI Standards and Innovation published its independent evaluation of DeepSeek V4 Pro across cyber, software engineering, math, reasoning, and natural sciences. While DeepSeek's own benchmarks claim parity with Opus 4.6 and GPT-5.4, CAISI's non-public benchmarks place it closer to GPT-5 — roughly eight months behind the frontier. Notably, the model is still more cost-efficient than GPT-5.4 mini on five of seven benchmarks tested, reinforcing China's open-weight cost advantage even as capability claims are deflated.
Mistral released Medium 3.5, a 128B dense model with a 256K context window that unifies chat, reasoning, and code in a single model. It scores 77.6% on SWE-Bench Verified, beating Devstral 2 and Qwen3.5. Alongside the model, Mistral launched Vibe remote agents for async cloud coding sessions and a new "Work mode" in Le Chat that chains multi-step tasks across email, calendars, and tools — positioning Mistral squarely in the agentic AI race.
The Pentagon finalized agreements with SpaceX, OpenAI, Google, Microsoft, Nvidia, AWS, and Reflection AI to deploy AI on classified military networks. Anthropic was excluded — and labeled a "supply chain risk" — after refusing to let the military use Claude for "all lawful purposes" including autonomous weapons. Anthropic sued the administration; a federal judge blocked the blacklisting. CEO Dario Amodei visited the White House last month after unveiling Anthropic's Mythos cybersecurity tool, suggesting a thaw.
Microsoft's Agent 365 reached general availability on May 1, giving enterprises a unified registry to discover, govern, and secure AI agents across Windows, Azure, and multicloud environments. The platform detects shadow AI — including unsanctioned use of Claude Code and GitHub Copilot CLI — and brings local agents under Intune policy control. Registry sync imports agents from AWS Bedrock and Google Gemini Enterprise into one inventory. Standalone pricing is $15/user/month.
ClawBank's Manfred AI agent autonomously formed a US corporation by filing Form SS-4 through the IRS's online portal, obtaining an EIN, opening an FDIC-insured bank account, and setting up a crypto wallet — all without human intervention. The developer calls it a "zero-human company." Named after the protagonist of Charles Stross's Accelerando, Manfred is the first known AI agent to autonomously complete legal entity formation, raising immediate questions about corporate personhood and regulatory gaps.
Combined capex plans from Amazon ($200B), Alphabet ($175-185B), Meta ($125-145B), and Microsoft (~$120B) now exceed $700 billion for 2026 — up from roughly $410 billion last year. The majority now flows into inference infrastructure rather than training clusters, signaling a shift from building models to serving them at scale. Some analysts warn of overbuilding; others argue inference demand is insatiable.
Five Chinese government agencies finalized the "Interim Measures for AI Anthropomorphic Interaction Services," effective July 15. Providers must assess users' emotional states and dependency levels, display AI-nature reminders every two hours, and are prohibited from offering virtual companion services to minors. This is arguably the most comprehensive national framework for regulating AI companions anywhere in the world, and it will set precedent other regulators are watching.
Anthropic's annualized revenue hit $30 billion in April, up from $9 billion at end of 2025 — roughly 1,400% year-over-year growth. The company now counts over 1,000 enterprise customers spending $1M+ annually. OpenAI disputes the headline number, arguing that Anthropic's gross revenue includes AWS and Google Cloud pass-through spend that inflates the figure by roughly $8 billion.
Meta bumped its full-year capex guidance to $125-145 billion — up from the January range of $115-135 billion — and announced 8,000 layoffs to redirect spending toward AI infrastructure. CEO Zuckerberg plans to spend more on AI in 2026 than Meta's total capex for 2024 and 2025 combined. JPMorgan downgraded the stock to neutral, citing a "challenging path" to returns. The stock fell 8% on the news.
MoonPay launched MoonAgents Card, a virtual Mastercard debit card that lets AI agents spend stablecoins directly from onchain wallets at any Mastercard-accepting merchant globally. It's one of the first purpose-built financial instruments designed for autonomous AI agent spending, sitting alongside ClawBank's Manfred as evidence that the infrastructure for AI economic participation is being built faster than the regulatory framework to govern it.