AI News PM — 4/5/2026 — Kilroy's Daily Briefings

Top stories, ranked by relevance.

Story cards stay below the sticky dock while audio, chapters, date, and brief navigation remain accessible.

#1DeepSeek V4 Launch Imminent — 1T Parameters, Open Weights

China's DeepSeek is days away from releasing V4, a ~1 trillion parameter Mixture-of-Experts model with only 37B active parameters per token and a 1M-token context window. Leaked benchmarks claim 80%+ on SWE-bench Verified. Apache 2.0 weights are planned, and if those numbers hold up under independent testing, this is another direct hit to US frontier model pricing power.

Source: [Dataconomy](https://dataconomy.com/2026/03/16/deepseek-v4-and-tencents-new-hunyuan-model-to-launch-in-april/) | [NxCode](https://www.nxcode.io/resources/news/deepseek-v4-release-specs-benchmarks-2026)

---

#2Google's TurboQuant Cuts LLM Memory 6x — No Accuracy Loss

Google Research's TurboQuant compresses the KV cache of large language models to 3 bits per value, delivering a 6x memory reduction and 8x attention speedup on H100 GPUs — with no measurable accuracy degradation. The paper is heading to ICLR 2026 in late April, and developers are already building independent implementations from the math alone.

Source: [Google Research](https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/) | [The Register](https://www.theregister.com/2026/04/01/googles_turboquant_reality/)

---

#3Nvidia Agent Toolkit Goes Live — Adobe, Salesforce, SAP Among 17 Adopters

Nvidia's open-source Agent Toolkit, unveiled at GTC 2026, is now available on build.nvidia.com and supported across AWS, Azure, GCP, and Oracle Cloud. The platform bundles OpenShell (policy-based safety guardrails), Nemotron reasoning models, and cuOpt optimization. With 17 enterprise software giants already committed, Nvidia is positioning itself as the rails for the agentic era.

Source: [VentureBeat](https://venturebeat.com/technology/nvidia-launches-enterprise-ai-agent-platform-with-adobe-salesforce-sap-among) | [NVIDIA Newsroom](https://nvidianews.nvidia.com/news/ai-agents)

---

#4AI Made a $45M Crypto Breach Look Easy — Ledger CTO Sounds Alarm

Ledger CTO Charles Guillemet warned today that AI is obliterating the cost asymmetry that kept crypto systems secure. Exploit chains that once took skilled researchers weeks now take hours. A solo operator used Claude to breach 20 Mexican government agency systems in under 72 hours earlier this year — and autonomous AI trading agent vulnerabilities triggered over $45M in crypto losses in 2026 alone.

Source: [CoinDesk](https://www.coindesk.com/tech/2026/04/05/ai-is-making-crypto-s-security-problem-even-worse-ledger-cto-warns) | [KuCoin](https://www.kucoin.com/blog/en-ai-trading-agent-vulnerability-2026-how-a-45m-crypto-security-breach-exposed-protocol-risks)

---

#5Meta Rolls Out MTIA 400 Chips — 72-Chip Racks, No Nvidia Required

Meta's in-house MTIA 400 inference chip has cleared testing and is rolling out across data center racks configured at 72 chips per rack. The MTIA roadmap spans four generations (300–500), with compute FLOPs scaling 25x from start to finish. Meta is methodically cutting Nvidia dependency for inference — the most cost-sensitive part of running AI at social-media scale.

Source: [CNBC](https://www.cnbc.com/2026/03/11/meta-ai-mtia-chip-data-center.html) | [Tom's Hardware](https://www.tomshardware.com/tech-industry/semiconductors/meta-reveals-four-new-mtia-chips-built-for-ai-inference)

---

#6Goldman Sachs: AI Demand to Drive Sharp Semiconductor Revenue Surge

Goldman Sachs published analysis today projecting a sharp jump in global semiconductor revenues driven by AI infrastructure buildout. Enterprise data backs it up: generative AI tools are showing a 23–33% productivity uplift across studies and company self-reports. Capital continues to flow toward the picks-and-shovels layer.

Source: [ANI News](https://www.aninews.in/news/business/ai-led-demand-to-drive-sharp-surge-in-semiconductor-revenues-goldman-sachs20260405145721/)

---

#7Vibe Coding Goes Mainstream — Bloomberg Asks if the Hype Is Warranted

Bloomberg today tackled the "vibe coding" phenomenon — AI-assisted development where engineers describe intent and let models write the implementation. The piece explores whether the productivity gains are real or the next tech FOMO cycle. Given productivity benchmarks above 20%, this one may actually have teeth.

Source: [Bloomberg](https://www.bloomberg.com/news/newsletters/2026-04-05/what-is-vibe-coding-the-ai-trend-fueling-a-new-kind-of-fomo)

---

**Big Picture**

Today's news traces a single through-line: the cost of AI capability is collapsing, and the downstream effects are hitting everywhere at once. DeepSeek V4 threatens frontier pricing. TurboQuant slashes inference overhead. Meta is building its way off Nvidia's balance sheet. Productivity gains are becoming measurable rather than theoretical. And the same cost collapse that benefits builders is handing attackers a free upgrade — the Ledger story is a preview of a structural security problem that the industry has no clear answer to yet. The "pilot era" for enterprise AI is ending; the scaling-and-hardening era is beginning.

Jump to another brief

Jump to this brief on another date

Recent AI News PM