Disclosure: I previously worked at CoreWeave and Google. AI Briefly contributors may hold positions in companies mentioned. Not investment advice.
The Short Version
- KAIROS is a daemon that wakes every 15 seconds and consolidates memory overnight. Every Claude Code seat becomes a persistent compute process, not a session. Rollout is gated for May 2026, employees first.
- The infrastructure stack was priced for bursty, statistically multiplexable inference. Daemons are neither. Even a mostly-sleeping daemon forces capacity planning from overcommit to reserved guarantee - different business, different margins.
- Cost range per seat: 1.3x at the floor (daemon mostly idle), 20x at the ceiling (high-agency mode running hot during work hours). The base case scenarios between those bounds are what nobody has modeled yet.
- Agentic workloads are CPU-bottlenecked. Georgia Tech and Intel measured it - 90.6% of agent latency sits on CPU-side tool processing (arXiv:2511.00739). Nvidia shipped a dedicated agentic CPU at GTC. Neoclouds built GPU farms.
- Anthropic isn't building a chip. It's co-designing Trainium with Annapurna, bought 400K TPUv7s outright from Broadcom, and keeps Nvidia as the third leg. Better play than a custom ASIC at $19B ARR.
- CoreWeave has $14B in debt tied to the wrong hardware mix.
- The agent runtime layer - the operating system for persistent AI - doesn't exist yet. Hyperscalers will get there. The window for someone else to own it is 12-24 months.
- The request-response inference model that the entire infrastructure stack was sized and priced against is ending.
How to Position
Long: Anthropic (pre-IPO, $19B ARR, daemon-first architecture). Nvidia Vera CPU and AMD EPYC (CPU bottleneck reprices hardware mix). Nebius (closest to owning the agent runtime layer).
Short / caution: GPU-only rental neoclouds without software layers. CoreWeave at current multiples if BMaaS margins stay at 14-16%. Any company whose unit economics are built on current subsidized API rates.