Agents moved from chat to execution: GPT‑5.5, multi‑cloud OpenAI, Pentagon scale, and Nvidia’s agent CPU
Agentic AI went practical: GPT‑5.5 targets multi‑step work, OpenAI opened multi‑cloud and government channels, the Pentagon scaled Gemini to millions, and Nvidia unveiled a CPU for agent loops. Here’s what changed — and one hands‑on experiment to run.
This Week in One Line
OpenAI rolled out GPT-5.5 to paid ChatGPT tiers, Microsoft and OpenAI unwound Azure exclusivity to enable multi‑cloud, the Pentagon expanded Gemini to millions on GenAI.mil, and Nvidia unveiled a CPU for agent loops — agents are crossing from demos into daily workflows.
Week in Numbers
- $600B — Estimated 2026 AI capex across Microsoft, Alphabet, Amazon, and Meta; a market-level test of AI spend. 1
- 10M — Weekly conversations handled by Meta’s business AI tools across WhatsApp, Messenger, and Instagram. 2
- 3M — Users eligible to access Gemini 3.1 Pro on the U.S. DoD’s GenAI.mil platform; 1.3M+ already active. 3
- 9x — Throughput gain Nvidia claims for Nemotron 3 Nano Omni vs. other open “omni” models at similar interactivity. 4
- 88 — Custom Olympus cores in Nvidia’s Vera CPU (central processing unit) for agentic workloads. 5
- $1.1B — Seed funding raised by Ineffable Intelligence, valuing the new lab at $5.1B. 6
- €500M — Schwarz Group’s structured financing to back Cohere’s tie-up with Aleph Alpha. 7
Top Stories
OpenAI releases GPT-5.5 for paid ChatGPT tiers
OpenAI launched GPT-5.5 across ChatGPT Plus, Pro, Business, and Enterprise, with GPT-5.5 Pro for Pro/Business/Enterprise. The model targets multi-step work (coding, research, data analysis, computer use) and reports gains over GPT-5.4 on Terminal-Bench 2.0 (82.7%), SWE-Bench Pro (58.6%), GDPval (84.9%), OSWorld-Verified (78.7%), and Tau2-bench Telecom (98.0%), while matching per-token latency. 8
OpenAI highlights planning, tool use, and self-checking, with internal teams citing time savings; API (application programming interface) access follows “very soon” pending additional safeguards, and safety evaluations/red-teaming framed the rollout. 8
Microsoft–OpenAI loosen exclusivity, enabling multi‑cloud
Microsoft and OpenAI reworked their partnership so OpenAI can sell on any cloud; Microsoft keeps a license to OpenAI tech through 2032, and its OpenAI stake is valued at over $135B. 9
Azure remains the default “ship first” path, revenue‑share terms shift, and AGI (artificial general intelligence) triggers were removed — VentureBeat frames this as the end of exclusivity, opening competition on AWS (Amazon Web Services) and Google Cloud. 10 11
Pentagon adds Gemini 3.1 Pro to GenAI.mil for up to 3M users
The U.S. Defense Department’s GenAI.mil made Google Cloud’s Gemini 3.1 Pro broadly available, with up to 3 million eligible users and more than 1.3 million already active; officials say the tools operate in Impact Level 5 environments and have enabled over 100,000 AI agents to be built. 3 12
The expansion underscores faster government adoption of frontier models alongside ongoing debates over acceptable use inside vendors. 3
Nvidia unveils Vera CPU for agentic AI bottlenecks
Nvidia detailed Vera, a data‑center CPU (central processing unit) built for agentic loops and RL (reinforcement learning) post‑training: 88 custom Olympus cores, up to 1.2 TB/s LPDDR5X memory bandwidth, and up to 1.5× higher agentic sandbox performance vs. competing x86 platforms. 5
A Vera rack integrates up to 256 CPUs and supports 22.5K+ concurrent CPU environments, with systems expected from OEMs (original equipment manufacturers) in H2 2026. 13
Nvidia releases Nemotron 3 Nano Omni, a unified multimodal model
Nemotron 3 Nano Omni consolidates video, audio, images, and text in a single open model, which Nvidia says delivers up to 9× higher throughput than other open “omni” models at comparable interactivity, with distribution via Hugging Face, OpenRouter, and NVIDIA NIM. 4
Under the hood: a 30B‑A3B hybrid Mixture of Experts (MoE) backbone with integrated vision and audio encoders; checkpoints arrive in BF16/FP8/FP4 and an accompanying paper details accuracy gains across modalities. 14
Anthropic pilots agent‑to‑agent commerce with real money
In a one‑week internal marketplace, AI agents representing 69 employees negotiated 186 deals totaling just over $4,000, handling listings, offers, and counteroffers without human sign‑off during negotiations. Anthropic found stronger models achieved better outcomes, while weaker‑model users often didn’t notice the gap. 15
Coverage notes the pilot’s small scale and highlights “agent quality” disparities as a governance issue to plan for. 16
OpenAI opens government and AWS routes (FedRAMP + Bedrock)
OpenAI said ChatGPT Enterprise and the OpenAI API Platform are authorized at Federal Risk and Authorization Management Program (FedRAMP) Moderate, enabling U.S. federal agencies to use OpenAI in a compliant environment (including access to GPT‑5.5, per agency decisions). 17
OpenAI also brought its models — including GPT‑5.5 — to Amazon Bedrock and introduced Bedrock Managed Agents powered by OpenAI in limited preview, designed to fit existing AWS identity, security, and procurement flows; it also published updated operating principles. 18 19
Cohere to merge with Aleph Alpha, backed by €500M from Schwarz
Cohere agreed to take over Germany’s Aleph Alpha with €500 million in structured financing from Schwarz Group and likely use of its sovereign cloud, STACKIT — a push to serve regulated buyers seeking data control and European language coverage. 7
Reports say the combined company would keep the Cohere name, target a valuation around $20B, and allocate roughly 90%/10% ownership to Cohere/Aleph Alpha, pending approvals. 20
Aidoc raises $150M to scale clinical AI imaging
Aidoc secured a $150 million Series E led by Goldman Sachs Alternatives. The company cites 31 FDA clearances and deployments in nearly 200 U.S. health systems and 1,600+ hospitals globally, funding a broader clinical AI model and additional regulatory work. 21
Competition among incumbents and startups is active; the raise signals enterprise-grade traction in imaging workflows like ED triage and abdominal findings. 21
Hyperscaler earnings put AI capex — and payoffs — under the microscope
With results clustered on Apr 29, options markets priced ≥4% moves as investors weighed an estimated $600B in 2026 AI capex across Microsoft, Alphabet, Amazon, and Meta (a group worth $10T and ~17% of the S&P 500). 1
Alphabet reported Google Cloud revenue of $20B (up 63% year over year) with a $462B backlog nearly doubling quarter-over-quarter; AWS posted $37.6B and Microsoft Cloud $54.5B. Microsoft guided capex above $40B in Q4 and cited capacity constraints through 2026 as two‑thirds of spend goes to GPUs and CPUs (graphics/central processing units). 22
Trend Analysis
A clear shift toward “agentic” systems emerged across model, infrastructure, and deployment news. OpenAI’s GPT‑5.5 is framed for multi‑step tasks, Nvidia’s Vera CPU targets CPU‑bound agent loops, and Nemotron 3 Nano Omni collapses perception handoffs by unifying vision, audio, and text — together pointing to faster, more reliable software that plans, uses tools, and checks its own work. 8 5 4
Distribution and governance widened at the same time. Microsoft and OpenAI ended Azure exclusivity so OpenAI can sell on any cloud, while OpenAI added a compliant government path (FedRAMP Moderate) and an enterprise route inside Amazon Bedrock. For buyers, that means more leverage on latency, data residency, and procurement fit — and fewer reasons to block pilots on platform constraints. 9 17 18
Adoption signals concentrated in real workflows: Meta’s business AI handled about 10 million weekly customer conversations, Aidoc raised $150M to expand clinical imaging AI with dozens of FDA clearances, and hyperscaler earnings emphasized cloud growth and capacity planning ($600B in capex, large backlogs). The center of gravity is shifting from model headlines to measurable throughput, guardrails, and ROI. 2 21 22
Public-sector moves add a second anchor: the Pentagon’s GenAI.mil scaled Gemini across up to 3 million users and the department signed agreements to bring AI tools onto classified networks, while small-scale agent-to-agent commerce at Anthropic highlighted outcome gaps across model tiers. The takeaway: capability is rising, but usage policies and model choice now visibly shape results. 3 23 15
Watch Points
- “GPT‑5.5 API” — OpenAI says API access is coming “very soon” with additional safeguards; watch for availability and usage policies. 8
- “Cohere–Aleph Alpha approval” — Regulatory review and STACKIT deployments will indicate how fast the sovereign AI pitch lands in the EU public sector. 7
- “Vera shipping window” — OEM timelines for Nvidia’s Vera (H2 2026) will show how quickly agentic CPU capacity reaches data centers. 13
Open Source Spotlight
- Qwen Code — Terminal-native coding agent that plugs into local or hosted models; ideal for CLI-first devs who want provider flexibility and privacy. QwenLM/qwen-code
- promptfoo — Reproducible prompt/agent/Retrieval‑Augmented Generation (RAG) test runner with CI integration; helps teams baseline quality, cost, and latency. promptfoo/promptfoo
- Skyvern — Browser automation with large language models and vision; good for scaling login–navigate–extract flows beyond brittle scripts. Skyvern-AI/skyvern
- TensorRT‑LLM — Nvidia’s high‑performance inference stack for large models; consolidates kernel and scheduling optimizations behind a Python/C++ API. NVIDIA/TensorRT-LLM
- vllm‑mlx — OpenAI/Anthropic‑compatible local server for Apple Silicon (MLX backend) with continuous batching and multimodal support. waybarrios/vllm-mlx
What Can I Try?
- Put GPT‑5.5 on a real task: have it plan and complete a weekly report or spreadsheet end‑to‑end, then compare quality, time, and tokens vs. your current flow. 8
- Register for Google × Kaggle’s free AI Agents Intensive (June 15–19) and pick a work‑relevant capstone to ship in a week. 24
- Pilot OpenAI on AWS Bedrock with IT: run the same prompt flow you use today and evaluate latency, security fit, and billing. 18
- Test Adobe’s Firefly AI Assistant (public beta): turn one product photo into a set of platform‑ready social assets and time the lift. 25
- Verify model lineage before deployment: fingerprint one model using Cisco’s Model Provenance Kit and archive the result in your AI inventory. 26
Comments (0)