AI NewsBusiness

8 min read 5/10/2026

OpenAI cash, AWS pact, Nvidia’s Vera CPU, and faster ChatGPT anchor an agent-first week

Agent platforms and guardrails shipped, ChatGPT got faster by default, Nvidia targeted agent bottlenecks with a new CPU, and OpenAI locked in massive funding and AWS capacity — a week about taking agents from pilot to production.

Find in this article

Reading Mode

This Week in One Line

OpenAI raised $122B and signed a 2 GW AWS Trainium pact; Nvidia unveiled a Vera CPU for agent loops; Google shipped an enterprise agent platform; ChatGPT switched to GPT-5.5 Instant — faster, more governed agents are moving into everyday work.

Week in Numbers

$122B — OpenAI’s new funding round to “accelerate the next phase of AI,” at an $852B valuation. ¹
88 cores — Nvidia’s Vera CPU packs 88 Arm-based Olympus cores to speed non‑GPU agent work. ²
2 GW — Trainium capacity OpenAI agreed to consume under its expanded AWS partnership. ³
6.5× — Download lift Appfigures attributes to image‑model launches versus chatbot upgrades. ⁴
220,000 GPUs — Compute at SpaceX’s Colossus 1 that Anthropic says it can tap, tied to higher Claude limits. ⁵
€1B+ — SAP’s four‑year investment as it acquires Prior Labs to bring tabular AI into core products. ⁶
1,000,000 tokens — Context window in Nvidia’s Nemotron 3 Super for long, agentic tasks. ⁷

Top Stories

OpenAI raises $122B, outlines an agent‑first “superapp” and multi‑cloud stack

OpenAI announced a $122B round at an $852B valuation to “accelerate the next phase of AI,” positioning compute and product integration as core advantages. The company detailed a broad infrastructure strategy (multiple clouds and chips, plus an in‑house Broadcom‑partnered chip) and a plan to unify ChatGPT, coding, search, and browsing into a single agent‑first surface. It also reported scale figures: 900M weekly active users, 50M+ subscribers, $2B in monthly revenue, APIs processing 15B tokens per minute, and an ads pilot topping $100M ARR (Annual Recurring Revenue). For non‑specialists, this signals AI that does multi‑step work, not just chat. ¹

OpenAI and Amazon: 2 GW Trainium, Bedrock integration, and a stateful runtime

OpenAI and Amazon agreed to co‑create a “Stateful Runtime Environment” delivered via Amazon Bedrock and made AWS the exclusive third‑party cloud distributor for OpenAI Frontier. The deal includes about 2 gigawatts of AWS Trainium capacity, expands a prior $38B agreement by $100B over eight years, and adds a $50B Amazon investment ($15B upfront plus $35B contingent). For AWS shops, this points toward OpenAI agents aligned with Bedrock permissions and monitoring. ³

Nvidia unveils Vera CPU to clear agent bottlenecks

Vera is a data center CPU built for the “glue” around Large Language Models (LLMs, Large Language Models): code runs, tool calls, and orchestration loops. It features 88 Arm‑based Olympus cores, up to 1.2 TB/s memory bandwidth, and up to 1.8 TB/s coherent bandwidth to Rubin GPUs over NVLink‑C2C; Nvidia cites up to 50% faster “sandbox” performance than traditional CPUs. Racks can integrate up to 256 liquid‑cooled Vera CPUs to support 22,500+ concurrent CPU environments, promising steadier latency in agent workflows. ⁸ ²

ChatGPT switches default to GPT‑5.5 Instant

OpenAI made GPT‑5.5 Instant the default ChatGPT model, emphasizing quicker answers with fewer sensitive‑topic mistakes and surfacing “memory sources” so users can view and edit the context powering replies. Reports cite higher scores versus GPT‑5.3 Instant (e.g., AIME 2025: 81.2 vs. 65.4) and parity with GPT‑5.4 latency. In the API, GPT‑5.5 appears as “chat‑latest,” while paid users can still select GPT‑5.3 for three months to compare behavior. ⁹ ¹⁰

Google launches Gemini Enterprise Agent Platform for workplace agents

Google Cloud introduced the Gemini Enterprise Agent Platform (evolving Vertex AI) to build, scale, govern, and monitor AI agents for employees. It bundles creation tools (Agent Studio, an upgraded Agent Development Kit), long‑running Agent Runtime with Memory Bank, identity/registry/gateway for governance, and quality controls via simulation, evaluation, and observability, with access to 200+ models in Model Garden. The message: standardized tooling so IT can manage while business users build. ¹¹

Gemini File Search adds multimodal retrieval with page‑level citations

Google expanded the Gemini API’s File Search so apps can retrieve across text and images with custom metadata filters and return page‑level citations — a direct boost to Retrieval‑Augmented Generation (RAG, 검색 증강 생성) workflows that need verifiable sources. Powered by Gemini Embedding 2, developers have reported measured gains (e.g., +40% Recall@1 for one use case, 60%→~87% Match@20 for another) and support for mixed inputs per request. For teams, this means answers that point to the exact page and screenshot. ¹² ¹³

Microsoft open‑sources an Agent Governance Toolkit

Microsoft released an Agent Governance Toolkit with policy enforcement, zero‑trust identity, execution sandboxing, and reliability engineering patterns mapped to the OWASP Agentic Top 10 risks. The active repo (v3.4.0 on May 5) includes documentation and a quick start, plus recent fixes to reduce false‑positive contributor‑risk flags. For teams piloting tool‑calling agents, it’s a packaged baseline for guardrails. ¹⁴

Anthropic secures SpaceX Colossus compute, raises usage limits

Anthropic announced access to the full Colossus 1 data center — over 300 MW and more than 220,000 Nvidia GPUs — alongside higher rate limits for Claude Pro/Max and API tiers. The company framed it as part of a multi‑gigawatt capacity build with commitments across Amazon, Google/Broadcom, Azure, and others. More headroom translates to longer coding and analysis sessions without reworking workflows. ⁵

Nvidia releases Nemotron 3 Super (and Nano Omni) for agentic AI

Nemotron 3 Super is an open‑weight model (120B parameters, 12B active) tuned for multi‑agent planning and tool use, with a 1‑million‑token context window to keep long tasks on track; Nvidia cites up to 5× higher throughput versus the prior Nemotron Super. A companion open multimodal model, Nemotron 3 Nano Omni, integrates vision and audio encoders to reduce hand‑off latency and claims up to 9× higher throughput than other open “omni” models at similar interactivity. Open weights and NIM packaging broaden deployment choices from local to cloud. ⁷ ¹⁵

Apple reportedly tests model choice in iOS 27

Apple is said to be adding an “Extensions” setting that lets users pick which third‑party AI model powers features like Siri and Writing Tools on iOS 27 (and reportedly on iPadOS/macOS 27). Early tests include integrations from Google and Anthropic, signaling a user‑centric, multi‑model experience inside native apps. That could let people match specific tasks to model strengths without app‑switching. ¹⁶ ¹⁷

Trend Analysis

Agent software hardened this week: platforms and guardrails shipped alongside verifiable retrieval. Google’s Gemini Enterprise Agent Platform formalizes build‑to‑governance workflows; Microsoft’s Agent Governance Toolkit wraps policy, identity, and sandboxes; Google’s File Search adds page‑level citations, moving RAG (Retrieval‑Augmented Generation) from “answers” to audit‑ready evidence. For teams, that’s a cleaner path from pilot to production with traceable outputs. ¹¹ ¹⁴ ¹²

Infrastructure also tilted toward agent bottlenecks and long horizons. Nvidia’s Vera CPU targets the CPU‑bound steps that stall agents between LLM calls; OpenAI’s 2 GW AWS deal and Bedrock‑based stateful runtime point to persistent state and cloud alignment; Nvidia’s Nemotron 3 Super extends context to 1,000,000 tokens so agents can keep full task state without re‑prompt churn. The through‑line is shorter waits and fewer fragile hand‑offs. ⁸ ³ ⁷

Capital and capacity concentrated further around enterprise deployment. OpenAI’s $122B raise underscores a push to turn research into durable products; Anthropic’s access to 220,000 GPUs at SpaceX Colossus and broader multi‑GW commitments expand immediate headroom; pairing with AWS Bedrock suggests smoother procurement and governance for agent stacks in existing enterprise clouds. The signal: more AI will arrive inside systems companies already use. ¹ ⁵ ³

User experience kept pace: ChatGPT moved to GPT‑5.5 Instant with editable “memory sources,” Google’s file‑grounded answers now cite exact pages, and Apple is reportedly testing per‑feature model choice. Together, these shifts imply assistants that are faster, more transparent, and easier to fit to task and tone inside everyday apps. ⁹ ¹² ¹⁶

Watch Points

“Stateful Runtime Environment” — If you see previews or docs, that’s OpenAI and AWS bringing persistent agent state into Bedrock, a likely on‑ramp for enterprise pilots. ³
“iOS 27 Extensions” — Apple’s reported per‑feature model picker would normalize multi‑model use inside default apps; watch which providers get listed. ¹⁶
“Dec 2, 2027 (EU high‑risk AI)” — A provisional EU deal delays some high‑risk obligations and adds an intimate‑deepfake ban from Dec 2; expect compliance playbooks to adjust. ¹⁸

Open Source Spotlight

Agent Governance Toolkit — Policy gates, zero‑trust identity, sandboxing, and reliability patterns for autonomous agents; a practical baseline for governance pilots. microsoft/agent-governance-toolkit
Solace Agent Mesh — Event‑driven orchestration for multi‑agent systems that must react to real‑world messages and coordinate across business apps; good for integration‑heavy teams. SolaceLabs/solace-agent-mesh
Ollama — One‑stop local runner for open‑weight models with installers, Docker image, and official Python/JS clients; handy for quick local prototypes. ollama/ollama
Pi (agent harness) — Unified LLM API, coding‑agent CLI, TUI/web UIs, Slack bot, and vLLM deployment scaffolding; useful to scaffold coding agents across surfaces. earendil-works/pi
ColorBench — Vision‑language benchmark focused on color perception and robustness; helpful for testing where visual illusions trip models. tianyi-lab/ColorBench

What Can I Try?

Require cited answers: upload a long PDF and relevant screenshots to Gemini File Search, then ask 5 work questions and verify page‑level citations and visuals. ¹²
Add a guardrail: run Microsoft’s Agent Governance Toolkit quick start, create one policy gate, and sandbox a tool call in a test workflow. ¹⁴
Re‑run a task in ChatGPT: use GPT‑5.5 Instant for a recent brief or spreadsheet and audit “memory sources,” correcting anything stale. ⁹
Draft an AWS pilot: write a one‑pager on a “Stateful Runtime” use case (inputs, permissions, success metric) and ask your AWS owner about timing and access. ³
Replace polling: set up Gemini API event‑driven webhooks for a long‑running job and verify signed deliveries with automatic retries. ¹⁹

At a Glance

Weekly Quiz

1. Which company introduced the Vera CPU to speed agentic AI’s non-GPU work?

2. In the OpenAI–Amazon deal, how much AWS Trainium capacity did OpenAI agree to consume?

3. What notable user-facing change accompanied ChatGPT’s switch to GPT-5.5 Instant?

Sources 32

[1] Openai OpenAI raises $122 billion to accelerate the next phase of AI | OpenAI [2] Nvidia NVIDIA Vera CPU Delivers High Performance, Bandwidth, and Efficiency for AI Factories [3] Nvidia New NVIDIA Nemotron 3 Super Delivers 5x Higher Throughput for Agentic AI | NVIDIA Blog [4] Anthropic Higher usage limits for Claude and a compute deal with SpaceX | Anthropic [5] Knightli Gemini Embedding 2: Putting Text, Images, Video, and Audio in One Vector Space [6] Cnbc Nvidia embraces AI investor, topping $40 billion in equity bets 2026 [7] Toolsstackai OpenAI Just Raised $122 Billion at an $852 Billion Valuation. Here's What It Actually Means. - Tools Stack AI [8] Nvidia NVIDIA Launches Vera CPU, Purpose-Built for Agentic AI | NVIDIA Newsroom [9] Datacenterknowledge Nvidia’s Vera CPU Will Anchor Next Phase of AI Infrastructure [10] Aidailypost Microsoft, OpenAI end pact; OpenAI can now sell on AWS... [11] Blog Gemini API File Search is now multimodal [12] Openai Introducing GPT-5.5 | OpenAI [13] Techcrunch OpenAI releases GPT-5.5 Instant, a new default model for ChatGPT [14] Deeplearning OpenAI’s Latest Model GPT-5.5 Tops Leaderboards for Coding, Visual Puzzles, and Overall Intelligence [15] Anthropic Agents for financial services and insurance | Anthropic [16] Yahoo SpaceX to give Anthropic access to its massive AI supercomputer [17] Nvidia NVIDIA Launches Nemotron 3 Nano Omni Model, Unifying Vision, Audio and Language for up to 9x More Efficient AI Agents | NVIDIA Blog [18] Techcrunch Nvidia has already committed $40B to equity AI deals this year [19] Nvidia Streaming Tokens and Tools: Multi-Turn Agentic Harness Support in NVIDIA Dynamo [20] Nvidia Achieving Peak System and Workload Efficiency on NVIDIA GB200 NVL72 with Slurm Block Scheduling [21] Nvidia Real-Time Performance Monitoring and Faster Debugging with NCCL Inspector and Prometheus [22] Reuters EU countries, lawmakers clinch provisional deal on watered-down AI rules [23] Techcrunch China’s Moonshot AI raises $2B at $20B valuation as demand for open-source AI skyrockets [24] Github microsoft/agent-governance-toolkit: AI Agent Governance Toolkit — Policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents. Covers 10/10 OWASP Agentic Top 10. [25] Openai Scaling Trusted Access for Cyber with GPT-5.5 and GPT-5.5-Cyber | OpenAI [26] Openai OpenAI and Amazon announce strategic partnership | OpenAI [27] Techcrunch Image AI models now drive app growth, beating chatbot upgrades [28] Sap SAP to Acquire Prior Labs to Establish a Globally Leading Frontier AI Lab in Europe [29] Google Introducing Gemini Enterprise Agent Platform | Google Cloud Blog [30] Techcrunch Apple plans to make iOS 27 a Choose Your Own Adventure of AI models [31] Itnews Apple to let users choose rival AI models across iOS 27 features [32] Blog Event-Driven Webhooks in the Gemini API

Helpful?

0to1log Weekly

Latest AI News