AWS–OpenAI pact, Meta’s 1GW chips, and Claude 4.7 push agents from chat to action
AWS becomes OpenAI’s go‑to distributor, Meta books 1+ gigawatt of custom AI chips, Anthropic upgrades Claude for tougher coding, and Chrome’s AI Mode goes split‑screen. The net: agents inch closer to doing real work, not just chat.
This Week in One Line
OpenAI and Amazon formalized a Bedrock partnership with a multi‑year, $50B plan, Meta locked in 1+ gigawatt of custom AI chips with Broadcom through 2029, Anthropic shipped Claude Opus 4.7 for tougher coding work, and Chrome’s AI Mode went split‑screen — pushing AI from chat toward getting real tasks done.
Week in Numbers
- $50B — Amazon’s planned multi‑year investment tied to its OpenAI partnership, alongside exclusive third‑party cloud distribution of Frontier on Bedrock. 3
- >1 GW — Compute Meta committed to its next‑gen custom accelerators with Broadcom, the first phase of a multi‑gigawatt rollout. 1
- 1M tokens — GPT‑5.4’s native context window for long‑horizon planning and computer use. 1
- 64.3% — Claude Opus 4.7’s score on SWE‑bench Pro (software‑engineering benchmark), leading generally available peers. 2
- $19.50 — Price per 1M image output tokens for Microsoft’s MAI‑Image‑2‑Efficient (about 41% lower than MAI‑Image‑2). 10
- $5B — Accel’s new late‑stage funds to back ~20 AI‑heavy bets (roughly $200M average checks). 4
- $100 — OpenAI’s new ChatGPT Pro tier positioned for heavier Codex users between Plus ($20) and Pro ($200). 1
Top Stories
OpenAI and AWS announce a sweeping cloud-and-silicon partnership
Amazon becomes the exclusive third‑party cloud distributor for OpenAI Frontier on Amazon Bedrock, and the companies will co‑build a Stateful Runtime Environment to run production agent applications. Amazon plans a multi‑year $50B investment (an initial $15B plus up to $35B subject to conditions), while OpenAI commits roughly 2 gigawatts of Trainium capacity across Trainium3 and Trainium4 — a signal that compute supply and distribution are being locked in for large‑scale deployments. For enterprise teams already on AWS, the path to evaluate OpenAI services could simplify via existing procurement and security controls. 3
Meta locks in custom AI silicon with Broadcom through 2029
Meta expanded its Broadcom partnership to co‑develop multiple generations of MTIA (Meta Training and Inference Accelerator) chips, starting with more than one gigawatt of capacity and scaling to multi‑gigawatt levels; Broadcom contributes chip design, high‑speed networking, and advanced packaging. Analysts note this as a move to reduce dependence on scarce, expensive third‑party GPUs and to support real‑time inference across feeds and recommendations. The companies also flagged a 2‑nanometer process for the next MTIA silicon. 1 2
Anthropic ships Claude Opus 4.7, focused on harder software work
Opus 4.7 is tuned for multi‑step coding and hours‑long tasks with better instruction‑following, self‑checking, and a higher‑resolution vision upgrade (images up to 2,576 px on the long edge) for UI, slides, and document analysis — at the same price as 4.6. On headline dev benchmarks, it posts 64.3% on SWE‑bench Pro and 87.6% on SWE‑bench Verified, with partners reporting fewer tool errors and more resilient execution through failures. Enterprises should note a tokenizer change that may increase token counts on the same inputs. 1 2
OpenAI introduces GPT‑5.4 with native computer use and 1M context
GPT‑5.4 is designed to handle real office work with less back‑and‑forth, combining long‑horizon planning, desktop/browser control via screenshots and input events, and stronger research, spreadsheet, and presentation skills. It reaches state‑of‑the‑art scores on agentic computer‑use benchmarks (e.g., 75.0% on OSWorld‑Verified) and supports up to 1M tokens of context, with faster sessions and fewer tokens reported by early users. For developers, computer use and tool search/parallelization reduce orchestration work for complex tasks. 1
Microsoft rolls out three in‑house models and a cheaper image variant
Microsoft announced MAI‑Transcribe‑1 (speech‑to‑text), MAI‑Voice‑1 (text‑to‑speech), and MAI‑Image‑2 (image generation) via Microsoft Foundry and MAI Playground, positioning them for enterprise staples like transcription, voiceovers, and creative assets. It later added MAI‑Image‑2‑Efficient at $5 per million input tokens and $19.50 per million image output tokens — about 41% cheaper for image outputs — and claimed 22% faster generation with 4x higher throughput per GPU, targeting high‑volume creative workloads. 9 10
Google DeepMind releases Gemma 4 under Apache 2.0
Gemma 4 arrives as an open‑weight family for reasoning and agent workflows, spanning Effective 2B, Effective 4B, 26B MoE (Mixture of Experts), and 31B Dense, with the larger models supporting up to 256K context. The 31B ranks #3 and the 26B MoE ranks #6 on Arena AI’s open text leaderboard, with deployment paths across on‑device (E2B/E4B), workstations, and the cloud, and broad support for vLLM, llama.cpp, MLX, NVIDIA NIM, and more. The Apache 2.0 license streamlines commercial use. 4
Chrome’s AI Mode goes side‑by‑side and can call local stores
Google updated AI Mode so web pages open side‑by‑side with the AI panel, reducing tab‑hopping and letting users add recent tabs, images, and PDFs into questions. For shopping and travel, Google is also rolling agentic features into AI Mode that can call nearby stores to check stock and send results back, and it added hotel price‑tracking alerts in Search. For day‑to‑day research and planning, this consolidates work into a single view. 1 2
OpenAI unveils GPT‑5.4‑Cyber with gated access for defenders
OpenAI introduced a security‑focused variant with more permissive behavior for legitimate vulnerability research, initially limited to vetted vendors, organizations, and researchers via its Trusted Access for Cyber program. The move mirrors Anthropic’s controlled rollout of Mythos and underscores that sensitive capabilities are shifting to identity‑verified access rather than pure model‑side guardrails. 1 2
Accel raises $5B to fuel late‑stage AI bets
Accel closed $5B across vehicles, including $4B for Leaders Fund V, targeting roughly 20 late‑stage checks averaging about $200M, with focus areas spanning AI software, hardware, robotics, defense tech, and data‑center infrastructure. The round reflects how late‑stage AI financings now resemble infrastructure‑scale capital, concentrating larger checks into fewer companies with clear revenue and operational discipline. 4
Factory reportedly seeks $150M at a $1.5B valuation for autonomous coding
WSJ reports Factory is in talks to raise $150M at a $1.5B valuation for agents that can take coding tasks from brief to implementation with less oversight. The interest from top investors signals sustained appetite for “agentic” developer tools, though buyers will still need side‑by‑side metrics (issues closed, PR rework rates) to validate promised cycle‑time gains. 4
Trend Analysis
AI shifted from “answer my question” toward “do the work.” GPT‑5.4 bundled native computer use, long context, and planning, while Claude Opus 4.7 focused on multi‑step coding with fewer tool failures. On the consumer side, Chrome’s AI Mode moved to a split‑screen workspace and can even call stores, showing agentic patterns in everyday browsing. For teams, this points toward permissioning, logging, and human‑in‑the‑loop design becoming standard before granting autonomy. 1 1
Open options strengthened alongside closed capabilities. Google’s Gemma 4 arrived as a permissively licensed, agent‑ready family with long context and broad tooling, while NVIDIA’s Nemotron 3 Super emphasized open weights plus efficiency for million‑token contexts. Together they suggest more organizations can pilot long‑running agents on accessible hardware without vendor lock‑in. 4 1
Compute and distribution deals set the backdrop. OpenAI’s AWS pact tied exclusive third‑party distribution on Bedrock to a multi‑year $50B plan and roughly 2GW of Trainium capacity, while Meta and Broadcom committed to more than a gigawatt of custom MTIA silicon on the path to multi‑gigawatt scale. Late‑stage funds like Accel’s $5B add fuel for companies that can translate this infrastructure into margin and productivity. 3 1 4
Sensitive capabilities are being gated. OpenAI’s GPT‑5.4‑Cyber and Anthropic’s restricted programs point toward identity‑verified access for dual‑use features, while recent security notes (e.g., Mac client updates) remind teams to bake in defensive error handling and updates for desktop and API‑based workflows. Expect procurement to include access tiering and verification alongside model selection. 1 4
Watch Points
- “Frontier on Bedrock” — watch for preview dates, regions, and pricing as AWS becomes the exclusive third‑party distributor for OpenAI’s enterprise platform. 3
- “2 nm MTIA” — signs of Meta’s next‑gen custom accelerators moving into production could shift inference cost and latency inside its apps. 2
- “MAI‑Image‑2‑Efficient” — adoption and per‑token pricing pressure in high‑volume creative workloads could ripple to other image providers. 10
Open Source Spotlight
- browser-harness — A minimal Chrome DevTools Protocol harness that lets an agent truly drive the browser and self‑heal by patching missing helper functions mid‑task. Good for builders exploring native computer‑use loops. browser-use/browser-harness
- Fireworks Tech Graph — Turn plain descriptions into publication‑ready system diagrams (SVG/PNG) with UML and AI‑pattern presets; handy for PMs and engineers documenting pipelines. yizhiyanhua-ai/fireworks-tech-graph
- Hermes Agent v0.9.0 — A production‑minded, self‑hosted agent update with fast lanes, iMessage/WeChat adapters, Android support, and a deep security hardening pass. NousResearch/hermes-agent v2026.4.13
- Hermes Web UI — A web dashboard to manage sessions, channels, schedules, skills, and analytics for Hermes Agent in one place. EKKOLearnAI/hermes-web-ui
- LLM Internals — A step‑by‑step learning repo on tokenization, attention, quantization, and deployment basics for engineers moving beyond simple API calls. amitshekhariitbhu/llm-internals
What Can I Try?
- Use Chrome’s AI Mode split‑screen: open AI Mode on desktop, click a result, and compare the page side‑by‑side while adding recent tabs or PDFs to your question. 1
- Put Claude Opus 4.7 on a real task: fix a small repo issue or analyze a complex PDF with images to gauge quality and token counts versus your current model. 1
- Automate a browser chore: clone browser‑harness and have your coding assistant wire up a file upload or form‑filling step end‑to‑end. 5
- Diagram your AI pipeline from text: generate an SVG with Fireworks Tech Graph to standardize how your team documents systems. 15
Comments (0)