Agents grew up: Claude 4.7, Google’s AI Mode, and OpenAI–AWS push real work
Agents got practical this week: Claude 4.7 tightened long‑running coding, Google’s AI Mode put answers next to the page, Meta locked in multi‑gigawatt chips, and OpenAI deepened with AWS — pointing to AI that plans, executes, and saves time.
This Week in One Line
Anthropic shipped Claude Opus 4.7 with stronger coding, Google put AI side‑by‑side with the web in Chrome, Meta locked multi‑gigawatt custom chips with Broadcom, and OpenAI formalized a deep AWS pact — net effect: AI shifts from chat to agents that actually do work.
Week in Numbers
- $100/month — OpenAI’s new ChatGPT Pro tier aimed at heavy Codex usage between the $20 Plus and $200 Pro plans. 1
- 1+ GW — Meta’s initial custom AI chip capacity commitment with Broadcom, scaling to multi‑gigawatt through 2029. 2
- $50B — Amazon’s planned multi‑year investment in OpenAI alongside exclusivity as third‑party cloud distributor for Frontier. 3
- $5B — Accel’s new late‑stage funds to write ~20 checks averaging ~$200M across AI software, hardware, robotics, and data centers. 4
- 22% faster, $19.50/M output tokens — Microsoft’s MAI‑Image‑2‑Efficient cuts image‑gen cost ~41% vs MAI‑Image‑2 and boosts throughput 4x/GPU. 5
- 13M users / $22M A‑round — Student learning app Gizmo’s scale and new funding to expand AI study tools. 6
- $150M at $1.5B — Reported raise in the works for Factory, an autonomous coding agents startup. 7
Top Stories
Anthropic launches Claude Opus 4.7 for harder coding and long‑running tasks
Anthropic released Claude Opus 4.7 with stronger instruction‑following, self‑checks, and higher‑resolution vision, targeting multi‑step software work and hours‑long agent runs at the same price as 4.6 ($5/M input, $25/M output). On developer benchmarks it leads generally available models: 64.3% on SWE‑bench Pro and 87.6% on SWE‑bench Verified, with partners reporting fewer tool errors and better recovery from failures. The update also tightens cybersecurity safeguards and introduces a tokenizer change that can increase token counts on the same inputs, which teams should factor into budgets. For non‑specialists this means more reliable “do the task” behavior, not just better chat. 8 9
Google’s AI Mode adds split‑screen browsing and store‑calling assistance
Google upgraded AI Mode so webpages open side‑by‑side with the AI panel, reducing tab‑hopping and keeping answers grounded in the page you’re viewing. For trip planning and everyday errands, AI Mode can also call nearby stores to check stock, and Search adds individual hotel price tracking with email alerts. For readers, this turns Chrome and Search into a persistent research and shopping workspace, changing how you compare sources, products, and prices. 10 11
Meta and Broadcom extend custom AI chip deal through 2029
Meta expanded its Broadcom partnership to co‑develop several generations of custom AI accelerators, starting with more than 1 gigawatt of compute and scaling to multi‑gigawatt levels. Broadcom contributes chip design, advanced packaging, and high‑speed networking, with new MTIA silicon moving to a 2‑nanometer process. The signal for everyday users: more stable, faster, and potentially cheaper AI‑driven features in Meta apps as in‑house inference scales. 2 12
OpenAI and Amazon formalize strategic partnership across cloud and silicon
OpenAI and AWS announced a stateful runtime for agent applications on Amazon Bedrock and named AWS the exclusive third‑party cloud distributor for OpenAI’s Frontier platform. Amazon plans to invest up to $50 billion in OpenAI and commit substantial Trainium capacity (around 2 gigawatts across Trainium3/4), positioning AWS as a go‑to channel for enterprises already standardized on its stack. For teams, this points toward simpler procurement and governance if you build on AWS — and more capacity headroom for production agents. 3
Microsoft rolls out three in‑house models and a cheaper, faster image tier
Microsoft introduced MAI‑Transcribe‑1, MAI‑Voice‑1, and MAI‑Image‑2 via Foundry and MAI Playground, claiming leading transcription accuracy across 25 languages and strong creative performance. It followed with MAI‑Image‑2‑Efficient, priced at $5 per million input tokens and $19.50 per million image output tokens, with 22% faster generation and 4x throughput per GPU. The direction is clear: bring common AI tasks inside Microsoft products at lower cost, easing adoption for Teams, Copilot, and marketing workflows. 13 5
Adobe unveils Firefly AI Assistant, a cross‑app creative agent
Adobe announced Firefly AI Assistant, a conversational agent that executes multi‑step creative tasks across Photoshop, Premiere, Lightroom, Illustrator, Express, and Frame.io. Users describe the outcome and the assistant orchestrates the steps, integrating session memory and third‑party models like Anthropic’s Claude. For designers and marketers, this condenses multi‑app workflows into a single thread — helpful for recurring asset sets and review cycles. 14
OpenAI debuts GPT‑5.4‑Cyber under “Trusted Access for Cyber”
OpenAI introduced GPT‑5.4‑Cyber, a variant tuned for defensive security work like vulnerability research, with access restricted to vetted vendors, organizations, and researchers. The program expands to thousands of individual defenders and hundreds of teams, adding verification tiers that unlock more powerful capabilities. The move, following Anthropic’s gated Mythos preview, shows sensitive capabilities are being governed by identity and programmatic access rather than open release. 15 16
Trend Analysis
Big vendors shifted from “chat about work” to “do the work” across consumer and enterprise surfaces. Claude Opus 4.7 focused on agentic coding reliability and long‑running tasks; Google’s AI Mode brought split‑screen research and agentic store‑calling to the browser; Adobe’s Firefly AI Assistant collapsed creative handoffs into one conversation. These launches point toward assistants that plan, execute steps, and recover from tool failures — the mechanics you feel as faster turnarounds on decks, code fixes, and campaign assets. 14
Under the hood, who runs the workloads and at what cost kept moving. Meta’s multi‑gigawatt Broadcom deal and OpenAI’s AWS pact (with $50B investment and Trainium capacity) push more AI onto custom silicon and familiar enterprise rails. Microsoft’s in‑house MAI models — including a cheaper, faster image tier — reinforce a cost‑and‑control playbook: migrate common tasks to first‑party models, embed them in flagship apps, and lower COGS for agent features. For buyers, this implies more predictable procurement and potentially lower unit costs inside tools you already pay for. 2 3 5
Access is becoming as strategic as capability. OpenAI’s GPT‑5.4‑Cyber and Anthropic’s Mythos preview are distributed via vetted programs, signaling that dual‑use domains won’t be “model for everyone” but “model for verified users.” That has knock‑on effects for procurement (identity verification, logging), for regulators (uneven access across regions), and for security teams that must design with and against increasingly capable tools. 15 16
Finally, capital is clustering around agentic software and infrastructure at late‑stage scale, with Accel’s $5B funds and reports of Factory’s $1.5B valuation term sheet. The money flow is a signal: investors are betting on “agents + infra” as a category, but adoption will be won on measurable cycle‑time and error‑rate improvements, not demos. Treat agent pilots like process changes — instrument them and expand only on proof. 4 7
Watch Points
- “Stateful runtime on Bedrock” — If you see this, it’s OpenAI’s and AWS’s managed foundation for running production agent teams with memory on Amazon Bedrock. 3
- “Tokenizer change” — Anthropic’s Opus 4.7 maps some inputs to more tokens; watch your per‑task token budgets if you switch.
- “Split‑screen AI Mode” — Publishers and marketers should monitor engagement metrics as Chrome shows AI and webpages side‑by‑side. 11
Open Source Spotlight
- browser-harness — A minimal Chrome DevTools Protocol harness that lets agents type, click, upload files, and “self‑heal” by writing missing helper functions mid‑task. Great for builders exploring native computer use. browser-use/browser-harness
- fireworks-tech-graph — Turn plain descriptions into crisp SVG/PNG technical diagrams (UML, RAG, multi‑agent flows). Handy for PMs and engineers documenting systems. yizhiyanhua-ai/fireworks-tech-graph
- Hermes Agent + HUD — Self‑hosted, persistent personal/Team agent with a hardened core, mobile support, Fast Mode routing, and a live web HUD for observability. For operators who want control and logs. NousResearch/hermes-agent joeynyc/hermes-hudui
- Web3Hermes — Community WebUI fork localizing Hermes Agent for Chinese users with streamlined setup. For region‑friendly deployments. Web3CZ/Web3Hermes
- LLM Internals — A step‑by‑step learning repo on tokenization, attention, quantization, and deployment — for engineers moving beyond APIs. amitshekhariitbhu/llm-internals
What Can I Try?
- Pilot split‑screen research in Chrome AI Mode: Open a long article, click through results to compare sources side‑by‑side, and add recent tabs or PDFs via the plus menu. Note whether it trims tab‑switch time.
- Generate a cross‑app creative set with Firefly AI Assistant: Join the public beta waitlist and brief the assistant to produce a social asset pack, then measure handoff time saved vs. your current flow. 14
- Compare image‑gen costs: Run a small batch on MAI‑Image‑2‑Efficient and your current tool, tracking $/asset and turnaround latency. 5
- Try a technical diagram from text: Use fireworks‑tech‑graph to render your current AI pipeline as SVG/PNG; drop it into your next spec. 17
- If you’re security‑qualified, apply for Trusted Access for Cyber: Test GPT‑5.4‑Cyber on non‑production code triage and document what refusals unblock. 16
Comments (0)