Agents leave the chatbox: security, privacy, and in‑app workflows take center stage
AI left the chat window. OpenAI stood up a $4B deployment unit and a security program, Microsoft’s agentic system found 16 Windows flaws, Meta added encrypted no‑log chats, and Claude moved into SMB tools — with one concrete action you can try now.
This Week in One Line
OpenAI formed a $4B deployment company and unveiled Daybreak for proactive security, Anthropic shipped Claude workflows inside SMB tools, Meta introduced encrypted Incognito Chat, and Microsoft’s agentic system found 16 Windows bugs — AI is moving from chat to secure, in‑app action.
Week in Numbers
- $4B — Capital behind OpenAI’s new enterprise deployment company to embed engineers with customers. 1
- 16 — New Windows networking/authentication vulnerabilities Microsoft’s agentic security system uncovered. 2
- $2.1B — Funding raised by Alphabet-affiliated Isomorphic Labs to advance AI-driven drug design. 3
- 224B — Reported daily tokens attributed to the open-source Hermes Agent on OpenRouter. 4
- $350,000 — Google Cloud credits offered to eligible startups in the Gemini Startup Forum. 5
- 12,000+ — Financial institutions ChatGPT’s personal finance preview can connect to for U.S. Pro users. 6
- 34–1 — Colorado Senate concurrence vote to replace its AI law before it takes effect (SB 26‑189). 7
Top Stories
OpenAI creates a $4B deployment company to embed AI teams inside enterprises
OpenAI launched a majority-owned services unit with more than $4B and a multi‑year investor partnership led by TPG to embed forward‑deployed engineers at client companies; it is also acquiring Tomoro to add about 150 deployment specialists on day one. The structure targets large, complex workflows, with Axios reporting a $10B pre‑money valuation and investor terms that cap profits while guaranteeing a minimum return. For buyers, this signals a shift from trials to embedded build‑outs that connect back‑office systems to models to move specific metrics. 1 8 9
Microsoft debuts MDASH, an agentic security harness that finds and proves exploitable bugs
Microsoft introduced a multi‑model system that coordinates 100+ specialized agents on top of Large Language Models (LLMs) to prepare, scan, validate, deduplicate, and prove vulnerabilities end‑to‑end. Researchers found 16 new Windows issues (including four Critical remote code execution flaws), achieved 88.45% on the CyberGym benchmark, and recalled up to 100% of past tcpip.sys cases; a limited preview is underway. The design keeps targeting and proving model‑agnostic, so organizations can swap in newer models without rebuilding plugins and policies. 2
Meta introduces Incognito Chat with encrypted, no‑log AI conversations
Meta rolled out an Incognito Chat mode for Meta AI that keeps conversations out of server logs with end‑to‑end encryption (E2EE) and disappearing messages. Under the hood, Meta’s Private Processing uses trusted execution environments (TEEs), remote attestation, and oblivious transport so large models can process requests without exposing content to Meta or WhatsApp. For users, it reframes assistant use cases where privacy and audit minimization are essential, with rollouts planned for WhatsApp and the Meta AI app. 10 11 12
OpenAI announces Daybreak for proactive cybersecurity
OpenAI unveiled Daybreak, pairing its Codex Security agent with specialized cyber models (including GPT‑5.5‑Cyber) to detect and validate software vulnerabilities and offer enterprise scans. The program aims to build system “threat models,” analyze real codebases, and support remediation in collaboration with industry and government partners, bringing a productized approach to defensive AI. For teams under compliance pressure, the explicit “Request a vulnerability scan” entry point implies shorter paths to a scoped pilot. 13 14
Anthropic ships Claude for Small Business with 15 ready‑made workflows
Anthropic introduced Claude for Small Business, a toggle inside Claude Cowork that embeds automation where owners already work — Intuit QuickBooks, PayPal, HubSpot, Canva, DocuSign, Google Workspace, and Microsoft 365 — with 15 workflows and 15 skills. All actions require user approval before sending, posting, or paying, and Anthropic says Team/Enterprise plans don’t train on customer data by default; a free “AI Fluency” course is included. This moves AI from chat into finance and marketing routines such as payroll planning, month‑end close, invoice chasing, and campaign kickoffs. 15 16
Codex comes to the ChatGPT mobile app for on‑the‑go code reviews
OpenAI is bringing its coding agent Codex into the ChatGPT mobile app, letting users connect to a Mac host, review outputs, approve changes, and start tasks from iOS and Android; Windows support is expected later. The company frames mobile control as a way to shorten the loop between a failing job and a fix, aligning with enterprise traction signals for agent‑driven code workflows. Treat mobile approvals like code reviews — helpful for response time, but requiring clear guardrails. 17 18
ChatGPT adds a personal finance preview for Pro users
OpenAI launched a personal finance experience for U.S. Pro users on May 15 that connects bank and brokerage accounts to a dashboard and Q&A grounded in a user’s own data. The company cites support for more than 12,000 financial institutions, lets users disable training, and says disconnected data will be deleted within 30 days; it also notes ChatGPT isn’t a substitute for professional advice. For individuals and operations teams, this shows how assistants change once you link real data. 6 19
DeepMind previews a “point‑and‑speak” pointer that acts across apps
DeepMind outlined a context‑aware pointer powered by Gemini that lets you point to on‑screen content and speak natural shorthand like “this” or “that” to act across apps, reducing prompt writing and context switching. Principles include capturing the visual/semantic context around the pointer and turning pixels into actionable entities (places, dates, objects); early integrations include Chrome and a forthcoming Googlebook “Magic Pointer.” For non‑specialists, it’s UI‑native assistance that meets you where you work. 20
Alphabet‑affiliated Isomorphic Labs raises $2.1B for AI drug design
Isomorphic Labs secured $2.1B led by Thrive Capital with participation from Alphabet and sovereign funds, citing plans to scale applied AI for drug discovery. The raise underscores sustained investor appetite for life‑sciences AI, even as enterprise tooling and security capture headlines elsewhere this week. Hiring will focus on applied AI, engineering, and go‑to‑market. 3
Open‑source Hermes Agent surges and ships v0.13.0
Nous Research’s Hermes Agent — an MIT‑licensed, self‑improving agent that reflects on tasks and writes reusable skills — shipped v0.13.0 with a Kanban multi‑agent board, /goal command, improved checkpoints, auto‑resume, and Google Chat support. Marktechpost reports Hermes leads OpenRouter daily usage with 224B tokens, highlighting demand for agents that learn from repeated execution. For teams eyeing reliability and control, the release adds heartbeat monitoring and security hardening for practical pilots. 21 4
Trend Analysis
Agentic AI stepped out of chat windows and into the tools people already use. Claude for Small Business lands automation directly in QuickBooks, PayPal, HubSpot, and Canva; Codex on mobile shortens the review loop from a phone; and DeepMind’s pointer reframes interaction as “point and speak,” not “copy and prompt.” Even personal finance in ChatGPT shows assistants becoming context‑aware once data is linked — a pattern pointing to in‑app, on‑screen help rather than separate chat tabs. 15 17 20 6
Security emerged as a proving ground for multi‑agent systems and specialized models. Microsoft’s MDASH coordinated more than 100 agents to find and prove real Windows flaws with measurable precision, while OpenAI’s Daybreak packaged cyber scanning and validation as an enterprise program; at the same time, Google’s threat team reported attackers using AI to discover a new vulnerability, underscoring the need for faster defense. Together these signals put automated triage and validation on security roadmaps. 2 13 22
Privacy and governance also moved forward. Meta’s Incognito Chat relies on trusted execution environments (TEEs) and attested encrypted channels to process requests without server‑side logs; Colorado’s SB 26‑189 shifts compliance from “what AI is used” to whether automation “materially influences” consequential decisions and preserves meaningful human review; and ChatGPT’s finance preview emphasizes opt‑in connections and 30‑day deletion on disconnect. This combination of technical and policy guardrails points toward more controlled deployments. 10 7 6
Open tooling and orchestration kept pace. Hermes Agent’s rapid uptake and reliability‑focused release features suggest teams want agents that learn over time with visible control surfaces, while automation projects like Activepieces and programmatic tools for NotebookLM show builders wiring agents into repeatable, auditable flows outside the chatbox. The throughline: orchestrate actions, capture learnings, and keep humans in the loop. 21 23 24
Watch Points
- “GPT‑5.5‑Cyber” or “Daybreak scans” — If you see this, it’s OpenAI’s push to productize proactive vulnerability discovery and validation for enterprises. 13
- “Private Processing / TEE attestation” — Meta’s encrypted, no‑log Incognito Chat relies on trusted execution environments; watch for rollout notes and any attestation or performance caveats. 10
- “Meaningful human review” — Colorado’s SB 26‑189 reframes compliance around automated influence on employment and other consequential decisions; expect vendor documentation requests to mirror this. 7
Open Source Spotlight
- Hermes Agent — Self‑improving agent with a Kanban task board, /goal lock, and checkpointed state; MIT‑licensed. Good for teams piloting reliable, always‑on assistants. NousResearch/hermes-agent
- Activepieces — Visual automation flows with Model Context Protocol (MCP) integrations so agents can call tools predictably; Zapier‑style for non‑developers. activepieces/activepieces
- Skyvern — Browser automation using LLMs and computer vision to script real web flows beyond brittle selectors; start with the quickstart. Skyvern-AI/skyvern
- agentmemory — Persistent memory for coding agents (Claude Code, Cursor, Gemini CLI) so they remember across sessions; useful for editor‑integrated assistants. rohitg00/agentmemory
- notebooklm-py — Unofficial Python application programming interface (API) and command‑line interface (CLI) to drive Google NotebookLM from scripts or agents. teng-lin/notebooklm-py
What Can I Try?
- Turn on Claude for Small Business: connect QuickBooks/PayPal/HubSpot/Canva and run one workflow (invoice chaser or month‑end close) with approvals. 15
- Request a Daybreak pilot: use the “Request a vulnerability scan” entry to scope one service or repo for automated triage. 14
- Try pointer‑based help in Chrome: follow DeepMind’s post and test “point and ask” on a product page instead of crafting a full prompt. 20
- Test ChatGPT Finance (U.S. Pro): link one low‑risk account, ask three concrete questions (spend change, subscriptions, savings target), then disconnect to confirm 30‑day deletion. 6
- Quickstart Hermes Agent: clone the repo, launch the dashboard, and run a weekly status‑report workflow using the /goal command. 21
Comments (0)