OpenAI Agents SDK

Build multi-agent workflows with Python and sandboxed, voice-capable agents

Some setup needed API

About

Define agents with tools and guardrails, chain them with handoffs, and run long jobs in a sandbox. Python developers use it to build LLM apps, voice agents, and multi-step workflows across OpenAI APIs and other providers. It stands out with provider-agnostic support for 100+ LLMs plus Sandbox and Realtime Agents built into one SDK.

Editor's Take

We recommend this SDK for Python engineers who need a lightweight, provider-agnostic multi-agent framework with built-in Sandbox and Realtime capabilities; it's worth trying if you plan to orchestrate stateful or voice-enabled agents and want native tracing and human-in-the-loop support.

Key Features

Configure an agent with tools, guardrails, and handoffs → coordinate tasks across multiple specialized agents
Switch between OpenAI Responses/Chat Completions and 100+ other LLMs → keep the same agent code thanks to provider‑agnostic design
Run a long task inside a controlled container → get persistent filesystem work via Sandbox Agents (introduced in v0.14.0)
Add human approval steps to critical actions → enforce Human‑in‑the‑loop guardrails before execution
Install the voice extra and call gpt-realtime-1.5 → build an interruptible Realtime voice agent with tracing for debugging

Use Cases

A Python backend engineer orchestrating a multi-agent customer support workflow with handoffs, guardrails, and session history
A voice UX designer prototyping a call center agent using Realtime Agents and gpt-realtime-1.5 with human review gates
A data engineer running containerized research tasks via Sandbox Agents to fetch, write, and summarize files over long horizons

Try It Like This

1
Multi-agent customer support workflow
Define specialized agents for ticket parsing, knowledge lookup, and response generation → wire tools (search, DB, API calls) into each agent and configure handoffs for escalation → run sessions from a Python backend, persist history, and add human-approval gates for risky actions.
2
Realtime voice call center prototype
Install the voice extra and enable gpt-realtime-1.5 in an agent config → build a Realtime Agent with interruption handling and tracing enabled for debugging → route calls through the agent and insert human review steps before executing billing or sensitive updates.
3
Containerized long-running research job
Create a Sandbox Agent to run a long task inside the provided container and mount a persistent filesystem → let the agent fetch data, write intermediate files, and run iterative analysis over hours → collect summaries and artifacts back into your app once the job completes.
4
Prototype a multi-step data pipeline
Describe pipeline stages (ingest, clean, transform, summarize) and map each stage to a specialized agent with the necessary tools → chain agents with clear handoffs and error-handling guardrails → iterate locally in Python and switch LLM providers for cost or capability testing without changing agent code.
5
Human-in-the-loop approval for critical actions
Mark critical tool calls or state transitions to require human approval in the agent config → surface approval requests to reviewers via your frontend or webhook → only execute guarded actions after an approved human response, keeping audit logs for compliance.

Pros & Cons

Pros

Provider-agnostic design supports 100+ LLMs so the same agent code can switch between OpenAI and other providers
Sandbox Agents run long tasks inside a controlled container with a persistent filesystem for multi-step or stateful jobs
Realtime Agents plus the voice extra enable interruptible voice agents with tracing and human-in-the-loop guardrails for safer production flows

Cons

Best experience is with OpenAI models; non-OpenAI provider support is available but reported as less polished
API costs can rise quickly for high-volume or compute-heavy agent workloads
Smaller community and ecosystem compared with more established frameworks like LangChain, which may mean fewer third-party integrations or examples

Getting Started

1 Install with pip install openai-agents (Python 3.10+ required) or add via uv
2 Create your first Agent with tools and run the examples, or start a Sandbox Agent
3 Trigger a multi-agent run and view the execution trace to confirm tool calls and handoffs work end-to-end