OpenAI API

Build and deploy AI agents with GPT‑5.4 through one API

Paid Some setup needed Web · API

About

Call GPT‑5.4 family models, wire up tools, and ship agentic chat or voice interfaces without managing infrastructure. Developers use it to build visual or code‑first agents, realtime voice apps, and multimodal features with clear per‑token pricing. Standouts are the 1.05M‑token context window and a Realtime API suited for sub‑second, voice‑centric interactions.

Editor's Take

Worth trying if you need sub‑second voice interactions, very long context handling, or a mix of visual agent design and code deployment; expect some beta‑era rough edges and plan for token costs on large uploads.

Key Features

Open a new agent in Agent Builder → assemble a workflow visually and test it in minutes
Install the Agents SDK → define tools and policies in code and deploy an agent endpoint
Stream mic audio to the Realtime API → get spoken responses with tool calls in sub‑second interactions
Send very long prompts → GPT‑5.4 processes up to a 1.05M‑token context with 128K max output
Pick GPT‑5.4 mini for cheaper runs → use the 400K context window while cutting input cost, or enable cached inputs for 10× lower input pricing

Use Cases

A product engineer shipping a voice coaching assistant that listens, talks back, and calls APIs during the conversation
A support tooling developer building a web chat agent with ChatKit to triage tickets and fetch account data
A data apps team migrating prompts to GPT‑5.4 to handle large briefs and specs without chunking

Try It Like This

1
Build a voice coaching assistant
Sign up and get an API key → stream mic audio to the Realtime API and receive sub-second spoken responses with tool calls → deploy the agent endpoint created in the Agents SDK to handle live coaching sessions and logging.
2
Create a multimodal support chat
Use Agent Builder to visually assemble a workflow that accepts images and text → connect tools (ticket lookup, account fetch) and test the flow in minutes → export the agent as an endpoint and integrate it into your web chat widget.
3
Migrate long-form briefs to GPT-5.4
Select GPT-5.4 (1.05M-token context) or GPT-5.4 mini for cost control → send very long prompts or full specs without chunking and validate outputs → measure token usage and enable cached inputs to reduce input cost where appropriate.
4
Prototype a realtime voice game
Install the Realtime API and stream player audio events → route model tool calls for game state and return spoken responses within sub-second latency → iterate on prompts and policies in the Agents SDK to refine turn-taking.
5
Deploy an API-driven agent endpoint
Install the Agents SDK and define tools, policies, and handlers in code → deploy an agent endpoint and run a smoke test with a REST/SDK call → monitor usage and switch models (e.g., GPT-5.4 mini) to balance latency and cost.

Pros & Cons

Pros

GPT‑5.4 family supports an exceptionally large 1.05M‑token context window (128K max output), letting you send very long prompts without chunking.
Realtime API is optimized for sub-second, voice‑centric interactions and can stream mic audio and spoken responses while invoking tools during conversation.
Agent Builder (visual) plus an Agents SDK (code) provide both low‑code workflow assembly and programmatic control for deploying agent endpoints; pricing is clear per token with options like GPT‑5.4 mini and cached inputs for lower input cost.

Cons

Assistants/Agents functionality is still evolving and has beta limitations—reviews suggest it’s more suited to prototyping now than mature production use.
Uploading files for knowledge retrieval feeds the entire document to the model and is billed for the full document token count, which can increase costs for large files.

Getting Started

1 Create an account at platform.openai.com and generate an API key.
2 Choose a GPT‑5.4 family model and follow the docs to make your first API call (chat or Realtime).
3 Hook up a simple tool or UI (ChatKit or SDK) and see the agent return a live response.

Pricing

Plan	Price	Includes
GPT-5.5	Input $5.00 / 1M tokens; Cached input $0.50 / 1M tokens; Output $30.00 / 1M tokens	Primary model pricing for GPT-5.5 (coming soon)
GPT-5.4	Input $2.50 / 1M tokens; Cached input $0.25 / 1M tokens; Output $15.00 / 1M tokens	Primary model pricing for GPT-5.4
GPT-5.4 mini	Input $0.75 / 1M tokens; Cached input $0.075 / 1M tokens; Output $4.50 / 1M tokens	Primary model pricing for GPT-5.4 mini
Multimodal models	Pricing for GPT-realtime-1.5, GPT-image-2, and Web search tools listed under Multimodal models	Prices shown per 1M tokens or per 1k calls for various modalities