Tool Use
Plain Explanation
Before tool use, apps had to parse free-form text to guess what action an AI wanted to take, which was brittle and unsafe. You might write regex just to extract a date or detect that the model wanted to look up a price. That approach broke easily and could not trigger real actions reliably.
Tool use solves this by having the model request a specific tool with structured arguments, like submitting a form. The app (or the provider) then runs the actual code and returns the result to the model. This turns the model into a decision-maker rather than an executor, and it makes outputs predictable when you need a JSON shape instead of prose.
Mechanically, the model returns a tool call with a name and JSON arguments, and either your app executes it and replies with a tool result (client-executed), or the provider runs it (server-executed) and streams back execution traces. Client tools run in a loop driven by stop reasons such as "tool_use"; server tools iterate inside the provider and can pause or finish without an extra round trip through your code.
Examples & Analogies
- IT helpdesk automation: The model decides to call a ticketing tool with fields like title, severity, and assignee. Your service creates the ticket via the internal API and returns the ticket ID, which the model uses to confirm back to the user.
- Finance data lookup: The model calls a pricing tool with a ticker and timestamp range to fetch current or historical prices not in its training data. The app runs the query and returns structured results that the model then summarizes.
- Code scratchpad and file edits: The model invokes a code or text-edit tool to modify a local file with specific diffs. Your client applies the change and reports the updated content so the model can review or continue.
At a Glance
| Client-executed tools | Server-executed tools | |
|---|---|---|
| Where code runs | Your app or environment | Provider infrastructure |
| Who drives the loop | You handle tool calls/results | Provider handles iterations |
| Response shape | Tool calls + your tool results | Final text + provider tool traces |
| Latency pattern | Extra round trips through your app | Internal iterations before reply |
| Failure handling | You catch errors and retry | Provider manages retries/caps |
Pick client-executed tools when you must run custom logic or touch private systems, and server-executed tools when you want provider-managed execution and simpler integration.
Where and Why It Matters
- When side effects are required: Actions like sending emails, writing files, or updating records move from "the model described it" to "the system actually did it" via tools.
- Fresh/external data access: Tools fetch current information (e.g., web search or database queries) and feed it back, overcoming training cutoffs.
- Structured outputs on demand: Tool schemas enforce JSON shapes so downstream systems can trust fields without parsing free text.
- Cost and latency controls: Agentic requests can include multiple turns; providers expose knobs like turn limits and benefit from prompt caching to manage overhead.
- Evaluation signals: Benchmarks that stress constrained tool use report low end-to-end success when strict compliance is required, highlighting reliability gaps teams must design around.
Common Misconceptions
- Myth: The model executes the tool itself. → Reality: The model only emits a structured call; your app or the provider runs the code and returns results.
- Myth: Using tools eliminates hallucinations. → Reality: Tools reduce guesswork for data and actions, but the model can still miscall tools or misread results.
- Myth: One assistant turn equals one tool call. → Reality: A single turn can invoke multiple tools in parallel, and caps typically limit turns, not individual calls.
How It Sounds in Conversation
- "Claude stopped with stop_reason: tool_use; I'll execute the DB tool and post the tool_result back."
- "Let's set a turn limit so the agent doesn't spend minutes iterating server-side tools."
- "The model proposed a function call with malformed args—add validation and return an error message the agent can recover from."
- "Cost-wise, our prompt tokens spike across turns, but cached prompt tokens should offset a lot of that."
- "Check provider-side tool usage traces vs tool_calls; only successful server-side executions are billable."
Related Reading
References
- How to implement tool useAnthropic Docs
도구 스키마, tool_use 블록, 호스트 실행, 결과 반환을 다루는 구현 가이드.
- Tool use with Claude
Official overview of tool definitions, execution models, and the agentic loop.
- Tools — Model Context Protocol (MCP)
Specification for tools exposed by MCP servers and invoked by clients.
- Function calling
Describes function/tool calling with JSON schemas and host-side execution.
- Function calling with the Gemini API
Covers function declarations, automatic calling, and OpenAPI-compatible schemas.