Vol.01 · No.10 CS · AI · Infra May 13, 2026

AI Glossary

GlossaryReferenceLearn
LLM & Generative AI

Tool Use

Difficulty

Plain Explanation

Before tool use, apps had to parse free-form text to guess what action an AI wanted to take, which was brittle and unsafe. You might write regex just to extract a date or detect that the model wanted to look up a price. That approach broke easily and could not trigger real actions reliably.

Tool use solves this by having the model request a specific tool with structured arguments, like submitting a form. The app (or the provider) then runs the actual code and returns the result to the model. This turns the model into a decision-maker rather than an executor, and it makes outputs predictable when you need a JSON shape instead of prose.

Mechanically, the model returns a tool call with a name and JSON arguments, and either your app executes it and replies with a tool result (client-executed), or the provider runs it (server-executed) and streams back execution traces. Client tools run in a loop driven by stop reasons such as "tool_use"; server tools iterate inside the provider and can pause or finish without an extra round trip through your code.

Examples & Analogies

  • IT helpdesk automation: The model decides to call a ticketing tool with fields like title, severity, and assignee. Your service creates the ticket via the internal API and returns the ticket ID, which the model uses to confirm back to the user.
  • Finance data lookup: The model calls a pricing tool with a ticker and timestamp range to fetch current or historical prices not in its training data. The app runs the query and returns structured results that the model then summarizes.
  • Code scratchpad and file edits: The model invokes a code or text-edit tool to modify a local file with specific diffs. Your client applies the change and reports the updated content so the model can review or continue.

At a Glance

Client-executed toolsServer-executed tools
Where code runsYour app or environmentProvider infrastructure
Who drives the loopYou handle tool calls/resultsProvider handles iterations
Response shapeTool calls + your tool resultsFinal text + provider tool traces
Latency patternExtra round trips through your appInternal iterations before reply
Failure handlingYou catch errors and retryProvider manages retries/caps

Pick client-executed tools when you must run custom logic or touch private systems, and server-executed tools when you want provider-managed execution and simpler integration.

Where and Why It Matters

  • When side effects are required: Actions like sending emails, writing files, or updating records move from "the model described it" to "the system actually did it" via tools.
  • Fresh/external data access: Tools fetch current information (e.g., web search or database queries) and feed it back, overcoming training cutoffs.
  • Structured outputs on demand: Tool schemas enforce JSON shapes so downstream systems can trust fields without parsing free text.
  • Cost and latency controls: Agentic requests can include multiple turns; providers expose knobs like turn limits and benefit from prompt caching to manage overhead.
  • Evaluation signals: Benchmarks that stress constrained tool use report low end-to-end success when strict compliance is required, highlighting reliability gaps teams must design around.

Common Misconceptions

  • Myth: The model executes the tool itself. → Reality: The model only emits a structured call; your app or the provider runs the code and returns results.
  • Myth: Using tools eliminates hallucinations. → Reality: Tools reduce guesswork for data and actions, but the model can still miscall tools or misread results.
  • Myth: One assistant turn equals one tool call. → Reality: A single turn can invoke multiple tools in parallel, and caps typically limit turns, not individual calls.

How It Sounds in Conversation

  • "Claude stopped with stop_reason: tool_use; I'll execute the DB tool and post the tool_result back."
  • "Let's set a turn limit so the agent doesn't spend minutes iterating server-side tools."
  • "The model proposed a function call with malformed args—add validation and return an error message the agent can recover from."
  • "Cost-wise, our prompt tokens spike across turns, but cached prompt tokens should offset a lot of that."
  • "Check provider-side tool usage traces vs tool_calls; only successful server-side executions are billable."

Related Reading

References

Helpful?