Vol.01 · No.10 CS · AI · Infra May 14, 2026

AI Glossary

GlossaryReferenceLearn
LLM & Generative AI Infra & Hardware

Model Router

Difficulty

Plain Explanation

A model router decides which AI model should handle a request. A simple classification task might go to a small model, while a hard reasoning or high-risk request goes to a stronger model. The goal is to reduce cost and latency without losing too much quality.

Examples & Analogies

It is like a support center that sends easy questions to automation and complex cases to a human specialist. A short document-classification request can go to an SLM, a complex debugging task to a frontier LLM, and an image-plus-text request to a multimodal model.

At a Glance

DimensionSingle modelModel router
MethodSame model for all requestsSelect model per request
StrengthSimple operationCost and latency optimization
RiskEasy requests become expensiveWrong routing can hurt quality
Required piecesOne model and APIClassifier, policy, fallback, logs

Where and Why It Matters

When an AI product uses multiple models, model choice becomes a product and infrastructure decision. A router can send easy traffic to cheaper models and escalate difficult traffic to stronger models. In agentic systems, it may also need to account for tool-use capability.

Common Misconceptions

  • Myth: A router just picks the cheapest model.
  • Reality: It should pick the cheapest model that still satisfies quality and safety requirements.
  • Myth: Prompt length is enough to estimate difficulty.
  • Reality: Task type, tool needs, risk, modality, and expected output matter.
  • Myth: Routing mistakes are harmless.
  • Reality: Bad routing can reduce quality, increase cost, or create safety failures.

How It Sounds in Conversation

  • "Send routine requests to the small model and fallback failures to the frontier model."
  • "Router accuracy is less important than end-to-end correctness and cost/request."
  • "Tool-calling requests need a different routing policy from normal chat."

Related Reading

References

Helpful?