1:1 Mentoring with Big Tech AI Engineers
System Design
44

Tiered Model Routing & Fallback

Use the cheapest model that can handle the task. Route dynamically.

Tiered Model Routing Decision Flow
flowchart TD
 Q[" User Query"] --> CL["⚡ CLASSIFIER
Flash/Haiku <200ms
$0.0001/query"] CL -->|"simple
(FAQ, lookup)"| T1[" Tier 1: Flash
$0.001/query
~500ms"] CL -->|"medium
(analysis, summary)"| T2[" Tier 2: Pro
$0.03/query
~2s"] CL -->|"complex
(reasoning, planning)"| T3[" Tier 3: Opus
$0.15/query
~5s"] T1 --> R[" Response"] T2 --> R T3 --> R style CL fill:#fff7e6,stroke:#c47e0a,stroke-width:2px style T1 fill:#f0fff4,stroke:#2d8659,stroke-width:2px style T2 fill:#fff7e6,stroke:#c47e0a,stroke-width:2px style T3 fill:#fff0f0,stroke:#c0392b,stroke-width:2px style R fill:#f0fff4,stroke:#2d8659,stroke-width:3px
TIERED MODEL ROUTING

Continue Reading

This topic continues with more in-depth content, code examples, and diagrams. Sign up free to unlock the full guide with all 87 sections.

Sign Up Free to Unlock

Free access · No credit card required

More in System Design

Get full access to all 87 sections with code examples, diagrams, and interactive animations.

Sign Up Free