Smart Router
How Franklin picks the best model for every request — routing in <1ms with 7-layer compression.
How Classification Works
Every time you send a message, the Smart Router classifies your request and picks the best model in under 1ms — 100% local, zero latency overhead. It analyzes prompt complexity, required capabilities (code, reasoning, creativity, trading), and your active routing profile. Before routing, 7-layer prompt compression reduces token count by 15-40%, cutting costs before they hit the provider.
Complexity Tiers
The router classifies requests into four tiers, each mapped to a different class of model:
- SIMPLE— quick facts, greetings, formatting. Routed to fast, cheap models.
- MEDIUM— summarization, code edits, general Q&A. Routed to mid-tier models.
- COMPLEX— multi-step code generation, analysis, long-form writing. Routed to frontier models.
- REASONING— math proofs, deep debugging, architectural design. Routed to reasoning-optimized models.
Routing Profiles
Switch between four profiles to control the quality-cost tradeoff:
- auto— best quality-to-cost ratio (default). Picks the optimal model for each tier.
- eco— cheapest option at every tier. Great for high-volume, cost-sensitive work.
- premium— most capable model at every tier. Use when quality matters more than cost.
- free— NVIDIA models only. No wallet required — always available.
# Switch routing profiles
/model auto # balanced (default)
/model eco # cheapest
/model premium # most capable
/model free # NVIDIA only, no costProfile vs. specific model
/model claude-sonnet-4.6 or /model gpt-5. The router is only active when a profile is selected.Tracking Spend
Use /cost to see a per-model spend breakdown for the current session:
/costThis shows total tokens used, cost per model, and which routing tier each request was classified into.
Adaptive Learning
The router learns from your usage patterns over time. If you consistently override the router's choice for certain types of prompts, it adapts future routing decisions to match your preferences. This happens automatically — no configuration needed.