Routing
TokenHub selects upstream pools using policy and health — not manual vendor switching in your code.
Policies
- Latency-first — prefer pools with better recent p95 latency.
- Cost-first — prefer lower unit-cost pools when SLA allows.
Failover
If the selected pool errors or rate-limits, the gateway retries the next eligible pool. You receive one assistant message; billing follows successful completion.