Unified Interface for Agents

ChatGPT

openai/chatgpt

General intelligence for production assistants

Gemini

google/gemini

Strong multimodal reasoning and long-context tasks

Qwen

aliyun/qwen

Efficient enterprise deployment across scenarios

DeepSeek

deepseek/deepseek

High-value reasoning and coding capability

Agent model

tokenhub-agent-max

Maximum reasoning depth for complex agent loops

Agent model

tokenhub-agent-plus

Balanced capability for most production agents

Agent model

tokenhub-agent-flash

Low-latency agent response for interactive workflows

tokenhub-vision-understanding

Understand screenshots, documents, and scenes

tokenhub-image-generation

Generate commercial-quality visuals with one API

tokenhub-video-generation

Create short videos for product and workflow demos

tokenhub-speech-processing

Handle transcription, synthesis, and voice interaction

Coding model

tokenhub-coder-max

Maximum code reasoning for complex engineering tasks

Coding model

tokenhub-coder-plus

Optimized coding quality with stable latency

Coding model

tokenhub-coder-flash

Fast coding completion for interactive IDE workflows

View all models

Products

From API service to enterprise-dedicated clusters.

API aggregation

One unified, OpenAI-compatible API for mainstream models — switch without rewriting your integration.

Create API key · View integration docs

Model marketplace

Comprehensive coverage of global top models, Agent models, coding models, and multimodal models.

View all models

Enterprise services

Enterprise-grade access with dedicated capacity, SLAs, and optional private deployment for your workflows.

View enterprise plans →

Trusted by builders

“Same OpenAI-compatible code, different models. Our eval harness could swap models without changing request wiring.”

— ML engineer · tooling

“Fallback saved a live demo. When one upstream degraded, the next pool kept the assistant responsive.”

— Product engineer · launch day

“Agent tool calling just works. No vendor-specific adapters needed for our function execution layer.”

— Platform dev · tool orchestration

“We can safely run CI with per-key limits and clear usage accounting — fewer surprises, easier budgets.”

— DevOps · CI guardrails

“Prompt caching reduced repeated system/context costs for our agent workflows.”

— Founder · cost optimization

“Batch jobs for long documents complete reliably. The routing layer keeps throughput steady overnight.”

— Backend · batch inference

“Model IDs are consistent in our production config. Routing picks the best upstream for latency and context size.”

— Engineering manager

“Billing matches successful completions. Retries don’t turn into surprise token bills.”

— FinOps · usage reconciliation

“Global coverage matters for us: interactive assistants feel faster with edge-aware routing.”

— Full-stack · global users

“Enterprise privacy and data processing agreements were clear. We can restrict providers to trusted options.”

— Security lead · compliance

Announcements

Product launches and model availability updates.

New

TokenHub now supports Seed models

April 30, 2026

New

TokenHub now supports GLM and Kimi

April 30, 2026

New

TokenHub now supports DeepSeek V4

April 30, 2026

New

TokenHub is now live

April 26, 2026

New

New support: Qwen, Wan, GLM, Kimi, MiniMax, DeepSeek

April 26, 2026