Text Model Guide
Call large language models through the TokenHub inference gateway with OpenAI Chat Completions compatibility.
1. Basic Calling Information
| Item | Description |
|---|---|
| Base URL | https://tokenhub.link |
| Endpoint | POST https://tokenhub.link/v1/chat/completions |
| Authentication | Authorization: Bearer <TokenHub API Key>. Create the key in API Keys in console. It is shown only once when created. |
| Model Identifier | Use model in request body. For supported models, check Model Catalog. You can call with model name or provider/model format, for example deepseek-v3.2 and alibaba/deepseek-v3.2. |
| Content-Type | Must be application/json. |
2. cURL Examples
Non-streaming response:
curl -sS "https://tokenhub.link/v1/chat/completions" \
-H "Authorization: Bearer $TOKENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-plus",
"messages": [
{"role": "user", "content": "Hello, who are you?"}
],
"max_tokens": 256,
"temperature": 0.7
}'
Sample non-streaming response structure (HTTP 200):
{
"choices": [
{
"message": {
"role": "assistant",
"content": "I am a large language model. Nice to meet you."
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 3019,
"completion_tokens": 104,
"total_tokens": 3123,
"prompt_tokens_details": {
"cached_tokens": 2048
}
},
"created": 1735120033,
"system_fingerprint": null,
"model": "qwen-plus",
"id": "chatcmpl-6ada9ed2-7f33-9de2-8bb0-78bd4035025a"
}
Streaming (SSE):
curl -N "https://tokenhub.link/v1/chat/completions" \
-H "Authorization: Bearer $TOKENHUB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen-plus",
"messages": [{"role":"user","content":"Write a short four-line poem."}],
"stream": true
}'
Sample streaming response structure (SSE chunks):
data: {"id":"chatcmpl-e30f5ae7-3063-93c4-90fe-beb5f900bd57","choices":[{"delta":{"content":"","function_call":null,"refusal":null,"role":"assistant","tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1735113344,"model":"qwen-plus","object":"chat.completion.chunk","service_tier":null,"system_fingerprint":null,"usage":null}
data: {"id":"chatcmpl-e30f5ae7-3063-93c4-90fe-beb5f900bd57","choices":[{"delta":{"content":"I am","function_call":null,"refusal":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1735113344,"model":"qwen-plus","object":"chat.completion.chunk","service_tier":null,"system_fingerprint":null,"usage":null}
data: {"id":"chatcmpl-e30f5ae7-3063-93c4-90fe-beb5f900bd57","choices":[{"delta":{"content":"a large","function_call":null,"refusal":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1735113344,"model":"qwen-plus","object":"chat.completion.chunk","service_tier":null,"system_fingerprint":null,"usage":null}
data: {"id":"chatcmpl-e30f5ae7-3063-93c4-90fe-beb5f900bd57","choices":[{"delta":{"content":"language","function_call":null,"refusal":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1735113344,"model":"qwen-plus","object":"chat.completion.chunk","service_tier":null,"system_fingerprint":null,"usage":null}
data: {"id":"chatcmpl-e30f5ae7-3063-93c4-90fe-beb5f900bd57","choices":[{"delta":{"content":"model,","function_call":null,"refusal":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1735113344,"model":"qwen-plus","object":"chat.completion.chunk","service_tier":null,"system_fingerprint":null,"usage":null}
data: {"id":"chatcmpl-e30f5ae7-3063-93c4-90fe-beb5f900bd57","choices":[{"delta":{"content":"nice to","function_call":null,"refusal":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1735113344,"model":"qwen-plus","object":"chat.completion.chunk","service_tier":null,"system_fingerprint":null,"usage":null}
data: {"id":"chatcmpl-e30f5ae7-3063-93c4-90fe-beb5f900bd57","choices":[{"delta":{"content":"meet you.","function_call":null,"refusal":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null}],"created":1735113344,"model":"qwen-plus","object":"chat.completion.chunk","service_tier":null,"system_fingerprint":null,"usage":null}
data: {"id":"chatcmpl-e30f5ae7-3063-93c4-90fe-beb5f900bd57","choices":[{"delta":{"content":"","function_call":null,"refusal":null,"role":null,"tool_calls":null},"finish_reason":"stop","index":0,"logprobs":null}],"created":1735113344,"model":"qwen-plus","object":"chat.completion.chunk","service_tier":null,"system_fingerprint":null,"usage":null}
data: {"id":"chatcmpl-e30f5ae7-3063-93c4-90fe-beb5f900bd57","choices":[],"created":1735113344,"model":"qwen-plus","object":"chat.completion.chunk","service_tier":null,"system_fingerprint":null,"usage":{"completion_tokens":17,"prompt_tokens":22,"total_tokens":39,"completion_tokens_details":null,"prompt_tokens_details":{"audio_tokens":null,"cached_tokens":0}}}
3. Parameter Reference
Request body is compatible with OpenAI Chat Completions. Field support may vary by upstream model and routing policy.
| Field | Type | Description |
|---|---|---|
model | string | Required. TokenHub model ID. |
messages | array | Required. Multi-turn messages. Each item includes role (for example system / user / assistant) and content. |
stream | boolean | Optional. When true, response is streamed via SSE. |
max_tokens | integer | Optional. Maximum generated token count. |
temperature | number | Optional. Sampling temperature. |
stop | string / array | Optional. Stop sequence(s). |
tools | array | Optional. Tool definitions such as function calling (effective only when supported by model and route). |
tool_choice | ... | Optional. Works together with tools. |
enable_thinking | boolean | Optional. Enables thinking mode for hybrid reasoning models. |
enable_search | boolean | Optional. Enables web search. Default is false. |
Response: On success, HTTP 200 with JSON fields like choices[].message.content and usage; streaming mode returns chunked SSE lines.
4. Common Scenarios
4.1 Single-Turn Q&A
{"model":"deepseek-v3.2","messages":[{"role":"user","content":"What is 1+1?"}]}
4.2 Multi-Turn Conversation
{
"model": "deepseek-v3.2",
"messages": [
{"role": "system", "content": "You are a concise assistant."},
{"role": "user", "content": "How is the weather in Beijing today?"},
{"role": "assistant", "content": "I cannot access real-time weather data."},
{"role": "user", "content": "Then what is the usual way to check it?"}
]
}
4.3 Python (OpenAI SDK)
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("TOKENHUB_API_KEY"),
base_url="https://tokenhub.link/v1",
)
completion = client.chat.completions.create(
model="deepseek-v3.2",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who are you?"},
]
)
print(completion.model_dump_json())
4.4 Node.js (OpenAI SDK)
import OpenAI from "openai";
const openai = new OpenAI(
{
apiKey: process.env.TOKENHUB_API_KEY,
baseURL: "https://tokenhub.link/v1"
}
);
async function main() {
const completion = await openai.chat.completions.create({
model: "deepseek-v3.2",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Who are you?" }
],
});
console.log(JSON.stringify(completion))
}
main();
4.5 Java (OpenAI SDK)
// OpenAI SDK version: 2.6.0
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.chat.completions.ChatCompletion;
import com.openai.models.chat.completions.ChatCompletionCreateParams;
public class Main {
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.builder()
.apiKey(System.getenv("TOKENHUB_API_KEY"))
.baseUrl("https://tokenhub.link/v1")
.build();
ChatCompletionCreateParams params = ChatCompletionCreateParams.builder()
.addUserMessage("Who are you?")
.model("deepseek-v3.2")
.build();
try {
ChatCompletion chatCompletion = client.chat().completions().create(params);
System.out.println(chatCompletion);
} catch (Exception e) {
System.err.println("Error occurred: " + e.getMessage());
e.printStackTrace();
}
}
}
5. Common Errors
- 401: Invalid API key — key is invalid or revoked.
- 400: Invalid request body (missing model, messages, etc.).
- 4xx/5xx: Upstream or platform-side errors. Use backoff retries for 429/5xx when appropriate.