API Documentation
Base URL: https://api.kimi.villamarket.ai
Authentication
All API requests require a Bearer token in the Authorization header.
Authorization: Bearer your-api-keyPOST
/v1/chat/completions
Create a chat completion. Supports streaming via "stream": true. Kimi K2.5 is a reasoning model — responses may include internal reasoning tokens before the final answer.
Request Body
| Parameter | Type | Description |
|---|---|---|
| model | string | Model ID. Use "kimi-k2.5" |
| messages | array | Array of message objects with role and content |
| max_tokens | integer | Maximum tokens to generate (default: 4096) |
| stream | boolean | Enable server-sent events streaming (default: false) |
| temperature | number | Sampling temperature, 0–2 (default: 0.6) |
| tools | array | List of tool/function definitions for function calling |
curl Example
curl https://api.kimi.villamarket.ai/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-k2.5",
"messages": [{"role": "user", "content": "Explain quantum computing in simple terms"}],
"max_tokens": 1024
}'Python Example
from openai import OpenAI
client = OpenAI(
base_url="https://api.kimi.villamarket.ai/v1",
api_key="your-api-key"
)
# Non-streaming
response = client.chat.completions.create(
model="kimi-k2.5",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=1024
)
print(response.choices[0].message.content)
# Streaming
stream = client.chat.completions.create(
model="kimi-k2.5",
messages=[{"role": "user", "content": "Write a haiku about AI"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
print()Response
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1700000000,
"model": "kimi-k2.5",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum computing uses quantum bits..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 15,
"completion_tokens": 128,
"total_tokens": 143
}
}GET
/v1/models
List available models.
curl https://api.kimi.villamarket.ai/v1/models \
-H "Authorization: Bearer your-api-key"Response
{
"object": "list",
"data": [
{
"id": "kimi-k2.5",
"object": "model",
"owned_by": "moonshot-ai"
}
]
}GET
/health
Check server health. No authentication required.
curl https://api.kimi.villamarket.ai/healthResponse
"healthy"Notes
Reasoning tokens: Kimi K2.5 is a reasoning model. Responses may include internal chain-of-thought tokens that count toward output token usage. The final answer follows the reasoning.
Rate limits: Usage is metered per API key. Default budget: $10 per key. Contact us for higher limits.
Streaming: Use "stream": true for real-time token-by-token output via Server-Sent Events (SSE).