MaaS_Kimi
Request Protocol
http
Header
| Parameter Name | Type | Required | Description |
|---|---|---|---|
Content-Type |
string | is | Fixed to applicatio n/json |
Authorization |
string | is | Bearer {your_api_key} |
Request URL
POST https://genaiapi.cloudsway.net/v1/ai/{endpoinPath}
Request Parameters
| Parameter | Type | Required | Default Value | Description |
|---|---|---|---|---|
| model | string | is | - | Model Identifier. |
| messages | array |
is |
- | Input messages. Each message contains role and content |
| temperature | number | No | 0.6 | Sampling temperature, range 0 to 2. A lower value makes the output more deterministic |
| max_tokens | integer | No | 1024 | Maximum number of tokens generated. The model's maximum value is 128000 |
| top_p | number | No | 1.0 | Nucleus sampling threshold. An alternative to temperature |
| frequency_penalty | number | No | 0 | Penalize repeated tokens. Range: -2.0 to 2.0 |
| presence_penalty | number | No | 0 | Penalize tokens based on occurrence. Range: -2.0 to 2.0 |
| stream | boolean | No | false | Incremental Streaming Response |
| n | integer | No | 1 | Number of Generated Completions |
| stop | string/array | No | null | Stop Sequence (up to 4) |
| user | string | No | null | Used to track the Unique Device Identifier of the end user |
Streaming
Example of Curl Request
curl 'https://genaiapi.cloudsway.net/v1/ai/{endpoinPath}/chat/completions' \
--header 'Authorization: Bearer ${your AK}' \
--header 'Content-Type: application/json' \
--data '{
"model": "moonshot-v1-8k",
"stream":"True",
"messages": [
{
"role": "system",
"content": "You are Kimi, an AI assistant provided by Moonshot AI. You are more proficient in conversations in Chinese and English. You will provide users with safe, helpful, and accurate responses. At the same time, you will refuse to answer any questions related to terrorism, racial discrimination, explicit violence, or other inappropriate content. Moonshot AI is a proper noun and should not be translated into other languages."
},
{
"role": "user",
"content": "hello?"
},
{
"partial": true,
"role": "assistant",
"content": "hello,"
}
],
"temperature": 0.3
}'
Python Request Example
import requests
import json
url = "https://genaiapi.cloudsway.net/v1/ai/{endpoinPath}/chat/completions"
headers = {
"Authorization": "Bearer ${your AK}",
"Content-Type": "application/json"
}
data = {
"model": "moonshot-v1-8k",
"stream": "True",
"messages": [
{
"role": "system",
"content": "You are Kimi, an AI assistant provided by Moonshot AI. You are more proficient in conversations in Chinese and English. You will provide users with safe, helpful, and accurate responses. At the same time, you will refuse to answer any questions related to terrorism, racial discrimination, explicit violence, or other inappropriate content. Moonshot AI is a proper noun and should not be translated into other languages.'
},
{
"role": "user",
"content": "hello?"
},
{
"partial": True,
"role": "assistant",
"content": "hello,"
}
],
"temperature": 0.3
}
response = requests.post(url, headers=headers, json=data)
# Handle response
if response.status_code == 200:
print(response.json())
else:
print(f"error: {response.status_code}")
print(response.text)
Tool call
Example of Curl Request
curl 'https://genaiapi.cloudsway.net/v1/ai/{endpointPath}/chat/completions' \
--header 'Authorization: Bearer ${your AK}' \
--header 'Content-Type: application/json' \
--data '{
"model": "grok-4-0709",
"messages": [{"role": "user", "content": "Calculate the straight-line distance between Beijing and Shanghai"}],
"tools": [{
"type": "function",
"function": {
"name": "get_distance",
"description": "Calculate the straight-line distance between two locations",
"parameters": {
"type": "object",
"properties": {
"location1": {"type": "string", "description": "Starting city"},
"location2": {"type": "string", "description": "Destination city"}
},
"required": ["location1", "location2"]
}
}
}],
"tool_choice": "auto"
}'
Python Request
import requests
import json
url = "https://genaiapi.cloudsway.net/v1/ai/{endpointPath}/chat/completions"
headers = {
"Authorization": "Bearer ${your AK}",
"Content-Type": "application/json"
}
data = {
"model": "grok-4-0709",
"messages": [
{
"role": "user",
"content": "Calculate the straight-line distance between Beijing and Shanghai"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_distance",
"description": "Calculate the straight-line distance between two locations",
"parameters": {
"type": "object",
"properties": {
"location1": {
"type": "string",
"description": "Starting city"
},
"location2": {
"type": "string",
"description": "Destination city"
}
},
"required": ["location1", "location2"]
}
}
}
],
"tool_choice": "auto"
}
# Send request
response = requests.post(url, headers=headers, json=data)
# Process response
if response.status_code == 200:
result = response.json()
print(json.dumps(result, indent=2, ensure_ascii=False))
# Check if there are tool calls
message = result.get("choices", [{}])[0].get("message", {})
if message.get("tool_calls"):
print("\nModel requested to call tool:")
for tool_call in message["tool_calls"]:
print(f" Function name: {tool_call['function']['name']}")
print(f" Arguments: {tool_call['function']['arguments']}")
else:
print(f"Request failed, status code: {response.status_code}")
print(response.text)
Response Example
{
"id": "chatcmpl-68770593a58dcb3e99966225",
"choices": [
{
"index": 0,
"logprobs": null,
"message": {
"role": "assistant",
"content": "Let me help you calculate the straight-line distance between Beijing and Shanghai.",
"reasoning_content": null,
"function_call": null,
"tool_calls": [
{
"id": "get_distance:0",
"type": "function",
"function": {
"arguments": "{\"location1\": \"Beijing\", \"location2\": \"Shanghai\"}",
"name": "get_distance"
}
}
],
"reasoning_details": null
},
"finish_reason": "tool_calls",
"native_finish_reason": null
}
],
"created": 1752630675,
"model": "kimi-k2-0711-preview",
"object": "chat.completion",
"system_fingerprint": null,
"usage": {
"prompt_tokens": 83,
"completion_tokens": 33,
"total_tokens": 116,
"completion_tokens_details": null,
"prompt_tokens_details": null
}
}
⚠️ Attention: Carry the id from the content of the first request response.
"tool_calls": [
{
"id": "get_distance:0",
"type": "function",
"function": {
"arguments": "{\"location1\": \"Beijing\", \"location2\": \"Shanghai\"}",
"name": "get_distance"
}
}
],
to the tool_call_id in the second request
curl --location 'https://genaiapi.cloudsway.net/v1/ai/xxx/chat/completions' \
--header 'Authorization: Bearer ${your AK}' \
--header 'Content-Type: application/json' \
--data '{
"max_tokens": 4096,
"messages": [
{
"role": "user",
"content": "Calculate the straight-line distance between Beijing and Shanghai"
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "functions.get_distance:0",
"type": "function",
"function": {
"name": "get_distance",
"arguments": "{\"location1\":\"Beijing\",\"location2\":\"Shanghai\"}"
}
}
]
},
{
"role": "tool",
"tool_call_id": "functions.get_distance:0",
"name": "get_distance",
"content": "1000"
}
]
}'
Response Example
{
"id": "90b265eb9fec4fdfb42275bf020deb47",
"choices": [
{
"index": 0,
"logprobs": null,
"message": {
"role": "assistant",
"content": "According to the calculation, **the straight-line distance between Beijing and Shanghai is approximately 1,000 km** (great-circle distance).\n\nAdditional notes:\n- This is the straight-line distance between the two cities, not accounting for terrain or routes\n- Actual travel distances are longer:\n - High-speed rail distance: approximately 1,318 km\n - Driving distance: approximately 1,200-1,300 km\n - Flight route distance: approximately 1,100 km (affected by flight path planning)\n\nBeijing is located at 39°54′N, 116°23′E; Shanghai is located at 31°14′N, 121°29′E. The longitudes of the two cities are similar, with a latitude difference of about 8.7°, which corresponds to a straight-line distance of approximately 1,000 km.",
"refusal": null,
"annotations": null,
"images": null,
"reasoning_content": "The user asked for the straight-line distance between Beijing and Shanghai. I need to use the get_distance function to calculate the distance between these two cities. I passed \"Beijing\" as location1 and \"Shanghai\" as location2 to the function.\n\nThe function returned 1000, which appears to be in kilometers. The straight-line (great-circle) distance between Beijing and Shanghai is approximately 1,000-1,100 km, so this result is reasonable.\n\nI should present this result to the user in a clear way, explaining that this is the straight-line distance (great-circle distance), not the actual travel distance.",
"function_call": null,
"tool_calls": null,
"reasoning_details": null
},
"finish_reason": "stop",
"native_finish_reason": null
}
],
"logprobs": null,
"created": 1776050977,
"model": "MaaS_Kimi_K2.5_20260127",
"object": "chat.completion",
"system_fingerprint": null,
"service_tier": null,
"usage": {
"prompt_tokens": 59,
"completion_tokens": 258,
"total_tokens": 317,
"completion_tokens_details": null,
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 0
},
"cache_creation_input_tokens": null,
"cache_creation": null,
"gemini_cache_tokens_details": null
}
}
Json scheme (json model)
Example of Curl Request
curl 'https://genaiapi.cloudsway.net/v1/ai/{endpointPath}/chat/completions' \
--header 'Authorization: Bearer ${your AK}' \
--header 'Content-Type: application/json' \
--data '{
"model": "moonshot-v1-8k",
"messages": [
{
"role": "system",
"content": "You are Kimi, an AI assistant provided by Moonshot AI. You are more proficient in conversations in Chinese and English. You will provide users with safe, helpful, and accurate responses. At the same time, you will refuse to answer any questions related to terrorism, racial discrimination, explicit violence, or other inappropriate content. Moonshot AI is a proper noun and should not be translated into other languages."
},
{
"role": "system",
"content": "You are an intelligent customer service representative for Kimi. You are responsible for answering various questions from users. Please respond to user questions based on the document content. Your answers can include text, images, and links, and you may include text, images, and links together in a single response.\n\nPlease output your response in the following JSON format:\n\n{\n \"text\": \"text content\",\n \"image\": \"image URL\",\n \"url\": \"link URL\"\n}\n\nNote: Place text content in the `text` field, place images as links starting with `oss://` in the `image` field, and place regular links in the `url` field."
},
{
"role": "user",
"content": "Hello, my name is Li Lei. What is 1+1?"
}
],
"temperature": 0.3,
"response_format": {
"type": "json_object"
}
}'
Python Request Example
import requests
import json
url = "https://genaiapi.cloudsway.net/v1/ai/{endpointPath}/chat/completions"
headers = {
"Authorization": "Bearer ${your AK}",
"Content-Type": "application/json"
}
data = {
"model": "moonshot-v1-8k",
"messages": [
{
"role": "system",
"content": "You are Kimi, an AI assistant provided by Moonshot AI. You are more proficient in conversations in Chinese and English. You will provide users with safe, helpful, and accurate responses. At the same time, you will refuse to answer any questions related to terrorism, racial discrimination, explicit violence, or other inappropriate content. Moonshot AI is a proper noun and should not be translated into other languages."
},
{
"role": "system",
"content": "You are an intelligent customer service representative for Kimi. You are responsible for answering various questions from users. Please respond to user questions based on the document content. Your answers can include text, images, and links, and you may include text, images, and links together in a single response.\n\nPlease output your response in the following JSON format:\n\n{\n \"text\": \"text content\",\n \"image\": \"image URL\",\n \"url\": \"link URL\"\n}\n\nNote: Place text content in the `text` field, place images as links starting with `oss://` in the `image` field, and place regular links in the `url` field."
},
{
"role": "user",
"content": "Hello, my name is Li Lei. What is 1+1?"
}
],
"temperature": 0.3,
"response_format": {
"type": "json_object"
}
}
# Send request
response = requests.post(url, headers=headers, json=data)
# Process response
if response.status_code == 200:
result = response.json()
print("Full response:")
print(json.dumps(result, indent=2, ensure_ascii=False))
# Parse the JSON format response returned by the model
try:
message_content = result.get("choices", [{}])[0].get("message", {}).get("content", "")
if message_content:
# The model should return a JSON formatted string
parsed_response = json.loads(message_content)
print("\nParsed response:")
print(f"Text: {parsed_response.get('text', '')}")
print(f"Image: {parsed_response.get('image', '')}")
print(f"Link: {parsed_response.get('url', '')}")
except json.JSONDecodeError:
print(f"\nThe content returned by the model is not valid JSON: {message_content}")
else:
print(f"Request failed, status code: {response.status_code}")
print(response.text)
Response Example
{
"id": "chatcmpl-687715097b6d3f9aad6f416a",
"choices": [
{
"index": 0,
"logprobs": null,
"message": {
"role": "assistant",
"content": "{\n \"text\": \"Hello, Li Lei! 1 + 1 = 2.\",\n \"image\": \"\",\n \"url\": \"\"\n}",
"reasoning_content": null,
"function_call": null,
"tool_calls": null,
"reasoning_details": null
},
"finish_reason": "stop",
"native_finish_reason": null
}
],
"created": 1752634634,
"model": "kimi-k2-0711-preview",
"object": "chat.completion",
"system_fingerprint": null,
"usage": {
"prompt_tokens": 212,
"completion_tokens": 32,
"total_tokens": 244,
"completion_tokens_details": null,
"prompt_tokens_details": null
}
}
Partial Mode
Partial Mode is an advanced feature provided by the Kimi API, which allows developers to prefill the beginning part of the model's response, enabling the model to precisely continue generation starting from this pre-set content.
You can understand it as a technology of "feeding prompts to AI", which forces the model's response to start with the specified content
Key Points:
-
Add an additional message to the end of the messages list, setting
role=assistant,partial=True; -
Place the content that needs to be used as the beginning in the
contentfield, and the Kimi large model will forcefully start generating responses with the content ofcontent; -
Splice the
contentfrom step 2 before the content generated by the Kimi large model to form a complete response
Example of Curl Request
curl 'https://genaiapi.cloudsway.net/v1/ai/{endpointPath}/chat/completions' \
--header 'Authorization: Bearer ${your AK}' \
--header 'Content-Type: application/json' \
--data '{
"model": "moonshot-v1-8k",
"messages": [
{
"role": "system",
"content": "You are Kimi, an AI assistant provided by Moonshot AI. You are more proficient in conversations in Chinese and English. You will provide users with safe, helpful, and accurate responses. At the same time, you will refuse to answer any questions related to terrorism, racial discrimination, explicit violence, or other inappropriate content. Moonshot AI is a proper noun and should not be translated into other languages."
},
{
"role": "user",
"content": "Hello?"
},
{
"partial": true,
"role": "assistant",
"content": "Dear user, hello. Let me introduce Kimi"
}
],
"temperature": 0.3
}'
Python Request
import requests
import json
url = "https://genaiapi.cloudsway.net/v1/ai/{endpointPath}/chat/completions"
headers = {
"Authorization": "Bearer ${your AK}",
"Content-Type": "application/json"
}
data = {
"model": "moonshot-v1-8k",
"messages": [
{
"role": "system",
"content": "You are Kimi, an AI assistant provided by Moonshot AI. You are more proficient in conversations in Chinese and English. You will provide users with safe, helpful, and accurate responses. At the same time, you will refuse to answer any questions related to terrorism, racial discrimination, explicit violence, or other inappropriate content. Moonshot AI is a proper noun and should not be translated into other languages."
},
{
"role": "user",
"content": "Hello?"
},
{
"partial": True,
"role": "assistant",
"content": "Dear user, hello. Let me introduce Kimi"
}
],
"temperature": 0.3
}
# Send request
response = requests.post(url, headers=headers, json=data)
# Process response
if response.status_code == 200:
result = response.json()
print(json.dumps(result, indent=2, ensure_ascii=False))
# Extract and print the assistant's response content
try:
assistant_message = result.get("choices", [{}])[0].get("message", {}).get("content", "")
if assistant_message:
print("\n" + "="*50)
print("Kimi's response:")
print("="*50)
print(assistant_message)
except Exception as e:
print(f"\nError parsing response: {e}")
else:
print(f"Request failed, status code: {response.status_code}")
print(response.text)
Response Example
{
"id": "4052eed298b04f6abba93680034825c2",
"choices": [
{
"index": 0,
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hello! I am **Kimi**, an AI assistant developed by **Moonshot AI**.\n\nI specialize in conversations in both Chinese and English. I can help you answer questions, write articles, analyze data, write code, translate text, or just chat.\n\nI strive to provide you with **safe, helpful, and accurate** responses, while strictly adhering to content guidelines by refusing any requests involving terrorism, racial discrimination, explicit violence, or other inappropriate content.\n\nIs there anything I can help you with? Whether it's for study, work, or everyday life, feel free to let me know!",
"refusal": null,
"annotations": null,
"images": null,
"reasoning_content": "The user greeted in Chinese with \"Hello?\", which is a simple greeting. As Kimi, I should respond warmly in Chinese and ask what I can help with.\n\nThe user said \"let me introduce Kimi,\" but in this context, the user might be testing me or genuinely wants to know about Kimi (myself). Based on the system prompt, I am Kimi, an AI assistant provided by Moonshot AI.\n\nI need to:\n1. Respond warmly to the greeting\n2. Briefly introduce myself (Kimi) and my capabilities\n3. Ask the user what specific needs they have\n\nThe self-introduction should include:\n- I am Kimi, developed by Moonshot AI\n- Proficient in Chinese and English conversations\n- Provide safe, helpful, and accurate responses\n- Reject inappropriate content\n\nKeep it concise, warm, and professional.",
"function_call": null,
"tool_calls": null,
"reasoning_details": null
},
"finish_reason": "stop",
"native_finish_reason": null
}
],
"logprobs": null,
"created": 1776052369,
"model": "MaaS_Kimi_K2.5_20260127",
"object": "chat.completion",
"system_fingerprint": null,
"service_tier": null,
"usage": {
"prompt_tokens": 87,
"completion_tokens": 257,
"total_tokens": 344,
"completion_tokens_details": null,
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 0
},
"cache_creation_input_tokens": null,
"cache_creation": null,
"gemini_cache_tokens_details": null
}
}
Kimi 2.5 supports thinking
Default enabled state
curl --location --request POST 'https://genaiapi.cloudsway.net/v1/ai/{endpoinPath}/chat/completions' \
--header 'Authorization: Bearer ${your AK}' \
--header 'Content-Type: application/json' \
--data-raw '{
"stream": false,
"messages": [
{
"role": "user",
"content": "how are you"
}
],
"thinking": {
"type": "false",
"budget_tokens": 2000
}
}'