MaaS_Gr
Request Protocol
https
Header
| Parameter Name | Type | Required | Description |
|---|---|---|---|
Content-Type |
string | is | Fixed to applicatio n/json |
Authorization |
string | is | Bearer {your_api_key} |
Request URL
POST https://{新平台域名}/v1/ai/{endpointPath}/happyhorse/v1/video-synthesis
Request Body Parameters
| Attribute Name | Type | Required/Optional | Description |
|---|---|---|---|
deferred |
boolean or null |
Optional (default false) |
If set to true, the request returns request_id, and then the deferred response can be obtained via GET /v1/chat/deferred-completion/{request_id}. |
frequency_penalty |
number or null |
Optional (default 0, range -2 to 2) |
Penalty value based on the existing frequency of tokens. A positive value reduces the probability of repeating the same line. (Not supported by inference models) |
logit_bias |
object or null |
Optional | (not supported by reasoning models) A JSON object that maps token IDs to bias values ranging from -100 to 100. |
logprobs |
boolean or null |
Optional (default false) |
Whether to return the log probabilities of the output tokens. |
max_completion_tokens |
integer or null |
Optional | Maximum number of tokens to generate for completion (only applies to visible output tokens, excluding inference or function call tokens). |
max_tokens |
integer or null |
Optional (deprecated) | is deprecated, it is recommended to use max_completion_tokens. |
messages |
array |
Required | Conversation message list. Each message contains role ( system/user/assistant/tool/function) and content (can be a string or an array of content components, supporting text, image URLs, file IDs, etc.). |
model |
string |
Required | Name of the model used. |
n |
integer or null |
Optional (default 1, minimum 1) |
Number of completion choices generated for each input message. |
parallel_tool_calls |
boolean or null |
Optional (default true) |
If false, the model executes at most one tool call. |
presence_penalty |
number or null |
Optional (default 0, range -2 to 2) |
Penalty value based on whether the new token has already appeared. Positive values encourage the model to discuss new topics. (Not supported by grok-3and inference models) |
reasoning_effort |
string or null |
Optional | Limit the intensity of the inference model's thinking. Optional low (using fewer inference tokens) or high (using more inference tokens). Not supported for grok-4. |
response_format |
object or null |
Optional | Structured output format. Can specify text, json_object, or a detailed structure with json_schema. |
search_parameters |
object or null |
Optional | Parameters that control real-time data retrieval. Includes mode (off/on/auto), sources (x, web, news, rss), date range, citation return, etc. |
seed |
integer or null |
Optional | Deterministic sampling seed (strives to ensure that the same parameters return the same results, not absolute). |
stop |
array or null |
Optional | Up to 4 stop sequences, generation stops upon encountering them. (Not supported by inference model) |
stream |
boolean or null |
Optional (default false) |
Whether to enable streaming response. Once enabled, incremental messages will be sent in SSE format. |
stream_options |
object or null |
Optional | Streaming option. Includes include_usage (sends an additional block containing usage before ending). |
temperature |
number or null |
Optional (default 1, range 0\~2) |
Sampling temperature. A higher value makes the output more random, while a lower value makes it more deterministic. |
tool_choice |
string or object or null |
Optional | Controls how the model selects tools. "none" / "auto" / "required" or specify a specific function name. |
tools |
array or null |
Optional | List of tools that the model can call (currently supports function type). Up to 128 functions, each containing a name, description, and parameter JSON schema. |
top_logprobs |
integer or null |
Optional (range 0\~8) | Returns the top K most likely tokens and their log probabilities at each token position. logprobs=true must be set simultaneously. |
top_p |
number or null |
Optional (default 1, range 0\~1, excluding 0) |
Nucleus sampling probability mass. Usually adjusted in combination with temperature as an alternative. |
user |
string or null |
Optional | Unique Device Identifier for end users for monitoring and abuse detection. |
web_search_options |
object or null |
Optional | Fields retained solely for OpenAI compatibility, containing filters, search_context_size, and user_location. |
Call Example
/chat/completions
/chat/completions Non-streaming Request
curl --location --request POST
'https://{新平台域名}/v1/ai/{endpointPath}/chat/completions' \
--header 'Authorization: Bearer {api key}' \
--header 'Content-Type: application/json' \
--data-raw '{
"model": "grok-4.3",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"thinking": {"type": "enabled"},
"reasoning_effort": "high",
"stream": false
}'
/chat/completions Streaming Request
curl --location --request POST
'https://{新平台域名}/v1/ai/{endpointPath}/chat/completions' \
--header 'Authorization: Bearer {api key}' \
--header 'Content-Type: application/json' \
--data-raw '{
"model": "grok-4.3",
"stream": true,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"thinking": {"type": "enabled"},
"reasoning_effort": "high",
"stream": false
}'
Unified Domain Access /v1/chat/completions
Curl Request
curl --location 'https://genaiapi.cloudsway.net/v1/chat/completions' \
--header 'Authorization: Bearer YOUR_ACCESS_KEY' \
--header 'Content-Type: application/json' \
--data '{
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "hi"
}
]
}
],
"model":"MaaS_Gr_4.3_20260501",
"stream": false,
"stream_options":{"include_usage":true}
}'
Python Request
import requests
import json
YOUR_ACCESS_KEY = "YOUR_ACCESS_KEY"
url = "https://genaiapi.cloudsway.net/v1/chat/completions"
headers = {
"Authorization": f"Bearer {YOUR_ACCESS_KEY}",
"Content-Type": "application/json"
}
payload = {
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "hi"
}
]
}
],
"model": "MaaS_Gr_4.3_20260501",
"stream": False,
"stream_options": {"include_usage": True}
}
try:
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
print(response.json())
except requests.exceptions.RequestException as e:
print(f"请求失败: {e}")
Return Example
{
"id": "a6ce483d-99b6-9910-a25a-69ff67e41e45",
"choices": [
{
"index": 0,
"logprobs": null,
"message": {
"role": "assistant",
"content": "Hi! How can I help you today?",
"refusal": null,
"annotations": null,
"images": null,
"reasoning_content": "The user said \"hi\". This is a simple greeting. As an AI, I should respond in a friendly, engaging way.\n",
"function_call": null,
"tool_calls": null,
"reasoning_details": null
},
"finish_reason": "stop",
"native_finish_reason": null
}
],
"logprobs": null,
"created": 1779095005,
"model": "MaaS_Gr_4.3_20260501",
"object": "chat.completion",
"system_fingerprint": "fp_f06c287374635121",
"service_tier": null,
"usage": {
"prompt_tokens": 131,
"completion_tokens": 126,
"total_tokens": 257,
"completion_tokens_details": {
"accepted_prediction_tokens": 0,
"audio_tokens": 0,
"image_tokens": 0,
"reasoning_tokens": 117,
"rejected_prediction_tokens": 0
},
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 128
},
"cache_creation_input_tokens": null,
"cache_creation": null,
"gemini_cache_tokens_details": null
}
}