Skip to content

Introduction

Maa5-DS Integration OpenAI API Specification

Length Limits: The API supports a maximum length of 64K. The output of reasoning_content cannot exceed 64K in length.

Unsupported Functions: Function Call, Json Output, FIM Full (Beta)

Unsupported Parameters: temperature, top_p, presence_penalty, frequency_penalty, logprobs, top_logprobs. Please note that for proprietary software, settings like temperature, top_p, presence_penalty, frequency_penalty will not be supported. Logprobs and top_logprobs will also be excluded.

Access Example

Non-Streaming:

curl -X POST 'https://genaiapi.cloudsway.net/v1/ai/sVUARttwSjilLOZY/chat/completions' -H 'Authorization: Bearer YOU_API_KEY' -H 'Content-Type: application/json' --data '{
    "model": "MaaS-DS-R1",
    "messages": [
        {
            "role": "user",
            "content": "hello"
        }
    ],
    "stream": false
}'

Streaming:

curl --request POST \
  --url https://genaiapi.cloudsway.net/v1/ai/{endpoint}/chat/completions \
  --header 'Accept: */*' \
  --header 'Accept-Encoding: gzip, deflate, br' \
  --header 'Authorization: Bearer YOU_API_KEY' \
  --header 'Connection: keep-alive' \
  --header 'Content-Type: application/json' \
  --header 'User-Agent: PostmanRuntime-ApipostRuntime/1.1.0' \
  --data '{
    "model": "deepseek-ai/DeepSeek-R1",
    "messages": [
        {
            "role": "system",
            "content": "你好"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "早上好"
                },
                {
                    "type": "text",
                    "text": "中午好"
                }
            ]
        }

    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    },
    "max_completion_tokens": 4000
}'

Frequently Asked Questions

  1. Regarding processing time, there may be considerations for length, how to optimize?
    You can enable the stream mode, which increases the responsiveness speed. It allows for incremental retrieval of the entire response over time. Refer to the example above: stream=true.

  2. How to obtain reasoning_content?

Non-Streaming

from openai import OpenAI
client = OpenAI(api_key="YOUR_API_KEY", base_url="YOUR_ENDPOINT")

# Round 1
messages = [{"role": "user", "content": "9.11 and 9.8, which is greater?"}]
response = client.chat.completions.create(
    model="deepseek-r1",
    messages=messages
)

reasoning_content = response.choices[0].message.reasoning_content
content = response.choices[0].message.content

Streaming

from openai import OpenAI
client = OpenAI(api_key="YOUR_API_KEY", base_url="YOUR_ENDPOINT")

# Round 1
messages = [{"role": "user", "content": "9.11 and 9.8, which is greater?"}]
response = client.chat.completions.create(
    model="deepseek-reasoner",
    messages=messages,
    stream=true
)

reasoning_content = ""
content = ""

for chunk in response:
    if chunk.choices[0].delta.reasoning_content:
        reasoning_content += chunk.choices[0].delta.reasoning_content
    else:
        content += chunk.choices[0].delta.content