MaaS_Cl_Opus_4.8

Basic Information

URL： https:// genaiapi.cloudsway.net
API Request Endpoint: POST
Authentication: Authenticate via HTTP Bearer Token, requiring the API key to be included in the request header.

Parameter Name	Type	Required	Description
`Content-Type`	string	is	Fixed to `applicatio` `n/json`
`Authorization`	string	is	`Bearer {your_api_key}`

Explanation of Core Parameters

Parameter	Type	Required	Description
`model`	string	is	Model ID to be used
`messages`	array	is	Conversation message list, including`role`(`user`or`assistant`) and`content`
`max_t` `okens`	integer	is	Maximum number of tokens generated
`system`	string	No	System prompt, used to set the behavior and background of the assistant
`temperature`	number	No	Sampling temperature, where a higher value makes the output more random and a lower value makes it more deterministic, only supports a fixed value of 1
`top_p`	number	No	Nucleus sampling parameter, the model considers the results of tokens with top_p probability mass, only supports passing a fixed value of 0.99
`top_k`	integer	No	Sampling only from the K tokens with the highest probability in each step, opus 4.8 does not support
`stream`	boolean	No	Whether to enable streaming, default is`false`
`stop_sequences`	array	No	Custom stop sequences, which stop the model when generated

Request and Response Example

Endpoint calls /v1/messages

curl --location --request POST 'https://genaiapi.cloudsway.net/{ENDPOINT}/v1/messages' \
--header 'Authorization: Bearer {Your AK}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "max_tokens": 2048,
    "stream": false,
    "messages": [
        {
            "role": "user",
            "content": "test"
        }
    ],
    "thinking": {
        "type": "adaptive"
    }
}'

OpenAI format endpoint call /chat/completions

curl --location --request POST 'https://genaiapi.cloudsway.net/{ENDPOINT}/v1/messages' \
--header 'Authorization: Bearer {Your AK}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "max_tokens": 2048,
    "stream": false,
    "messages": [
        {
            "role": "user",
            "content": "test"
        }
    ],
    "thinking": {
        "type": "adaptive"
    }
}'

OpenAI format integrated call /chat/completions

curl --location --request POST 'https://genaiapi.cloudsway.net/{ENDPOINT}/v1/messages' \
--header 'Authorization: Bearer {Your AK}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "MaaS_Cl_Opus_4.8_20260528",
    "stream": false,
    "messages": [
        {
            "role": "user",
            "content": "who are you?"
        }
    ],
    "thinking": {
        "type": "adaptive"
    }
}'

Function Introduction

thinking

opus4.8 does not support extended thinking budget, and passing the following parameters will result in a 400 error

thinking: {"type": "enabled", "budget_tokens": N}

opus4.8 only supports the Adaptive method

thinking: {"type": "adaptive"}

Minimum cached token

The minimum cacheable prompt length on Claude Opus 4.8 is 1024 tokens, and no code changes are required.