MaaS_Cl_Opus_4.8
Basic Information
-
URL: https:// genaiapi.cloudsway.net
-
API Request Endpoint:
POST -
Authentication: Authenticate via HTTP Bearer Token, requiring the API key to be included in the request header.
| Parameter Name | Type | Required | Description |
|---|---|---|---|
Content-Type |
string | is | Fixed to applicatio n/json |
Authorization |
string | is | Bearer {your_api_key} |
Explanation of Core Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string | is | Model ID to be used |
messages |
array | is | Conversation message list, includingrole(userorassistant) andcontent |
max_t okens |
integer | is | Maximum number of tokens generated |
system |
string | No | System prompt, used to set the behavior and background of the assistant |
temperature |
number | No | Sampling temperature, where a higher value makes the output more random and a lower value makes it more deterministic, only supports a fixed value of 1 |
top_p |
number | No | Nucleus sampling parameter, the model considers the results of tokens with top_p probability mass, only supports passing a fixed value of 0.99 |
top_k |
integer | No | Sampling only from the K tokens with the highest probability in each step, opus 4.8 does not support |
stream |
boolean | No | Whether to enable streaming, default isfalse |
stop_sequences |
array | No | Custom stop sequences, which stop the model when generated |
Request and Response Example
Endpoint calls /v1/messages
curl --location --request POST 'https://genaiapi.cloudsway.net/{ENDPOINT}/v1/messages' \
--header 'Authorization: Bearer {Your AK}' \
--header 'Content-Type: application/json' \
--data-raw '{
"max_tokens": 2048,
"stream": false,
"messages": [
{
"role": "user",
"content": "test"
}
],
"thinking": {
"type": "adaptive"
}
}'
OpenAI format endpoint call /chat/completions
curl --location --request POST 'https://genaiapi.cloudsway.net/{ENDPOINT}/v1/messages' \
--header 'Authorization: Bearer {Your AK}' \
--header 'Content-Type: application/json' \
--data-raw '{
"max_tokens": 2048,
"stream": false,
"messages": [
{
"role": "user",
"content": "test"
}
],
"thinking": {
"type": "adaptive"
}
}'
OpenAI format integrated call /chat/completions
curl --location --request POST 'https://genaiapi.cloudsway.net/{ENDPOINT}/v1/messages' \
--header 'Authorization: Bearer {Your AK}' \
--header 'Content-Type: application/json' \
--data-raw '{
"model": "MaaS_Cl_Opus_4.8_20260528",
"stream": false,
"messages": [
{
"role": "user",
"content": "who are you?"
}
],
"thinking": {
"type": "adaptive"
}
}'
Function Introduction
thinking
opus4.8 does not support extended thinking budget, and passing the following parameters will result in a 400 error
opus4.8 only supports the Adaptive method
Minimum cached token
The minimum cacheable prompt length on Claude Opus 4.8 is 1024 tokens, and no code changes are required.