Text-to-Text

Chat Completion

Introduction

OpenAI's Chat Completion is a powerful feature that allows developers to build applications that can carry out dynamic and interactive conversations. It uses the same underlying model as the text completion feature, but it's designed to handle multi-turn conversations.

In a chat-based model, you send a list of messages instead of a single string as a prompt. Each message in the list has two properties: 'role' and 'content'. The 'role' can be 'system', 'user', or 'assistant', and 'content' contains the text of the message from the role.

The system role is used to set the behavior of the assistant, the user role providzes instructions to the assistant, and the assistant role contains the assistant's responses. The model generates a response for the assistant based on the conversation history.

This feature opens up new possibilities for creating interactive and dynamic applications, such as virtual assistants, chatbots, and more.

Best Practices

The chat completion feature works similarly to the text completion feature, but it takes a list of messages as input instead of a single prompt. Here's an example of how you can use OpenAI chat completion:

curl -X POST 'https://genaiapi.cloudsway.net/v1/ai/sVUARttwSjilLOZY/chat/completions' -H 'Authorization: Bearer YOU_API_KEY' -H 'Content-Type: application/json' --data '{
    "model": "gpt-4",
    "messages": [
        {
            "role": "user",
            "content": "hello"
        }
    ]
}'

In this example, we are sending a POST request to the chat completion endpoint with the following parameters:

model: The name of the model to use for completion (e.g., gpt-4-turbo). This parameter is optional as we have included it in the upstream of the Cloudsway endpoint.
messages: A list of messages containing the conversation history. Each message has a 'role' and 'content' property.

For other query parameters, you can refer to the OpenAI API documentation for more details.

The chat completion response will be a JSON object containing the generated response from the assistant. For example:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "messages": [
        {
          "role": "assistant",
          "content": "I'm doing well, thank you for asking. How can I help you today?"
        }
      ]
    }
  ],
  "created": 1632345678,
  "model": "gpt-4-turbo",
  "object": null,
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 100,
    "prompt_tokens": 20,
    "total_tokens": 120
  }
}

You can use the generated response to continue the conversation or display it in your application. The chat completion feature allows you to create engaging and interactive chat experiences for your users.

Seamless Integration

Here's an example of how you can integrate Cloudsway with the Python OpenAI library:

from openai import OpenAI
from openai import OpenAI

client=OpenAI(
    base_url="https://genaiapi.cloudsway.net/v1/ai/zUcfeMfrpNqyEhTN",
    api_key='YOUR_API_KEY'
)
message_text = [{"role":"user","content":"what's 1+1? Answer in one word."}]

completion = client.chat.completions.create(
  model="", # model = "deployment_name"
  messages = message_text,
  temperature=0.7,
  max_tokens=800,
  top_p=0.95,
)

print(completion)

By following these steps, you can effectively use OpenAI chat completion to generate responses based on the conversation history. Experiment with different messages and parameters to create engaging chat experiences for your users.

Stream Completion

OpenAI's Stream Completion is a powerful feature that allows developers to build applications that can carry out dynamic and interactive conversations in real-time. It uses the same underlying model as the text completion feature, but it's designed to handle continuous streams of messages. The only difference is that the messages are sent in a continuous stream instead of a single batch. So you can add the parameter stream to the request. For example:

curl -X POST 'https://genaiapi.cloudsway.net/v1/ai/sVUARttwSjilLOZY/chat/completions' -H 'Authorization: Bearer YOU_API_KEY' -H 'Content-Type: application/json' --data '{
    "model": "gpt-4",
    "messages": [
        {
            "role": "user",
            "content": "hello"
        }
    ],
    "stream": true
}'

The parameter in python is:

completion = client.chat.completions.create(
  model="", # model = "deployment_name"
  stream = true,
  messages = message_text,
  temperature=0.7,
  max_tokens=800,
  top_p=0.95,
)

The stream completion response will be a JSON object containing the generated response from the assistant. For example:

{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-3.5-turbo-0125", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}

You can use the generated response to continue the conversation or display it in your application. The stream completion feature allows you to create engaging and interactive chat experiences for your users in real-time.

Multimodal

Introduction

A multimodal model is an artificial intelligence model capable of processing and comprehending various types of data, such as text, images, audio, and more. Unlike unimodal models, multimodal models can integrate information from different modalities, thereby offering a more comprehensive and precise understanding and generative ability. For instance, MaaS 4o is a multimodal model that can handle both text and image inputs, producing corresponding outputs.

Multimodal input image

Here is an exemplary code snippet demonstrating the best practices for utilizing MaaS 4o to process text and image inputs, specifically by analyzing the ingredients and calories in a food image.

curl --request POST \
  --url https://genaiapi.cloudsway.net/v1/ai/MdjOvwDAoVaNE/chat/completions \
  --header 'Accept: */*' \
  --header 'Accept-Encoding: gzip, deflate, br' \
  --header 'Authorization: Bearer {YOUR_API_KEY}' \
  --header 'Connection: keep-alive' \
  --header 'Content-Type: application/json' \
  --header 'User-Agent: PostmanRuntime-ApipostRuntime/1.1.0' \
  --data '{
    "messages": [
        {
            "role": "system", 
            "content": "假设你是食品专家，你需要根据提供的图片分析出成分和卡路里，用json的方式返回"
        }, 
        {
            "role": "user", 
            "content": [
                {
                    "type": "text", 
                    "text": "分析一下"
                }, 
                {
                    "type": "image_url", 
                    "image_url": {
                        "url": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSqbvh1y4Kc3bOG60cNFQ1HRIYeMyg5hYhKNQ&s"
                    }
                }
            ]
        }
    ], 
    "stream": false,
     "stream_options": {
    "include_usage": true
  }
}'

In this example, we send a POST request to the endpoint with the following parameters:

role: Represents the role of the message, which can be "system", "user", or "assistant".
content: Contains the content of the message, with different formats depending on the role.
stream: A boolean value indicating whether to use streaming. In this example, the value is false, indicating that streaming is not used.

The response would be a JSON object containing the generated text. For example:

{
    "created": 1721984825,
    "usage": {
        "completion_tokens": 313,
        "prompt_tokens": 248,
        "total_tokens": 561
    },
    "model": null,
    "id": "chatcmpl-9pBCjZZoDwaTKYjiRz5HomWHnSEZC",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "role": "assistant",
                "function_call": null,
                "tool_calls": null,
                "content": "Here is an image of a burger. Based on the image, it contains the following ingredients:\n\n- Sesame bun (top and bottom)\n- Lettuce\n- Tomato slices\n- Bacon strips\n- Grilled beef patty\n- Possibly some sauces (such as mayonnaise or salad dressing)\n\nThe approximate calorie breakdown for each ingredient is as follows:\n\n1. **Sesame bun** (top and bottom): approximately 150 calories\n2. **Lettuce**: approximately 5 calories\n3. **Tomato slices**: approximately 10 calories\n4. **Bacon strips** (2-3 pieces): approximately 120 calories\n5. **Grilled beef patty**: approximately 250 calories\n6. **Sauces**: approximately 50 calories\n\nThe total estimated calories are as follows:\n\n```json\n{\n  \"components\": {\n    \"sesame_bun\": 150,\n    \"lettuce\": 5,\n    \"tomato_slices\": 10,\n    \"bacon_strips\": 120,\n    \"grilled_beef_patty\": 250,\n    \"sauces\": 50\n  },\n  \"total_calories\": 585\n}\n```\n\nPlease note that this is only an estimate, and the actual calorie count may vary depending on the specific quantity and type of ingredients."
            },
            "logprobs": null
        }
    ],
    "system_fingerprint": null,
    "object": "chat.completion"
}

Multimodal input video

curl --request POST \
  --url https://genaiapi.cloudsway.net/v1/ai/xxx/chat/completions \
  --header 'Authorization: Bearer {YOUR_API_KEY}' \
  --header 'content-type: application/json' \
  --data '{
    "stream": true,
    "messages": [
        {
            "role": "system",
            "content": "你是一位智能助手"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "把这个视频翻译为文本"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://xxx.mp4或者类似data:video/mp4;base64,xxx这种base64"
                    }
                }
            ]
        }
    ]
}'

Multimodal input audio

curl --request POST \
  --url https://genaiapi.cloudsway.net/v1/ai/xxx/chat/completions \
  --header 'Authorization: Bearer {YOUR_API_KEY}' \
  --header 'content-type: application/json' \
  --data '{
    "stream": true,
    "messages": [
        {
            "role": "system",
            "content": "将语音输出为文字"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "input_audio",
                    "input_audio": {
                        "data": "语音base64,不带data:audio/mp3;base64类似前缀",
                        "format": "mp3"
                    }
                }
            ]
        }
    ]
}'

MaaS 3.7 Sonnet

MaaS 3.7 Sonnet can freely switch between immediate responses and gradually showing the thinking process. In the standard mode, it is an upgraded version of Claude 3.5 Sonnet, with faster response speed and higher accuracy. In the extended thinking mode, the model will conduct self-reflection before answering, and it performs excellently in complex tasks such as mathematics, physics, instruction following, and coding. Users can also switch modes through the new chat window. In addition, API users can precisely control the thinking time of the model to find the best balance between speed, cost, and the quality of the answers.

Currently, it supports two invocation methods: compatibility with the OpenAI protocol and the native protocol.

Example of Invocation Compatible with the OpenAI Protocol

The endpoint format compatible with the OpenAI protocol generally is： https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions

REST API Example

curl --location 'https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions' \
--header 'Authorization: Bearer {YOUR_ACCESS_KEY}' \
--header 'Content-Type: application/json' \
--data '{
    "model": "claude-3-7-sonnet-20250219",
    "max_tokens": 2048,
    "stream":false,
    "messages": [
        {
            "role": "user",
            "content": "证明勾股定理"
        }
    ],
     "thinking": { "type": "enabled", "budget_tokens": 2000 }
}'

SDK Example

from openai import OpenAI


client = OpenAI(
     base_url="https://genaiapipre.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}",
    api_key="{YOUR_ACCESS_KEY}"
)

messages=[
        {
            "role": "user",
            "content": "hi"
        }
    ]


# chat completion
result = client.chat.completions.create(
    model="modelName",
    messages=messages,
    stream=False,
    extra_body={
        "thinking": {"type": "enabled", "budget_tokens": 2000}
    }

)

print(result.json())

Example of Invocation Following the Native Protocol

The endpoint format following the native protocol is generally： https://genaiapi.cloudsway.net/v2/ai/{YOUR ENDPOINTPATH}/claude/chat/completions

REST API Example

curl --location 'https://genaiapi.cloudsway.net/v2/ai/{YOUR_ENDPOINTPATH}/claude/chat/completions' \
--header 'Authorization: Bearer {YOUR_ACCESS_KEY}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "claude-3-7-sonnet-20250219",
    "max_tokens": 20000,
    "messages": [
        {
            "role": "user",
            "content": "Are there an infinite number of prime numbers such that n mod 4 == 3?"
        }
    ],
    "metadata": {
        "user_id": "test03191813"
    },
    "stop_sequences":["Machine"],
    "system":"You are a seasoned data scientist at a Fortune 500 company",
    "temperature":1,
    "thinking":{
        "budget_tokens":16000,
        "type":"enabled"
    }
}'

MaaS 2.5

MaaS 2.5 Flash

REST API Example

stream=true

stream=false

curl --location 'https://genaiapi.cloudsway.net/v1/ai/xxx/chat/completions' \
--header 'Authorization: Bearer {YOUR_ACCESS_KEY}' \
--header 'Content-Type: application/json' \
--data '{
    "max_tokens": 2048, 
    "stream": true, 
    "messages": [
        {
            "role": "user", 
            "content": "1+1等于几"
        }
    ], 
    "thinking": {
        "type": "enabled", 
        "budget_tokens": 2000
    }
}'

Activate thoughtful content output

curl --location 'https://genaiapi.cloudsway.net/v1/ai/xxx/chat/completions' \
--header 'Authorization: Bearer {YOUR_ACCESS_KEY}' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "MaaS_2.5_Flash",
    "messages": [
        {
            "role": "user",
            "content": "1 + 2 + 3 + ... + 5 = ?"
        }
    ],
    "max_tokens":4096,
    "stream": false,
    "thinking": {
        "type": "enabled",
        "budget_tokens":1024
    }
}'

MaaS 2.5 Pro

REST API Example

stream=true

stream=false

curl --location 'https://genaiapi.cloudsway.net/v1/ai/xxx/chat/completions' \
--header 'Authorization: Bearer xxx' \
--header 'Content-Type: application/json' \
--data '{
    "max_tokens": 2048, 
    "stream": true, 
    "messages": [
        {
            "role": "user", 
            "content": "1+1等于几"
        }
    ], 
    "thinking": {
        "type": "enabled"
    }
}'

Activate thoughtful content output

curl --location 'https://genaiapi.cloudsway.net/v1/ai/xxx/chat/completions' \
--header 'Authorization: Bearer xxxxx' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "MaaS_2.5_pro",
    "messages": [
        {
            "role": "user",
            "content": "1 + 2 + 3 + ... + 5 = ?"
        }
    ],
    "max_tokens":4096,
    "stream": false,
    "thinking": {
        "type": "enabled"
    }
}'

MaaS 4 Sonnet

MaaS 4 Sonnet is an efficient AI model that balances performance and cost, featuring precise instruction execution, intelligent tool collaboration, and enhanced memory capability, while supporting flexible switching between quick response and deep thinking modes, excelling in coding, reasoning, and multitasking with significantly improved output stability.

Currently, it supports two invocation methods: compatibility with the OpenAI protocol and the native protocol.

Example of Invocation Compatible with the OpenAI Protocol

The endpoint format compatible with the OpenAI protocol generally is： https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions

REST API Example

curl --location 'https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions' \
--header 'Authorization: Bearer {YOUR_ACCESS_KEY}' \
--header 'Content-Type: application/json' \
--data '{
    "model": "claude-4-sonnet-20250514",
    "max_tokens": 2048,
    "stream":false,
    "messages": [
        {
            "role": "user",
            "content": "证明勾股定理"
        }
    ],
     "thinking": { "type": "enabled", "budget_tokens": 2000 }
}'

Example of Invocation Following the Native Protocol

The endpoint format following the native protocol is generally： https://genaiapi.cloudsway.net/v2/ai/{YOUR ENDPOINTPATH}/claude/chat/completions

REST API Example

curl --location 'https://genaiapi.cloudsway.net/v2/ai/{YOUR ENDPOINT}/claude/chat/completions' \
--header 'Authorization: Bearer {YOUR AK}' \
--header 'Content-Type: application/json' \
--header 'anthropic-beta: interleaved-thinking-2025-05-14' \
--data '{
    "messages": [
        {
            "role": "user",
            "content": "What'\''s the total revenue if we sold 150 units of product A at $50 each, and how does this compare to our average monthly revenue from the database?"
        }
    ],
    "max_tokens": 16000,
    "thinking": {
        "type": "enabled",
        "budget_tokens": 10000
    },
    "tools": [
        {
            "name": "calculator",
            "description": "Perform mathematical calculations",
            "input_schema": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "Mathematical expression to evaluate"
                    }
                },
                "required": [
                    "expression"
                ]
            }
        },
        {
            "name": "database_query",
            "description": "Query product database",
            "input_schema": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "SQL query to execute"
                    }
                },
                "required": [
                    "query"
                ]
            }
        }
    ]
}'