Text-to-Text
Chat Completion
Introduction
Chat Completion is a powerful feature that enables developers to build applications capable of dynamic and interactive conversations. It uses the same underlying model as the text completion feature but is specifically designed to handle multi-turn conversations.
In chat-based models, you send a list of messages as a prompt, rather than just a single string. Each message in the list has two attributes: 'role' and 'content'. 'role' can be 'system', 'user', or 'assistant', while 'content' contains the text of the message sent by the role.
The System Persona is used to set the behavior of the assistant, the User Persona provides instructions to the assistant, and the Assistant Persona contains the assistant's responses. The model generates the assistant's replies based on the conversation history.
This feature opens up new possibilities for creating interactive and dynamic applications, such as virtual assistants, chatbots, and so on.
Best Practice
The Chat Completion feature works similarly to the Text Completion feature, but it accepts a series of messages as input rather than a single prompt. Here is an example of how you can use OpenAI Chat Completion:
curl -X POST 'https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions' -H 'Authorization: Bearer {YOUR_API_KEY}' -H 'Content-Type: application/json' --data '{
"model": "gpt-4",
"messages": [
{
"role": "user",
"content": "hello"
}
]
}'
In this example, we sent a POST request to the chat completion endpoint with the following parameters:
-
model: The name of the model used for completion (e.g., gpt-4-turbo). This parameter is optional because we have already included it upstream of the Cloudsway endpoint. -
messages: A list of messages containing conversation history. Each message has a 'role' and 'content' attribute.
For other query parameters, you can refer to the interface documentation for more detailed information.
The response of chat completion will be a JSON object containing the assistant-generated response. For example:
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"messages": [
{
"role": "assistant",
"content": "I'm doing well, thank you for asking. How can I help you today?"
}
]
}
],
"created": 1632345678,
"model": "gpt-4-turbo",
"object": null,
"system_fingerprint": null,
"usage": {
"completion_tokens": 100,
"prompt_tokens": 20,
"total_tokens": 120
}
}
You can use the generated responses to continue the conversation or display them in your application. The chat completion feature allows you to create engaging and interactive chat experiences for your users.
Seamless Integration
This is an example of how to integrate Cloudsway with the Python OpenAI library:
from openai import OpenAI
from openai import OpenAI
client=OpenAI(
base_url="https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}",
api_key='YOUR_API_KEY'
)
message_text = [{"role":"user","content":"what's 1+1? Answer in one word."}]
completion = client.chat.completions.create(
model="", # model = "deployment_name"
messages = message_text,
temperature=0.7,
max_tokens=800,
top_p=0.95,
)
print(completion)
By following these steps, you can effectively use OpenAI Chat Completion to generate responses based on conversation history. Try using different messages and parameters to create an engaging chat experience for your users.
Stream Completion
Stream completion is a powerful feature that allows developers to build applications capable of dynamic and interactive conversations. It uses the same underlying model as the text completion feature but is designed to handle continuous message streams. The only difference is that messages are sent as a continuous stream rather than a one-time batch. So you can add the stream parameter to your request. For example:
curl -X POST 'https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions' -H 'Authorization: Bearer {YOUR_API_KEY}' -H 'Content-Type: application/json' --data '{
"model": "gpt-4",
"messages": [
{
"role": "user",
"content": "hello"
}
],
"stream": true
}'
The parameters in Python are:
completion = client.chat.completions.create(
model="", # model = "deployment_name"
stream = true,
messages = message_text,
temperature=0.7,
max_tokens=800,
top_p=0.95,
)
The response of streaming completion will be a JSON object containing the assistant-generated response. For example:
{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-3.5-turbo-0125", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}
You can use the generated responses to continue the conversation or display them in your application. The streaming completion feature allows you to create an engaging real-time interactive chat experience for your users.
MultiModal Machine Learning
Introduction
MultiModal Machine Learning is an artificial intelligence model capable of processing and understanding multiple types of data (such as text, images, audio, etc.). Different from single-modal models, MultiModal Machine Learning can integrate information from different modalities, thereby providing more comprehensive and accurate understanding and generation capabilities. For example, MaaS 4o is a MultiModal Machine Learning model that can process text and image inputs and generate corresponding outputs.
MultiModal Machine Learning Input Image
The following is a sample code for best practice using MaaS 4o to process text and image inputs. By inputting a food image, MaaS 4o analyzes the ingredients and calories in the food
curl --request POST \
--url https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions \
--header 'Accept: */*' \
--header 'Accept-Encoding: gzip, deflate, br' \
--header 'Authorization: Bearer {YOUR_API_KEY}' \
--header 'Connection: keep-alive' \
--header 'Content-Type: application/json' \
--header 'User-Agent: PostmanRuntime-ApipostRuntime/1.1.0' \
--data '{
"messages": [
{
"role": "system",
"content": "Assume you are a food expert. You need to analyze the ingredients and calories from the provided image and return the result in JSON format"
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "Analyze this"
},
{
"type": "image_url",
"image_url": {
"url": "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSqbvh1y4Kc3bOG60cNFQ1HRIYeMyg5hYhKNQ&s"
}
}
]
}
],
"stream": false,
"stream_options": {
"include_usage": true
}
}'
In this example, we send a POST request to the completion endpoint with the following parameters:
-
role: Represents the role of the message, which can be "system" (system), "user" (user), or "assistant" (assistant). -
content: The content of the message, which varies in format depending on the role. -
stream: A boolean value indicating whether to use streaming. In this example, the value is false, indicating that streaming is not used.
The completion response will be a JSON object containing the generated text. For example:
{
"created": 1721984825,
"usage": {
"completion_tokens": 313,
"prompt_tokens": 248,
"total_tokens": 561
},
"model": null,
"id": "chatcmpl-9pBCjZZoDwaTKYjiRz5HomWHnSEZC",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"function_call": null,
"tool_calls": null,
"content": "This is an image of a hamburger. Based on the image, it contains the following ingredients:\n\n- Sesame bun (top and bottom)\n- Lettuce\n- Tomato slices\n- Bacon slices\n- Grilled beef patty\n- Possibly some sauce (e.g., mayonnaise or salad dressing)\n\nHere is an approximate calorie analysis for each ingredient:\n\n1. **Sesame bun** (top and bottom): ~150 calories\n2. **Lettuce**: ~5 calories\n3. **Tomato slices**: ~10 calories\n4. **Bacon slices (2-3 slices)**: ~120 calories\n5. **Grilled beef patty**: ~250 calories\n6. **Sauce**: ~50 calories\n\nThe approximate total calories are as follows:\n\n```json\n{\n \"components\": {\n \"sesame_bun\": 150,\n \"lettuce\": 5,\n \"tomato_slices\": 10,\n \"bacon_slices\": 120,\n \"grilled_beef_patty\": 250,\n \"sauce\": 50\n },\n \"total_calories\": 585\n}\n```\n\nPlease note that this is only a rough estimate. Actual calories may vary depending on the quantity and type of ingredients."
},
"logprobs": null
}
],
"system_fingerprint": null,
"object": "chat.completion"
}
MultiModal Machine Learning Input Video
curl --request POST \
--url https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions \
--header 'Authorization: Bearer {YOUR_API_KEY}' \
--header 'content-type: application/json' \
--data '{
"stream": true,
"messages": [
{
"role": "system",
"content": "You are an intelligent assistant"
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "Translate this video to text"
},
{
"type": "image_url",
"image_url": {
"url": "https://xxx.mp4 or similar data:video/mp4;base64,xxx base64 format"
}
}
]
}
]
}'
MultiModal Machine Learning Input Audio
curl --request POST \
--url https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions \
--header 'Authorization: Bearer {YOUR_API_KEY}' \
--header 'content-type: application/json' \
--data '{
"stream": true,
"messages": [
{
"role": "system",
"content": "Convert speech to text"
},
{
"role": "user",
"content": [
{
"type": "input_audio",
"input_audio": {
"data": "audio base64, without prefix like data:audio/mp3;base64",
"format": "mp3"
}
}
]
}
]
}'
MaaS_GP Series
Supports two invocation methods: compatibility with the OpenAI protocol and the native protocol
Compatible with OpenAI Protocol
The endpoint format compatible with the OpenAI protocol is generally:
https://genaiapi.cloudsway.net/v1/ai/{endpointPath}/chat/completions
Example of REST API Call
curl "https://genaiapi.cloudsway.net/v1/ai/{endpointPath}/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer RWYh************x0GC" \
-d '{
"messages": [
{
"role": "developer",
"content": "Talk like a pirate."
},
{
"role": "user",
"content": "Are semicolons optional in JavaScript?"
}
]
}'
SDK Call Example
from openai import OpenAI
client = OpenAI(
api_key="xxxx",
base_url="https://genaiapi.cloudsway.net/v1/ai/xxxx/"
)
resp = client.chat.completions.create(
model="gpt5-pro",
messages=[{
"role": "user",
"content": "How much gold would it take to coat the Statue of Liberty in a 1mm layer?"
}]
)
print(resp.choices[0].message.content)
MaaS_Cl Series
Supports two invocation methods: compatibility with the OpenAI protocol and the native protocol
Compatible with OpenAI Protocol
The endpoint format compatible with the OpenAI protocol is generally: https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}endpoint/chat/completions
Example of REST API Call
curl --location 'https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions' \
--header 'Authorization: Bearer {YOUR_ACCESS_KEY}' \
--header 'Content-Type: application/json' \
--data '{
"model": "claude-3-7-sonnet-20250219",
"max_tokens": 2048,
"stream":false,
"messages": [
{
"role": "user",
"content": "Prove the Pythagorean theorem"
}
],
"thinking": { "type": "enabled", "budget_tokens": 2000 }
}'
SDK Call Example
from openai import OpenAI
client = OpenAI(
base_url="https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}",
api_key="{YOUR KEY}"
)
messages=[
{
"role": "user",
"content": "hi"
}
]
# chat completion
result = client.chat.completions.create(
model="modelName",
messages=messages,
stream=False,
extra_body={
"thinking": {"type": "enabled", "budget_tokens": 2000}
}
)
print(result.json())
import anthropic
client = anthropic.Anthropic(base_url="https://genaiapi.cloudsway.net/xxx",
api_key="xxx")
message = client.beta.messages.create(
model="xxx",
max_tokens=4096,
system="You are an expert",
messages=[{"role": "user", "content": "Write me a 300-word essay"}],
stream=True,
thinking={
"type": "enabled",
"budget_tokens": 4000
}
)
for chunk in message:
print(chunk)
Native Protocol
The format of endpoints following the native protocol is generally:
https://genaiapi.cloudsway.net/{YOUR_ENDPOINTPATH}/v1/messages
Example of REST API Call
curl --request POST \
--url https://genaiapi.cloudsway.net/{YOUR ENDPOINTPATH}/v1/messages \
--header 'Accept: */*' \
--header 'Accept-Encoding: gzip, deflate, br' \
--header 'Authorization: Bearer {YOUR_API_KEY}' \
--header 'Connection: keep-alive' \
--header 'Content-Type: application/json' \
--header 'User-Agent: PostmanRuntime-ApipostRuntime/1.1.0' \
--data '{
"max_tokens": 5000,
"stream": false,
"system": [
{
"type": "text",
"text": "You are an AI assistant tasked with analyzing literary works"
}
],
"messages": [
{
"role": "user",
"content": "Analyze the major themes in Pride and Prejudice."
}
]
}'
MaaS_Ge Series
Supports two invocation methods: compatibility with the OpenAI protocol and the native protocol
Compatible with OpenAI Protocol
Basic Usage
stream=true Streaming
stream=false Non-streaming
Example of REST API Call
curl --location 'https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions' \
--header 'Authorization: Bearer {YOUR_ACCESS_KEY}' \
--header 'Content-Type: application/json' \
--data '{
"max_tokens": 2048,
"stream": true,
"messages": [
{
"role": "user",
"content": "What is 1+1"
}
],
"thinking": {
"type": "enabled",
"budget_tokens": 2000
}
}'
Start outputting thinking content
curl --location 'https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/chat/completions' \
--header 'Authorization: Bearer {YOUR_ACCESS_KEY}' \
--header 'Content-Type: application/json' \
--data-raw '{
"model": "MaaS_2.5_Flash",
"messages": [
{
"role": "user",
"content": "1 + 2 + 3 + ... + 5 = ?"
}
],
"max_tokens":4096,
"stream": false,
"thinking": {
"type": "enabled",
"budget_tokens":1024
}
}'
SDK Call Example
from openai import OpenAI
client = OpenAI(
api_key="YOUR AK",
base_url="https://genaiapi.cloudsway.net/v1/ai/{endpointPath}/"
)
resp = client.chat.completions.create(
messages=[{
"role": "user",
"content": "Who are you"
}],
model:""
)
print(resp.choices[0].message.content)
Native Protocol
The endpoint format following the native protocol is generally: https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/generateContent
Example of REST API Call
curl --location 'https://genaiapi.cloudsway.net/v1/ai/{YOUR_ENDPOINTPATH}/generateContent' \
--header 'Authorization: Bearer {YOUR_API_KEY}' \
--header 'Content-Type: application/json' \
--data '{
"systemInstruction": {
"parts": [
{
"text": "You are a helpful AI assistant"
}
]
},
"contents": [
{
"role": "user",
"parts": [
{
"text": "Write a 150-word fable"
}
]
}
],
"generationConfig": {
"temperature": 0.8,
"topP": 0.9,
"topK": 20,
"maxOutputTokens": 3000,
"candidateCount": 1,
"stopSequences": [
"history"
],
"thinkingConfig": {
"includeThoughts": true,
"thinkingBudget": 0
}
},
"safetySettings": [
{
"category": "HARM_CATEGORY_HARASSMENT",
"threshold": "BLOCK_NONE"
},
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"threshold": "BLOCK_NONE"
},
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"threshold": "BLOCK_NONE"
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_NONE"
}
]
}'