Chat Completions API文档

POST https://genaiapi.cloudsway.net/v1/ai/{ENDPOINT_PATH}/chat/completions

功能概述：为给定的聊天对话创建模型响应

请求参数

字段	含义	层级	格式	必需	备注
messages	构成当前对话的消息列表	1	arrary	是	根据使用的模型不同，支持不同类型的消息内容（模态），包括文本、图像和音频
Developer message	开发者提供的指令，模型应始终遵循，无论用户发送何种消息	2	object		对于o1及更新版本的模型，开发者消息替代了之前的系统消息
content	开发者消息的内容	3	string 或者 arrary	是
Text content	开发者消息的内容	4	string
Array of content parts	由特定类型定义的内容块数组	4	array		对于开发者消息，仅支持文本类型（text）
text	文本内容	5	string	是
type	内容块的类型	5	string	是
role	消息作者的角色	3	string	是	此处为developer
name	参与者的可选名称。用于向模型提供信息以区分相同角色的不同参与者	3	string
System message	开发者提供的指令，无论用户发送什么消息，模型都应遵循这些指令	2	object	在 o1 及更高版本的模型中，请改用开发者消息 (developer messages) 实现此功能
content	系统消息的内容	3	string 或者 arrary	是
Text content	系统消息的内容	4	string
Array of content parts	由特定类型定义的内容块数组	4	array		对于系统消息，仅支持文本类型（text）
Text content part	文本内容	5	object
text	文本内容	6	string	是
type	内容块的类型	6	string	是
role	消息作者的角色	3	string	是	此处为system
name	参与者的可选名称。用于向模型提供信息以区分相同角色的不同参与者	3	string
User message	终端用户发送的消息，包含提示词或额外的上下文信息	2	object
content	用户消息的内容	3	string 或者 arrary	是
Text content	用户消息的内容	4	string
Array of content parts	由特定类型定义的内容块数组	4	array		支持的内容类型根据用于生成响应的模型而有所不同，可包含文本、图像或音频输入
Text content part	文本内容	5	object
text	文本内容	6	string	是
type	内容块的类型	6	string	是
Image content part		5	object
image_url		6	object	是
url	图像的URL或base64编码的图像数据	7	stirng	是
detail	指定图像的细节级别	7	stirng		默认为auto
type	内容块的类型	6	string	是
role	消息作者的角色	3	string	是	此处为user
name	参与者的可选名称。用于向模型提供信息以区分相同角色的不同参与者	3	string
Assistant message	模型为响应用户消息而发送的消息	2	object
role	消息作者的角色	3	string	是	此处为助手 assistant
audio	模型先前音频响应的相关数据	3	object或者null
id	模型先前音频响应的唯一标识符	4	string	是
content	助手消息的内容	3	string 或者 arrary		除非指定了 `tool_calls` 或 `function_call`，否则为必需字段
Text content	助手消息的文本内容	4	string
Array of content parts	具有定义类型的内容块数组	4	array		可包含一个或多个文本类型 (`text`) 内容块，或恰好一个拒绝类型 (`refusal`) 内容块
Text content part		5	object
text	文本内容	6	string	是
type	内容块的类型	6	string	是
Refusal content part		5	object
refusal	模型生成的拒绝消息	6	string	是
type	内容块的类型	6	string	是
function_call	已弃用，由 `tool_calls` 替代	3	object或者null		包含模型生成的应调用函数的名称和参数
name	要调用的函数名称	4	string	是
arguments	调用函数时使用的参数，由模型以JSON格式生成。	4	string	是	注意：模型生成的JSON不一定有效，可能会产生函数模式中未定义的参数。调用函数前请验证参数
name	参与者的可选名称。用于区分相同角色的不同参与者	3	string
refusal	助手的拒绝消息	3	string或者null
tool_calls	模型生成的工具调用（如函数调用）	3	array
Function tool call	对模型创建的函数工具的调用	4	object
function	模型调用的函数	5	object	是
arguments	调用函数时所需的参数，由模型以JSON格式生成	6	string	是	请注意，模型生成的内容并不总是有效的JSON，而且可能会虚构出函数 schema 中未定义的参数。因此，在调用函数之前，请在代码中对这些参数进行验证
name	要调用的函数名称	6	string	是
id	toolcall的ID	5	string	是
type	tool的类型，目前只支持function	5	string	是
Tool message		2	object
content	工具消息的内容	3	string或者array	是
Text content	工具消息的文本内容	4	string
Array of content parts	具有特定类型的内容块数组。对于工具消息，仅支持文本类型 (text)	4	array
Text content part		5	object
text	文本内容	6	string	是
type	内容块类型	6	string	是
role	消息作者的角色，此处为tool	3	string	是
tool_call_id	该消息所响应的工具调用ID	3	string	是
model	用于生成响应的模型ID，例如 `gpt-4o` 或 `o3`	1	string	是
frequency_penalty	介于 -2.0 和 2.0 之间的数值，默认是0	1	number或者null		正值会根据标记在当前文本中的现有频率进行惩罚，从而降低模型逐字重复相同内容的可能性
logit_bias	调整指定标记在生成结果中出现的概率，默认为null	1	map		- 接收一个JSON对象，该对象将标记（通过分词器中的标记ID指定）映射到-100至100之间的偏差值 - 数学原理：该偏差值会在模型生成logits后、采样前被加入 - 具体效果因模型而异，但通常： -- -1到1之间的值会略微降低/提高选中概率 -- -100或100等极值会导致完全禁止/强制选中该标记
max_completion_tokens	生成结果中允许的最大标记数上限（包括可见输出标记和推理标记）	1	integer或者null
n	为每条输入消息生成的聊天补全选项数量,默认为1	1	integer或者null		注意：费用将基于所有选项生成的标记总数计算，建议保持 `n=1` 以控制成本
parallel_tool_calls	是否在工具使用时启用并行函数调用功能，默认为true	1	boolean
presence_penalty	介于 -2.0 和 2.0 之间的数值，默认为0	1	number或者null		正值会根据标记是否已在当前文本中出现进行惩罚，从而提高模型讨论新话题的可能性
response_format	指定模型输出格式的对象	1	object
Text	默认文本响应格式	2	object
type	所定义的响应格式类型。始终为文本	3	string	是
JSON schema	JSON 模式响应格式。用于生成结构化的 JSON 响应	2	object
json_schema	结构化输出的配置选项，包括一个JSON模式	3	object	是
name	响应格式的名称。必须由 a-z、A-Z、0-9 组成，或包含下划线和连字符，且最大长度为 64	4	string	是
description	对该响应格式用途的描述，供模型判断如何以该格式进行回应	4	string
schema	响应格式的模式，以 JSON 模式对象的形式描述	4	object
strict	生成输出时是否启用严格的模式遵循。默认为false，如果设置为true，模型将始终遵循schema字段中定义的确切模式。当strict为true时，仅支持JSON Schema的一个子集	4	boolean or null
type	响应格式的类型，固定为`json_object`	3	string	是
JSON object	JSON 对象响应格式	2	object		这是一种较旧的生成 JSON 响应的方法。对于支持 json_schema 的模型，建议使用 json_schema。请注意，如果没有系统消息或用户消息指示模型生成 JSON，模型将不会生成 JSON
type	所定义的响应格式类型。始终为 json_object	3	string	是
seed	（已弃用）若指定该参数，系统将尽力实现确定性采样（相同`seed`和参数的重复请求应返回相同结果）。确定性不保证，需通过`system_fingerprint`响应参数监测后端变更	1	integer或者null		该功能目前处于测试阶段
stop	最新推理模型 o3 和 o4-mini 不支持此参数,默认为null	1	string/array/null		最多可设置 4 个停止序列，当 API 生成到这些序列时会终止输出（返回的文本不包含停止序列本身）
store	是否存储本次聊天补全请求的输出，用于模型蒸馏或评估产品，默认为false	1	boolean或者null		- 支持文本和图像输入 - 注意：超过 10MB 的图像输入将被丢弃
stream	若设为 true，模型响应数据将通过服务器发送事件（SSE）流式传输，默认为false	1	boolean或者null
stream_options	流式回答的可选项（只有stream被设置为true是才有效），默认为null	1	object或者null
include_obfuscation	启用流混淆时（默认 true），会在流式事件中添加随机字符以标准化负载大小（防御旁路攻击）。设为 false 可节省带宽（需确保网络链路可信	2	boolean
include_usage	若启用，会在最终 `data: [DONE]` 消息前发送一个额外数据块： - `usage` 字段包含整个请求的 token 统计 - `choices` 字段始终为空数组注意：若流中断，可能无法收到包含总 token 消耗的最终数据块	2	boolean
temperature	采样温度值（范围 0-2），默认为1	1	number或者null		- 较高值（如 0.8）会增加输出的随机性 - 较低值（如 0.2）会使输出更集中和确定 - 建议与 `top_p` 参数二选一使用
tool_choice	控制模型是否（以及如何）调用工具	1	string或者object		- `none`：不调用工具，直接生成消息（无工具时的默认值） - `auto`：自动选择生成消息或调用工具（有工具时的默认值） - `required`：必须调用一个或多个工具 - 指定工具：通过 `{"type": "function", "function": {"name": "my_function"}}` 强制调用特定工具
Tool choice mode	工具选择模式	2	string		- `none`：不调用工具，直接生成消息（无工具时的默认值） - `auto`：自动选择生成消息或调用工具（有工具时的默认值） - `required`：必须调用一个或多个工具 - 指定工具：通过 `{"type": "function", "function": {"name": "my_function"}}` 强制调用特定工具
Function tool choice	强制指定模型应调用的特定工具	2	object		用于强制模型调用特定函数
function		3	object	是
name	需要调用函数的名字	4	string	是
type	tool的类型，当前仅支持function	3	object	是
tools	模型可调用的工具列表	1	array		当前仅支持将函数作为工具使用，用于提供模型可生成JSON输入的函数列表。最多支持128个函数
Function tool	一种可用于生成回应的函数工具	2	object
function		3	object	是
name	被调用函数的名称。仅允许使用字母（a-z、A-Z）、数字（0-9）、下划线和短横线，最大长度64个字符	4	string	是
description	函数功能的描述，用于帮助模型判断调用时机及方式	4	string
parameters	函数接受的参数，以JSON Schema格式描述	4	object
strict	是否在生成函数调用时启用严格模式。默认未false，若设为`true`，模型将严格遵循`parameters`中定义的JSON Schema结构	4	boolean或者null		严格模式下仅支持部分JSON Schema特性
type	工具类型。当前仅支持`function`	3	string	是
top_p	温度采样的替代方案——核采样（nucleus sampling），模型仅考虑概率质量累积在前`top_p`区间的token。默认为1	1	number或者null		例如`0.1`表示仅考虑概率质量前10%的token。建议：调整此参数或`temperature`，但不要同时调整二者

返回结果

返回一个对话完成对象（chat completion object），如果请求启用了流式传输，则返回对话完成块对象（chat completion chunk objects）的流式序列

示例请求与响应结果

Default请求

示例代码

curl请求


curl https://genaiapi.cloudsway.net/v1/ai/{ENDPOINT_PATH}/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {YOUR_ACCESS_KEY}" \
  -d '{
    "model": "gpt-5",
    "messages": [
      {
        "role": "developer",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ]
  }'

python请求


from openai import OpenAI
client = OpenAI()

completion = client.chat.completions.create(
  model="gpt-5",
  messages=[
    {"role": "developer", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]
)

print(completion.choices[0].message)

node js请求


import OpenAI from "openai";

const openai = new OpenAI();

async function main() {
  const completion = await openai.chat.completions.create({
    messages: [{ role: "developer", content: "You are a helpful assistant." }],
    model: "gpt-5",
    store: true,
  });

  console.log(completion.choices[0]);
}

main();

csharp请求


using System;
using System.Collections.Generic;

using OpenAI.Chat;

ChatClient client = new(
    model: "gpt-4.1",
    apiKey: Environment.GetEnvironmentVariable("YOUR_ACCESS_KEY")
);

List messages =
[
    new SystemChatMessage("You are a helpful assistant."),
    new UserChatMessage("Hello!")
];

ChatCompletion completion = client.CompleteChat(messages);

Console.WriteLine(completion.Content[0].Text);

Default响应结果

{
  "id": "chatcmpl-B9MBs8CjcvOU2jLn4n570S5qMJKcT",
  "object": "chat.completion",
  "created": 1741569952,
  "model": "gpt-4.1-2025-04-14",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?",
        "refusal": null,
        "annotations": []
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 19,
    "completion_tokens": 10,
    "total_tokens": 29,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default"
}

Streaming请求

示例代码

curl请求


curl https://genaiapi.cloudsway.net/v1/ai/{ENDPOINT_PATH}/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {YOUR_ACCESS_KEY}" \
  -d '{
    "model": "gpt-5",
    "messages": [
      {
        "role": "developer",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ],
    "stream": true
  }'

python请求


from openai import OpenAI
client = OpenAI()

completion = client.chat.completions.create(
  model="gpt-5",
  messages=[
    {"role": "developer", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  stream=True
)

for chunk in completion:
  print(chunk.choices[0].delta)

print(completion.choices[0].message)

node js请求


import OpenAI from "openai";

const openai = new OpenAI();

async function main() {
  const completion = await openai.chat.completions.create({
    model: "gpt-5",
    messages: [
      {"role": "developer", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
    ],
    stream: true,
  });

  for await (const chunk of completion) {
    console.log(chunk.choices[0].delta.content);
  }
}

main();

csharp请求


using System;
using System.ClientModel;
using System.Collections.Generic;
using System.Threading.Tasks;

using OpenAI.Chat;

ChatClient client = new(
    model: "gpt-4.1",
    apiKey: Environment.GetEnvironmentVariable("{YOUR_ACCESS_KEY}")
);

List messages =
[
    new SystemChatMessage("You are a helpful assistant."),
    new UserChatMessage("Hello!")
];

AsyncCollectionResult completionUpdates = client.CompleteChatStreamingAsync(messages);

await foreach (StreamingChatCompletionUpdate completionUpdate in completionUpdates)
{
    if (completionUpdate.ContentUpdate.Count > 0)
    {
        Console.Write(completionUpdate.ContentUpdate[0].Text);
    }
}

Streaming响应结果

{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"role":"assistant","content":""},"logprobs":null,"finish_reason":null}]}

{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}]}

....

{"id":"chatcmpl-123","object":"chat.completion.chunk","created":1694268190,"model":"gpt-4o-mini", "system_fingerprint": "fp_44709d6fcb", "choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}

Functions请求

示例代码

curl请求


curl https://genaiapi.cloudsway.net/v1/ai/{ENDPOINT_PATH}/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {YOUR_ACCESS_KEY}" \
-d '{
  "model": "gpt-4.1",
  "messages": [
    {
      "role": "user",
      "content": "What is the weather like in Boston today?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": ["celsius", "fahrenheit"]
            }
          },
          "required": ["location"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}'

python请求


from openai import OpenAI
client = OpenAI()

tools = [
  {
    "type": "function",
    "function": {
      "name": "get_current_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA",
          },
          "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
        },
        "required": ["location"],
      },
    }
  }
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]
completion = client.chat.completions.create(
  model="gpt-5",
  messages=messages,
  tools=tools,
  tool_choice="auto"
)

print(completion)

node js请求


import OpenAI from "openai";

const openai = new OpenAI();

async function main() {
  const messages = [{"role": "user", "content": "What's the weather like in Boston today?"}];
  const tools = [
      {
        "type": "function",
        "function": {
          "name": "get_current_weather",
          "description": "Get the current weather in a given location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA",
              },
              "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
            },
            "required": ["location"],
          },
        }
      }
  ];

  const response = await openai.chat.completions.create({
    model: "gpt-4.1",
    messages: messages,
    tools: tools,
    tool_choice: "auto",
  });

  console.log(response);
}

main();

csharp请求


using System;
using System.Collections.Generic;

using OpenAI.Chat;

ChatClient client = new(
    model: "gpt-4.1",
    apiKey: Environment.GetEnvironmentVariable("{YOUR_ACCESS_KEY}")
);

ChatTool getCurrentWeatherTool = ChatTool.CreateFunctionTool(
    functionName: "get_current_weather",
    functionDescription: "Get the current weather in a given location",
    functionParameters: BinaryData.FromString("""
        {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA"
                },
                "unit": {
                    "type": "string",
                    "enum": [ "celsius", "fahrenheit" ]
                }
            },
            "required": [ "location" ]
        }
    """)
);

List messages =
[
    new UserChatMessage("What's the weather like in Boston today?"),
];

ChatCompletionOptions options = new()
{
    Tools =
    {
        getCurrentWeatherTool
    },
    ToolChoice = ChatToolChoice.CreateAutoChoice(),
};

ChatCompletion completion = client.CompleteChat(messages, options);

Functions响应结果

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699896916,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\n\"location\": \"Boston, MA\"\n}"
            }
          }
        ]
      },
      "logprobs": null,
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 82,
    "completion_tokens": 17,
    "total_tokens": 99,
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  }
}

Logprobs请求

示例代码

curl请求


curl https://genaiapi.cloudsway.net/v1/ai/{ENDPOINT_PATH}/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {YOUR_ACCESS_KEY}" \
  -d '{
    "model": "gpt-5",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ],
    "logprobs": true,
    "top_logprobs": 2
  }'

python请求


from openai import OpenAI
client = OpenAI()

completion = client.chat.completions.create(
  model="gpt-5",
  messages=[
    {"role": "user", "content": "Hello!"}
  ],
  logprobs=True,
  top_logprobs=2
)

print(completion.choices[0].message)
print(completion.choices[0].logprobs)

node js请求


import OpenAI from "openai";

const openai = new OpenAI();

async function main() {
  const completion = await openai.chat.completions.create({
    messages: [{ role: "user", content: "Hello!" }],
    model: "gpt-5",
    logprobs: true,
    top_logprobs: 2,
  });

  console.log(completion.choices[0]);
}

main();

csharp请求


using System;
using System.Collections.Generic;

using OpenAI.Chat;

ChatClient client = new(
    model: "gpt-4.1",
    apiKey: Environment.GetEnvironmentVariable("{YOUR_ACCESS_KEY}")
);

List messages =
[
    new UserChatMessage("Hello!")
];

ChatCompletionOptions options = new()
{
    IncludeLogProbabilities = true,
    TopLogProbabilityCount = 2
};

ChatCompletion completion = client.CompleteChat(messages, options);

Console.WriteLine(completion.Content[0].Text);

Logprobs响应结果

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1702685778,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?"
      },
      "logprobs": {
        "content": [
          {
            "token": "Hello",
            "logprob": -0.31725305,
            "bytes": [72, 101, 108, 108, 111],
            "top_logprobs": [
              {
                "token": "Hello",
                "logprob": -0.31725305,
                "bytes": [72, 101, 108, 108, 111]
              },
              {
                "token": "Hi",
                "logprob": -1.3190403,
                "bytes": [72, 105]
              }
            ]
          },
          {
            "token": "!",
            "logprob": -0.02380986,
            "bytes": [
              33
            ],
            "top_logprobs": [
              {
                "token": "!",
                "logprob": -0.02380986,
                "bytes": [33]
              },
              {
                "token": " there",
                "logprob": -3.787621,
                "bytes": [32, 116, 104, 101, 114, 101]
              }
            ]
          },
          {
            "token": " How",
            "logprob": -0.000054669687,
            "bytes": [32, 72, 111, 119],
            "top_logprobs": [
              {
                "token": " How",
                "logprob": -0.000054669687,
                "bytes": [32, 72, 111, 119]
              },
              {
                "token": "<|end|>",
                "logprob": -10.953937,
                "bytes": null
              }
            ]
          },
          {
            "token": " can",
            "logprob": -0.015801601,
            "bytes": [32, 99, 97, 110],
            "top_logprobs": [
              {
                "token": " can",
                "logprob": -0.015801601,
                "bytes": [32, 99, 97, 110]
              },
              {
                "token": " may",
                "logprob": -4.161023,
                "bytes": [32, 109, 97, 121]
              }
            ]
          },
          {
            "token": " I",
            "logprob": -3.7697225e-6,
            "bytes": [
              32,
              73
            ],
            "top_logprobs": [
              {
                "token": " I",
                "logprob": -3.7697225e-6,
                "bytes": [32, 73]
              },
              {
                "token": " assist",
                "logprob": -13.596657,
                "bytes": [32, 97, 115, 115, 105, 115, 116]
              }
            ]
          },
          {
            "token": " assist",
            "logprob": -0.04571125,
            "bytes": [32, 97, 115, 115, 105, 115, 116],
            "top_logprobs": [
              {
                "token": " assist",
                "logprob": -0.04571125,
                "bytes": [32, 97, 115, 115, 105, 115, 116]
              },
              {
                "token": " help",
                "logprob": -3.1089056,
                "bytes": [32, 104, 101, 108, 112]
              }
            ]
          },
          {
            "token": " you",
            "logprob": -5.4385737e-6,
            "bytes": [32, 121, 111, 117],
            "top_logprobs": [
              {
                "token": " you",
                "logprob": -5.4385737e-6,
                "bytes": [32, 121, 111, 117]
              },
              {
                "token": " today",
                "logprob": -12.807695,
                "bytes": [32, 116, 111, 100, 97, 121]
              }
            ]
          },
          {
            "token": " today",
            "logprob": -0.0040071653,
            "bytes": [32, 116, 111, 100, 97, 121],
            "top_logprobs": [
              {
                "token": " today",
                "logprob": -0.0040071653,
                "bytes": [32, 116, 111, 100, 97, 121]
              },
              {
                "token": "?",
                "logprob": -5.5247097,
                "bytes": [63]
              }
            ]
          },
          {
            "token": "?",
            "logprob": -0.0008108172,
            "bytes": [63],
            "top_logprobs": [
              {
                "token": "?",
                "logprob": -0.0008108172,
                "bytes": [63]
              },
              {
                "token": "?\n",
                "logprob": -7.184561,
                "bytes": [63, 10]
              }
            ]
          }
        ]
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 9,
    "total_tokens": 18,
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "system_fingerprint": null
}