New API Features in GPT-5

Minimal reasoning effort

The reasoning.effort parameter controls how many reasoning-related tokens the model generates before producing a response. Earlier reasoning models (such as o3) only supported low, medium, and high.

low favors faster speed and fewer reasoning tokens, while high favors more comprehensive reasoning.

The new minimal setting generates very few reasoning tokens, suitable for scenarios requiring the fastest response times. In general, having the model generate a small amount of reasoning when necessary performs better than generating none at all. The default setting is medium.

The minimal setting excels particularly in code generation and instruction-following scenarios, enabling strict adherence to given instructions. However, in this setting, the model may require prompts to behave more proactively. Even in "minimal" mode, reasoning quality can be improved by encouraging the model to "think" or list steps before answering.

curl --request POST \
--url https://genaiapi.cloudsway.net/v1/ai/xxx/chat/completions \
--header "Authorization: Bearer $API_KEY" \
--header 'Content-type: application/json' \
--data '{
  "model": "gpt-5",
  "messages": [
    {
      "role": "user",
      "content": "How much gold would it take to coat the Statue of Liberty in a 1mm layer?"
    }
  ],
  "reasoning_effort": "minimal"
}'

Verbosity

Verbosity determines the number of output tokens generated. Reducing the number of tokens can lower overall latency. Although the model's reasoning method remains largely unchanged, the model will seek more concise ways to answer—which may improve or reduce answer quality depending on your usage scenario. Here are the applicable scenarios for the two ends of the verbosity spectrum:

High verbosity: Suitable for scenarios where the model needs to provide detailed documentation explanations or perform large-scale code refactoring.
Low verbosity: Ideal for scenarios requiring concise answers or simple code generation, such as SQL queries.

Models prior to GPT-5 used medium verbosity by default. Starting from GPT-5, this option can be configured as high, medium, or low.

When generating code, medium and high verbosity produce longer, more structured code with inline comments; while low verbosity generates shorter, more concise code with fewer comments.

curl --request POST \
--url https://genaiapi.cloudsway.net/v1/ai/xxx/chat/completions \
--header "Authorization: Bearer $API_KEY" \
--header 'Content-type: application/json' \
--data '{
  "model": "gpt-5",
  "messages": [
    { "role": "user", "content": "What is the answer to the ultimate question of life, the universe, and everything?" }
  ],
  "verbosity": "low"
}'

Custom tools

In GPT-5, we introduce a new feature called Custom Tools. This feature allows the model to send arbitrary raw text as input for tool calls while still constraining the output when needed.

Freeform inputs

By defining a tool as type: custom, you can enable the model to send plain text input directly to your tool, no longer limited to structured JSON format. With this feature, the model can send any raw text—such as code, SQL queries, Shell commands, configuration files, or even long articles—directly to your tool.

This functionality gives tool calls more flexibility, suitable for scenarios dealing with complex or unstructured input.

{
    "type": "custom",
    "name": "code_exec",
    "description": "Executes arbitrary python code",
}

curl --request POST --url https://genaiapi.cloudsway.net/v1/ai/xxx/chat/completions --header "Authorization: Bearer $API_KEY" --header 'Content-type: application/json' --data '{
  "model": "gpt-5",
  "messages": [
    { "role": "user", "content": "Use the code_exec tool to calculate the area of a circle with radius equal to the number of r letters in blueberry" }
  ],
  "tools": [
    {
      "type": "custom",
      "custom": {
        "name": "code_exec",
        "description": "Executes arbitrary python code"
      }
    }
  ]
}'

Constraining outputs

GPT-5 supports Context-Free Grammars (CFGs) for custom tools, allowing you to provide a Lark grammar to constrain the output to a specific syntax or domain-specific language (DSL).

By attaching a CFG (such as SQL or a DSL grammar), you can ensure that the text generated by the assistant conforms to your grammar rules.

This feature makes tool calls or structured responses more precise and controlled, while enabling direct enforcement of strict syntax or domain-specific formats in GPT-5's function calls, thereby improving control and reliability in complex or restricted domains.

Best practices for custom tools

Write concise and clear tool descriptions: The model will select what to send based on your description; if you want the model to always call the tool, state it explicitly.
Validate output on the server side: The free-form string feature is powerful, but measures need to be taken to prevent injection attacks or unsafe commands.

Allowed tools

allowed_tools under the tool_choice parameter allows you to pass N tool definitions but restrict the model to using only M of them (M < N).

You can list the complete set of tools in tools, then use the allowed_tools block to specify a subset and define the mode:

auto: The model can choose any one of the specified tools to call.
required: The model must call one of the tools.

By separating all possible tools from the currently available subset, you can achieve higher security, predictability, and better prompt caching. Additionally, you can avoid fragile prompt engineering issues, such as hard-coded call sequences. GPT-5 can dynamically call or request specific functions during a conversation while reducing the risk of accidental tool usage in long contexts.

	Standard Tools	Allowed Tools
Model Scope	All tools listed under "tools": [...]	Only the subset specified in tool_choice within "tools": [...]
Tool Invocation	The model can choose to invoke or not invoke any tool	The model is restricted to invoke (or must invoke) the specified tool
Purpose	Declare the scope of available functions	Restrict the scope of actually used functions

Allowed tools used in the chatCompletions interface

curl --location 'https://genaiapipre.cloudsway.net/v1/ai/xxx/chat/completions' \
--header 'Authorization: Bearer $API_KEY' \
--header 'Content-Type: application/json' \
--data '{
    "model": "gpt-5",
    "messages": [
        {
            "role":"user",
            "content":"Please use the code_exec tool to calculate the area of a circle with radius equal to the number of '\''r'\''s in strawberry"
        }
    ],
    "tools": [
        {
            "type": "custom",
            "custom": {
                "name": "code_exec",
                "description": "Executes arbitrary python code"
            }
        },
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get the current weather in a given location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA"
                        }
                    },
                    "required": [
                        "location"
                    ]
                }
            }
        }
    ],
    "tool_choice": {
        "type": "allowed_tools",
        "allowed_tools": {
            "mode": "auto",
            "tools": [
                {
                    "type": "function",
                    "function": {
                        "name": "get_weather"
                    }
                },
                {
                    "type": "custom",
                    "custom": {
                        "name": "code_exec"
                    }
                }
            ]
        }
    }
}'