Skip to content

MaaS_VEO Video Interface Documentation

Request Protocol

Http

Request Header

Parameter Name Value
Authorization Bearer {YOUR AK}
Content-Type multipart/form-data

Generate Video

Request URL

https://genaiapi.cloudsway.net/v1/ai/{Your Endpoint}/veo/videos/generate

Request Body

Objects in instances

Attribute Name Type Required/Optional Description
prompt string Required for text-to-video function. Optional if an input image prompt is provided (to convert an image to a video). A text string used to guide the first eight seconds of the video. For example:
  • Quick tracking shot: depicts a bustling dystopian world under bright neon lights, flying cars and fog, night, halos, volumetric lighting
  • Neon holograms of cars traveling at the speed of light, cinematic shots, incredible detail, volumetric lighting
  • Many spotted jellyfish pulsating underwater. Their bodies are transparent and glow in the deep sea
  • Extreme close-up of shallow depth-of-field puddles on the street. Reflecting a futuristic Tokyo city, bright neon lights, night, with halos
  • Time-lapse photography of dancing auroras, twinkling stars, snow-covered landscapes in the Arctic sky
  • A lone cowboy riding a horse across an open plain under a beautiful sunset, soft light, warm colors
image Media Optional Optional. An image used to guide video generation, which can be a bytesBase64Encoded string of the encoded image or a gcsUri string URI of the Cloud Storage bucket location.
lastFrame Media Optional Optional. An image for the first frame of the video, used to fill the space between them. lastFrame can be a bytesBase64Encoded string of the encoded image or a gcsUri string URI of the Cloud Storage bucket location. lastFrame is supported by the following Preview models:
  • MaaS_Veo_3.1_generate_preview
  • MaaS_Veo_3.1_fast_generate_preview
referenceImages Media Optional list[referenceImages] Optional. A list containing up to three asset images or at most one style image, used to describe the referenceImages that the model should use when generating the video. Important note: Veo 3.1 models do not support referenceImages.style. The following models support referenceImages in Preview :
  • MaaS_Veo_3.1_generate_preview
referenceImages.image Media Optional Optional. Each image can be encoded as a bytesBase64Encoded string, or a gcsUri string URI pointing to a Cloud Storage bucket location.
referenceImages.referenceType string Required in referenceImages object Required in the referenceImages object. Specifies the type of the provided reference image. Supported values: "asset": The reference image provides material resources for the generated video, such as scenes, objects, or characters. "style": The reference image provides style information for the generated video, such as scene colors, lighting, or textures. Important note: Veo 3.1 models do not support referenceImages.style.

Media Object

Attribute Name Type Description
bytesBase64Encoded string A Base64-encoded string of bytes of an image or video file.
gcsUri string A string URI pointing to a Cloud Storage bucket location.
mimeType string The following objects must have this parameter: image, video, mask, lastFrame, referenceImages.image. Specifies the MIME type of the video or image. For images, the system accepts the following MIME types: image/jpeg, image/png, image/webp. For videos, the system accepts the following MIME types: video/mov, video/mpeg, video/mp4, video/mpg, video/avi, video/wmv, video/mpegps, video/flv

parameters Object

Attribute Name Type Required/Optional Description
aspectRatio string Optional Specifies the aspect ratio of the generated video. Acceptable values are 16:9 (default) or 9:16.
compressionQuality string Optional Specifies the compression quality of the generated video. Acceptable values are "optimized" or "lossless". The default value is "optimized".
durationSeconds integer Required The duration of the video file to be generated:
  • Veo 2 models: 5-8. The default value is 8.
  • Veo 3 models: 4, 6, or 8. The default is 8.
  • When using referenceImages: 8.
enhancePrompt boolean Optional Use Gemini to optimize the prompt. Acceptable values are true or false (default value is true).
generateAudio boolean Required (only for MaaS_Veo_3_generate_preview) See description for details Required parameter for MaaS_Veo_3_generate_preview. Generate audio for the video. Acceptable values are true or false. When calling the original factory interface, if this field is not filled in for MaaS_Veo_3_generate_preview, audio will be generated by default.
negativePrompt string Optional A text string used to describe what you want to prevent the model from generating. For example:
  • Top lighting, bright colors
  • People, animals
  • Multiple cars, wind
personGeneration string Optional Safety settings to control whether person or face generation is allowed:
  • allow_adult (default): Only allow generation of adults
  • dont_allow: Prohibit including people/faces in the image
resizeMode string Optional Only for Veo 3 models, used with image to image-to-video. The resizing mode used by the model to resize the video. Accepted values include "pad" (default) or "crop".
resolution string Optional Only for Veo 3 models. The resolution of the generated video, acceptable values are 720p (default) or 1080p.
sampleCount int Optional The number of output videos requested. Accepted values are 1-4.
seed uint32 Optional A number used to generate deterministic videos (seed value). The range is 0-4,294,967,295.
storageUri string Optional The Cloud Storage bucket URI for storing the output video, in the format gs://BUCKET_NAME/SUBDIRECTORY. If not provided, the video bytes encoded in Base64 will be returned.

Request Examples

Text-to-Video

curl 'https://genaiapi.cloudsway.net/v1/ai/{Your Endpoint}/veo/videos/generate' \
-H 'Authorization: Bearer {Your AK}' \
-H 'Content-Type: application/json' \
-d '{
    "instances": [
        {
            "prompt": "",
            "image": {
                "bytesBase64Encoded": "INPUT_IMAGE",
                "mimeType": "MIME_TYPE"
             }
        }
    ],
    "parameters": {
        "durationSeconds": 8,
        "generateAudio": true
    }
}'

Image-to-Video

curl 'https://genaiapi.cloudsway.net/v1/ai/{Your Endpoint}/veo/videos/generate' \
-H 'Authorization: Bearer {Your AK}' \
-H 'Content-Type: application/json' \
-d '{
    "instances": [
        {
            "prompt": ""
        }
    ],
    "parameters": {
        "durationSeconds": 8,
        "generateAudio": true
    }
}'

Video Using Asset Images

curl 'https://genaiapipre.cloudsway.net/v1/ai/{Your Endpoint}/veo/videos/generate' \
-H 'Authorization: Bearer {Your AK}' \
-H 'Content-Type: application/json' \
-d '{
    "instances": [
        {
            "prompt": "",
            "referenceImages": [
                {
                    "image": {
                        "bytesBase64Encoded": "",
                        "mimeType": "image/png"
                    },
                    "referenceType": "asset"
                }
            ]
        }
    ],
    "parameters": {
        "durationSeconds": 8,
        "generateAudio": false
    }
}'

Return Value Example

Complete Return Value Parameters

{
        "instances": [{
               "prompt": string,
               "image": {
                    // Union field can be only one of the following:
                    "bytesBase64Encoded": string,
                    "gcsUri": string,
                    // End of list of possible types for union field.
                    "mimeType": string
               },
              "lastFrame": {
                    // Union field can be only one of the following:
                    "bytesBase64Encoded": string,
                    "gcsUri": string,
                    // End of list of possible types for union field.
                    "mimeType": string
              },
              "video": {
                    // Union field can be only one of the following:
                    "bytesBase64Encoded": string,
                    "gcsUri": string,
                    // End of list of possible types for union field.
                    "mimeType": string
              },
              "referenceImages": [
                // A list of up to three asset images or at most one style image for the
                // model to use when generating videos.
                //
                // referenceImages is supported by the following models in Preview:
                //
                // *   veo-2.0-generate-exp
                // *   veo-3.1-generate-preview
                {
                    "image:" {
                      // Union field can be only one of the following:
                      "bytesBase64Encoded": string,
                      "gcsUri": string,
                      // End of list of possible types for union field.
                      "mimeType": string
                    },
                    "referenceType": string
                }
              ]
        }],
        "parameters": {
                "aspectRatio": string,
                "compressionQuality": string,
                "durationSeconds": integer,
                "enhancePrompt": boolean,
                "generateAudio": boolean,
                "negativePrompt": string,
                "personGeneration": string,
                "resizeMode": string, // Veo 3 image-to-video only
                "resolution": string, // Veo 3 models only
                "sampleCount": integer,
                "seed": uint32,
                "storageUri": string
        }
}
{  "name": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID"}

Query Task

Request URL

https://genaiapi.cloudsway.net/v1/ai/{Your Endpoint}/veo/videos/task

Input Parameter Example

{
  "operationName": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/MODEL_ID/operations/OPERATION_ID"
}

Return Value Example

Task Completed, Video Generation Successful

{
   "name": string,
   "done": boolean,
   "response":{
      "@type":"type.googleapis.com/cloud.ai.large_models.vision.GenerateVideoResponse",
      "raiMediaFilteredCount": integer,
      "videos":[
         {
           // Union field can be only one of the following:
           "gcsUri": string,
           "bytesBase64Encoded": string,
           // End of list of possible types for union field.
           "mimeType": string
         },
         {
           // Union field can be only one of the following:
           "gcsUri": string,
           "bytesBase64Encoded": string,
           // End of list of possible types for union field.
           "mimeType": string
         },
         {
           // Union field can be only one of the following:
           "gcsUri": string,
           "bytesBase64Encoded": string,
           // End of list of possible types for union field.
           "mimeType": string
         },
         {
           // Union field can be only one of the following:
           "gcsUri": string,
           "bytesBase64Encoded": string,
           // End of list of possible types for union field.
           "mimeType": string
         },
      ]
   }
}

Task Completed, Video Generation Error

{
    "name": string,
    "done": boolean,
    "error": {
        "code": integer,
        "message": string
    }
}

Error Codes

Task Creation Phase

{
    "error": {
        "code": "400",
        "message": "{\n  \"error\": {\n    \"code\": 400,\n    \"message\": \"Invalid compression quality type: notvalid\",\n    \"status\": \"INVALID_ARGUMENT\"\n  }\n}\n"
    }
}
HTTP Status Code error.code error.message
200 0
400 400 Status: INVALID_ARGUMENT message:
  • Invalid resize mode
  • Invalid compression quality type
  • No inputs provided
  • Invalid referenceType
  • image is empty
  • Invalid resolution
401 401 You are not authorized to access this resource.
500 Internal Server Error
429 429 The request frequency exceeds the client's set limit. Please contact customer service for adjustment.

Task Query Phase

The 'done' field indicates whether the task is completed. Regardless of success, 'true' means completed. Error information needs to be judged through the 'error' field and the information in the 'response' field.

{
    "data": {
        "name": "projects/my-project-xx-xxxxx/locations/us-central1/publishers/google/models/veo-3.1-generate-preview/operations/xxxxxxx",
        "done": true,
        "error": {
            "code": 3,
            "message": "Reference to video does not support this mix of reference images."
        }
    }
}
HTTP Status Code error.code error.message
200 0
200 3
  • Unsupported output video duration 4 seconds, supported durations are [8] for feature reference_to_video.
  • Unsupported output video duration 34 seconds, supported durations are [8,4,6] for feature text_to_video.
  • Unsupported output video duration 14 seconds, supported durations are [8,4,6] for feature image_to_video.
  • Generated video is large, an output storage uri is required.
  • Reference to video does not support this mix of reference images.
  • Unsupported output storage uri
  • Invalid sample count 6. The sample count should be in the range of [1, 4].
200 7
401 401 You are not authorized to access this resource.
500 Internal Server Error
429 429 The request frequency exceeds the client's set limit. Please contact customer service for adjustment.

When the task is created successfully, querying the task may fail due to some sensitive information. The return example is as follows. The final codes in raiMediaFilteredReasons may not correspond to specific filtering reasons, and some model sides do not provide specific filtering reasons.

{
  "name": "projects/xxxx-veo/locations/us-central1/publishers/google/models/veo-3.0-generate-preview/operations/9a0be521-bf8b-449f-9c2d-f22f9d5088c5",
  "done": true,
  "response": {
    "raiMediaFilteredCount": 1,
    "raiMediaFilteredReasons": [
      "Your current safety settings for people/face generation filtered out 1 videos. You will not be charged for blocked videos. Try rephrasing the prompt. If you think this was an error, send feedback. Support codes: 39322892, 63236870"
    ]
  }
}

The relationship between codes and specific reasons is as follows:

Support Code Safety Category Description
58061214 17301594 Children Requests to generate content depicting children are rejected if personGeneration is not set to "allow_all" or the project is not on the allowlist for this feature.
29310472 15236754 Celebrities Requests to generate realistic depictions of well-known individuals are rejected, or requests where the project is not on the allowlist for this feature.
64151117 42237218 Video Safety Violation Detects content that violates safety policies.
62263041 Dangerous Content Detects content that is potentially dangerous in nature.
57734940 22137204 Hateful Detects hate-related themes or content.
74803281 29578790 42876398 Other Detects various other safety issues in the request
92201652 Personal Information Detects personally identifiable information (PII) in text, such as mentions of credit card numbers, addresses, or other such information.
89371032 49114662 72817394 Prohibited Content Detects prohibited content in the request.
90789179 63429089 43188360 Explicit Content Detects sexually explicit content.
78610348 Harmful Content Detects harmful themes or content in text.
61493863 56562880 Violence Detects violence-related content in videos or text.
32635315 Vulgar Detects vulgar themes or content in text.