MaaS-nar shortTextToSpeech
Request Method:
POST
Request Path:
/tts-n/text-to-speech/{responseFormat}
pathVariables
| Parameter | Description | Example |
|---|---|---|
| responseFormat | Audio format (mp3/m4a) | mp3 |
queryParams
| Parameter | Description | Example |
|---|---|---|
| voice | voice Possible values |
Yifei |
| voice-speed | Speech speed Possible values fast normal slow numeric between 0.3-2 |
normal |
| voice-volume | Volume Possible values x-loud loud standard soft x-soft normalized |
loud |
Request header
| Parameter | Description | Example |
|---|---|---|
| Accept | Fixed value:application/octet-stream | application/octet-stream |
| Authorization | AccessKey Bearer ${AccessKey} | Bearer RWXXXX0Gd |
| Content-type | Possible values text/plain application/application/x-www-form-urlencoded text/vtt application/x-subrip text/srt |
text/plain |
The request body format and the Content-Type in the header need to correspond.
Request body
Please ensure that the Content-Type and the body format are properly aligned.
-
Sending the UTF-8 string in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Accept: application/octet-stream' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-type: text/plain' \ --data 'hello' -
Sending a URL-encoded string in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Accept: application/octet-stream' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: application/x-www-form-urlencoded' \ --data '%E4%BD%A0%E5%A5%BD%E5%95%8A' -
Sending a UTF-8 text file in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Accept: application/octet-stream' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-type: text/plain' \ --data '@test.txt' -
Sending a VTT file in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Accept: application/octet-stream' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: text/vtt' \ --data '@sing-song_2024-07-29_103928.vtt' -
Sending an SRT file in the body (Method 1):
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Accept: application/octet-stream' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: application/x-subrip' \ --data '@sing-song_2024-07-29_103928.srt'
Sending an SRT file in the body (Method 2):
```
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \
--header 'Accept: application/octet-stream' \
--header 'Authorization: Bearer ${AccessKey}' \
--header 'Content-Type: text/srt' \
--data '@sing-song_2024-07-29_103928.srt'
```
Response headers
| Parameter | Description | Example |
|---|---|---|
| x-duration-seconds | Duration of the audio in seconds | 3 |
Response Body
File stream
MaaS-nar longTextToSpeech
API Flow
- Call the longTextToSpeech API to obtain the statusUrl.
- Poll the statusUrl (recommended interval of 5-10s) to get the task result.
- If the task has completed successfully, download the audio file using the URL provided in the result field.
Request Method:
POST
Request Path:
/tts-n/text-to-speech/{responseFormat}
Path Variables
| Parameter | Description | Example |
|---|---|---|
| responseFormat | Audio format: mp3/m4a/wav | wav |
Query Parameters
| Parameter | Description | Example |
|---|---|---|
| voice | Voice option Available Options |
Yifei |
| voice-speed | Speech speed Available options: fast, normal, slow, or a number between 0.3 and 2 |
normal |
| voice-volume | Volume level Available options: x-loud, loud, standard, soft, x-soft, or normalized |
loud |
Request Headers
| Parameter | Description | Example |
|---|---|---|
| Authorization | AccessKey Bearer ${AccessKey} | Bearer RWYhqXXXux0Gd |
| Content-type | Available options text/plain application/application/x-www-form-urlencoded text/vtt application/x-subrip text/srt |
text/plain |
Do not set the Accept header.r
Request body format must match the Content-Type specified in the header.
Request Body
Note the content-type and body format correspondence.
-
Sending a UTF-8 string in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-type: text/plain' \ --data 'hello' -
Sending a URL-encoded string in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Content-Type: application/x-www-form-urlencoded' \ --data '%E4%BD%A0%E5%A5%BD%E5%95%8A' -
Sending a UTF-8 text file in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-type: text/plain' \ --data '@test.txt' -
Sending a VTT file in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: text/vtt' \ --data '@sing-song_2024-07-29_103928.vtt' -
Sending an SRT file in the body (Method 1):
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: application/x-subrip' \ --data '@sing-song_2024-07-29_103928.srt'Sending an SRT file in the body (Method 2):
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: text/srt' \ --data '@sing-song_2024-07-29_103928.srt'
Response
| Parameter | Description | Example |
|---|---|---|
| statusUrl | URL to obtain the task execution result. It can be accessed directly through a GET request without any authorization information. Recommended polling interval is 5-10s. | |
| taskId | Task ID | 1 |
Getting the task result through statusUrl
| Parameter | Description | Example |
|---|---|---|
| finished | Indicates whether the task has finished. True means the polling should be stopped. | Boolean |
| percent | Progress of audio generation, ranging from 0 to 100. | Integer |
| succeeded | Indicates whether the audio was generated successfully. | Boolean |
| result | URL to download the audio if it has been successfully generated. This URL is valid for 10 minutes. | String |
| message | Reason for audio generation failure. | String |
| durationInSeconds | Duration of the audio in seconds, rounded to the nearest whole second. | Integer |