MaaS-nar shortTextToSpeech
Request Method:
POST
Request Path:
/tts-n/text-to-speech/{responseFormat}
pathVariables
Parameter | Description | Example |
---|---|---|
responseFormat | Audio format (mp3/m4a) | mp3 |
queryParams
Parameter | Description | Example |
---|---|---|
voice | voice Possible values |
Yifei |
voice-speed | Speech speed Possible values fast normal slow numeric between 0.3-2 |
normal |
voice-volume | Volume Possible values x-loud loud standard soft x-soft normalized |
loud |
Request header
Parameter | Description | Example |
---|---|---|
Accept | Fixed value:application/octet-stream | application/octet-stream |
Authorization | AccessKey Bearer ${AccessKey} | Bearer RWYhq1NsLPAMmieux0Gd |
Content-type | Possible values text/plain application/application/x-www-form-urlencoded text/vtt application/x-subrip text/srt |
text/plain |
The request body format and the Content-Type in the header need to correspond.
Request body
Please ensure that the Content-Type and the body format are properly aligned.
-
Sending the UTF-8 string in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Accept: application/octet-stream' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-type: text/plain' \ --data 'hello'
-
Sending a URL-encoded string in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Accept: application/octet-stream' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: application/x-www-form-urlencoded' \ --data '%E4%BD%A0%E5%A5%BD%E5%95%8A'
-
Sending a UTF-8 text file in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Accept: application/octet-stream' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-type: text/plain' \ --data '@test.txt'
-
Sending a VTT file in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Accept: application/octet-stream' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: text/vtt' \ --data '@sing-song_2024-07-29_103928.vtt'
-
Sending an SRT file in the body (Method 1):
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Accept: application/octet-stream' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: application/x-subrip' \ --data '@sing-song_2024-07-29_103928.srt'
Sending an SRT file in the body (Method 2):
```
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \
--header 'Accept: application/octet-stream' \
--header 'Authorization: Bearer ${AccessKey}' \
--header 'Content-Type: text/srt' \
--data '@sing-song_2024-07-29_103928.srt'
```
Response headers
Parameter | Description | Example |
---|---|---|
x-duration-seconds | Duration of the audio in seconds | 3 |
Response Body
File stream
MaaS-nar longTextToSpeech
API Flow
- Call the longTextToSpeech API to obtain the statusUrl.
- Poll the statusUrl (recommended interval of 5-10s) to get the task result.
- If the task has completed successfully, download the audio file using the URL provided in the result field.
Request Method:
POST
Request Path:
/tts-n/text-to-speech/{responseFormat}
Path Variables
Parameter | Description | Example |
---|---|---|
responseFormat | Audio format: mp3/m4a/wav | wav |
Query Parameters
Parameter | Description | Example |
---|---|---|
voice | Voice option Available Options |
Yifei |
voice-speed | Speech speed Available options: fast, normal, slow, or a number between 0.3 and 2 |
normal |
voice-volume | Volume level Available options: x-loud, loud, standard, soft, x-soft, or normalized |
loud |
Request Headers
Parameter | Description | Example |
---|---|---|
Authorization | AccessKey Bearer ${AccessKey} | Bearer RWYhq1NsLPAMmieux0Gd |
Content-type | Available options text/plain application/application/x-www-form-urlencoded text/vtt application/x-subrip text/srt |
text/plain |
Do not set the Accept header.r
Request body format must match the Content-Type specified in the header.
Request Body
Note the content-type and body format correspondence.
-
Sending a UTF-8 string in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-type: text/plain' \ --data 'hello'
-
Sending a URL-encoded string in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Content-Type: application/x-www-form-urlencoded' \ --data '%E4%BD%A0%E5%A5%BD%E5%95%8A'
-
Sending a UTF-8 text file in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-type: text/plain' \ --data '@test.txt'
-
Sending a VTT file in the body:
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: text/vtt' \ --data '@sing-song_2024-07-29_103928.vtt'
-
Sending an SRT file in the body (Method 1):
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: application/x-subrip' \ --data '@sing-song_2024-07-29_103928.srt'
Sending an SRT file in the body (Method 2):
curl --location '${endpointPath}/tts-n/text-to-speech/mp3' \ --header 'Authorization: Bearer ${AccessKey}' \ --header 'Content-Type: text/srt' \ --data '@sing-song_2024-07-29_103928.srt'
Response
Parameter | Description | Example |
---|---|---|
statusUrl | URL to obtain the task execution result. It can be accessed directly through a GET request without any authorization information. Recommended polling interval is 5-10s. | |
taskId | Task ID | 1 |
Getting the task result through statusUrl
Parameter | Description | Example |
---|---|---|
finished | Indicates whether the task has finished. True means the polling should be stopped. | Boolean |
percent | Progress of audio generation, ranging from 0 to 100. | Integer |
succeeded | Indicates whether the audio was generated successfully. | Boolean |
result | URL to download the audio if it has been successfully generated. This URL is valid for 10 minutes. | String |
message | Reason for audio generation failure. | String |
durationInSeconds | Duration of the audio in seconds, rounded to the nearest whole second. | Integer |