Text-to-Speech
Introduction
The Text-to-Speech (TTS) model is a technology that converts written text into speech, enabling computers to read out the text. This conversion technology is highly practical in various scenarios, such as navigation systems, virtual assistants, audiobooks, accessibility technology, and customer service systems. The development of TTS technology has brought generated speech closer to natural language by considering different intonations, speeds, and tones, greatly enhancing the user experience.
Best Practices
Using MaaS-nar as an example
curl --request POST \
--url 'https://genaiapijp.cloudsway.net/v1/ai/kXfKrPc/tts-n/text-to-speech/mp3?voice=Beatrice&voice-speed=fast&voice-volume=standard' \
--header 'Accept: application/octet-stream' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--header 'Content-Type: text/plain' \
--data hello
Using MaaS-Ele as an example
curl --request POST \
--url https://genaiapihk.cloudsway.net/v1/ai/ZMfNI/tts-e/text-to-speech/pqHfZKP75CvOlQylNhV4 \
--header 'Authorization: Bearer YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"text": "你好",
"voice_settings": {
"stability": 0,
"similarity_boost": 1.0,
"use_speaker_boost": false
}
}'
The output will generate an MP3 file.