Pricing for SpeechKit
To calculate the cost of using the service, use the calculator or see the prices on this page.
What goes into the cost of using SpeechKit
Using speech synthesis
The cost of using SpeechKit for speech synthesis depends on the version of the API used. For API v1, the cost is calculated based on the total number of characters sent to generate speech from text in a calendar month (Reporting period).
The number of characters in a request is determined considering spaces and special characters. The cost of an empty request is equal to the cost of one character.
The cost using API v3 depends on the number of synthesis requests sent. Speech synthesis requests have limitations — 250 characters and 24 seconds.
Using speech recognition
The cost of using SpeechKit for speech recognition depends on the recognition type and duration of a recognized audio fragment. The cost is calculated for a calendar month (Reporting period).
Streaming speech recognition
The cost of using SpeechKit streaming recognition is calculated based on the pricing rules for synchronous recognition.
Billable unit — a 15-second segment of single-channel audio. Shorter segments are rounded up (1 second becomes 15 seconds).
1 audio fragment that is 37 seconds is billed as 45 seconds.
Explanation: the audio is divided into 2 15-second segments and one 7-second segment. The length of the last segment is rounded up to 15 seconds. Total: 3 segments, 15 seconds each.
2 audio fragments that are 5 and 8 seconds are billed as 30 seconds.
Explanation: the length of each audio is rounded up to 15 seconds. Total: 2 segments, 15 seconds each.
These rules apply when using asynchronous recognition.
Billable unit — 1 second of two-channel audio. Shorter segments are rounded up. The number of channels is rounded up to an even number.
The minimum billable amount is 15 seconds for every pair of channels. Audio that is shorter is billed as 15 seconds.
Examples of rounding audio length:
|Length||Number of channels||Seconds charged|
|Service||Rate for billable unit,
|Speech synthesis using API v1, for 1 million characters||$10.560000|
|Speech synthesis using API v3, for request||$0.001280|
|Service||Rate for request,
|SpeechKit Brand Voice API request||$0.001280|
Hosting Brand Voice models
|Service||Rate per month,
|SpeechKit Brand Voice Adaptive||$1923.076950|
|SpeechKit Brand Voice Full||By request|
|Service||Rate for the billable unit,
|Synchronous file recognition||$0.001280|
|Asynchronous file recognition||$0.000128|
|Asynchronous file recognition, deferred mode model||$0.000032|