Yandex SpeechKit

Service allows to recognize or voice any text in several languages.
The service infrastructure is designed with high loads in mind to ensure that the system is available and fault-free even if the number of concurrent requests is high.
SpeechKit is the power behind Alice, the Yandex voice assistant.
The Yandex.Cloud infrastructure is protected in accordance with Federal Law No. 152.
  • Support for three languages
    The service handles audio and text in three languages: Russian, English, and Turkish. You can easily add support for any of these languages at any time, which means you won’t need third-party services to reach a new audience.
  • Natural-sounding speech
    If you try to synthesize speech from words recorded by an actor, the resulting sound is far from natural. Yandex SpeechKit composes speech from more than a million individual phonemes, with intonation set by a neural network trained on numerous real-life examples. The result is synthesized speech that is pleasing to the ear.
  • Real-time synthesis
    As soon as your service or app sends a text for speech synthesis, it immediately receives an audio recording in response. The delay is so small that you can create software with voice streaming support.
  • Transparent pricing
    When we receive audio recordings from you, we charge for the length of the recording. Text is charged based on character count. This way you can accurately forecast your spending.
  • Easy-to-use API
    The service offers an HTTP API for data exchange. This means that you can implement new features on a tight deadline, without needing to deploy and support your own infrastructure.
  • NEW
    Analysis of the context
    Premium voices synthesis checks the entire text before the start, not the separate sentences. This results in a more pertinent voice tone similar to a real person speech.
  • NEW
    Attention to details
    The use of deep neural networks in premium voices synthesis allows us to pay attention to a far greater number of details in the original voice. This allows to develop much clearer and nuanced voice and avoid any distortions that may appear in standard voices.

Use cases

  • When you have a lot of calls from customers that all provide the same kind of information, you can automate the process of saving the data to the database. Yandex SpeechKit can recognize the last name of the caller, preferred date and time of the appointment, and other details. Let your call center staff be focused on more complex issues.

  • Add voice control to your app: many of your users will find this much faster and easier. Yandex SpeechKit can decode voice commands so that the app can respond to them.

  • Let’s say you need to automatically inform a large customer audience of the same thing, but you would also like to add personal flavor to each message. For example, you want to refer to each person by name, state the unique customer number, or personalize it in other ways. Using speech synthesis, you can mass-dial your customers without involving your call center.

  • Add a voice interface to your service so that the users of your products or services can skip on-screen text. This makes it easier for people with limited vision to become your customers.

  • With voice synthesis you can share your knowledge more easily. Prepare a text version of your intended video or webinar, and let Yandex SpeechKit do the rest of the job for you. No need to waste time on voiceover: delegate this to an automated system.

Try Yandex SpeechKit:

Get startedAll services