Yandex SpeechKit

Speech recognition and speech synthesis (text-to-speech) technologies. The service infrastructure is designed with high loads in mind to ensure that the system is available and fault-free even if the number of concurrent requests is high. SpeechKit is the power behind Alice, the Yandex voice assistant.
The Yandex.Cloud infrastructure is protected in accordance with Federal Law No. 152.
  • Support for three languages
    The service handles audio and text in three languages: Russian, English, and Turkish. You can easily add support for any of these languages at any time, which means you won’t need third-party services to reach a new audience.
  • Natural-sounding speech
    If you try to synthesize speech from words recorded by an actor, the resulting sound is far from natural. Yandex SpeechKit composes speech from more than a million individual phonemes, with intonation set by a neural network trained on numerous real-life examples. The result is synthesized speech that is pleasing to the ear.
  • Real-time synthesis
    As soon as your service or app sends a text for speech synthesis, it immediately receives an audio recording in response. The delay is so small that you can create software with voice streaming support.
  • Transparent pricing
    When we receive audio recordings from you, we charge for the length of the recording. Text is charged based on character count. This way you can accurately forecast your spending.
  • Easy-to-use API
    The service offers an HTTP API for data exchange. This means that you can implement new features on a tight deadline, without needing to deploy and support your own infrastructure.

Use cases

  • When you have a lot of calls from customers that all provide the same kind of information, you can automate the process of saving the data to the database. Yandex SpeechKit can recognize the last name of the caller, preferred date and time of the appointment, and other details. Let your call center staff be focused on more complex issues.

  • Add voice control to your app: many of your users will find this much faster and easier. Yandex SpeechKit can decode voice commands so that the app can respond to them.

  • Let’s say you need to automatically inform a large customer audience of the same thing, but you would also like to add personal flavor to each message. For example, you want to refer to each person by name, state the unique customer number, or personalize it in other ways. Using speech synthesis, you can mass-dial your customers without involving your call center.

  • Add a voice interface to your service so that the users of your products or services can skip on-screen text. This makes it easier for people with limited vision to become your customers.

  • With voice synthesis you can share your knowledge more easily. Prepare a text version of your intended video or webinar, and let Yandex SpeechKit do the rest of the job for you. No need to waste time on voiceover: delegate this to an automated system.

Try Yandex SpeechKit:

