Yandex SpeechKit

Speech recognition and speech synthesis (text-to-speech) technologies. The service infrastructure is designed with high loads in mind to ensure that the system is available and fault-free even if the number of concurrent requests is high. SpeechKit is the power behind Alice, the Yandex voice assistant.

Documentation

  • Support for four languages

    The service handles audio and text in four languages: Russian, English, Ukrainian, and Turkish. You can easily add support for any of these languages at any time, which means you won’t need third-party services to reach a new audience.

  • Natural-sounding speech

    If you try to synthesize speech from words recorded by an actor, the resulting sound is far from natural. Yandex SpeechKit composes speech from more than a million individual phonemes, with intonation set by a neural network trained on numerous real-life examples. The result is synthesized speech that is pleasing to the ear.

  • Real-time synthesis

    As soon as your service or app sends a text for speech synthesis, it immediately receives an audio recording in response. The delay is so small that you can create software with voice streaming support.

  • Transparent pricing

    When we receive audio recordings from you, we charge for the length of the recording. Text is charged based on character count. This way you can accurately forecast your spending.

  • Easy-to-use API

    The service offers an HTTP API for data exchange. This means that you can implement new features on a tight deadline, without needing to deploy and support your own infrastructure.

Use cases

  • When you have a lot of calls from customers that all provide the same kind of information, you can automate the process of saving the data to the database. Yandex SpeechKit can recognize the last name of the caller, preferred date and time of the appointment, and other details. Let your call center staff be focused on more complex issues.

  • Add voice control to your app: many of your users will find this much faster and easier. Yandex SpeechKit can decode voice commands so that the app can respond to them.

  • Let’s say you need to automatically inform a large customer audience of the same thing, but you would also like to add personal flavor to each message. For example, you want to refer to each person by name, state the unique customer number, or personalize it in other ways. Using speech synthesis, you can mass-dial your customers without involving your call center.

  • Add a voice interface to your service so that the users of your products or services can skip on-screen text. This makes it easier for people with limited vision to become your customers.

  • With voice synthesis you can share your knowledge more easily. Prepare a text version of your intended video or webinar, and let Yandex SpeechKit do the rest of the job for you. No need to waste time on voiceover: delegate this to an automated system.

Try Yandex SpeechKit:

All services