Yandex SpeechKit release notes: Archive

Written by

Yandex Cloud

Updated at April 17, 2024

Current version
Previous versions

SpeechKit provides updates based on the system model and version.

For recognition

For a detailed description of the available versions, see Recognition models.

For synthesis

In speech synthesis, the service provides two types of voices: standard and premium. Premium voices use new speech synthesis technology.

For more information about voice models, see About technology.

Current version

For information about synthesis model updates, see Yandex SpeechKit release notes: Speech synthesis.

For information about recognition model updates, see Yandex SpeechKit release notes: Speech recognition.

Previous versions

Release 30.09.21

Major upgrade of premium voices available in the REST API. Voice updates are available by the tags alena:rc and filipp:rc.

Various improvements in synthesis quality, including the synthesis of questions. Fixed a rare problem with looping synthesis.

For testing purposes, a function for adding stress to specific words is available. It allows you to better control intonation, especially when synthesizing questions. To add a stress after a word that needs to be emphasized, add <[accented]>. For example, in Are you glad <[accented]> to see me?, the word glad is emphasized.

Release on 09/03/21

In streaming speech recognition, transcription, and short audio recognition by the general:rc tag, a new version of the Demosthenes model is now available. It features improved basic recognition quality and recognizes names of healthcare professions and words related to jewelry.

We invite you to join in testing the version. Any feedback will be appreciated.

Version availability by tags

In transcription only:

hqa: The Amati version.

In streaming, transcription, and short audio recognition:

general: The Galen version.
general:rc: Demosthenes version.
general:deprecated: Zeno version.

Release on 26/02/21

In transcription by the hqa model tag, a new version named Guarneri is now available. It features greatly improved recognition quality.

Version availability by tags

In transcription only:

hqa: The Guarneri version.

In streaming, transcription, and short audio recognition:

general: The Galen version.
general:rc: Galen version.
general:deprecated: Zeno version.

Release on 03/02/21

The Galen version of the basic recognition model was tested successfully and is the main version of the recognition model as of February 3.

Version availability by tags

In transcription only:

hqa: The Amati version.

In streaming, transcription, and short audio recognition:

general: The Galen version.
general:rc: Galen version.
general:deprecated: Zeno version.

Release on 14/12/20

In transcription by the hqa model tag, a new version named Amati is now available. Issues have been fixed where silence was recognized instead of speech. Text recognition for news and medicine subject domains has been improved.

Version availability by tags

In transcription only:

hqa: The Amati version.

In streaming, transcription, and short audio recognition:

general: Zeno version.
general:rc: Galen version.
general:deprecated: Anaximander version.

Release on 01/12/20

In streaming, transcription, and short audio recognition by the general:rc tag, a new version of the Galen model is now available. It provides a significantly better basic recognition quality and recognizes words related to COVID-19.

Version availability by tags

In transcription only:

hqa: The Stradivarius version.

In streaming, transcription, and short audio recognition:

general: Zeno version.
general:rc: Galen version.
general:deprecated: Anaximander version.

Release on 24/11/20

After successful testing, the Zeno version is now the main released version of the general model in streaming, transcription, and short audio recognition.

Version availability by tags

In transcription only:

hqa: The Stradivarius version.

In streaming, transcription, and short audio recognition:

general and general:rc: Zeno version.
general:deprecated: Anaximander version.

Release on 17/11/20

Numerous corrections in the pronunciation of individual words thanks to the improved normalization. Declension of numerals fixed. A new version of the alena premium voice is now available by the alena tag.

Version availability by tags

No changes.

Release on 26/10/20

A next-generation recognition model is available in transcription: hqa. This model has a richer vocabulary, so recognition results are much better and more understandable to readers. The difference is especially noticeable with long audio recognition.

Version availability by tags

In transcription:

hqa: The Stradivarius version.
general: Anaximander version.
general:rc: The Zeno version.
general:deprecated: The Marcus Aurelius version.

In streaming and short audio recognition: no changes.

Release on 12/10/20

The new version provides significantly better basic recognition quality. A new version of the general model is now available in streaming, transcription, and short audio recognition.

Version availability by tags

general: Anaximander version.
general:rc: Zeno version.
general:deprecated: Marcus Aurelius version.

Release on 18/08/20

Update for transcription in the Anaximander version:

Improved handling of dense speech flows, having no detectable pauses in speech for more than 30 seconds.
Timing fixed.
Fixed an error with partial recognition results arriving after the final result.

The acoustic and language properties of the model have not changed.

Version availability by tags

These versions are available for streaming recognition, transcription, and short audio recognition:

general: Anaximander version.
general:rc: Anaximander version (updated).
general:deprecated: Marcus Aurelius version.

Release on 21/07/20

Anaximander is now the main operating version for streaming recognition, transcription, and short audio recognition.

Version availability by tags

general and general:rc: Anaximander version.
general:deprecated: Marcus Aurelius version.

Release on 27/05/20

New versions of the general model are now available in transcription and short audio recognition.

Version availability by tags

Available versions by tag:

general:rc: The Anaximander version.
general and general:deprecated: The Marcus Aurelius version.

Versions of the general model available for streaming recognition:

general: The Marcus Aurelius version.
general:rc: Anaximander version.
general:deprecated: Diogenes version.

Release on 15/05/20

For streaming speech recognition, the new version of the Anaximander model is now available with the general:rc tag.

Version availability by tags

general: Marcus Aurelius version.
general:rc: Anaximander version.
general:deprecated: Diogenes version.

The versions for short and long audio recognition remain unchanged.

Release on 16/04/20

For streaming speech recognition by the general tag, a new version of the Marcus Aurelius model is now available.

Version availability by tags

general and general:rc: Marcus Aurelius version.
general:deprecated: Diogenes version.

The versions for short and long audio recognition available with the general tag remain unchanged.

Yandex SpeechKit release notes: Archive

For recognitionFor recognition

For synthesisFor synthesis

Current versionCurrent version

Previous versionsPrevious versions

Release 30.09.21Release 30.09.21

Release on 09/03/21Release on 09/03/21

Version availability by tagsVersion availability by tags

Release on 26/02/21Release on 26/02/21

Version availability by tagsVersion availability by tags

Release on 03/02/21Release on 03/02/21

Version availability by tagsVersion availability by tags

Release on 14/12/20Release on 14/12/20

Version availability by tagsVersion availability by tags

Release on 01/12/20Release on 01/12/20

Version availability by tagsVersion availability by tags

Release on 24/11/20Release on 24/11/20

Version availability by tagsVersion availability by tags

Release on 17/11/20Release on 17/11/20

Version availability by tagsVersion availability by tags

Release on 26/10/20Release on 26/10/20

Version availability by tagsVersion availability by tags

Release on 12/10/20Release on 12/10/20

Version availability by tagsVersion availability by tags

Release on 18/08/20Release on 18/08/20

Version availability by tagsVersion availability by tags

Release on 21/07/20Release on 21/07/20

Version availability by tagsVersion availability by tags

Release on 27/05/20Release on 27/05/20

Version availability by tagsVersion availability by tags

Release on 15/05/20Release on 15/05/20

Version availability by tagsVersion availability by tags

Release on 16/04/20Release on 16/04/20

Version availability by tagsVersion availability by tags

Was the article helpful?

For recognition

For synthesis

Current version

Previous versions

Release 30.09.21

Release on 09/03/21

Version availability by tags

Release on 26/02/21

Version availability by tags

Release on 03/02/21

Version availability by tags

Release on 14/12/20

Version availability by tags

Release on 01/12/20

Version availability by tags

Release on 24/11/20

Version availability by tags

Release on 17/11/20

Version availability by tags

Release on 26/10/20

Version availability by tags

Release on 12/10/20

Version availability by tags

Release on 18/08/20

Version availability by tags

Release on 21/07/20

Version availability by tags

Release on 27/05/20

Version availability by tags

Release on 15/05/20

Version availability by tags

Release on 16/04/20

Version availability by tags