Yandex.Cloud
  • Services
  • Why Yandex.Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Yandex SpeechKit
  • Getting started
  • Releases
  • Speech recognition
    • About the technology
    • Short audio recognition
    • Recognition of long audio fragments
    • Data streaming recognition
    • Audio formats
    • Recognition models
  • Speech synthesis
    • About the technology
    • API method description
    • List of voices
    • Using SSML
    • List of supported SSML phonemes
  • IVR integration
  • Using the API
    • Authentication in the API
    • Response format
    • Troubleshooting
  • Quotas and limits
  • Access management
  • Pricing policy
    • Current pricing policy
    • Archive
      • Policy before January 1, 2019
  • Questions and answers
  1. Releases

YC SpeechKit releases

  • Current version
    • Release 14.12.20
  • Previous versions
    • Release 01.12.20
    • Release 24.11.20
    • Release 17.10.20
    • Release 26.10.20
    • Release 12.10.20
    • Release 18.08.20
    • Release 21.07.20
    • Release 27.05.20
    • Release 15.05.20
    • Release 16.04.20

Yandex SpeechKit provides updates based on the model and version system.

For speech recognition

general model versions have several tags:

  • general: This tag indicates the main version.
  • general:rc: This tag indicates a release candidate that you can test.
  • general:deprecated: The tag of the previous version that remains available 2 weeks after publishing the new main version.

For a detailed description of the available versions, see Recognition models.

The next-generation hqa model is only available in transcription for the Stradivarius version.

For speech synthesis

In speech synthesis, the service provides two types of voices: standard and premium. Premium voices use new speech-synthesis technology.

For more information about voice models, see About the technology.

Current version

Release 14.12.20

In transcription by the hqa model tag, a new version named Amati is now available. Issues where silence was recognized instead of speech have been fixed. Text recognition has been improved for the news and medicine subject domains.

Version availability by tag

In transcription only:

  • hqa: The Amati version.

In streaming, transcription, and short audio recognition:

  • general: The Zeno version.
  • general:rc: The Galen version.
  • general:deprecated: The Anaximander version.

Previous versions

Release 01.12.20

In streaming, transcription, and short audio recognition by the general:rc tag, a new version of the Galen model is now available. It provides a significantly better basic recognition quality and recognizes words related to COVID-19.

Version availability by tag

In transcription only:

  • hqa: The Stradivarius version.

In streaming, transcription, and short audio recognition:

  • general: The Zeno version.
  • general:rc: The Galen version.
  • general:deprecated: The Anaximander version.

Release 24.11.20

After successful testing, the Zeno version is now the main released version of the general model in streaming, transcription, and short audio recognition.

Version availability by tag

In transcription only:

  • hqa: The Stradivarius version.

In streaming, transcription, and short audio recognition:

  • general and general:rc: Zeno version.
  • general:deprecated: The Anaximander version.

Release 17.10.20

Numerous corrections in the pronunciation of individual words thanks to improved normalization. Declension of numerals fixed. A new version of the alena premium voice is now available by the alena tag.

Version availability by tag

No changes.

Release 26.10.20

A next-generation recognition model is available in transcription: hqa. This model has a richer vocabulary, so recognition results are much better and more understandable to readers. The difference is especially noticeable with long audio recognition.

Version availability by tag

In transcription:

  • hqa: The Stradivarius version.
  • general: The Anaximander version.
  • general:rc: The Zeno version.
  • general:deprecated: The Marcus Aurelius version.

In streaming and short audio recognition: no changes.

Release 12.10.20

The new version provides significantly better basic recognition quality. A new version of the general model is now available in streaming, transcription, and short audio recognition.

Version availability by tag

  • general: The Anaximander version.
  • general:rc: The Zeno version.
  • general:deprecated: The Marcus Aurelius version.

Release 18.08.20

Update for transcription in the Anaximander version:

  • Improved handling of dense speech flows, having no detectable pauses in speech for more than 30 seconds.
  • Timing fixed.
  • Fixed an error with partial recognition results arriving after the final result.

The acoustic and language properties of the model have not changed.

Version availability by tag

These versions are available for streaming recognition, transcription, and short audio recognition:

  • general: The Anaximander version.
  • general:rc: The Anaximander version (updated).
  • general:deprecated: The Marcus Aurelius version.

Release 21.07.20

Anaximander is now the main operating version for streaming recognition, transcription, and short audio recognition.

Version availability by tag

  • general and general:rc: Anaximander version.
  • general:deprecated: The Marcus Aurelius version.

Release 27.05.20

New versions of the general model are now available in transcription and short audio recognition.

Version availability by tag

Available versions by tag:

  • general:rc: The Anaximander version.
  • general and general:deprecated: The Marcus Aurelius version.

Versions of the general model available for streaming recognition:

  • general: The Marcus Aurelius version.
  • general:rc: The Anaximander version.
  • general:deprecated: The Diogenes version.

Release 15.05.20

For streaming speech recognition, the new version of the Anaximander model is now available with the general:rc tag.

Version availability by tag

  • general: The Marcus Aurelius version.
  • general:rc: The Anaximander version.
  • general:deprecated: The Diogenes version.

The versions for short and long audio recognition remain unchanged.

Release 16.04.20

For streaming speech recognition by the general tag, a new version of the Marcus Aurelius model is now available.

Version availability by tag

  • general and general:deprecated: The Marcus Aurelius version.
  • general:deprecated: The Diogenes version.

The versions for short and long audio recognition available with the general tag remain unchanged.

In this article:
  • Current version
  • Release 14.12.20
  • Previous versions
  • Release 01.12.20
  • Release 24.11.20
  • Release 17.10.20
  • Release 26.10.20
  • Release 12.10.20
  • Release 18.08.20
  • Release 21.07.20
  • Release 27.05.20
  • Release 15.05.20
  • Release 16.04.20
Language
Careers
Privacy policy
Terms of use
© 2021 Yandex.Cloud LLC