YC SpeechKit releases
Yandex SpeechKit provides updates based on the model and version system.
For speech recognition
general
model versions have several tags:
general
: This tag indicates the main version.general:rc
: This tag indicates a release candidate that you can test.general:deprecated
: The tag of the previous version that remains available 2 weeks after publishing the new main version.
For a detailed description of the available versions, see Recognition models.
The next-generation hqa
model is only available in transcription for the Stradivarius version.
For speech synthesis
In speech synthesis, the service provides two types of voices: standard and premium. Premium voices use new speech-synthesis technology.
For more information about voice models, see About the technology.
Current version
Release 14.12.20
In transcription by the hqa
model tag, a new version named Amati is now available. Issues where silence was recognized instead of speech have been fixed. Text recognition has been improved for the news and medicine subject domains.
Version availability by tag
In transcription only:
hqa
: The Amati version.
In streaming, transcription, and short audio recognition:
general
: The Zeno version.general:rc
: The Galen version.general:deprecated
: The Anaximander version.
Previous versions
Release 01.12.20
In streaming, transcription, and short audio recognition by the general:rc
tag, a new version of the Galen model is now available. It provides a significantly better basic recognition quality and recognizes words related to COVID-19.
Version availability by tag
In transcription only:
hqa
: The Stradivarius version.
In streaming, transcription, and short audio recognition:
general
: The Zeno version.general:rc
: The Galen version.general:deprecated
: The Anaximander version.
Release 24.11.20
After successful testing, the Zeno version is now the main released version of the general
model in streaming, transcription, and short audio recognition.
Version availability by tag
In transcription only:
hqa
: The Stradivarius version.
In streaming, transcription, and short audio recognition:
general
andgeneral:rc
: Zeno version.general:deprecated
: The Anaximander version.
Release 17.10.20
Numerous corrections in the pronunciation of individual words thanks to improved normalization. Declension of numerals fixed. A new version of the alena
premium voice is now available by the alena
tag.
Version availability by tag
No changes.
Release 26.10.20
A next-generation recognition model is available in transcription: hqa
. This model has a richer vocabulary, so recognition results are much better and more understandable to readers. The difference is especially noticeable with long audio recognition.
Version availability by tag
In transcription:
hqa
: The Stradivarius version.general
: The Anaximander version.general:rc
: The Zeno version.general:deprecated
: The Marcus Aurelius version.
In streaming and short audio recognition: no changes.
Release 12.10.20
The new version provides significantly better basic recognition quality. A new version of the general
model is now available in streaming, transcription, and short audio recognition.
Version availability by tag
general
: The Anaximander version.general:rc
: The Zeno version.general:deprecated
: The Marcus Aurelius version.
Release 18.08.20
Update for transcription in the Anaximander version:
- Improved handling of dense speech flows, having no detectable pauses in speech for more than 30 seconds.
- Timing fixed.
- Fixed an error with partial recognition results arriving after the final result.
The acoustic and language properties of the model have not changed.
Version availability by tag
These versions are available for streaming recognition, transcription, and short audio recognition:
general
: The Anaximander version.general:rc
: The Anaximander version (updated).general:deprecated
: The Marcus Aurelius version.
Release 21.07.20
Anaximander is now the main operating version for streaming recognition, transcription, and short audio recognition.
Version availability by tag
general
andgeneral:rc
: Anaximander version.general:deprecated
: The Marcus Aurelius version.
Release 27.05.20
New versions of the general
model are now available in transcription and short audio recognition.
Version availability by tag
Available versions by tag:
general:rc
: The Anaximander version.general
andgeneral:deprecated
: The Marcus Aurelius version.
Versions of the general
model available for streaming recognition:
general
: The Marcus Aurelius version.general:rc
: The Anaximander version.general:deprecated
: The Diogenes version.
Release 15.05.20
For streaming speech recognition, the new version of the Anaximander model is now available with the general:rc
tag.
Version availability by tag
general
: The Marcus Aurelius version.general:rc
: The Anaximander version.general:deprecated
: The Diogenes version.
The versions for short and long audio recognition remain unchanged.
Release 16.04.20
For streaming speech recognition by the general
tag, a new version of the Marcus Aurelius model is now available.
Version availability by tag
general
andgeneral:deprecated
: The Marcus Aurelius version.general:deprecated
: The Diogenes version.
The versions for short and long audio recognition available with the general
tag remain unchanged.