Short audio recognition

Short audio recognition ensures fast response time and is suitable for single-channel audio of small length.

If you want to recognize speech over the same connection, use streaming mode. In streaming mode, you can get intermediate recognition results.

Audio requirements

The audio you send must meet the following requirements:

  1. Maximum file size: 1 MB.
  2. Maximum length: 30 seconds.
  3. Maximum number of audio channels: 1.

If your file is larger, longer, or has more audio channels, use long audio recognition.

HTTP request


Use the "Transfer-Encoding: chunked" header for data streaming.

Query parameters

Parameter Description
lang string
The language to use for recognition.
Acceptable values:
  • ru-RU (by default) — Russian.
  • en-US — English.
  • tr-TR — Turkish.
topic string
The language model to be used for recognition.
The closer the model is matched, the better the recognition result. You can only specify one model per request.
Acceptable values depend on the selected language. Default value: general.
profanityFilter boolean
This parameter controls the profanity filter in recognized speech.
Acceptable values:
  • false (default) — Profanities aren't excluded from recognition results.
  • true — Profanities are excluded from recognition results.
format string
The format of the submitted audio.
Acceptable values:
sampleRateHertz string
The sampling frequency of the submitted audio.
Used if format is set to lpcm. Acceptable values:
  • 48000 (default) — Sampling rate of 48 kHz.
  • 16000 — Sampling rate of 16 kHz.
  • 8000 — Sampling rate of 8 kHz.
folderId string

ID of the folder that you have access to. Required for authorization with a user account (see the UserAccount resource). Don't specify this field if you make a request on behalf of a service account.

Maximum string length: 50 characters.

Parameters in the request body

The request body has to contain the binary content of an audio file.


The recognized text is returned in the response in the result field.

  "result": <recognized text>


To recognize speech in Russian, send an audio fragment (for example, speech.ogg) to the service.

Sample request

POST /speech/v1/stt:recognize?topic=general&lang=ru-RU&folderId={folder ID} HTTP/1.1
Authorization: Bearer <IAM-TOKEN>

... (binary content of an audio file)
$ export FOLDER_ID=b1gvmob95yysaplct532
$ export IAM_TOKEN=CggaATEVAgA...
$ curl -X POST \
     -H "Authorization: Bearer ${IAM_TOKEN}" \
     -H "Transfer-Encoding: chunked" \
     --data-binary "@speech.ogg" \
import urllib.request
import json

FOLDER_ID = "b1gvmob95yysaplct532" # ID of the folder
IAM_TOKEN = "CggaATEVAgA..." # IAM token

with open("speech.ogg", "rb") as f:
    data =

params = "&".join([
    "folderId=%s" % FOLDER_ID,

url = urllib.request.Request("" % params, data=data)
url.add_header("Authorization", "Bearer %s" % IAM_TOKEN)

responseData = urllib.request.urlopen(url).read().decode('UTF-8')
decodedData = json.loads(responseData)

if decodedData.get("error_code") is None:

$token = 'CggaATEVAgA...'; # IAM token
$folderId = "b1gvmob95yysaplct532"; # ID of the folder
$audioFileName = "speech.ogg";

$file = fopen($audioFileName, 'rb');

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "${folderId}&format=oggopus");
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Authorization: Bearer ' . $token, 'Transfer-Encoding: chunked'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);

curl_setopt($ch, CURLOPT_INFILE, $file);
curl_setopt($ch, CURLOPT_INFILESIZE, filesize($audioFileName));
$res = curl_exec($ch);
$decodedResponse = json_decode($res, true);
if (isset($decodedResponse["result"])) {
    echo $decodedResponse["result"];
} else {
    echo "Error code: " . $decodedResponse["error_code"] . "\r\n";
    echo "Error message: " . $decodedResponse["error_message"] . "\r\n";


Response example

HTTP/1.1 200 OK
YaCloud-Billing-Units: 15
  "result": "your number is 212-85-06"