Short audio recognition
Short audio recognition ensures fast response time and is suitable for single-channel audio of small length.
If you want to recognize speech over the same connection, use streaming mode. In streaming mode, you can get intermediate recognition results.
Audio requirements
The audio you send must meet the following requirements:
- Maximum file size: 1 MB.
- Maximum length: 30 seconds.
- Maximum number of audio channels: 1.
If your file is larger, longer, or has more audio channels, use long audio recognition.
HTTP request
POST https://stt.api.cloud.yandex.net/speech/v1/stt:recognize
Use the "Transfer-Encoding: chunked"
header for data streaming.
Query parameters
Parameter | Description |
---|---|
lang | string The language to use for recognition. Acceptable values:
|
topic | string The language model to be used for recognition. The closer the model is matched, the better the recognition result. You can only specify one model per request. Acceptable values depend on the selected language. Default value: general . |
profanityFilter | boolean This parameter controls the profanity filter in recognized speech. Acceptable values:
|
format | string The format of the submitted audio. Acceptable values:
|
sampleRateHertz | string The sampling frequency of the submitted audio. Used if format is set to lpcm . Acceptable values:
|
folderId | string ID of the folder that you have access to. Required for authorization with a user account (see the UserAccount resource). Don't specify this field if you make a request on behalf of a service account. Maximum string length: 50 characters. |
Parameters in the request body
The request body has to contain the binary content of an audio file.
Response
The recognized text is returned in the response in the result
field.
{
"result": <recognized text>
}
Examples
To recognize speech in Russian, send an audio fragment (for example, speech.ogg) to the service.
Sample request
POST /speech/v1/stt:recognize?topic=general&lang=ru-RU&folderId={folder ID} HTTP/1.1
Host: stt.api.cloud.yandex.net
Authorization: Bearer <IAM-TOKEN>
... (binary content of an audio file)
$ export FOLDER_ID=b1gvmob95yysaplct532
$ export IAM_TOKEN=CggaATEVAgA...
$ curl -X POST \
-H "Authorization: Bearer ${IAM_TOKEN}" \
-H "Transfer-Encoding: chunked" \
--data-binary "@speech.ogg" \
"https://stt.api.cloud.yandex.net/speech/v1/stt:recognize?topic=general&folderId=${FOLDER_ID}"
import urllib.request
import json
FOLDER_ID = "b1gvmob95yysaplct532" # ID of the folder
IAM_TOKEN = "CggaATEVAgA..." # IAM token
with open("speech.ogg", "rb") as f:
data = f.read()
params = "&".join([
"topic=general",
"folderId=%s" % FOLDER_ID,
"lang=ru-RU"
])
url = urllib.request.Request("https://stt.api.cloud.yandex.net/speech/v1/stt:recognize?%s" % params, data=data)
url.add_header("Authorization", "Bearer %s" % IAM_TOKEN)
responseData = urllib.request.urlopen(url).read().decode('UTF-8')
decodedData = json.loads(responseData)
if decodedData.get("error_code") is None:
print(decodedData.get("result"))
<?php
$token = 'CggaATEVAgA...'; # IAM token
$folderId = "b1gvmob95yysaplct532"; # ID of the folder
$audioFileName = "speech.ogg";
$file = fopen($audioFileName, 'rb');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://stt.api.cloud.yandex.net/speech/v1/stt:recognize?lang=ru-RU&folderId=${folderId}&format=oggopus");
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Authorization: Bearer ' . $token, 'Transfer-Encoding: chunked'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_INFILE, $file);
curl_setopt($ch, CURLOPT_INFILESIZE, filesize($audioFileName));
$res = curl_exec($ch);
curl_close($ch);
$decodedResponse = json_decode($res, true);
if (isset($decodedResponse["result"])) {
echo $decodedResponse["result"];
} else {
echo "Error code: " . $decodedResponse["error_code"] . "\r\n";
echo "Error message: " . $decodedResponse["error_message"] . "\r\n";
}
fclose($file);
Response example
HTTP/1.1 200 OK
YaCloud-Billing-Units: 15
{
"result": "your number is 212-85-06"
}