Yandex.Cloud
  • Services
  • Why Yandex.Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Yandex Vision
  • Getting started
  • Step-by-step instructions
    • All instructions
    • Text recognition
    • Assessing image quality
    • Image moderation
    • Face detection
    • Base64 encoding
  • Concepts
    • Overview
    • Text recognition
      • Overview
      • Template recognition
      • Supported languages
      • Current version restrictions
    • Image classification
      • Overview
      • Supported models
    • Face detection
    • Quotas and limits
  • Access management
  • Pricing policy
  • API reference
    • Authentication in the API
    • gRPC
      • Обзор
      • VisionService
    • REST
      • Handling errors
      • Troubleshooting
      • Overview
      • Vision
        • Overview
        • batchAnalyze
  • Questions and answers
  1. API reference
  2. REST
  3. Vision
  4. batchAnalyze

Method batchAnalyze

  • HTTP request
  • Body parameters
  • Response

Analyzes a batch of images and returns results with annotations.

HTTP request

POST https://vision.api.cloud.yandex.net/vision/v1/batchAnalyze

Body parameters

{
  "analyzeSpecs": [
    {
      "features": [
        {
          "type": "string",

          // `analyzeSpecs[].features[]` includes only one of the fields `classificationConfig`, `textDetectionConfig`
          "classificationConfig": {
            "model": "string"
          },
          "textDetectionConfig": {
            "languageCodes": [
              "string"
            ],
            "model": "string"
          },
          // end of the list of possible fields`analyzeSpecs[].features[]`

        }
      ],
      "mimeType": "string",

      // `analyzeSpecs[]` includes only one of the fields `content`, `signature`
      "content": "string",
      "signature": "string",
      // end of the list of possible fields`analyzeSpecs[]`

    }
  ],
  "folderId": "string"
}
Field Description
analyzeSpecs[] object

Required. A list of specifications. Each specification contains the file to analyze and features to use for analysis.

Restrictions:

  • Supported file formats: JPEG, PNG.
  • Maximum file size: 1 MB.
  • Image size should not exceed 20M pixels (length x width).

The number of elements must be in the range 1-8.

analyzeSpecs[].
features[]
object

Required. Requested features to use for analysis.

Max count of requested features for one file is 8.

The number of elements must be in the range 1-8.

analyzeSpecs[].
features[].
type
string
Type of requested feature.
  • TEXT_DETECTION: Text detection (OCR) feature.
  • CLASSIFICATION: Classification feature.
  • FACE_DETECTION: Face detection feature.
  • IMAGE_COPY_SEARCH: Image copy search.
analyzeSpecs[].
features[].
classificationConfig
object
Required for the CLASSIFICATION type. Specifies configuration for the classification feature.
analyzeSpecs[].features[] includes only one of the fields classificationConfig, textDetectionConfig

analyzeSpecs[].
features[].
classificationConfig.
model
string

Model to use for image classification.

The maximum string length in characters is 256.

analyzeSpecs[].
features[].
textDetectionConfig
object
Required for the TEXT_DETECTION type. Specifies configuration for the text detection (OCR) feature.
analyzeSpecs[].features[] includes only one of the fields classificationConfig, textDetectionConfig

analyzeSpecs[].
features[].
textDetectionConfig.
languageCodes[]
string

Required. List of the languages to recognize text. Specified in ISO 639-1 format (for example, ru).

The number of elements must be in the range 1-8. The maximum string length in characters for each value is 3.

analyzeSpecs[].
features[].
textDetectionConfig.
model
string

Model to use for text detection. Possible values:

  • page (default) — this model is suitable for detecting multiple text entries in an image.
  • line — this model is suitable for cropped images with one line of text.

The maximum string length in characters is 50.

analyzeSpecs[].
mimeType
string

MIME type of content (for example, application/pdf).

The maximum string length in characters is 255.

analyzeSpecs[].
content
string (byte)
analyzeSpecs[] includes only one of the fields content, signature

Image content, represented as a stream of bytes. Note: As with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64.

The maximum string length in characters is 10485760.

analyzeSpecs[].
signature
string
analyzeSpecs[] includes only one of the fields content, signature

The maximum string length in characters is 16384.

folderId string

ID of the folder to which you have access. Required for authorization with a user account (see UserAccount resource). Don't specify this field if you make the request on behalf of a service account.

The maximum string length in characters is 50.

Response

HTTP Code: 200 - OK

{
  "results": [
    {
      "results": [
        {
          "error": {
            "code": "integer",
            "message": "string",
            "details": [
              "object"
            ]
          },

          // `results[].results[]` includes only one of the fields `textDetection`, `classification`, `faceDetection`, `imageCopySearch`
          "textDetection": {
            "pages": [
              {
                "width": "string",
                "height": "string",
                "blocks": [
                  {
                    "boundingBox": {
                      "vertices": [
                        {
                          "x": "string",
                          "y": "string"
                        }
                      ]
                    },
                    "lines": [
                      {
                        "boundingBox": {
                          "vertices": [
                            {
                              "x": "string",
                              "y": "string"
                            }
                          ]
                        },
                        "words": [
                          {
                            "boundingBox": {
                              "vertices": [
                                {
                                  "x": "string",
                                  "y": "string"
                                }
                              ]
                            },
                            "text": "string",
                            "confidence": "number",
                            "languages": [
                              {
                                "languageCode": "string",
                                "confidence": "number"
                              }
                            ],
                            "entityIndex": "string"
                          }
                        ],
                        "confidence": "number"
                      }
                    ]
                  }
                ],
                "entities": [
                  {
                    "name": "string",
                    "text": "string"
                  }
                ]
              }
            ]
          },
          "classification": {
            "properties": [
              {
                "name": "string",
                "probability": "number"
              }
            ]
          },
          "faceDetection": {
            "faces": [
              {
                "boundingBox": {
                  "vertices": [
                    {
                      "x": "string",
                      "y": "string"
                    }
                  ]
                }
              }
            ]
          },
          "imageCopySearch": {
            "copyCount": "string",
            "topResults": [
              {
                "imageUrl": "string",
                "pageUrl": "string",
                "title": "string",
                "description": "string"
              }
            ]
          },
          // end of the list of possible fields`results[].results[]`

        }
      ],
      "error": {
        "code": "integer",
        "message": "string",
        "details": [
          "object"
        ]
      }
    }
  ]
}
Field Description
results[] object

Request results. Results have the same order as specifications in the request.

results[].
results[]
object

Results for each requested feature. Feature results have the same order as in the request.

results[].
results[].
error
object
Return error in case of error during the specified feature processing.

The error result of the operation in case of failure or cancellation.

results[].
results[].
error.
code
integer (int32)

Error code. An enum value of google.rpc.Code.

results[].
results[].
error.
message
string

An error message.

results[].
results[].
error.
details[]
object

A list of messages that carry the error details.

results[].
results[].
textDetection
object
Text detection (OCR) result.
results[].results[] includes only one of the fields textDetection, classification, faceDetection, imageCopySearch

results[].
results[].
textDetection.
pages[]
object

Pages of the recognized file.

For JPEG and PNG files contains only 1 page.

results[].
results[].
textDetection.
pages[].
width
string (int64)

Page width in pixels.

results[].
results[].
textDetection.
pages[].
height
string (int64)

Page height in pixels.

results[].
results[].
textDetection.
pages[].
blocks[]
object

Recognized text blocks in this page.

results[].
results[].
textDetection.
pages[].
blocks[].
boundingBox
object

Area on the page where the text block is located.

results[].
results[].
textDetection.
pages[].
blocks[].
boundingBox.
vertices[]
object

The bounding polygon vertices.

results[].
results[].
textDetection.
pages[].
blocks[].
boundingBox.
vertices[].
x
string (int64)

X coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
boundingBox.
vertices[].
y
string (int64)

Y coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[]
object

Recognized lines in this block.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
boundingBox
object

Area on the page where the line is located.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
boundingBox.
vertices[]
object

The bounding polygon vertices.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
boundingBox.
vertices[].
x
string (int64)

X coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
boundingBox.
vertices[].
y
string (int64)

Y coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[]
object

Recognized words in this line.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
boundingBox
object

Area on the page where the word is located.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
boundingBox.
vertices[]
object

The bounding polygon vertices.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
boundingBox.
vertices[].
x
string (int64)

X coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
boundingBox.
vertices[].
y
string (int64)

Y coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
text
string

Recognized word value.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
confidence
number (double)

Confidence of the OCR results for the word. Range [0, 1].

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
languages[]
object

A list of detected languages together with confidence.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
languages[].
languageCode
string

Detected language code.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
languages[].
confidence
number (double)

Confidence of detected language. Range [0, 1].

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
entityIndex
string (int64)

Id of recognized word in entities array

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
confidence
number (double)

Confidence of the OCR results for the line. Range [0, 1].

results[].
results[].
textDetection.
pages[].
entities[]
object

Recognized entities

results[].
results[].
textDetection.
pages[].
entities[].
name
string

Entity name

results[].
results[].
textDetection.
pages[].
entities[].
text
string

Recognized entity text

results[].
results[].
classification
object
Classification result.
results[].results[] includes only one of the fields textDetection, classification, faceDetection, imageCopySearch

results[].
results[].
classification.
properties[]
object

Properties extracted by a specified model.

For example, if you ask to evaluate the image quality, the service could return such properties as good and bad.

results[].
results[].
classification.
properties[].
name
string

Property name.

results[].
results[].
classification.
properties[].
probability
number (double)

Probability of the property, from 0 to 1.

results[].
results[].
faceDetection
object
Face detection result.
results[].results[] includes only one of the fields textDetection, classification, faceDetection, imageCopySearch

results[].
results[].
faceDetection.
faces[]
object

An array of detected faces for the specified image.

results[].
results[].
faceDetection.
faces[].
boundingBox
object

Area on the image where the face is located.

results[].
results[].
faceDetection.
faces[].
boundingBox.
vertices[]
object

The bounding polygon vertices.

results[].
results[].
faceDetection.
faces[].
boundingBox.
vertices[].
x
string (int64)

X coordinate in pixels.

results[].
results[].
faceDetection.
faces[].
boundingBox.
vertices[].
y
string (int64)

Y coordinate in pixels.

results[].
results[].
imageCopySearch
object
Image Copy Search result.
results[].results[] includes only one of the fields textDetection, classification, faceDetection, imageCopySearch

results[].
results[].
imageCopySearch.
copyCount
string (int64)

Number of image copies

results[].
results[].
imageCopySearch.
topResults[]
object

Top relevance result of image copy search

results[].
results[].
imageCopySearch.
topResults[].
imageUrl
string

url of image

results[].
results[].
imageCopySearch.
topResults[].
pageUrl
string

url of page that contains image

results[].
results[].
imageCopySearch.
topResults[].
title
string

page title that contains image

results[].
results[].
imageCopySearch.
topResults[].
description
string

image description

results[].
error
object

Return error in case of error with file processing.

The error result of the operation in case of failure or cancellation.

results[].
error.
code
integer (int32)

Error code. An enum value of google.rpc.Code.

results[].
error.
message
string

An error message.

results[].
error.
details[]
object

A list of messages that carry the error details.

In this article:
  • HTTP request
  • Body parameters
  • Response
Language
Careers
Privacy policy
Terms of use
© 2021 Yandex.Cloud LLC