Method batchAnalyze

Analyzes a batch of images and returns results with annotations.

HTTP request

POST https://vision.api.cloud.yandex.net/vision/v1/batchAnalyze

Body parameters

{
  "analyzeSpecs": [
    {
      "features": [
        {
          "type": "string",

          // `analyzeSpecs[].features[]` includes only one of the fields `classificationConfig`, `textDetectionConfig`
          "classificationConfig": {
            "model": "string"
          },
          "textDetectionConfig": {
            "languageCodes": [
              "string"
            ],
            "model": "string"
          },
          // end of the list of possible fields`analyzeSpecs[].features[]`

        }
      ],
      "mimeType": "string",
      "content": "string"
    }
  ],
  "folderId": "string"
}
Field Description
analyzeSpecs[] object

Required. A list of specifications. Each specification contains the file to analyze and features to use for analysis.

Restrictions:

  • Supported file formats: JPEG, PNG.
  • Maximum file size: 1 MB.
  • Image size should not exceed 20M pixels (length x width).

The number of elements must be in the range 1-8.

analyzeSpecs[].
features[]
object

Required. Requested features to use for analysis.

Max count of requested features for one file is 8.

The number of elements must be in the range 1-8.

analyzeSpecs[].
features[].
type
string
Type of requested feature.
  • TEXT_DETECTION: Text detection (OCR) feature.
  • CLASSIFICATION: Classification feature.
  • FACE_DETECTION: Face detection feature.
analyzeSpecs[].
features[].
classificationConfig
object
Required for the CLASSIFICATION type. Specifies configuration for the classification feature.
analyzeSpecs[].features[] includes only one of the fields classificationConfig, textDetectionConfig

analyzeSpecs[].
features[].
classificationConfig.
model
string

The model to use for the image analysis.

The maximum string length in characters is 256.

analyzeSpecs[].
features[].
textDetectionConfig
object
Required for the TEXT_DETECTION type. Specifies configuration for the text detection (OCR) feature.
analyzeSpecs[].features[] includes only one of the fields classificationConfig, textDetectionConfig

analyzeSpecs[].
features[].
textDetectionConfig.
languageCodes[]
string

Required. List of the languages to recognize text. Specified in ISO 639-1 format (for example, ru).

The number of elements must be in the range 1-8. The maximum string length in characters for each value is 3.

analyzeSpecs[].
features[].
textDetectionConfig.
model
string

Do not specify this field, custom models are not supported yet.

The maximum string length in characters is 50.

analyzeSpecs[].
mimeType
string

MIME type of content (for example, application/pdf).

The maximum string length in characters is 255.

analyzeSpecs[].
content
string (byte)

Image content, represented as a stream of bytes. Note: As with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64.

The maximum string length in characters is 1048576.

folderId string

ID of the folder to which you have access. Required for authorization with a user account (see UserAccount resource). Don't specify this field if you make the request on behalf of a service account.

The maximum string length in characters is 50.

Response

HTTP Code: 200 - OK

{
  "results": [
    {
      "results": [
        {
          "error": {
            "code": "integer",
            "message": "string",
            "details": [
              "object"
            ]
          },

          // `results[].results[]` includes only one of the fields `textDetection`, `classification`, `faceDetection`
          "textDetection": {
            "pages": [
              {
                "width": "string",
                "height": "string",
                "blocks": [
                  {
                    "boundingBox": {
                      "vertices": [
                        {
                          "x": "string",
                          "y": "string"
                        }
                      ]
                    },
                    "lines": [
                      {
                        "boundingBox": {
                          "vertices": [
                            {
                              "x": "string",
                              "y": "string"
                            }
                          ]
                        },
                        "words": [
                          {
                            "boundingBox": {
                              "vertices": [
                                {
                                  "x": "string",
                                  "y": "string"
                                }
                              ]
                            },
                            "text": "string",
                            "confidence": "number",
                            "languages": [
                              {
                                "languageCode": "string",
                                "confidence": "number"
                              }
                            ]
                          }
                        ],
                        "confidence": "number"
                      }
                    ]
                  }
                ]
              }
            ]
          },
          "classification": {
            "properties": [
              {
                "name": "string",
                "probability": "number"
              }
            ]
          },
          "faceDetection": {
            "faces": [
              {
                "boundingBox": {
                  "vertices": [
                    {
                      "x": "string",
                      "y": "string"
                    }
                  ]
                }
              }
            ]
          },
          // end of the list of possible fields`results[].results[]`

        }
      ],
      "error": {
        "code": "integer",
        "message": "string",
        "details": [
          "object"
        ]
      }
    }
  ]
}
Field Description
results[] object

Request results. Results have the same order as specifications in the request.

results[].
results[]
object

Results for each requested feature. Feature results have the same order as in the request.

results[].
results[].
error
object
Return error in case of error during the specified feature processing.

The error result of the operation in case of failure or cancellation.

results[].
results[].
error.
code
integer (int32)

Error code. An enum value of google.rpc.Code.

results[].
results[].
error.
message
string

An error message.

results[].
results[].
error.
details[]
object

A list of messages that carry the error details.

results[].
results[].
textDetection
object
Text detection (OCR) result.
results[].results[] includes only one of the fields textDetection, classification, faceDetection

results[].
results[].
textDetection.
pages[]
object

Pages of the recognized file.

For JPEG and PNG files contains only 1 page.

results[].
results[].
textDetection.
pages[].
width
string (int64)

Page width in pixels.

results[].
results[].
textDetection.
pages[].
height
string (int64)

Page height in pixels.

results[].
results[].
textDetection.
pages[].
blocks[]
object

Recognized text blocks in this page.

results[].
results[].
textDetection.
pages[].
blocks[].
boundingBox
object

Area on the page where the text block is located.

results[].
results[].
textDetection.
pages[].
blocks[].
boundingBox.
vertices[]
object

The bounding polygon vertices.

results[].
results[].
textDetection.
pages[].
blocks[].
boundingBox.
vertices[].
x
string (int64)

X coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
boundingBox.
vertices[].
y
string (int64)

Y coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[]
object

Recognized lines in this block.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
boundingBox
object

Area on the page where the line is located.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
boundingBox.
vertices[]
object

The bounding polygon vertices.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
boundingBox.
vertices[].
x
string (int64)

X coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
boundingBox.
vertices[].
y
string (int64)

Y coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[]
object

Recognized words in this line.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
boundingBox
object

Area on the page where the word is located.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
boundingBox.
vertices[]
object

The bounding polygon vertices.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
boundingBox.
vertices[].
x
string (int64)

X coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
boundingBox.
vertices[].
y
string (int64)

Y coordinate in pixels.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
text
string

Recognized word value.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
confidence
number (double)

Confidence of the OCR results for the word. Range [0, 1].

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
languages[]
object

A list of detected languages together with confidence.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
languages[].
languageCode
string

Detected language code.

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
words[].
languages[].
confidence
number (double)

Confidence of detected language. Range [0, 1].

results[].
results[].
textDetection.
pages[].
blocks[].
lines[].
confidence
number (double)

Confidence of the OCR results for the line. Range [0, 1].

results[].
results[].
classification
object
Classification result.
results[].results[] includes only one of the fields textDetection, classification, faceDetection

results[].
results[].
classification.
properties[]
object

Properties extracted by a specified model.

For example, if you ask to evaluate the image quality, the service could return such properties as good and bad.

results[].
results[].
classification.
properties[].
name
string

Property name.

results[].
results[].
classification.
properties[].
probability
number (double)

Probability of the property, from 0 to 1.

results[].
results[].
faceDetection
object
Face detection result.
results[].results[] includes only one of the fields textDetection, classification, faceDetection

results[].
results[].
faceDetection.
faces[]
object

An array of detected faces for the specified image.

results[].
results[].
faceDetection.
faces[].
boundingBox
object

Area on the image where the face is located.

results[].
results[].
faceDetection.
faces[].
boundingBox.
vertices[]
object

The bounding polygon vertices.

results[].
results[].
faceDetection.
faces[].
boundingBox.
vertices[].
x
string (int64)

X coordinate in pixels.

results[].
results[].
faceDetection.
faces[].
boundingBox.
vertices[].
y
string (int64)

Y coordinate in pixels.

results[].
error
object

Return error in case of error with file processing.

The error result of the operation in case of failure or cancellation.

results[].
error.
code
integer (int32)

Error code. An enum value of google.rpc.Code.

results[].
error.
message
string

An error message.

results[].
error.
details[]
object

A list of messages that carry the error details.