Vision API, gRPC: VisionService
Written by
Updated at September 8, 2023
A set of methods for the Vision service.
Call | Description |
---|---|
BatchAnalyze | Analyzes a batch of images and returns results with annotations. |
Calls VisionService
BatchAnalyze
Analyzes a batch of images and returns results with annotations.
rpc BatchAnalyze (BatchAnalyzeRequest) returns (BatchAnalyzeResponse)
BatchAnalyzeRequest
Field | Description |
---|---|
analyze_specs[] | AnalyzeSpec A list of specifications. Each specification contains the file to analyze and features to use for analysis. Restrictions:
|
folder_id | string ID of the folder to which you have access. Required for authorization with a user account (see yandex.cloud.iam.v1.UserAccount resource). Don't specify this field if you make the request on behalf of a service account. The maximum string length in characters is 50. |
AnalyzeSpec
Field | Description |
---|---|
source | oneof: content or signature |
content | bytes Image content, represented as a stream of bytes. Note: As with all bytes fields, protobuffers use a pure binary representation, whereas JSON representations use base64. The maximum string length in characters is 10485760. |
signature | string The maximum string length in characters is 16384. |
features[] | Feature Requested features to use for analysis. Max count of requested features for one file is 8. The number of elements must be in the range 1-8. |
mime_type | string MIME type application/pdf ). The maximum string length in characters is 255. |
Feature
Field | Description |
---|---|
type | enum Type Type of requested feature.
|
config | oneof: classification_config or text_detection_config |
classification_config | FeatureClassificationConfig Required for the CLASSIFICATION type. Specifies configuration for the classification feature. |
text_detection_config | FeatureTextDetectionConfig Required for the TEXT_DETECTION type. Specifies configuration for the text detection (OCR) feature. |
FeatureClassificationConfig
Field | Description |
---|---|
model | string Model to use for image classification. The maximum string length in characters is 256. |
FeatureTextDetectionConfig
Field | Description |
---|---|
language_codes[] | string List of the languages to recognize text. Specified in ISO 639-1 ru ). The number of elements must be in the range 1-8. The maximum string length in characters for each value is 3. |
model | string Model to use for text detection. Possible values:
|
BatchAnalyzeResponse
Field | Description |
---|---|
results[] | AnalyzeResult Request results. Results have the same order as specifications in the request. |
AnalyzeResult
Field | Description |
---|---|
results[] | FeatureResult Results for each requested feature. Feature results have the same order as in the request. |
error | google.rpc.Status Return error in case of error with file processing. |
FeatureResult
Field | Description |
---|---|
feature | oneof: text_detection , classification , face_detection or image_copy_search |
text_detection | TextAnnotation Text detection (OCR) result. |
classification | ClassAnnotation Classification result. |
face_detection | FaceAnnotation Face detection result. |
image_copy_search | ImageCopySearchAnnotation Image Copy Search result. |
error | google.rpc.Status Return error in case of error during the specified feature processing. |
TextAnnotation
Field | Description |
---|---|
pages[] | Page Pages of the recognized file. For JPEG and PNG files contains only 1 page. |
Page
Field | Description |
---|---|
width | int64 Page width in pixels. |
height | int64 Page height in pixels. |
blocks[] | Block Recognized text blocks in this page. |
entities[] | Entity Recognized entities |
Block
Field | Description |
---|---|
bounding_box | Polygon Area on the page where the text block is located. |
lines[] | Line Recognized lines in this block. |
Polygon
Field | Description |
---|---|
vertices[] | Vertex The bounding polygon vertices. |
Vertex
Field | Description |
---|---|
x | int64 X coordinate in pixels. |
y | int64 Y coordinate in pixels. |
Line
Field | Description |
---|---|
bounding_box | Polygon Area on the page where the line is located. |
words[] | Word Recognized words in this line. |
confidence | double Confidence of the OCR results for the line. Range [0, 1]. |
Word
Field | Description |
---|---|
bounding_box | Polygon Area on the page where the word is located. |
text | string Recognized word value. |
confidence | double Confidence of the OCR results for the word. Range [0, 1]. |
languages[] | DetectedLanguage A list of detected languages together with confidence. |
entity_index | int64 Id of recognized word in entities array |
DetectedLanguage
Field | Description |
---|---|
language_code | string Detected language code. |
confidence | double Confidence of detected language. Range [0, 1]. |
Entity
Field | Description |
---|---|
name | string Entity name |
text | string Recognized entity text |
ClassAnnotation
Field | Description |
---|---|
properties[] | Property Properties extracted by a specified model. For example, if you ask to evaluate the image quality, the service could return such properties as good and bad . |
Property
Field | Description |
---|---|
name | string Property name. |
probability | double Probability of the property, from 0 to 1. |
FaceAnnotation
Field | Description |
---|---|
faces[] | Face An array of detected faces for the specified image. |
Face
Field | Description |
---|---|
bounding_box | Polygon Area on the image where the face is located. |
ImageCopySearchAnnotation
Field | Description |
---|---|
copy_count | int64 Number of image copies |
top_results[] | CopyMatch Top relevance result of image copy search |
CopyMatch
Field | Description |
---|---|
image_url | string url of image |
page_url | string url of page that contains image |
title | string page title that contains image |
description | string image description |