Yandex Monitoring metric reference
Written by
Updated at March 28, 2024
This section describes the bucket metrics delivered to Monitoring.
The name of the metric is written in the name
label.
Resource usage metrics
Metric name Type, units of measurement |
Label description |
---|---|
resources.storage.used_bytes IGAUGE , bytes |
Size of user and service data stored in distributed network storage. Service data includes the data from the primary and secondary indexes |
resources.storage.limit_bytes IGAUGE , bytes |
Limit on the size of user and service data that a database can store in distributed network storage. |
API metrics
Metric name Type, units of measurement |
Label description |
---|---|
api.grpc.request.bytes RATE , bytes |
Size of requests received by the database over a period. Labels: * api_service: Name of the gRPC API service, e.g., table .* method: Name of the gRPC API method, e.g., ExecuteDataQuery . |
api.grpc.request.dropped_count RATE , number |
Number of requests dropped at the transport (gRPC) layer due to an error. Labels: * api_service: Name of the gRPC API service, e.g., table .* method: Name of the gRPC API method, e.g., ExecuteDataQuery . |
api.grpc.request.inflight_count IGAUGE , number |
Number of requests concurrently processed by the database over a period. Labels: * api_service: Name of the gRPC API service, e.g., table .* method: Name of the gRPC API method, e.g., ExecuteDataQuery . |
api.grpc.request.inflight_bytes IGAUGE , bytes |
Size of requests concurrently processed by the database over a period. Labels: * api_service: Name of the gRPC API service, e.g., table .* method: Name of the gRPC API method, e.g., ExecuteDataQuery . |
api.grpc.response.bytes RATE , bytes |
Size of responses sent by the database over a period. Labels: * api_service: Name of the gRPC API service, e.g., table .* method: Name of the gRPC API method, e.g., ExecuteDataQuery . |
api.grpc.response.count RATE , number |
Number of responses sent by the database over a period. Labels: * api_service: Name of the gRPC API service, e.g., table .* method: Name of the gRPC API method, e.g., ExecuteDataQuery .* status: Query execution status; for more information about statuses, see Handling errors |
api.grpc.response.dropped_count RATE , number |
Number of responses dropped at the transport (gRPC) layer due to an error. Labels: * api_service: Name of the gRPC API service, e.g., table .* method: Name of the gRPC API method, e.g., ExecuteDataQuery . |
api.grpc.response.issues RATE , number |
Number of errors of a certain type, which occurred when executing queries during a certain period of time. Labels: * issue_type: Error type; the only value is optimistic_locks_invalidation . For more information about lock invalidation, see Transactions and queries to YDB |
Session metrics
Metric name Type, units of measurement |
Label description |
---|---|
table.session.active_count IGAUGE , number |
Number of sessions run by clients at the moment |
table.session.closed_by_idle_count RATE , number |
Number of sessions closed by the DB server in a certain period of time due to exceeding the lifetime allowed for an idle session |
Transaction processing metrics
You can analyze a transaction's execution time using a histogram counter. The intervals are set in milliseconds. The chart shows the number of transactions whose duration falls within a certain time interval.
Metric name Type, units of measurement |
Label description |
---|---|
table.transaction.total_duration_milliseconds HIST_RATE , number |
Number of transactions of a certain duration on the server and client. The duration of a transaction is counted from the point of its explicit or implicit start to committing changes or its rollback. Includes the transaction processing time on the server and the time on the client between sending different requests within the same transaction. Labels: * tx_kind: Transaction type; the possible values are read_only , read_write , write_only , and pure . |
table.transaction.server_duration_milliseconds HIST_RATE , number |
Number of transactions of a certain duration on the server. The duration is the time of executing requests within a transaction on the server. Does not include the waiting time on the client between sending separate requests within a single transaction. Labels: * tx_kind: Transaction type; the possible values are read_only , read_write , write_only , and pure . |
table.transaction.client_duration_milliseconds HIST_RATE , number |
Number of transactions of a certain duration on the client. The duration is the waiting time on the client between sending individual requests within a single transaction. Does not include the waiting time on the client between sending separate requests within a single transaction. Labels: * tx_kind: Transaction type; the possible values are read_only , read_write , write_only , and pure . |
Query processing metrics
Metric name Type, units of measurement |
Label description |
---|---|
table.query.request.bytes RATE , bytes |
Size of YQL query text and parameter values to queries received by the database over a certain period of time |
table.query.request.parameters_bytes RATE , bytes |
Parameter size to queries received by the database over a certain period of time |
table.query.response.bytes RATE , bytes |
Size of responses sent by the database over a certain period of time |
table.query.compilation.latency_milliseconds HIST_RATE , number |
Histogram counter. The intervals are set in milliseconds. Shows the number of successfully executed compilation queries whose duration falls within a certain time interval. |
table.query.compilation.active_count IGAUGE , number |
Number of active compilations at the moment |
table.query.compilation.count RATE , number |
Number of compilations completed successfully over a certain time period |
table.query.compilation.errors RATE , number |
Number of compilations failed over a certain period of time |
table.query.compilation.cache_hits RATE , number |
Number of queries over a certain period of time, which did not require any compilation, because there was an existing plan in the cache of prepared queries |
table.query.compilation.cache_misses RATE , number |
Number of queries over a certain period of time that required query compilation |
table.query.execution.latency_milliseconds HIST_RATE , number |
Histogram counter. The intervals are set in milliseconds. Shows the number of queries whose execution time falls within a certain interval. |
Table partition metrics
Metric name Type, units of measurement |
Label description |
---|---|
table.datashard.row_count GAUGE , number |
Number of rows in DB tables |
table.datashard.size_bytes GAUGE , bytes |
Size of data in DB tables |
table.datashard.used_core_percents HIST_GAUGE , % |
Histogram counter. The intervals are set as a percentage. Shows the number of table partitions using computing resources in the ratio that falls within a certain interval. |
table.datashard.read.rows RATE , number |
Number of rows read by all partitions of all DB tables over a certain period of time |
table.datashard.read.bytes RATE , bytes |
Size of data read by all partitions of all DB tables over a certain period of time |
table.datashard.write.rows RATE , number |
Number of rows written by all partitions of all DB tables over a certain period of time |
table.datashard.write.bytes RATE , bytes |
Size of data written by all partitions of all DB tables over a certain period of time |
table.datashard.scan.rows RATE , number |
Number of rows read through the StreamExecuteScanQuery or StreamReadTable gRPC API calls by all partitions of all DB tables over a certain period of time |
table.datashard.scan.bytes RATE , bytes |
Size of data read through the StreamExecuteScanQuery or StreamReadTable gRPC API calls by all partitions of all DB tables over a certain period of time |
table.datashard.bulk_upsert.rows RATE , number |
Number of rows added through the BulkUpsert gRPC API call to all partitions of all DB tables over a certain period of time |
table.datashard.bulk_upsert.bytes RATE , bytes |
Size of data added through the BulkUpsert gRPC API call to all partitions of all DB tables over a certain period of time |
table.datashard.erase.rows RATE , number |
Number of rows deleted from the database over a certain period of time |
table.datashard.erase.bytes RATE , bytes |
Size of data deleted from the database over a certain period of time |
Resource usage metrics (for Dedicated mode only)
Metric name Type units of measurement |
Label description |
---|---|
resources.cpu.used_core_percents RATE , % |
CPU usage. If the value is 100 , one of the cores is being used for 100%. The value may be greater than 100 for multi-core configurations.Labels: - pool: Сomputing pool; the possible values are user , system , batch , io , and ic . |
resources.cpu.limit_core_percents IGAUGE , % |
Percentage of CPU available to a database. For example, for a database of three nodes, 4 cores with pool=user in each node, the value of this sensor will equal 1200 .Labels: - pool: Computing pool; the possible values are user , system , batch , io , or ic . |
resources.memory.used_bytes IGAUGE , bytes |
Amount of RAM used by the database nodes |
resources.memory.limit_bytes IGAUGE , bytes |
RAM available to the database nodes |
Query processing metrics (for Dedicated mode only)
Metric name Type units of measurement |
Description Labels |
---|---|
table.query.compilation.cache_evictions RATE , number |
Number of queries evicted from the cache of prepared queries |
table.query.compilation.cache_size_bytes IGAUGE , bytes |
Size of the cache of prepared queries |
table.query.compilation.cached_query_count IGAUGE , number |
Size of the cache of prepared queries |