General questions about Yandex Database
YDB — is a distributed fault-tolerant Distributed SQL DBMS. YDB provides high availability and scalability while simultaneously ensuring strict consistency and ACID transaction support. Queries are made using an SQL dialect (YQL).
YDB is a fully managed database. DB instances are created through the YDB database management service.
What features does YDB provide?
YDB provides high availability and data security through synchronous replication in three availability zones. YDB also ensures even load distribution across available hardware resources. This means that you don't need to order resources, Yandex Database automatically allocates and releases resources based on the user load.
What consistency model does YDB use?
To read data, YDB uses a model of strict data consistency.
How do I design a primary key?
To design the primary key properly, follow the rules given below.
Avoid situations where the main load falls on a single partition of a table. With even load distribution, it's easier to achieve high overall performance.
This rule implies that you shouldn't use a monotonically increasing sequence, such as timestamp, as a table's primary key.
The fewer table partitions used during query execution, the faster it's executed. For greater performance, follow the one query — one partition rule.
Avoid situations where a small part of the DB is under much heavier load than the rest of the DB.
For more information, see Schema design.
How do I evenly distribute the load across table partitions?
You can use the following techniques to evenly distribute the load across table partitions and increase overall DB performance.
To avoid using monotonically increasing primary key values, you can:
- Change the order of its components.
- use a hash of the key column values as the primary key.
Reduce the number of partitions used in a single query.
For more information, see Techniques that let you evenly distribute the load across table partitions.
Can I use NULL in a key column?
In YDB, all columns, including key ones, may contain a
NULL value, but we don't recommend using
NULL as values in key columns.
According to the SQL standard (ISO/IEC 9075), you can't compare
NULL with other values. Therefore, the use of concise SQL statements with simple comparison operators may lead to skipping rows containing NULL during filtering, for example.
Is there an optimal size of a database row?
To achieve high performance, we don't recommend writing rows larger than 8 MB and key columns larger than 2 KB to the DB.
For more information about limits, see Database limits.
How are secondary indexes used in YDB?
Secondary indexes in YDB are global and can be non-unique.
For more information, see Secondary indexes.
How is paginated output performed?
To organize paginated output, we recommend selecting data sorted by primary key sequentially, limiting the number of rows with the
LIMIT keyword. We do not recommend using the
OFFSET keyword to solve this problem.
For more information, see Paginated output.
How do I effectively upload large amounts of data to YDB?
To speed up uploading large amounts of data, follow these recommendations:
When creating a table, explicitly specify the required number of partitions or their boundaries. This will help you effectively use system bandwidth as soon as you start uploading data by avoiding unnecessary re-partitioning of the table.
Don't insert data in separate transactions for each row. It's more efficient to insert multiple rows at once (batch inserts). This reduces the overhead on the transaction mechanism itself.
In addition to the previous step, within each transaction (batch), insert rows from the primary key-sorted set of data to minimize the number of partitions that the transaction affects.
Avoid writing data sequentially in ascending or descending order of the primary key value to evenly distribute the load across all table partitions.
For more information, see Uploading large data volumes.
How do I delete expired data?
To effectively remove expired data, we recommend using Time to Live (TTL).
Can I get logs of my operations with services?
Yes, you can request log records about your resources from Yandex.Cloud services. For more information, see Data requests.