A dataset describes a set of data and its structure. Data in a dataset is represented as fields. For more information, see Data model.
Yandex DataLens lets you create datasets based on data sources that have a connection. Only one table can be a data source for a dataset.
DataLens offers several modes for datasets to work with data sources. For more information, see Data source operation mode.
Data in a dataset is represented as a set of fields.
Fields define the structure and format of a dataset. A field can be one of the following types:
- Dimension. Contains values that define data characteristics, such as city, date of purchase, or product category. The aggregation function is not applied to fields with a dimension, otherwise the field becomes an measure. In the interface, dimensions are displayed in green.
- Measure. Contains numeric values that aggregation functions (information) are applied to, such as the amount of clicks and the number of click-throughs. If you remove the aggregation function from this field, it becomes a dimension. In the interface, measures are displayed in blue.
When creating datasets, you can duplicate existing fields and create new ones.
DataLens lets you create calculated fields using aggregation functions and functions available for the data source.
DataLens offers the following data types:
Date and time(in
The following aggregation functions are available for measures:
|No||Without aggregation||All types|
|Average||Arithmetic mean value||
|Amount||Number of records||
|Number of unique||Number of unique records||
|Amount||Sum of values||
Data source operation mode
Datasets work with data sources in the following modes:
- Direct access.
- One-time materialization.
- Periodic materialization.
All data requests are executed on the data source side.
If you use the Metrica API as the data source, DataLens uses direct data access.
Data is only loaded to the Yandex DataLens internal storage once. All subsequent requests are processed using the loaded data. To sync Yandex DataLens storage with the source, you can reload the data.
If you use a CSV file as the data source, DataLens automatically materializes the dataset.
Data is periodically loaded to Yandex DataLens storage according to certain rules. The rules are defined in the dataset settings.
You can configure dataset permissions. For more information, see Access management.