Datasets

A dataset describes a set of data and its structure. Data in a dataset is represented as fields. For more information, see Data model.

Yandex DataLens lets you create datasets based on data sources that have a connection. Only one table can be a data source for a dataset.

DataLens offers several modes for datasets to work with data sources. For more information, see Data source operation mode.

Data model

Data in a dataset is represented as a set of fields.

Data field

Fields define the structure and format of a dataset. A field can be one of the following types:

  • Dimension. Contains values that define data characteristics, such as city, date of purchase, or product category. The aggregation function is not applied to fields with a dimension, otherwise the field becomes an measure. In the interface, dimensions are displayed in green.
  • Measure. Contains numeric values that aggregation functions (information) are applied to, such as the amount of clicks and the number of click-throughs. If you remove the aggregation function from this field, it becomes a dimension. In the interface, measures are displayed in blue.

When creating datasets, you can duplicate existing fields and create new ones.

Calculated field

DataLens lets you create calculated fields using aggregation functions and functions available for the data source.

Data types

DataLens offers the following data types:

  • Logical
  • Date (in YYYY-MM-DD format)
  • Date and time (in YYYY-MM-DD hh:mm:ss format)
  • Fractional number
  • Integer
  • String

Data aggregation

The following aggregation functions are available for measures:

Name Description Supported types
No Without aggregation All types
Average Arithmetic mean value Fractional number
Integer
Amount Number of records String
Date
Date and time
Fractional number
Integer
Number of unique Number of unique records String
Date
Date and time
Fractional number
Integer
Maximum Maximum value Date
Date and time
Fractional number
Integer
Minimum Minimum value Date
Date and time
Fractional number
Integer
Amount Sum of values Fractional number
Integer

Data source operation mode

Datasets work with data sources in the following modes:

  • Direct access.
  • One-time materialization.
  • Periodic materialization.

Direct access

All data requests are executed on the data source side.

Note

If you use the Metrica API as the data source, DataLens uses direct data access.

One-time materialization

Data is only loaded to the Yandex DataLens internal storage once. All subsequent requests are processed using the loaded data. To sync Yandex DataLens storage with the source, you can reload the data.

Note

If you use a CSV file as the data source, DataLens automatically materializes the dataset.

Periodic materialization

Data is periodically loaded to Yandex DataLens storage according to certain rules. The rules are defined in the dataset settings.

Access management

You can configure dataset permissions. For more information, see Access management.

See also