Yandex Cloud
  • Services
  • Solutions
  • Why Yandex Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Language / Region
© 2022 Yandex.Cloud LLC
Yandex DataLens
  • Getting started
  • Practical guidelines
    • All tutorials
    • Visualizing data from a CSV file
    • Creating and publishing a chart with a map of Moscow from a CSV file
    • Analyzing a retail chain's sales from a ClickHouse database
    • Analyzing public data on road accidents in Russia
    • Analyzing sales and locations of pizzerias based on data from Clickhouse DB and Marketplace
    • Web analytics with a connection to Yandex Metrica
    • Web analytics with funnels and cohorts calculated based on Yandex Metrica data
    • Mobile app analytics based on AppMetrica data
    • Analyzing Yandex Music podcast statistics (for podcasters)
    • Visualizing data with a QL chart
    • Building customer journey charts based on AppMetrica data
  • Concepts
    • Service overview
    • Connection
    • Data types
    • Datasets
      • Overview
      • Data model
      • Dataset settings
    • Charts
      • Overview
      • Chart types
      • Chart settings
    • Dashboards
    • Combining data
    • Using Markdown in DataLens
    • DataLens Public
    • Aggregations in DataLens
    • Calculated fields
      • Overview
      • Formula syntax
    • Parameterization
    • Marketplace
    • Backups
    • Caching
    • Quotas and limits
    • Organizations in DataLens
  • Step-by-step instructions
    • All instructions
    • Working with connections
      • Creating a ClickHouse connection
      • Creating a connection to a CSV file
      • Creating a Google Sheets connection
      • Creating a MySQL connection
      • Creating a PostgreSQL connection
      • Creating an MS SQL Server connection
      • Creating an Oracle Database connection
      • Creating a YDB connection
      • Creating a Yandex Metrica API connection
      • Creating an AppMetrica connection
      • Creating a Yandex Cloud Billing connection
      • Creating a Greenplum connection
      • Managing connection access
    • Working with datasets
      • Creating a dataset
      • Combining data from multiple tables
      • Combining data from multiple CSV connections
      • Creating a data field
      • Creating a calculated data field
      • Creating a default filter for new charts
      • Updating fields in datasets
      • Describing a dataset via a source SQL query
      • Dataset materialization
      • Managing dataset access
      • Managing access to data rows
      • Adding parameters to a dataset
    • Working with charts
      • Creating a line chart
      • Creating an area chart
      • Creating a pie chart
      • Creating a donut chart
      • Creating a column chart
      • Creating a bar chart
      • Creating a scatter chart
      • Creating a map
      • Creating a table
      • Creating a pivot table
      • Creating an indicator
      • Creating a tree chart
      • Creating a QL chart
      • Creating a multi-dataset chart
      • Adding a hierarchy
      • Configuring the navigator
      • Publishing a chart
      • Managing chart access
      • Adding guid as a parameter
      • Adding parameters to a chart
    • Working with dashboards
      • Creating dashboards
      • Adding charts to dashboards
      • Adding selectors to dashboards
      • Creating an alias
      • Deleting an alias field
      • Publishing dashboards
      • Managing dashboard access
      • Dashboard auto-update
    • Working with permissions
      • Granting permissions
      • Deleting permissions
      • Request permissions
    • Working with DataLens Marketplace
      • Adding a product from Marketplace
      • Removing a product from Marketplace
      • Creating and using a connector (for partners)
    • Working with organizations
      • Switching between DataLens instances
  • Access management
    • Managing access to DataLens
    • Managing access at the data row level
  • Pricing policy
  • Function reference
    • All Functions
    • Aggregate functions
      • Overview
      • ALL_CONCAT
      • ANY
      • ARG_MAX
      • ARG_MIN
      • AVG
      • AVG_IF
      • COUNT
      • COUNTD
      • COUNTD_APPROX
      • COUNTD_IF
      • COUNT_IF
      • MAX
      • MEDIAN
      • MIN
      • QUANTILE
      • QUANTILE_APPROX
      • STDEV
      • STDEVP
      • SUM
      • SUM_IF
      • TOP_CONCAT
      • VAR
      • VARP
    • Array functions
      • Overview
      • ARRAY
      • ARR_AVG
      • ARR_MAX
      • ARR_MIN
      • ARR_PRODUCT
      • ARR_STR
      • ARR_SUM
      • CAST_ARR_FLOAT
      • CAST_ARR_INT
      • CAST_ARR_STR
      • CONTAINS
      • COUNT_ITEM
      • GET_ITEM
      • REPLACE
      • SLICE
      • STARTSWITH
      • UNNEST
    • Date/Time functions
      • Overview
      • DATEADD
      • DATEPART
      • DATETRUNC
      • DAY
      • DAYOFWEEK
      • HOUR
      • MINUTE
      • MONTH
      • NOW
      • QUARTER
      • SECOND
      • TODAY
      • WEEK
      • YEAR
    • Geographical functions
      • Overview
      • GEOCODE
      • GEOINFO
      • TOPONYM_TO_GEOPOINT
      • TOPONYM_TO_GEOPOLYGON
    • Logical functions
      • Overview
      • CASE
      • IF
      • IFNULL
      • ISNULL
      • ZN
    • Mathematical functions
      • Overview
      • ABS
      • ACOS
      • ASIN
      • ATAN
      • ATAN2
      • CEILING
      • COMPARE
      • COS
      • COT
      • DEGREES
      • DIV
      • EXP
      • FLOOR
      • GREATEST
      • LEAST
      • LN
      • LOG
      • LOG10
      • PI
      • POWER
      • RADIANS
      • ROUND
      • SIGN
      • SIN
      • SQRT
      • SQUARE
      • TAN
    • Operators
      • Overview
      • AND
      • Addition and concatenation (+)
      • BETWEEN
      • Comparison
      • Division (/)
      • IN
      • IS FALSE
      • IS TRUE
      • LIKE
      • Modulo (%)
      • Multiplication (*)
      • NOT
      • Negation (-)
      • OR
      • Power (^)
      • Subtraction (-)
    • String functions
      • Overview
      • ASCII
      • CHAR
      • CONCAT
      • CONTAINS
      • ENDSWITH
      • FIND
      • ICONTAINS
      • IENDSWITH
      • ISTARTSWITH
      • LEFT
      • LEN
      • LOWER
      • LTRIM
      • REGEXP_EXTRACT
      • REGEXP_EXTRACT_NTH
      • REGEXP_MATCH
      • REGEXP_REPLACE
      • REPLACE
      • RIGHT
      • RTRIM
      • SPACE
      • SPLIT
      • STARTSWITH
      • SUBSTR
      • TRIM
      • UPPER
      • UTF8
    • Text markup functions
      • Overview
      • BOLD
      • ITALIC
      • MARKUP
      • URL
    • Time series functions
      • Overview
      • AGO
      • AT_DATE
    • Type conversion functions
      • Overview
      • BOOL
      • DATE
      • DATETIME
      • DATETIME_PARSE
      • DATE_PARSE
      • DB_CAST
      • FLOAT
      • GEOPOINT
      • GEOPOLYGON
      • INT
      • STR
    • Window functions
      • Overview
      • AVG
      • AVG_IF
      • COUNT
      • COUNT_IF
      • FIRST
      • LAG
      • LAST
      • MAVG
      • MAX
      • MCOUNT
      • MIN
      • MMAX
      • MMIN
      • MSUM
      • RANK
      • RANK_DENSE
      • RANK_PERCENTILE
      • RANK_UNIQUE
      • RAVG
      • RCOUNT
      • RMAX
      • RMIN
      • RSUM
      • SUM
      • SUM_IF
    • Function Availability
  • Tutorial on functions
    • Aggregate functions
    • Window functions
    • LOD expressions and filtering in aggregate functions
  • Public materials
    • Educational projects
    • Webinars and conferences
    • Articles and publications
    • Public dashboards and charts
  • Troubleshooting
    • Questions and answers
    • DataLens errors
  1. Practical guidelines
  2. Analyzing public data on road accidents in Russia

Analyzing open data on road accidents in Russia

Written by
Yandex Cloud
  • Before you start
  • Step 1. Create a connection to a CSV file
  • Step 2. Create a dataset and configure its fields
  • Step 3. Create your first chart: a heat map
  • Step 4. Create your second chart: a bar chart
  • Step 5. Create more charts: line charts
  • Step 6. Create a dashboard
  • Step 7. Add charts to the dashboard
  • Step 8. Add selectors and create a dashboard
  • Step 9. Connect an additional source with region geolayers
  • Step 10. Add new fields to the dataset
  • Step 11. Create a chart using the new fields
  • Step 12. Add a new chart to the dashboard
  • Step 13. Publish the dashboard

This scenario analyzes public data on road accidents in Russia. In addition to identifying the facts of accident statistics in Russia, during the analysis you'll learn how to:

  • Work with the DataLens key entities: Connections, Datasets, Charts, and Dashboards.
  • Combine multiple sources at the level of a single dataset.
  • Use the Date and time data type and edit chart-level groupings.
  • Work with geodata: geopoints and geopolygons.
  • Create public dashboards that will be available to everyone from any device without authentication.

To visualize and analyze the data, set up DataLens and follow these steps:

  1. Create a connection to a CSV file.
  2. Create a dataset and configure its fields.
  3. Create your first chart: a heat map.
  4. Create your second chart: a bar chart.
  5. Create more charts: line charts.
  6. Create a dashboard.
  7. Add charts to the dashboard.
  8. Add selectors and create a dashboard.
  9. Connect an additional source with region geolayers.
  10. Add new fields to the dataset.
  11. Create a chart using the new fields.
  12. Add a new chart to the dashboard.
  13. Publish the dashboard.

Before you start

To get started with DataLens:

New user
I'm already using Yandex Cloud
  1. Log in to your Yandex account. If you don't have an account, create one.
  2. Open the homepage DataLens.
  3. Click Open Datalens.
  4. Click Log in.
  1. Log in to your Yandex account.

  2. Open the homepage DataLens.

  3. Click Open Datalens.

  4. Select one of the options:

    • If you already have an organization, select it from the drop-down menu in the Organizations tab and click Open DataLens.

      Note

      To activate a DataLens instance, the user must have the admin or owner role. For more information about roles, see Granting permissions.

    • If you have a cloud but no organization, click Add new DataLens. In the window that opens, enter your organization's name and description and click Create organization and DataLens. For more information about working with organizations, see Getting started with organizations.

If you have a technical question about the service, please contact support. To ask for advice or discuss the solution to your problem or best service practices, write to the DataLens chat in Telegram.

Step 1. Create a connection to a CSV file

  1. Download the file with traffic accident statistics for April-December 2018 in CSV format.

  2. Open the DataLens homepage and click Create connection.

    image

  3. Select the CSV connection type.

    image

  4. Click Select CSV file and select the downloaded file. It may take up to several minutes to load it, depending on the speed of your internet connection.

    image

    When loaded, you'll see a preview of the file data. To view it, use the scroll bars on the right and at the bottom.

    image

  5. In the upper-right corner, click Create.

    image

The connection to the CSV file is created. The data is presented in the same form as in the file. To use it, you need to create a dataset.

Step 2. Create a dataset and configure its fields

The dataset will consist of a single source: the CSV file.

  1. Click Create dataset.

    image

  2. Go to the Fields tab.

    image

    Here you can add data fields and calculated fields, change field data types and aggregation rules, as well as rename fields.

    image

  3. Change the field names to Russian ones in the following order:

    • Region code
    • Region name
    • Road code
    • Road name
    • Road type
    • OKTMO code
    • Address
    • Accident type

    image

  4. The accident date and time are shown in the following fields:

    • crash_date: The date in numeric format like 20190218.
    • crash_time: The time written like 19:34.

    They won't be used in the dataset in this format. Click the visibility icon to hide them.

    image

  5. To display information about the accident date and time, add a new field.

    1. Click Add field.

      image

    2. Specify the following data for it:

      • Field name: Date and time
      • Formula: datetime_parse(str([crash_date])+'-'+str([crash_time]))

      image

      Tip

      To review the documentation for these functions yourself, click Reference.

    3. Click Create.

    The preview section now correctly displays the accident date and time as a value of the Date and time type. The new Date and time field is at the top of the table. Click to edit the field formula.

    image

  6. Continue renaming the fields in the following order:

    • Cause of accident
    • Death toll
    • Number of victims
    • Number of vehicles
    • Number of participants

    Set the Sum aggregation type for the following fields:

    • Death toll
    • Number of victims
    • Number of vehicles
    • Number of participants

    image

  7. Add a field to display a geopoint.

    1. Click Add field at the top right.

      image

    2. Enter data for the new field:

      • Field name: Geopoint
      • Formula: GEOPOINT([latitude],[longitude])

      where [latitude] and [longitude] are dataset fields.

      image

    3. Click Create.

      Note

      The example is based on ready-to-use data in coordinate format. In other cases, to convert data to coordinate format, you can use the geocoding function.

    4. Make sure that the table and preview contain the new Geopoint type field.

      image

  8. Add a field to display the number of accidents.

    1. Click Add field.

      image

    2. Enter data for the new field:

      • Field name: Number of accidents
      • Formula: SUM(1)

      image

    3. Click Create.

    You don't need to set the aggregation type for this field: when you add it to the chart, DataLens applies the aggregation function automatically.

  9. Add a field for data on deaths in road accidents.

    1. Click Add field.

      image

    2. Enter data for the new field:

      • Field name: Accident death rate
      • Formula: [Death toll]/[Number of accidents]

      image

    3. Click Create

    Note

    You can create calculated fields directly in charts. However, it's more convenient to do this at the dataset level. This makes them available in all charts.

  10. Save the dataset by clicking Save at the top right.

    image

  11. Specify the dataset name: Accidents. Click Create.

    image

The dataset is created.

Step 3. Create your first chart: a heat map

Proceed to creating the first chart.

  1. Click Create chart.

    image

  2. In the wizard window that opens, click on the field that defaults to Column chart and select the Map chart type.

    image

  3. Under Points, select the Points (heat map) layer type.

    image

  4. Drag the Geopoint field to the Geopoints section.

    image

    That's your first chart on the map.

  5. To save it, click the arrow icon next to the Save button and select Save as.

    image

  6. In the dialog box, name the chart Heat map and click Save.

    image

Step 4. Create your second chart: a bar chart

  1. Analyze the number of accidents by region.

    1. In the same window, select Bar chart as the chart type.

      image

    2. Drag:

      • The Region name field to the Y section.
      • The Number of accidents field to the X section.
      • The Number of accidents field to the Sorting section.

      image

      The chart shows the number of accidents by region. Most accidents occurred in the Moscow region.

      Note

      The legend may be missing for some of the chart lines. If this is the case, hover over the line and you'll see a tooltip with the region name.

    3. Click Save as to save the chart.

      image

      Name the chart Number of accidents by region and save it.

      image

  2. Analyze the death rate in accidents by region.

    1. Replace the Number of accidents measure with Accident death rate by dragging and dropping.

      image

      This time, Ingushetia is far ahead in the number of deaths, followed by Tuva and Kalmykia.

    2. Save the chart by clicking Save as

      image

      and name it Death rate by region.

      image

Step 5. Create more charts: line charts

Let's see how the number of accidents and related deaths are distributed by week, day of the week, and time of day.

  1. Analyze the number of accidents and death rate by week.

    1. In the same window, select Line chart as the chart type.

      image

    2. The chart management sections contain the previous values. Delete them by clicking .

      image

    3. Drag the Date and time field to the X section and click the green calendar icon.

      image

    4. In the Grouping field, choose Date part ⟶ Week and click Apply.

      image

    5. Drag:

      • The Number of accidents field to the Y section.
      • The Accident death rate field to the Y2 section.

      image

      Now you can see a chart with two graphs: the number of accidents and death rate. If you hover over a point on the chart, a tooltip appears with specific values.

    6. Save the chart by clicking Save as

      image

      and name it Number of accidents and death rate by week.

      image

  2. Now analyze the number of accidents and death rate by day of the week.

    1. Change grouping by date and time to Date part ⟶ Day of week.

      image

      This chart shows more obvious trends.

      image

    2. Save the chart by clicking Save as

      image

      and name it Number of accidents and death rate by day of week.

      image

  3. Analyze the number of accidents and death rate by hour of day.

    1. Change grouping by date and time to Date part ⟶ Hour.

      image

      The chart shows that less accidents occur at night, but they result in more casualties.

      image

    2. Save the chart by clicking Save as

      image

      and name it Number of accidents and death rate by hour of day.

      image

Step 6. Create a dashboard

  1. Open the menu by clicking the icon to the left of the logo and select Dashboards.

    image

  2. Click Create.

    image

    and select Dashboard.

    image

  3. Name the dashboard Accidents in Russia and click Create.

    image

Note

If this is the first dashboard you create in the folder, it opens as soon as it's created. If the folder already contains dashboards, the list of them opens. In this case, select the Accidents in Russia dashboard from the list.

Step 7. Add charts to the dashboard

  1. Add the first chart.

    1. Click Add and select Chart from the drop-down list.

      image

    2. In the Chart field, select the previously created Heat Map from the list of charts.

      image

    3. Click Add.

      image

      The chart is displayed on the dashboard.

  2. Repeat the previous three steps for the Number of accidents by region, Number of accidents and death rate by week, Number of accidents and death rate by day of week, and Number of accidents and death rate by hour of day charts by adding them to the dashboard.

  3. Create a switch for the Number of accidents by region and Death rate by region charts on the dashboard.

    1. Click the Number of accidents by region chart settings icon to open them.

      image

    2. Click Add to the left of the screen.

      image

    3. In the Chart field, click Select and choose the Death rate by region chart.

      image

    4. Click Save.

      image

Step 8. Add selectors and create a dashboard

Selectors let you filter data by value.

  1. Add a selector to show statistics by region name.

    1. Click Add and choose Selector.

      image

    2. Select Based on dataset, click Select, and choose the previously created Accidents dataset.

      image

    3. In the Field property, choose Region name and select Show next to the selector name. Click Add.

      image

    The selector is displayed on the dashboard as a rectangle.

  2. Repeat the previous three steps for the fields:

    • Cause of accident
    • Accident type
    • Road type
    • Road name
  3. You can drag and resize dashboard elements. Place the elements according to the screenshot or however is convenient for you and click Save to save the dashboard.

    image

If you choose any values in the selectors, the charts will display data for these values.

You can switch the Number of accidents by region chart to Death rate by region.

image

Step 9. Connect an additional source with region geolayers

Heat maps aren't always the most informative. Upload the polygon reference and add a chart with a color fill for Russian regions.

  1. Open the menu by clicking the button to the left of the logo and select Connections.

    image

  2. At the top right, click Create

    image

    and select Connection.

    image

  3. Select the CSV file type.

    image

    Download the Regions.csv file. Then, in the create connection window, upload it by clicking Select CSV file.

  4. Take a look at a preview of the file data. For the Table header parameter, set the Available value. At the top right, click Create.

    image

Step 10. Add new fields to the dataset

You need to add new fields to the previously created Accidents dataset.

  1. In the left menu, select Datasets.

    image

  2. Choose the Accidents dataset.

    image

  3. Go to the Sources tab.

    image

  4. Click Add.

    image

  5. Choose Connections and then Regions.

    image

    Since the connection data has not yet been merged, an error message may appear. Follow these steps.

  6. Click the merge data icon.

    image

  7. Click Add link.

    image

  8. Select the fields to link: reg_name and Accident region. Then click Save.

    image

    Make sure the data preview is displayed correctly.

    image

  9. Go to the Fields tab.

    image

  10. New fields are displayed at the bottom of the list. For the Polygon field, specify the Geopolygon data type.

    image

  11. Click Save to save the dataset.

    image

Step 11. Create a chart using the new fields

  1. Click Create chart.

    image

  2. Select the Map chart type.

    image

  3. Add polygons to the map. Drag the Polygon field from the Dimensions section to the Polygons section.

  4. Change colors of the polygons based on the number of accidents. Drag the Number of accidents field from the Measures section to the Colors section.

  5. Drag the following fields to the Tooltips section:

    • Region name
    • Number of accidents
    • Death toll
    • Number of victims
    • Number of vehicles
    • Number of participants
    • Accident death rate

    The chart shows a map with a color fill for regions. If you hover over a region, information for that region appears.

    image

  6. Click Save at the top right to save the chart.

    image

  7. Name the chart Map of regions and click Save once again.

    image

Step 12. Add a new chart to the dashboard

  1. Select Dashboards in the top-left menu.

    image

  2. Select the previously created dashboard with accident data.

    image

  3. Click Edit at the top right.

    image

  4. Click the Heat map settings icon to open them.

    image

  5. Click Add to the left of the screen.

    image

  6. In the Chart field, click Select and choose the last chart named Map of regions.

    image

  7. Click Save.

    image

You can now switch the type of map data visualization: heat map or region fill.

image

Step 13. Publish the dashboard

  1. To set up public access to the dashboard, click .

    image

  2. Enable access via link. You can copy and share the link that appears. Everyone can use the link to access the dashboard from any device and without authentication. Click Apply.

    image

    Note

    You can enable or disable public access separately for each chart and dataset associated with the dashboard.

Was the article helpful?

Language / Region
© 2022 Yandex.Cloud LLC
In this article:
  • Before you start
  • Step 1. Create a connection to a CSV file
  • Step 2. Create a dataset and configure its fields
  • Step 3. Create your first chart: a heat map
  • Step 4. Create your second chart: a bar chart
  • Step 5. Create more charts: line charts
  • Step 6. Create a dashboard
  • Step 7. Add charts to the dashboard
  • Step 8. Add selectors and create a dashboard
  • Step 9. Connect an additional source with region geolayers
  • Step 10. Add new fields to the dataset
  • Step 11. Create a chart using the new fields
  • Step 12. Add a new chart to the dashboard
  • Step 13. Publish the dashboard