Yandex Cloud
  • Services
  • Solutions
  • Why Yandex Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Language / Region
© 2022 Yandex.Cloud LLC
Practical guidelines
  • Web service
    • All tutorials
    • Static website in Object Storage
    • Website on LAMP or LEMP stack
    • Fault-tolerant website with load balancing by Network Load Balancer
    • Fault-tolerant website using DNS load balancing
    • Joomla website with PostgreSQL
    • WordPress website
    • WordPress website on a MySQL database
    • Transferring a WordPress website from a different hosting provider to Yandex Cloud
    • 1C-Bitrix website
    • Integrating an L7 load balancer with the Cloud CDN and Object Storage
    • Blue-green and canary deployment of service versions
  • Online stores
    • All tutorials
    • 1C-Bitrix online store
    • Opencart online store
  • Data archive
    • All tutorials
    • Single-node file server
    • Configuring an SFTP server on Centos 7
    • Backup to Object Storage via Acronis Backup
    • Backup to Object Storage via CloudBerry Desktop Backup
    • Backup to Object Storage via Duplicati
    • Backup to Object Storage via Bacula
    • Backup to Object Storage via Veritas Backup Exec
    • Digitizing archives in Yandex Vision
  • Test environment
    • All tutorials
    • Testing applications with GitLab
    • Creating test VMs using GitLab CI
    • High-performance computing on preemptible VMs
    • Emulating multiple IoT devices
    • gRPC service load testing
    • Using Phantom to run a fixed-load HTTPS test
    • Using Pandora to run a step-load HTTPS test
  • Performing infrastructure management
    • All tutorials
    • Getting started with Terraform
    • Uploading Terraform states to Object Storage
    • Getting started with Packer
    • Automating VM image builds using Jenkins
    • Continuous deployment of containerized applications using GitLab
    • Creating a cluster of 1C:Enterprise Linux servers with a Managed Service for PostgreSQL cluster
    • Creating a cluster of 1C:Enterprise Windows servers with SQL Server
    • Migrating to Yandex Cloud using Hystax Acura
    • Emergency recovery in Yandex Cloud using Hystax Acura
    • Configuring a fault-tolerant architecture in Yandex Cloud
  • Building a data platform
    • All tutorials
    • Syncing MySQL data using Yandex DataTransfer
    • Using schema registries with Managed Service for Apache Kafka®
      • Overview
      • Using Managed Schema Registry with Yandex Managed Service for Apache Kafka®
      • Using Confluent Schema Registry with Yandex Managed Service for Apache Kafka®
    • Delivering data using Debezium
    • Migrating databases from Yandex Managed Service for MySQL to MySQL
    • Migrating databases to Managed Service for Microsoft SQL Server
    • Transferring data from PostgreSQL to ClickHouse using Yandex Data Transfer
    • Configuring Yandex Cloud DNS for accessing managed database clusters from other cloud networks
    • Configuring Kafka Connect for Yandex Managed Service for Apache Kafka® clusters
  • Windows in Yandex Cloud
    • All tutorials
    • Deploying Active Directory
    • Deploying Microsoft Exchange
    • Deploying Remote Desktop Services
    • Deploying an Always On availability group
    • Deploying an Always On availability group with an internal network load balancer
    • Deploying Remote Desktop Gateway
  • Network routing
    • All tutorials
    • Routing through a NAT instance
    • Creating a VPN tunnel
    • Installing a Cisco CSR1000v virtual router
    • Installing a Mikrotik CHR virtual router
    • Creating a VPN connection using OpenVPN
  • Data visualization and analytics
    • All tutorials
    • Visualizing data from a CSV file
    • Creating and publishing a chart with a map of Moscow from a CSV file
    • Analyzing a store chain's sales based on data from a ClickHouse DB
    • Analyzing open data on road accidents in Russia
    • Analyzing sales and locations of pizzerias based on data from ClickHouse DB and Marketplace
    • Web analytics with a connection to Yandex.Metrica
    • Web analytics with funnels and cohorts calculated based on Yandex.Metrica data
    • Mobile app analytics based on AppMetrica data
    • Analyzing Yandex Music podcast statistics (for podcasters)
    • Visualizing data with a SQL chart
    • Mobile app customer journey analytics based on AppMetrica data
    • Analyzing Object Storage logs in DataLens
  • Internet of things
    • Tutorials for the internet of things
    • Status monitoring of geographically distributed devices
    • Monitoring sensor readings and event notifications
  • Serverless technologies
    • URL shortener
    • Entering data into storage systems
    • Storing application runtime logs
  1. Data visualization and analytics
  2. Analyzing open data on road accidents in Russia

Analyzing open data on road accidents in Russia

Written by
Yandex Cloud
  • Before you start
  • Step 1. Create a connection to a CSV file
  • Step 2. Create a dataset and configure its fields
  • Step 3. Create your first chart: a heat map
  • Step 4. Create your second chart: a bar chart
  • Step 5. Create more charts: line charts
  • Step 6. Create a dashboard
  • Step 7. Add charts to the dashboard
  • Step 8. Add selectors and create a dashboard
  • Step 9. Connect an additional source with region geolayers
  • Step 10. Add new fields to the dataset
  • Step 11. Create a chart using the new fields
  • Step 12. Add a new chart to the dashboard
  • Step 13. Publish the dashboard

This scenario analyzes public data on road accidents in Russia. In addition to identifying the facts of accident statistics in Russia, during the analysis you'll learn how to:

  • Work with the DataLens key entities: Connections, Datasets, Charts, and Dashboards.
  • Combine multiple sources at the level of a single dataset.
  • Use the Date and time data type and edit chart-level groupings.
  • Work with geodata: geopoints and geopolygons.
  • Create public dashboards that will be available to everyone from any device without authentication.

To visualize and analyze the data, set up DataLens and follow these steps:

  1. Create a connection to a CSV file.
  2. Create a dataset and configure its fields.
  3. Create your first chart: a heat map.
  4. Create your second chart: a bar chart.
  5. Create more charts: line charts.
  6. Create a dashboard.
  7. Add charts to the dashboard.
  8. Add selectors and create a dashboard.
  9. Connect an additional source with region geolayers.
  10. Add new fields to the dataset.
  11. Create a chart using the new fields.
  12. Add a new chart to the dashboard.
  13. Publish the dashboard.

Before you start

To get started with DataLens:

New user
I'm already using Yandex Cloud
  1. Log in to your Yandex account. If you don't have an account, create one.
  2. Open the homepage DataLens.
  3. Click Open Datalens.
  4. Click Log in.
  1. Log in to your Yandex account.

  2. Open the homepage DataLens.

  3. Click Open Datalens.

  4. Select one of the options:

    • If you already have an organization, select it from the drop-down menu in the Organizations tab and click Open DataLens.

      Note

      To activate a DataLens instance, the user must have the admin or owner role. For more information about roles, see Granting permissions.

    • If you have a cloud but no organization, click Add new DataLens. In the window that opens, enter your organization's name and description and click Create organization and DataLens. For more information about working with organizations, see Getting started with organizations.

If you have a technical question about the service, please contact support. To ask for advice or discuss the solution to your problem or best service practices, write to the DataLens chat in Telegram.

Step 1. Create a connection to a CSV file

  1. Download the file with traffic accident statistics for April-December 2018 in CSV format.

  2. Open the DataLens homepage and click Create connection.

    image

  3. Select the CSV connection type.

    image

  4. Click Select CSV file and select the downloaded file. It may take up to several minutes to load it, depending on the speed of your internet connection.

    image

    When loaded, you'll see a preview of the file data. To view it, use the scroll bars on the right and at the bottom.

    image

  5. In the upper-right corner, click Create.

    image

The connection to the CSV file is created. The data is presented in the same form as in the file. To use it, you need to create a dataset.

Step 2. Create a dataset and configure its fields

The dataset will consist of a single source: the CSV file.

  1. Click Create dataset.

    image

  2. Go to the Fields tab.

    image

    Here you can add data fields and calculated fields, change field data types and aggregation rules, as well as rename fields.

    image

  3. Change the field names to Russian ones in the following order:

    • Region code
    • Region name
    • Road code
    • Road name
    • Road type
    • OKTMO code
    • Address
    • Accident type

    image

  4. The accident date and time are shown in the following fields:

    • crash_date: The date in numeric format like 20190218.
    • crash_time: The time written like 19:34.

    They won't be used in the dataset in this format. Click the visibility icon to hide them.

    image

  5. To display information about the accident date and time, add a new field.

    1. Click Add field.

      image

    2. Specify the following data for it:

      • Field name: Date and time
      • Formula: datetime_parse(str([crash_date])+'-'+str([crash_time]))

      image

      Tip

      To review the documentation for these functions yourself, click Reference.

    3. Click Create.

    The preview section now correctly displays the accident date and time as a value of the Date and time type. The new Date and time field is at the top of the table. Click to edit the field formula.

    image

  6. Continue renaming the fields in the following order:

    • Cause of accident
    • Death toll
    • Number of victims
    • Number of vehicles
    • Number of participants

    Set the Sum aggregation type for the following fields:

    • Death toll
    • Number of victims
    • Number of vehicles
    • Number of participants

    image

  7. Add a field to display a geopoint.

    1. Click Add field at the top right.

      image

    2. Enter data for the new field:

      • Field name: Geopoint
      • Formula: GEOPOINT([latitude],[longitude])

      where [latitude] and [longitude] are dataset fields.

      image

    3. Click Create.

      Note

      The example is based on ready-to-use data in coordinate format. In other cases, to convert data to coordinate format, you can use the geocoding function.

    4. Make sure that the table and preview contain the new Geopoint type field.

      image

  8. Add a field to display the number of accidents.

    1. Click Add field.

      image

    2. Enter data for the new field:

      • Field name: Number of accidents
      • Formula: SUM(1)

      image

    3. Click Create.

    You don't need to set the aggregation type for this field: when you add it to the chart, DataLens applies the aggregation function automatically.

  9. Add a field for data on deaths in road accidents.

    1. Click Add field.

      image

    2. Enter data for the new field:

      • Field name: Accident death rate
      • Formula: [Death toll]/[Number of accidents]

      image

    3. Click Create

    Note

    You can create calculated fields directly in charts. However, it's more convenient to do this at the dataset level. This makes them available in all charts.

  10. Save the dataset by clicking Save at the top right.

    image

  11. Specify the dataset name: Accidents. Click Create.

    image

The dataset is created.

Step 3. Create your first chart: a heat map

Proceed to creating the first chart.

  1. Click Create chart.

    image

  2. In the wizard window that opens, click on the field that defaults to Column chart and select the Map chart type.

    image

  3. Under Points, select the Points (heat map) layer type.

    image

  4. Drag the Geopoint field to the Geopoints section.

    image

    That's your first chart on the map.

  5. To save it, click the arrow icon next to the Save button and select Save as.

    image

  6. In the dialog box, name the chart Heat map and click Save.

    image

Step 4. Create your second chart: a bar chart

  1. Analyze the number of accidents by region.

    1. In the same window, select Bar chart as the chart type.

      image

    2. Drag:

      • The Region name field to the Y section.
      • The Number of accidents field to the X section.
      • The Number of accidents field to the Sorting section.

      image

      The chart shows the number of accidents by region. Most accidents occurred in the Moscow region.

      Note

      The legend may be missing for some of the chart lines. If this is the case, hover over the line and you'll see a tooltip with the region name.

    3. Click Save as to save the chart.

      image

      Name the chart Number of accidents by region and save it.

      image

  2. Analyze the death rate in accidents by region.

    1. Replace the Number of accidents measure with Accident death rate by dragging and dropping.

      image

      This time, Ingushetia is far ahead in the number of deaths, followed by Tuva and Kalmykia.

    2. Save the chart by clicking Save as

      image

      and name it Death rate by region.

      image

Step 5. Create more charts: line charts

Let's see how the number of accidents and related deaths are distributed by week, day of the week, and time of day.

  1. Analyze the number of accidents and death rate by week.

    1. In the same window, select Line chart as the chart type.

      image

    2. The chart management sections contain the previous values. Delete them by clicking .

      image

    3. Drag the Date and time field to the X section and click the green calendar icon.

      image

    4. In the Grouping field, choose Date part ⟶ Week and click Apply.

      image

    5. Drag:

      • The Number of accidents field to the Y section.
      • The Accident death rate field to the Y2 section.

      image

      Now you can see a chart with two graphs: the number of accidents and death rate. If you hover over a point on the chart, a tooltip appears with specific values.

    6. Save the chart by clicking Save as

      image

      and name it Number of accidents and death rate by week.

      image

  2. Now analyze the number of accidents and death rate by day of the week.

    1. Change grouping by date and time to Date part ⟶ Day of week.

      image

      This chart shows more obvious trends.

      image

    2. Save the chart by clicking Save as

      image

      and name it Number of accidents and death rate by day of week.

      image

  3. Analyze the number of accidents and death rate by hour of day.

    1. Change grouping by date and time to Date part ⟶ Hour.

      image

      The chart shows that less accidents occur at night, but they result in more casualties.

      image

    2. Save the chart by clicking Save as

      image

      and name it Number of accidents and death rate by hour of day.

      image

Step 6. Create a dashboard

  1. Open the menu by clicking the icon to the left of the logo and select Dashboards.

    image

  2. Click Create.

    image

    and select Dashboard.

    image

  3. Name the dashboard Accidents in Russia and click Create.

    image

Note

If this is the first dashboard you create in the folder, it opens as soon as it's created. If the folder already contains dashboards, the list of them opens. In this case, select the Accidents in Russia dashboard from the list.

Step 7. Add charts to the dashboard

  1. Add the first chart.

    1. Click Add and select Chart from the drop-down list.

      image

    2. In the Chart field, select the previously created Heat Map from the list of charts.

      image

    3. Click Add.

      image

      The chart is displayed on the dashboard.

  2. Repeat the previous three steps for the Number of accidents by region, Number of accidents and death rate by week, Number of accidents and death rate by day of week, and Number of accidents and death rate by hour of day charts by adding them to the dashboard.

  3. Create a switch for the Number of accidents by region and Death rate by region charts on the dashboard.

    1. Click the Number of accidents by region chart settings icon to open them.

      image

    2. Click Add to the left of the screen.

      image

    3. In the Chart field, click Select and choose the Death rate by region chart.

      image

    4. Click Save.

      image

Step 8. Add selectors and create a dashboard

Selectors let you filter data by value.

  1. Add a selector to show statistics by region name.

    1. Click Add and choose Selector.

      image

    2. Select Based on dataset, click Select, and choose the previously created Accidents dataset.

      image

    3. In the Field property, choose Region name and select Show next to the selector name. Click Add.

      image

    The selector is displayed on the dashboard as a rectangle.

  2. Repeat the previous three steps for the fields:

    • Cause of accident
    • Accident type
    • Road type
    • Road name
  3. You can drag and resize dashboard elements. Place the elements according to the screenshot or however is convenient for you and click Save to save the dashboard.

    image

If you choose any values in the selectors, the charts will display data for these values.

You can switch the Number of accidents by region chart to Death rate by region.

image

Step 9. Connect an additional source with region geolayers

Heat maps aren't always the most informative. Upload the polygon reference and add a chart with a color fill for Russian regions.

  1. Open the menu by clicking the button to the left of the logo and select Connections.

    image

  2. At the top right, click Create

    image

    and select Connection.

    image

  3. Select the CSV file type.

    image

    Download the Regions.csv file. Then, in the create connection window, upload it by clicking Select CSV file.

  4. Take a look at a preview of the file data. For the Table header parameter, set the Available value. At the top right, click Create.

    image

Step 10. Add new fields to the dataset

You need to add new fields to the previously created Accidents dataset.

  1. In the left menu, select Datasets.

    image

  2. Choose the Accidents dataset.

    image

  3. Go to the Sources tab.

    image

  4. Click Add.

    image

  5. Choose Connections and then Regions.

    image

    Since the connection data has not yet been merged, an error message may appear. Follow these steps.

  6. Click the merge data icon.

    image

  7. Click Add link.

    image

  8. Select the fields to link: reg_name and Accident region. Then click Save.

    image

    Make sure the data preview is displayed correctly.

    image

  9. Go to the Fields tab.

    image

  10. New fields are displayed at the bottom of the list. For the Polygon field, specify the Geopolygon data type.

    image

  11. Click Save to save the dataset.

    image

Step 11. Create a chart using the new fields

  1. Click Create chart.

    image

  2. Select the Map chart type.

    image

  3. Add polygons to the map. Drag the Polygon field from the Dimensions section to the Polygons section.

  4. Change colors of the polygons based on the number of accidents. Drag the Number of accidents field from the Measures section to the Colors section.

  5. Drag the following fields to the Tooltips section:

    • Region name
    • Number of accidents
    • Death toll
    • Number of victims
    • Number of vehicles
    • Number of participants
    • Accident death rate

    The chart shows a map with a color fill for regions. If you hover over a region, information for that region appears.

    image

  6. Click Save at the top right to save the chart.

    image

  7. Name the chart Map of regions and click Save once again.

    image

Step 12. Add a new chart to the dashboard

  1. Select Dashboards in the top-left menu.

    image

  2. Select the previously created dashboard with accident data.

    image

  3. Click Edit at the top right.

    image

  4. Click the Heat map settings icon to open them.

    image

  5. Click Add to the left of the screen.

    image

  6. In the Chart field, click Select and choose the last chart named Map of regions.

    image

  7. Click Save.

    image

You can now switch the type of map data visualization: heat map or region fill.

image

Step 13. Publish the dashboard

  1. To set up public access to the dashboard, click .

    image

  2. Enable access via link. You can copy and share the link that appears. Everyone can use the link to access the dashboard from any device and without authentication. Click Apply.

    image

    Note

    You can enable or disable public access separately for each chart and dataset associated with the dashboard.

Was the article helpful?

Language / Region
© 2022 Yandex.Cloud LLC
In this article:
  • Before you start
  • Step 1. Create a connection to a CSV file
  • Step 2. Create a dataset and configure its fields
  • Step 3. Create your first chart: a heat map
  • Step 4. Create your second chart: a bar chart
  • Step 5. Create more charts: line charts
  • Step 6. Create a dashboard
  • Step 7. Add charts to the dashboard
  • Step 8. Add selectors and create a dashboard
  • Step 9. Connect an additional source with region geolayers
  • Step 10. Add new fields to the dataset
  • Step 11. Create a chart using the new fields
  • Step 12. Add a new chart to the dashboard
  • Step 13. Publish the dashboard