Visualizing data from Yandex Object Storage on Yandex DataLens dashboards
Yandex DataLens is a data visualization and analysis service. Yandex Query and Yandex DataLens integration lets you visualize data stored in Yandex Object Storage. Yandex DataLens generates a YQL query, Yandex Query runs the query and returns its execution results, while Yandex DataLens visualizes these results.
Below is the architecture of the solution with Cloud Logging.
In the given example, we'll show the cost of New York City yellow taxi trips on the map, depending on the passenger drop-off area, with the driver tip amount highlighted. The data was previously uploaded to Yandex Object Storage and stored in a public bucket named yq-sample-data
, in the nyc_taxi_csv
folder.
Note
Yandex Cloud provides the New York City taxi trips dataset as is. Yandex Cloud makes no representations, express or implied, warranties, or conditions pertaining to your use of the specified dataset. To the extent allowed by your local laws, Yandex Cloud shall not be liable for any loss or damage, including direct, consequential, special, indirect, incidental, or exemplary, resulting from your use of the dataset.
NYC Taxi and Limousine Commission (TLC):
The data was collected and provided to the NYC Taxi and Limousine Commission (TLC) by technology providers authorized under the Taxicab & Livery Passenger Enhancement Programs (TPEP/LPEP). The taxi trip data is not generated by the TLC, and the TLC makes no representations whatsoever about the accuracy of this data.
Take a look at the Dataset source and its Use Policy.
Setup
Setup steps:
- In Yandex Query, create a connection to data stored in Yandex Object Storage.
- In Yandex Query, write a query that will generate a data structure.
- Create a data source in Yandex DataLens.
- In Yandex DataLens, create a dataset.
- Create a chart in Yandex DataLens for data visualization.
Creating a data connection in Yandex Query
To get started, create a connection to Yandex Object Storage:
- Go to the Yandex Query interface, the Tutorial section, select Batch mode, and click Create infrastructure for training.
- In the Create connection window that opens, all the necessary field values are preset, so just click Create to create a connection.
- In the Create binding window that opens, all the necessary field values are preset, so just click Create to create a data binding.
Creating a data source in Yandex DataLens
To create a connection, you'll need a service account with the editor role.
To create a Yandex Query connection:
-
Go to the connections page.
-
Click Create connection.
-
Select the Yandex Query connection.
-
Select the folder to create a Yandex Query connection in.
-
Specify the connection parameters:
-
Connection name. Name it
yellow-taxi
. -
Cloud and folder. Select the folder where your service account will be located.
-
**Service account **. Select an existing service account or create a new one.
-
SQL query access level. Enables you to use an ad-hoc SQL query to generate a dataset.
-
-
Click Create in the lower-left corner. The connection appears in the list.
Note
You can check the host connection before creating it. To do this, click Check connection.
Creating a dataset
-
On the Yandex DataLens homepage, click Create dataset.
-
On the left side of the screen, click
-
Select the
yellow-taxi
connection you created in the previous step. -
In the connection properties, go to the SQL tab and paste the query text into the Subquery field:
SELECT * FROM bindings.`tutorial-analytics`
-
Click Create. The dataset appears in the list.
-
If everything is set up correctly, the Preview field will contain data like
.
-
Create a calculated field:
- Go to the Fields tab.
- Click Add field.
- In the Field name field, specify
dropoff_point
. - In the Formula field, enter:
GEOPOINT(ROUND([dropoff_latitude],1),ROUND([dropoff_longitude],1))
- Click Save.
- The Preview section will contain a new field named
dropoff_point
with data like[40.7,-74]
.
-
Click Save in the top right-hand corner to save the dataset.
Charts with data
To create a chart:
- Go to the Charts section in the Yandex DataLens interface.
- Click Create chart and select Chart.
- Select the
yellow-taxi
dataset. - Select the Map chart type.
- Drag the
dropoff_point
dimension into the Points field. - Drag the
fare_amount
dimension into the Points size field. - Drag the
tip_amount
dimension into the Color field.
The result of running the query is a map showing the distribution of trip fares by region and with the driver tip amount highlighted.