Window functions in DataLens
Window functions are similar to aggregate functions. They allow you to get additional information about the original sample. For example, you can calculate the cumulative total and the moving average or rank values.
The difference is that when calculating window functions, the rows are not combined into one but continue to be separate. The result of the calculation is displayed in each row. The original number of rows doesn't change. For more detail on data aggregate and grouping in DataLens, please review Data aggregation in DataLens.
These examples will use Selling.csv containing sales data by city as the data source to process.
Applying window functions
In DataLens, only measures can be the arguments of window functions. The groups of values that a function is calculated for are specified as a list of dimensions and are called windows. Groupings may only use the dimensions involved in building a chart. These include all the dimensions in one chart section.
Let's take a look at the Selling
table with data on sales in cities:
#  City  Category  Date  Sales  Profit  Day's discount 

1  Detroit  Office Supplies  20140102  10  7  0.05 
2  Portland  Office Supplies  20140405  14  10  0.00 
3  Portland  Office Supplies  20140121  20  12  0.20 
4  San Francisco  Office Supplies  20140311  8  3  0.10 
5  Detroit  Furniture  20140101  12  3  0.00 
6  Portland  Furniture  20140121  7  2  0.05 
7  San Francisco  Technology  20140102  7  3  0.10 
8  San Francisco  Technology  20140117  13  5  0.20 
Example 1
The chart based on the Selling
table and grouped by City
and Category
must compute total sales (TotalSales
) as well as each category's share in a city as a percentage of the total (% Total
). To do this, you need to create two measures using the SUM window function:
 TotalSales —
SUM(SUM([Sales]) TOTAL)
 % Total —
SUM([Sales]) / [TotalSales]
For example, for the Table chart, the result looks like this:
Example 2
You need to arrange the rows in the Selling
table based on the sales amount. To do this, you can use the RANK window function RANK(SUM([Sales]))
. As a result, each row is assigned a sequence number: the row with the largest sales amount is 1, and the smallest is 6.
Window functions can be nested. You can also specify a custom grouping for each function used in the formula.
Example 3
You need to arrange the rows in the Selling
table based on the the average sales amount for all dates in the city. You can compute an average sales amount for a city using the AVG function: AVG(SUM([Sales]) WITHIN [City])
. City names repeat in the table, so for ranking, use the RANK_DENSE function. It doesn't skip sequence numbers for rows with the same value. The resulting formula is RANK_DENSE(AVG(SUM([Sales]) WITHIN [City]) TOTAL)
.
Example 4
Let's review the most complicated window function example. Let's a build a dataset from a connection to the demo DB (SampleLite
table) to use as our data source. Let's build a chart of the sales statistics by product subcategory. Let's only include in the chart those subcategories that made it into the daily top 3 sellers at least once a day.

Let's order our product subcategories in descending order by sales amount within each date. To accomplish this, we'll use the RANK window function to create a metric:
 Sales Rank —
RANK([Sales] WITHIN [Date])
As a result, for each date, the subcategory with the highest total sales will be numbered
1
, the subcategory with the next highest total2
, etc. For convenience, let us place the data in a Table chart:  Sales Rank —

Let's highlight the categories in the top 3 most sold for the same date. To do this, create the following metric:
 Top3 —
IF([Sales Rank] <= 3, 1, 0)
The subcategories that make it into the top 3 based on sales for each date will have the
[Top3]
metric equal to1
whereas for the remaining categories on the same date, it will be equal to0
.  Top3 —

Using the
[Top3]
flag, we have highlighted the categories for the same date. Now, we need to highlight these subcategories for the other dates. To accomplish this, we create a metric using the MAX window function: [Show Category] —
MAX([Top3] WITHIN [SubCategory])
In each product subcategory, the
[Show Category]
flag will be equal to1
both for the date when the subcategory made it into the top 3 based on sales and for all the other dates. If a subcategory did not make it into the 3 most sold on any date, its[Show Category]
flag will be equal to0
.  [Show Category] —

Let's add the following filter to the chart:
[Show Category] = 1
. This will give us a list of the product subcategories that are to be displayed on the chart. 
Let's now change our chart type to Line chart. Configure visualization:

Drag the
Date
dimension under X. 
Drag the
Sales
metric under Y. 
Drag the
SubCategory
dimension under Colors. 
Under Chart filters, we'll keep the filter for
Show Category
equal to1
. 
In the Y axis settings, select that Null values are to Displayed as 0.

Grouping in window functions
Just like aggregate functions, window functions can be calculated:
 For a single window.
 For several windows.
For more information on groupings in window functions, please review under Grouping.
Grouping for a single window
With this grouping option, the function is calculated for a single window that includes all the rows. The TOTAL
grouping is used. It enables you to calculate totals, rank rows, and perform other operations that require information about all the source data.
Example
You need to calculate the average sales amount (AvgSales
) and deviations from it for each category in the city (DeltaFromAvg
). The best function for this is AVG:
 AvgSales —
AVG(SUM([Sales]) TOTAL)
 DeltaFromAvg —
SUM([Sales])  [AvgSales]
Grouping for several windows
Sometimes the window function needs to be calculated separately by group, and not across all records. In this case, the WITHIN
and AMONG
groupings are used.
WITHIN
WITHIN
: Similar to GROUP BY
in SQL
. It lists all the dimensions by which splitting into windows is performed. In WITHIN
, you can also use measures. In this case their values are similarly included in the window grouping.
Warning
In WITHIN
, the dimensions that aren't included in chart grouping are ignored. For example, in a chart grouped by the City
and Category
dimensions for the SUM(SUM([Sales]) WITHIN [Date])
measure, the Date
dimension is ignored and it becomes the same as the SUM(SUM([Sales]) TOTAL)
measure.
Example
Calculating the share of each category (% Total
) of the total sales amount by city (TotalSales
):
 TotalSales —
SUM(SUM([Sales]) WITHIN [City])
 % Total —
SUM([Sales]) / [TotalSales]
For example, this is the result for the Column chart:
AMONG
In this case, splitting into windows is performed for all dimensions that are included in the chart grouping but are not listed in AMONG
. That's why this grouping type is contrary to WITHIN
. When calculating the function, AMONG
transforms to WITHIN
, which performs grouping by all dimensions that are not listed in AMONG
.
For example, for a chart with grouping by the City
and Category
dimensions, the following measures are the same:
SUM(SUM([Sales]) AMONG [Category])
andSUM(SUM([Sales]) WITHIN [City])
SUM(SUM([Sales]) AMONG [City], [Category])
andSUM(SUM([Sales]) TOTAL)
This option is provided only for convenience and is used when you don't know which dimensions the chart will be built across in advance, but you need to exclude certain dimensions from the window grouping.
Warning
The dimensions listed in AMONG
should be added to the chart sections. Otherwise, the chart returns an error.
Sorting
Some window functions support sorting, the direction of which affects the calculation value. To specify sorting for the window function:
 Specify dimensions or measures in the
ORDER BY
section.  In the chart, move the dimensions or measures to the Sorting section.
Dimensions and measures for sorting are first taken from the ORDER BY
section in the formula and then from the Sorting chart section.
Example
You need to calculate the change in the total sales amount (IncTotal
) for the entire period, from the earliest to the latest date. To do this, you can use the RSUM function sorted by the Date
dimension: RSUM(SUM([Sales]) TOTAL ORDER BY [Date])
.
Result for an example Line chart:
You'll get a similar result if you set the IncTotal
measure with the RSUM(SUM([Sales]) TOTAL)
formula and add the Date
dimension to the Sorting section.
Filtering
Function values in charts are calculated after applying filters across the dimensions and measures added to the Filters section. For window functions, you can override this order. To do this, specify the necessary dimensions or measures in the BEFORE FILTER BY
section of the formula. In this case, the function value is calculated before filtering is applied.
The calculation order is changed when you need to calculate the function value for the original dataset but the chart data is limited by the filter.
Example
You need to calculate the change in the total sales amount (IncTotal
) for the period from 17.01.2014
through 11.03.2014
. If you add a Date
filter and create the RSUM(SUM([Sales]) TOTAL ORDER BY [Date])
measure, the function is calculated only for the data limited by the filter:
To calculate a function for all the data, but only display the result for a certain period, you need to add the Date
dimension to the BEFORE FILTER BY
section: RSUM(SUM([Sales]) TOTAL ORDER BY [Date] BEFORE FILTER BY [Date])
.
Questions and answers
As an example, let's consider a line chart showing a plot of the change in total sales by date (see Selling table.). The cumulative total (IncTotal
) is calculated using the RSUM window function: RSUM(SUM([Sales]))
.
To display the change in the sales amount for each product category, add the Category
dimension to the Colors section.
After that, the chart displays a separate graph for each category but the totals are calculated incorrectly: for Furniture
, it's 49 instead of 19, for Office Supplies
, 91 instead of 52, for Technology
, 42 instead of 20. This is because the dimension in the Colors (Category
) section is included in the grouping the same way as the dimension in the X section (Date
). To calculate the amount correctly, you need to add the Category
dimension to the WITHIN
section or the Date
dimension to the AMONG
section: RSUM(SUM([Sales]) WITHIN [Category])
or RSUM(SUM([Sales]) AMONG [Date])
.
When adding a grouping (rounding) for a date in the chart, the original field is replaced with an automatically generated one. For example, when rounding to a month, the [Date]
dimension is replaced with a new field using the DATETRUNC([Date], "month")
formula. Because the original [Date]
field disappears from the list of chart dimensions, the window function it's used in no longer works. For the function to work correctly, you need to round the original [Date]
dimension in the formula using the DATETRUNC function.