Yandex.Cloud
  • Services
  • Why Yandex.Cloud
  • Pricing
  • Documentation
  • Contact us
Get started
Yandex Database
  • Getting started
    • Overview
    • Create databases
    • Examples of YQL queries
    • Examples of operations in the YDB CLI
    • Launch a test app
    • Document API
    • Developing in NodeJS through the Document API
  • Step-by-step instructions
    • Database management
    • How to connect to a database
    • Table management
    • Reading and writing data
    • Working with secondary indexes
  • Working with the SDK
  • Concepts
    • Overview
    • Data model and schema
    • Serverless and Dedicated operation modes
    • Data types
    • Transactions
    • Secondary indexes
    • Time to Live (TTL)
    • Terms and definitions
    • Quotas and limits
  • Access management
  • Pricing policy
    • Overview
    • Serverless mode
    • Dedicated mode
  • Recommendations
    • Schema design
    • Partitioning tables
    • Secondary indexes
    • Paginated output
    • Loading large data volumes
    • Using timeouts
  • YDB API and API reference
    • Database limits
    • Handling errors in the API
  • Amazon DynamoDB-compatible HTTP API
    • API reference
      • All methods
      • Actions
        • BatchGetItem
        • BatchWriteItem
        • CreateTable
        • DeleteItem
        • DeleteTable
        • DescribeTable
        • GetItem
        • ListTables
        • PutItem
        • Query
        • Scan
        • TransactGetItems
        • TransactWriteItems
        • UpdateItem
      • Common errors
  • YQL reference guide
    • Overview
    • Data types
      • Simple
      • Optional
      • Containers
      • Special
    • Syntax
      • Unsupported statements
      • For text representation of data types
      • Expressions
      • CREATE TABLE
      • DROP TABLE
      • INSERT INTO
      • UPSERT INTO
      • REPLACE INTO
      • UPDATE
      • DELETE
      • SELECT
      • GROUP BY
      • JOIN
      • FLATTEN
      • ACTION
      • DISCARD
      • PRAGMA
      • DECLARE
      • OVER, PARTITION BY, and WINDOW
    • Built-in functions
      • Basic
      • Aggregate
      • Window
      • For lists
      • For dictionaries
      • For JSON
      • For structures
      • For types
    • Preset user-defined functions
      • HyperScan
      • Pcre
      • Pire
      • Re2
      • String
      • Unicode
      • Datetime
      • Url
      • Ip
      • Digest
      • Math
      • Histogram
    • For text representation of data types
  • YQL tutorial
    • Overview
    • Creating a table
    • Adding data to a table
    • Selecting data from all columns
    • Selecting data from specific columns
    • Sorting and filtering
    • Data aggregation
    • Additional selection criteria
    • Joining tables by JOIN
    • Data insert and update by REPLACE
    • Data insert and update by UPSERT
    • Data insert by INSERT
    • Data update by UPDATE
    • Deleting data
    • Adding and deleting columns
    • Deleting a table
  • Maintenance
    • Backups
  • Diagnostics
    • System views
  • Questions and answers
    • General questions
    • Errors
    • YQL
    • All questions on the same page
  • Public materials
  1. YQL reference guide
  2. Syntax
  3. OVER, PARTITION BY, and WINDOW

OVER, PARTITION BY, and WINDOW

    Window functions were introduced in the SQL:2003 standard and expanded in the SQL:2011 standard. They let you run calculations on a set of table rows that are related to the current row in some way.

    Unlike aggregate functions, using window functions doesn't cause rows to become grouped into a single output row: the number of rows in the results table always matches that in the source table.

    If a query contains both aggregate and window functions, grouping is performed and aggregate function values are calculated first. The calculated values of aggregate functions can be used as window function arguments (but not the other way around).

    Syntax

    General syntax for calling a window function is as follows

    function_name([expression [, expression ...]]) OVER window_name
    

    Here, window_name (window name) is an arbitrary ID that is unique within the query and expression is an arbitrary expression that contains no window function calls.

    In the query, each window name must be mapped to the window definition (window_definition):

    SELECT
        F1(...) OVER w1,
        F2(...) OVER w2,
        ...
    FROM `my_table`
    WINDOW
        w1 AS (window_definition_1),
        ...
        w2 AS (window_definition_2)
    ;
    

    Here, the window_definition is written as

    [ PARTITION BY (expression AS column_identifier | column_identifier) [, ...] ]
    [ ORDER BY expression [ASC | DESC] ]
    [ frame_definition ]
    

    You can set an optional frame definition (frame_definition) in one of two ways:

    • ROWS frame_begin
    • ROWS BETWEEN frame_begin AND frame_end

    The frame start (frame_begin) and frame end (frame_end) are set in one of the following ways:

    • UNBOUNDED PRECEDING
    • offset PRECEDING
    • CURRENT ROW
    • offset FOLLOWING
    • UNBOUNDED FOLLOWING

    Here, the frame offset is a non-negative numeric literal. If the frame end isn't set, the CURRENT ROW is assumed.

    There should be no window function calls in any of the expressions inside the window definition.

    Calculation algorithm

    If PARTITION BY is set, the source table rows are grouped into partitions which are then handled independently of each other.
    If PARTITION BY isn't set, all rows in the source table get in the same partition. If ORDER BY is set, it determines the order of rows in a partition.
    If ORDER BY is omitted, the order of rows in the partition is undefined.

    The frame_definition specifies a set of partition rows that fall into the window frame associated with the current row.

    In ROWS mode (the only one that YQL currently supports), the window frame contains rows with the specified offsets relative to the current row in the partition.

    • For example, if ROWS BETWEEN 3 PRECEDING AND 5 FOLLOWING is used, the window frame contains 3 rows preceding the current one, the current row, and 5 rows following it.

    The set of rows in the window frame may change depending on which row is the current one.

    • For example, for the first row in the partition, the ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING window frame will have no rows.

    Setting UNBOUNDED PRECEDING as the frame start means "from the first partition row" and UNBOUNDED FOLLOWING as the frame end — "up to the last partition row". Setting CURRENT ROW means "from/to the current row".

    If no frame_definition is specified, a set of rows to be included in the window frame depends on whether there is ORDER BY in the window_definition.
    Namely, if there is ORDER BY, then ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW is implicitly assumed. If none, then ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

    Further, depending on the specific window function, it's calculated either based on the set of rows in the partition or the set of rows in the window frame.

    List of available window functions

    Implementation specifics

    • Functions calculated on the ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING or ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW window frame are implemented efficiently (do not require additional memory and their computation runs on a partition in O(partition size) time).

    • For the ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING window frame, you can choose the execution strategy in RAM by specifying the COMPACT hint after the PARTITION keyword.

      For example, PARTITION COMPACT BY key or PARTITION COMPACT BY () (if PARTITION BY was missing initially).

      If the COMPACT hint is specified, this requires additional memory equal to O(partition size), but then no extra JOIN operation is made.

    • If the window frame doesn't start with UNBOUNDED PRECEDING, calculating window functions on this window requires additional memory equal to O(the maximum number of rows from the window boundaries to the current row), while the computation time is equal to O(number_of_partition_rows * window_size).

    • For the window frame starting with UNBOUNDED PRECEDING and ending with N, where N is neither CURRENT ROW nor UNBOUNDED FOLLOWING, additional memory equal to O(N) is required and the computation time is equal to O(N * number_of_partition_rows).

    • The LEAD(expr, N) and LAG(expr, N) functions always require O(N) of RAM.

    Given the above, a query with ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING should, if possible, be changed to ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW by reversing the ORDER BY sorting order.

    Examples:

    SELECT
        COUNT(*) OVER w AS rows_count_in_window,
        some_other_value -- access the current row
    FROM `my_table`
    WINDOW w AS (
        PARTITION BY partition_key_column
        ORDER BY int_column
    );
    
    SELECT
        LAG(my_column, 2) OVER w AS row_before_previous_one
    FROM `my_table`
    WINDOW w AS (
        PARTITION BY partition_key_column
    );
    
    SELECT
        -- AVG (like all aggregate functions used as window functions)
        -- is calculated on the window frame
        AVG(some_value) OVER w AS avg_of_prev_current_next,
        some_other_value -- access the current row
    FROM `my_table`
    WINDOW w AS (
        PARTITION BY partition_key_column
        ORDER BY int_column
        ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
    );
    
    SELECT
        -- LAG doesn't depend on the window frame position
        LAG(my_column, 2) OVER w AS row_before_previous_one
    FROM `my_table`
    WINDOW w AS (
        PARTITION BY partition_key_column
        ORDER BY my_column
    );
    
    Language
    Careers
    Privacy policy
    Terms of use
    © 2021 Yandex.Cloud LLC