ClickHouse Window Functions — Current State of the Art

Window functions have long been a top feature request in ClickHouse. Thanks to excellent work by Alexander Kuzmenkov, ClickHouse now has experimental support, and users can begin to try them out. Naturally we did the same at Altinity. In fact, we tested the functionality extensively to find what currently works, what doesn’t, and what is not supported.

The result of this work was a set of tests that we have developed and added to ClickHouse repository. In addition to just writing tests, we have, as always, described the functionality that we test using requirements that help us understand and manage the process of testing each feature. In this article, we will take a comprehensive look at window functions, including an intro for new users, detailed syntax, and a summary of current limitations. For this article, we used the latest stable ClickHouse version 21.5.5.12.

Note: Our friends at Tinybird.co also blogged about window functions recently. You can expect more community articles and talks as this long-awaited feature becomes fully available in stable releases.

Window What?

Window functions allow users to perform calculations on data rows based on some relation of these rows to the current row inside a so-called window. Therefore, with window functions you always have to keep the current row in mind. A simple example of using window functions is a calculation of a moving average. For example, we can easily calculate a moving average for the last four values in a series by using ROWS frame and applying the avg aggregate function over it.

SELECT
    number,
    avg(number) OVER (ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS mv4
FROM values('number Int8', 1, 2, 3, 4, 5, 6, 7, 8)
┌─number─┬─mv4─┐
│      1 │   1 │
│      2 │ 1.5 │
│      3 │   2 │
│      4 │ 2.5 │
│      5 │ 3.5 │
│      6 │ 4.5 │
│      7 │ 5.5 │
│      8 │ 6.5 │
└────────┴─────┘

Note that the output before number 4 is not valid as there is not enough data.

In addition the current row, there is also a concept of peers of the current row that applies to RANGE frames. Peers of the current row are the rows that fall into the same sort bucket if the ORDER BY clause is specified. We can observe this behavior in the following example.

SELECT
    number,
    sum(number) OVER (ORDER BY number ASC RANGE BETWEEN CURRENT ROW AND CURRENT ROW)
FROM numbers(1, 3)
┌─number─┬─sum(number) OVER (ORDER BY number ASC RANGE BETWEEN CURRENT ROW AND CURRENT ROW)─┐
│      1 │                                                                                1 │
│      2 │                                                                                2 │
│      3 │                                                                                3 │
└────────┴──────────────────────────────────────────────────────────────────────────────────┘

A detailed explanation might help. What we see above is that a window function is evaluated for each result row. Because we don’t have any PARTITION BY clause it means we only have one window. Now, because we have a RANGE frame clause it means for each row the window function evaluates a corresponding frame for that row. The frame is always sliding with the current row inside a window. The frame is specified as BETWEEN CURRENT ROW AND CURRENT ROW which means each frame only includes current row peers. Because we have an ORDER BY clause that assigns each value into its own sort bucket, then we have the case where current row is the same as current row peers. Therefore, each frame only includes the value of the current row which is passed to the sum function that in turn returns the value of the current row for each frame.

If no ORDER BY clause is specified, then all rows in the window partition are considered to be the peers of the current row as the query below illustrates this.

SELECT
    number,
    sum(number) OVER (RANGE BETWEEN CURRENT ROW AND CURRENT ROW)
FROM numbers(1, 3)
┌─number─┬─sum(number) OVER (RANGE BETWEEN CURRENT ROW AND CURRENT ROW)─┐
│      1 │                                                            6 │
│      2 │                                                            6 │
│      3 │                                                            6 │
└────────┴──────────────────────────────────────────────────────────────┘

What we can see is that now the window function returned the same result for each row. Why? Simple: we don’t have any ORDER BY clause and all rows in the result are considered to be the peers of the current row. While the frame is still sliding inside a window with each current row, the frame for each row always contains all the rows.

In general, a window can either include all rows in the result set or result set can be partitioned into separate windows using the PARTITION BY clause. So when no PARTITION BY clause is specified then all the rows fall into one big partition, also known as a window. Rows in each window then can be sorted using the ORDER BY clause. The window analogy comes in handy when you try to understand this functionality. Just create a mental image of a window that allows you to see either all the rows in the result set or just some chunk of them as the concept diagram below shows.

Now within a window you can also define a frame using a frame clause. The frame clause can either be defined in terms of the ROWS or the RANGE relationship to the current row. Nifty!

Of course, those who have already worked with window functions in either MySQL or PostgreSQL will find the syntax and functionality to be familiar and the basic explanation above unnecessary. However, if you are not familiar with window functions, then they can initially be hard to understand. Thus we find our first limitation of window functions in ClickHouse; the current ClickHouse documentation does not go into many details and so if you need to get familiar with window functions you have to use either documentation from PostgreSQL or MySQL. Some examples and explanations of the expected behavior can also be found in our requirements.

The Mini Ninja-course For Defining Windows

Windows can either be named or inline defined using the OVER clause. Named windows are defined using WINDOW clause which is defined as

WINDOW window_name AS (window_spec)
    [, window_name AS (window_spec)] ..

where window_name is a unique name for the specific window and window_spec is defined as

window_spec:
  	[partition_clause] [order_clause] [frame_clause]

where the partition_clause is

partition_clause:
  	PARTITION BY expr [, expr] ...

and the order_clause is

order_clause:
    ORDER BY expr [ASC|DESC] [, expr [ASC|DESC]] ...

and the frame_clause is

frame_clause:
    {ROWS | RANGE } frame_extent

and the frame_extent is

frame_extent:
    {frame_start | frame_between}

and frame_start is

frame_start: {
    CURRENT ROW
  | UNBOUNDED PRECEDING
  | UNBOUNDED FOLLOWING
  | expr PRECEDING
  | expr FOLLOWING
}

and frame_between is

frame_between:
    BETWEEN frame_start AND frame_end

and frame_end is

frame_end: {
    CURRENT ROW
  | UNBOUNDED PRECEDING
  | UNBOUNDED FOLLOWING
  | expr PRECEDING
  | expr FOLLOWING
}

oh, and again it is important to keep in mind that CURRENT ROW in the frame_start or frame_end for RANGE frames actually means peers of the current row. Again, the peers of the current row are any rows that fall into the same sort bucket defined by the ORDER BY clause, if any.

Let’s also not forget the OVER clause that is defined as

OVER ()|(window_spec)|named_window 

where the special case of the OVER () is used to define a window that contains all the rows.

Don’t you love it? We do, but if you are not familiar with window functions then please resist the urge to hit something with a window or a window with a frame before digesting the summary above. It all makes sense once you start using the window functions! Also, don’t forget that some frame_between clauses might not be valid as you can define something that has a start before the end. Don’t do that unless you want to receive an error as we will see in the examples below.

What About The Function Part?

After one or more windows are defined a function can be applied over them. There are two types of functions that we can apply to windows. First type are the so-called window native functions such as:

  • row_number
  • first_value
  • last_value
  • rank
  • dense_rank
  • lead (not currently supported but there is a workaround)
  • lag (not currently supported but there is a workaround)

These functions can only be used over the windows. The second type are regular aggregate functions and ClickHouse supports applying over the windows all the standard aggregate functions such as min, max, avg etc. as well as ClickHouse specific aggreate functions such as argMin, argMax, avgWeighted etc.

Note that there are special aggregate functions such as initializeAggregation, aggregate function combinators, and parametric aggregate functions that we haven’t tested using over windows.

There are also some aggregate functions that were not easy to set up for testing as they required arrays so they were skipped. If you are interested, you can find the full list of aggregate functions tested with windows in tests/testflows/window_functions/tests/aggregate_funcs.py.

Test Coverage

In addition to stateless window function tests, we at Altinity have also developed an additional test suite using our QA process. Given that window functions have a reference implementation, as a base, in our testing we have compared ClickHouse window functions’ behavior against PostgreSQL regression tests. We could not use these tests as-is, as there are enough differences between ClickHouse and PostgreSQL that require manual changes to test queries. Nonetheless, we have ported as many queries as we could to make sure ClickHouse’s window functions implementation returns the same results as in PostgreSQL.

We have also added additional test cases to provide better coverage for the ROWS and RANGE frame clauses including error corner cases. In addition, we have modified some tests to work when reference tables are sharded between a three-node cluster accessed using a Distributed table. This was needed because when the table is sharded the order of rows changes and to make test work with both distributed and non-distributed reference tables correct ORDER BY clauses had to be added.

Overall, with the requirements that we have defined, ClickHouse satisfies 93% of them. Here is a summary of the coverage report.

SRS019 ClickHouse Window Functions
  130 requirements (121 satisfied 93.1%, 7 unsatisfied 5.4%, 2 untested 1.5%) 

If you are interested, you can find our tests and requirements in the tests/testflows/window_functions folder inside ClickHouse’s GitHub repository.

Potential Problems With Distributed Tables

For distributed tables, the current implementation forces window function calculation to be performed on the initiator node. This means that window function calculation is not spread out between shards but instead each shard has to send its data to the initiator to perform calculations. If the amount of data is large, then the initiator node can be left receiving and processing a lot of data with the potential of running out of memory.

Additionally, there are still bugs that need to be ironed out for distributed tables, for example issue 23902. This issue seems to be a regression from 21.5.5.12 in the latest master and hopefully can be resolved before the next official 21.6 release.

Documented Unsupported Features

The EXCLUDE Clause

The EXCLUDE clause is not supported and is not recognized by the parser. For example, the following query will produce an error.

SELECT 
    sum(unique1) OVER (ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING EXCLUDE NO OTHERS) AS sum,
    unique1, 
    four 
FROM tenk1 
WHERE unique1 < 10
Syntax error: failed at position 68 ('exclude')

The GROUPS Frame

The GROUPS frame is also not supported right now. The parser does recognize it and a not implemented error is returned. For example,

SELECT
    sum(unique1) OVER (ORDER BY four ASC GROUPS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),
    unique1,
    four
FROM tenk1
WHERE unique1 < 10
Code: 48. DB::Exception: Received from localhost:9000. DB::Exception: Window frame 'GROUPS' is not implemented (while processing 'ORDER BY four ASC GROUPS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW').

No INTERVAL Support For Date And DateTime In RANGE Frames

The INTERVAL syntax is not supported for Date and DateTime types in the RANGE frame offsets. The workaround is just to specify the offset as a positive integer. For example, if you have a Date type column and you need a range frame being 1 year preceding and 1 year following the date of the current row then you can try to do the following.

SELECT
    sum(salary) OVER (ORDER BY enroll_date ASC RANGE BETWEEN 365 PRECEDING AND 365 FOLLOWING) AS sum,
    salary,
    enroll_date
FROM empsalary
ORDER BY empno ASC

The gotcha is that for Date type the integer value specifies the number of days and for DateTime type the value specifies the number of seconds. So, if you have DateTime column and you need 1 year interval you can try to do the following.

SELECT
    id,
    f_timestamp,
    first_value(id) OVER w AS first_value,
    last_value(id) OVER w AS last_value
FROM datetimes
WINDOW w AS (ORDER BY f_timestamp ASC RANGE BETWEEN 31536000 PRECEDING AND 31536000 FOLLOWING)
ORDER BY id ASC

The problem with the workarounds above is that it really does not work well in practice as calculation of date time intervals is much more complex. A simple example below shows what happens when we try to add one year to 2020-02-28 using fixed number of days.

SELECT toDate('2020-02-28') + 365

┌─plus(toDate('2020-02-28'), 365)─┐
│                      2021-02-27 │
└─────────────────────────────────┘

SELECT toDate('2020-02-28') + toIntervalYear(1)

┌─plus(toDate('2020-02-28'), toIntervalYear(1))─┐
│                                    2021-02-28 │
└───────────────────────────────────────────────┘

The obvious problem above is that the number of days in each year is not constant.

Lack of lag(value, offset) And lag(value, offset) Functions

The current lack of proper support for lead(value, offset) and lag(value, offset) functions is also unfortunate. The workaround for lag(value, offset) is to use

any(value) over (.... rows between <offset> preceding and <offset> preceding)

and for lead(value, offset) use

any(value) over (.... rows between <offset> following and <offset> following)

Therefore, where instead of just using lead(ten), with the default value of offset being 1, you have to write the following:

SELECT
    any(ten) OVER (PARTITION BY four ORDER BY ten ASC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS lead,
    ten,
    four
FROM tenk1
WHERE unique2 < 10

Oh, and don’t forget that if you have NULLs and you don’t want to get the default value instead of a proper NULL value you have to use anyOrNull function as follows:

SELECT
    anyOrNull(ten) OVER (PARTITION BY four ORDER BY ten ASC ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS lead,
    ten,
    four
FROM tenk1
WHERE unique2 < 10

Expressions Involving Window Functions

Expressions using window functions is also currently not supported. To illustrate, you can have a query as follows.

SELECT count(*) OVER (PARTITION BY four ORDER BY ten ASC) + sum(hundred) OVER (PARTITION BY four ORDER BY ten ASC) AS cntsum
FROM tenk1
WHERE unique2 < 10

If you do try it, you will get a very long error message.

Code: 47. DB::Exception: Received from localhost:9000. DB::Exception: Unknown identifier: count() OVER (PARTITION BY four ORDER BY ten ASC); there are columns: less(unique2, 10), 10, 10, unique2, four, ten, hundred: While processing count() OVER (PARTITION BY four ORDER BY ten ASC) + sum(hundred) OVER (PARTITION BY four ORDER BY ten ASC) AS cntsum

Other Unsupported Features

RANGE Frame With Named Window

The RANGE frame with named window can’t be used. For instance, if you define a named window and you try to use it with RANGE frame it will not work.

SELECT 
    * 
FROM (
    SELECT 
        sum(unique1) OVER (w range BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS sum,
        unique1, 
        four 
    FROM tenk1 
    WHERE unique1 < 10 
    WINDOW w AS (order by four)
)
ORDER BY unique1

You instead get a lengthy syntax error.

Code: 62. DB::Exception: Syntax error: failed at position 42 ('w'): w range between current row and unbounded following) AS sum,unique1, four FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four)) ORDER BY unique1 FORMAT Tab. Expected one of: GROUPS, ClosingRoundBracket, ORDER BY, PARTITION BY, ROWS, RANGE, token

RANGE Frame Data Types And Offsets

The RANGE frame offsets also work only for floats and integer data types. Nullable or other data types are not supported. So, if you try using a Decimal column in the RANGE frame as below

SELECT
    id,
    f_numeric,
    first_value(id) OVER w AS first_value,
    last_value(id) OVER w AS last_value
FROM numerics
WINDOW w AS (ORDER BY f_numeric ASC RANGE BETWEEN 1 PRECEDING AND 1 FOLLOWING)

you will be reminded that Decimal type is not supported with RANGE frame.

DB::Exception: The RANGE OFFSET frame for 'DB::ColumnDecimal<DB::Decimal<long> >' ORDER BY column is not implemented

Also, RANGE frame is not supported for Nullable columns. The example query below

SELECT
    number,
    sum(number) OVER (ORDER BY number ASC RANGE BETWEEN CURRENT ROW AND 1 FOLLOWING) AS sum
FROM values('number Nullable(Int8)', 1, 1, 2, 3, NULL)

will return the following error.

DB::Exception: The RANGE OFFSET frame for 'DB::ColumnNullable' ORDER BY column is not implemented

Additionally, you can’t use a sub-query for offsets. If you have a query like

SELECT 
    sum(unique1) OVER (ORDER BY unique1 ROWS (SELECT unique1 FROM tenk1 ORDER BY unique1 LIMIT 1) + 1 PRECEDING) AS sum,
    unique1 
FROM tenk1 
WHERE unique1 < 10

you will also get a syntax error.

DB::Exception: Syntax error: failed at position 49 ('('): (SELECT unique1 FROM tenk1 ORDER BY unique1 LIMIT 1) + 1 PRECEDING) AS sum, unique1 FROM tenk1 WHERE unique1 < 10

Similarly, column values can’t be used in range offsets. Thus, the following query will not work.

SELECT 
    any(ten) OVER (PARTITION BY four ORDER BY ten ROWS BETWEEN four PRECEDING AND four PRECEDING) AS lag,
    ten,
    four
FROM tenk1 
WHERE unique2 < 10

Instead, you will again get a syntax error.

DB::Exception: Syntax error: failed at position 67 ('four'): four PRECEDING AND four PRECEDING) AS lag , ten, four FROM tenk1 WHERE unique2 < 10

Other Things To Keep In Mind

RANGE Frame Offsets Must Be Positive

Don’t forget that negative offsets can’t be used. Therefore, if you try to execute a query that has a negative offset as below

SELECT
    number,
    sum(number) OVER (ORDER BY number ASC RANGE BETWEEN CURRENT ROW AND -1 FOLLOWING) AS sum
FROM values('number Float64', 1, 1, 2)

you will get the following error.

DB::Exception: Window frame start offset must be nonnegative, -1 given

ORDER BY Clause Required For RANGE Frame

When you specify a RANGE frame the ORDER BY clause is required. In comparison, for the ROWS frame it is not. For example, if you use RANGE frame without ORDER BY clause like so

SELECT
    number,
    sum(number) OVER (RANGE BETWEEN 1 FOLLOWING AND 2 FOLLOWING) AS sum
FROM values('number Float64', 1, 1, 2)

you will get an error.

DB::Exception: The RANGE OFFSET window frame requires exactly one ORDER BY column, 0 given

Note that the ORDER BY clause can use only one column. If we modify the query above and try to specify more than one column then you will receive an error.

SELECT
    number,
    sum(number) OVER (ORDER BY number ASC, number ASC RANGE BETWEEN 1 FOLLOWING AND 2 FOLLOWING) AS sum
FROM values('number Float64', 1, 1, 2)
DB::Exception: The RANGE OFFSET window frame requires exactly one ORDER BY column, 2 given

Frame Range Must Be Valid

Not all frame ranges that can be defined are valid. Any frame that has start offset which is before the end offset is not valid. For example, this query

SELECT
    number,
    sum(number) OVER (ORDER BY number ASC RANGE BETWEEN 2 FOLLOWING AND 1 FOLLOWING) AS sum
FROM values('number Float64', 1, 1, 2)

will return an error.

DB::Exception: Received from localhost:9000. DB::Exception: Frame start offset 2 FOLLOWING does not precede the frame end offset 1 FOLLOWING

RANGE Frame Corner Case

There is a corner case for RANGE frame that will not work as expected. For example,

SELECT
    number,
    sum(number) OVER (ORDER BY number ASC RANGE BETWEEN 0 FOLLOWING AND CURRENT ROW) AS sum
FROM values('number Int8', 1, 1, 2, 3)

seems like a valid query with a valid range where one would expect that the 0 FOLLOWING start of the frame would be treated as the CURRENT ROW. That is not the case, and instead you will get the following error.

DB::Exception: Window frame 'RANGE BETWEEN 0 FOLLOWING AND CURRENT ROW' is invalid

To fix this just use CURRENT ROW as the start frame instead. The fun part is that if you say 0 PRECEDING for the start frame it will work and the result will be the same as if you specified RANGE BETWEEN CURRENT ROW AND CURRENT ROW.

SELECT
    number,
    sum(number) OVER (ORDER BY number ASC RANGE BETWEEN 0 PRECEDING AND CURRENT ROW) AS sum
FROM values('number Int8', 1, 1, 2, 3)
┌─number─┬─sum─┐
│      1 │   2 │
│      1 │   2 │
│      2 │   2 │
│      3 │   3 │
└────────┴─────┘

Conclusion

In this article, we have provided a summary of the syntax to define SQL windows and frames in ClickHouse. Along with the syntax we have given numerous examples of queries. We have highlighted the testing that we at Altinity have performed to ensure that window functions behave correctly. We also looked at unsupported features and the corresponding workarounds as well as other small things that you might run into when using window functions.

Overall, windows functions, while still in experimental stage, are maturing very fast. We invite you to use this exciting new feature in your work with ClickHouse and report any issues that you find on GitHub.

Share