ClickHouse RoadMap 2019

Dec 17, 2018

The year 2018 approaches the end. It has been a great year for ClickHouse and the ClickHouse community — a lot of events, new features and interesting projects. Now it is time to see what is next. ClickHouse development team lead by Alexey Milovidov unveiled some plans and allowed us to share them with you.

There is some time left before the New Year still, and new features can sill arrive. There were rumors that next release is going to be published on December 31st, though it may be ready earlier as well. The following features are planned there:

HDFS import/export via table functions
Parquet file format support for importing/exporting data. That makes it easier to integrate ClickHouse with Hadoop ecosystem.
Column level compression/encoding. The initial release will include lz4, zstd and delta encoding. Double delta, Gorilla and blosc algorithms are to be released later.
Ability to add new columns to MergeTree storage engine index. This is especially useful for Summing/Aggregating MergeTree tables that require all non-aggregated columns to be in the index

The first 2019 major releases will bring the following integration extensions.

Amazon S3 import/export via table functions
Dictionaries as first-class citizens defined with common ‘CREATE TABLE’ SQL syntax

Security and fine grained access control is a highly desirable feature by many companies, and ClickHouse will properly support it in Q1/2019:

Table, column and row level security
RBAC access control model
Pluggable external authentication (LDAP, Kerberos)

MergeTree is the core ClickHouse technology and it will be improved further for even better performance and usability. Q1-Q2/2019 plans include:

Adaptive index granularity for MergeTree tables
Secondary index structures (min/max, bloom filter)
Using index for better ORDER BY / GROUP BY performance

This year there was a lot of work done already on improving ClickHouse support of SQL joins. In Q2-Q3/2019 it is going to be continued, both in terms of SQL standard compliance and better performance. That includes:

Multi-table joins
Merge join for big tables
Bucket-shuffle algorithm for distributed joins
ASOF joins for time series data

Resource pools and support for multiple storage volumes were planned for 2018 but delayed in favor of other features. Those are still in the plan for Q2-Q3/2019 with resource pools coming first:

Resource pools (fine grained CPU, memory, network, RAM allocation)
Layered storage HDD/SDD for cold/hot data
JBOD storage support

ClickHouse has been being criticized sometimes for limited support of geospatial data structures. We can not expect it to be as feature rich as PostGIS, but some extensions for geospatial applications are planned for Q3/2019, though the priorities may be changed, and it may appear earlier:

Geohash support
Polygonal dictionaries

Amongst other things that ClickHouse development team has plans to work on, we would like to highlight two in particular:

Advanced algorithms for searching strings, making it more full-text-search-friendly
Machine learning algorithms as aggregate functions. That opens up a lot of possibilities so we are eager to see how it works.

This is just a list of projects that the core development team is going to work on. There are many community contributors who add significant features to ClickHouse as well. Altinity is going to be active there too — we have several ClickHouse projects and code contributions planned for 2019 that will make ClickHouse easier and safer to use.

Stay tuned!

2 Comments

According to the roadmap published on the Yandex site: https://clickhouse.yandex/docs/en/roadmap/ it sounds as if RBAC and external authentication is slated for Q3 2019, rather than Q1… Any more word on this scheduling?

Alexander Zaitsev says:

15th February 2019 at 5:58 am

We recently checked it with them, the target is still Q1 or early Q2.

Comments are closed.

Get in touch with ClickHouse experts.

Get in touch with ClickHouse experts.

ClickHouse RoadMap 2019

2 Comments