Blog | Innovation | News

Project Antalya 2026 Roadmap

Citius, Altius, Fortius

Project Antalya was started internally a year ago, and was first announced to the community in April 2025. We wanted to solve the problem of high storage cost for big data sets, and add compute-storage separation for users of open source ClickHouse®. Project Antalya is based on three simple ideas: Parquet on object storage as an open data format, Iceberg as an open table format, and compute swarms as an open execution model. We have been implementing those ideas in ClickHouse throughout 2025. What comes next in 2026?

Iceberg as a first-class citizen in ClickHouse

During 2025 Iceberg became the de-facto standard for data lakes. There are strong reasons for that. Iceberg provides a true universal interface to tables on object storage. It also has a transaction model that allows it to be used as a storage layer for DBMS. And it is open, supported by hundreds of products and libraries.

While there was a lot of progress integrating Iceberg into ClickHouse in 2025, it is still far from perfection. Our goal is to make Iceberg tables as easy to use as MergeTree, for storing and querying data. In particular, we want the full CREATE TABLE functionality that would support partitioning, sorting, schema inference from a source table, and table settings. Other DDL commands are needed as well, such as DROP TABLE, TRUNCATE TABLE, and ALTER TABLE. The latter one is especially intriguing to implement and test.

The same is true for writing into Iceberg tables. While the basic functionality was added into ClickHouse in Q3-Q4/2025, it still lacks the convenience of MergeTree tables. Full support should come in the following months.

Swarming around Data

Compute Swarms are among the best features of Project Antalya, and the most successful so far. Swarm nodes do not keep any data and schema, and therefore can be almost instantly launched and used to query Parquet data from Iceberg tables or object storage directly. We put a lot of effort into proper cache tuning and job distribution, and proved that ClickHouse with Antalya Swarm extension can outperform MergeTree containing the same data.

While the swarm computational model is extremely powerful, we want to make it even better. In particular, we plan to introduce native support for Standalone Swarms that can be used from any ClickHouse instance, not just Antalya builds. Other planned features include better support for joins. 

The bigger story is an extension of swarm computing model outside of query only. We plan to add features that would make it easy to use swarms for data transformations like inserting new data and compaction of Parquet files. These operations typically required heavy weight Spark jobs in the past, but with compute swarms it will be much easier and faster.

Tiered Storage and Hybrid Tables

When we started Antalya we wanted to solve a problem for our users that sounded trivial: how to store a lot of data in the cloud in the most efficient way without sacrificing legendary ClickHouse performance. This problem is not easy to solve. Project Antalya approaches it by using Iceberg as a cold tier for ClickHouse MergeTree data. We have made two major additions to ClickHouse for that. ALTER TABLE EXPORT PART and EXPORT PARTITION allow exporting data from MergeTree to Iceberg. Hybrid Tables provide an interface to query from MergeTree and Iceberg tiers as from a single table.

We are continuing to invest in improvements in both features to ensure that they can operate at scale, and support transparent, transactional movement of data between MergeTree and Iceberg. We’re shooting to have this complete by Q3 2026.

Performance, Security and Other Features

Project Antalya gained a lot of interest from ClickHouse users and community. Engineers started to try it out for various use cases and discover new applications. That generated user requests that we always take seriously. One example from 2025 is how Antalya Compute Swarms can be used as a query engine for new AWS service S3 Table buckets.

We plan more features like this. We have already tested support for Snowflake Polaris. Next step is to extend Antalya to use Google Metastore, Unity, and other popular catalogs. There are very few efficient execution engines for data lakes. Data is often abandoned because there is no easy way to query it. ClickHouse query speed and scalability of Antalya swarms will convert such data swamps to data jets!

Another important focus is security. The security model for data lakes is very convoluted. We need to support OAuth for accessing catalogs and respect security models configured via catalogs. ClickHouse RBAC and row-level access policies allow fine grain access control to sensitive data. We plan to apply this efficiently to Iceberg as well.

Project Antalya Full Throttle

2025 was the incubation year for Project Antalya. It proved that all three ideas – Parquet, Iceberg and Swarms – are feasible technologies to deliver a capable real-time data lake engine. In 2026, Antalya builds will mature into a production grade technology suitable for a wide range of use cases. The data sizes are growing, and Project Antalya is the right technology to keep up with this growth.

Project Antalya is an open source project based on ClickHouse that everybody can use and contribute back. We do not develop it just for ourselves. We believe that open source, open storage and and an open query execution model is the desired foundation for every user. When it is open – there is no vendor lock-in.

The more detailed Project Antalya 2026 roadmap is publicly available. If you know C++ and want to develop some cool stuff – give us a all. We welcome contributions and new community members. Join us! 

Share

ClickHouse® is a registered trademark of ClickHouse, Inc.; Altinity is not affiliated with or associated with ClickHouse, Inc.

Table of Contents:

Related:

Leave a Reply

Your email address will not be published. Required fields are marked *