Webinars

Five Things Every New ClickHouse® User Should Know (Part 1: Applications)

Recorded: July 22 @ 08:00 am PDT
Presenters: Robert Hodges

In this practical, beginner-friendly webinar, Altinity CEO Robert Hodges presents five concepts he considers essential for any developer building applications on ClickHouse®. The talk is framed as “five things I really like about ClickHouse and want to share,” combining genuine enthusiasm with hard-won operational wisdom.

The session opens with a brief overview of what ClickHouse is: a real-time analytic database combining open-source accessibility with the columnar, parallel, and vectorized query execution of modern analytical systems. Robert then works through the five lessons in sequence.

Lesson 1 covers the many ways to run ClickHouse, from a single-binary curl install to Docker, Docker Compose, the Altinity Kubernetes Operator for ClickHouse®, Altinity.Cloud, and managed cloud options.

Lesson 2 digs into what makes ClickHouse tables fundamentally different: the MergeTree engine, columnar storage, partitioning by time, sorting by tenant and key columns, and compression. Robert illustrates the difference by walking through a 196-million-row query that executes in under a second because only 17 MB of compressed column data needs to be read and parallelized across eight vCPUs.

Lesson 3 addresses the “too many parts” problem. Small, frequent inserts cause ClickHouse to spend enormous energy merging tiny parts, slowing queries and potentially triggering insert refusals. The solutions are streaming large files via the HTTP interface, enabling async inserts, and using Kafka with the Kafka table engine and a materialized view.

Lesson 4 explores ClickHouse joins in depth: the standard hash join, dictionaries as an in-RAM alternative to repeated right-side scans, query rewrites to move joins into subqueries, and the powerful technique of using conditional aggregation to avoid joins entirely for multi-entity data in a single table.

Lesson 5 explains why ClickHouse prioritizes speed over transactional guarantees and how that leads to the ReplacingMergeTree, which uses immutable inserts and eventual consistency to implement row-level deduplication and updates at scale. The FINAL keyword and version columns complete the picture.

The session closes with a Q&A on handling pre-1900 dates using the DateTime64 type and user-defined functions.

Here are the slides:

Key Moments (Timestamps)

Key moments generated with AI assistance.

  • 00:13 – Welcome, housekeeping, and what this talk is about
  • 01:32 – What is ClickHouse? Real-time analytics meets open-source SQL
  • 04:24 – Lesson 1: ClickHouse runs anywhere (Linux, Docker, Kubernetes, cloud)
  • 10:14 – Lesson 2: ClickHouse tables and what makes them different (partitioning, sorting, compression)
  • 22:28 – Lesson 3: ClickHouse likes big parts (async inserts, HTTP streaming, Kafka)
  • 29:49 – Lesson 4: ClickHouse joins (hash join, dictionaries, subquery rewrites, conditional aggregation)
  • 41:45 – Lesson 5: ClickHouse prioritizes speed (ReplacingMergeTree, FINAL, eventual consistency)
  • 49:11 – Overview of other MergeTree variants (SummingMergeTree, AggregatingMergeTree, CoalescingMergeTree)
  • 54:02 – Summary and wrap-up
  • 54:37 – Q&A: handling pre-1900 dates with DateTime64 and UDFs

Webinar Transcript

[00:13] – Welcome and Housekeeping

Robert: Welcome everyone to our webinar on five new things every new ClickHouse® user should know. We are going to be particularly focused on things you would like to know if you are building applications. My name is Robert Hodges. I am CEO of Altinity. Another title I thought about for this talk is “five things I really like about ClickHouse and want to share today.” I hope you will appreciate the enthusiasm about these different features.

Let us do a little bit of housekeeping. This talk is being recorded. You do not have to scramble to take notes. We will distribute a link to the recording as well as the slides within about 24 hours, probably less. You will get it in your mailbox.

We do have time for questions. You can put them into the Q&A box or into the chat. If things are relevant to the talk and pop up, I will probably just stop and answer them right there. Otherwise, we will have a bit of time at the end. My colleague Josh Lee, who is our dev advocate, will also be watching for questions and queuing them up.

Quick message from our sponsor: we are Altinity and we help you run open-source ClickHouse better. We have been around since 2017. We have hundreds of customers. Our two main offerings are Altinity.Cloud, which allows you to run open-source ClickHouse, practically any build, on any platform you want, and enterprise support. Want to run ClickHouse on Android phones? Give us a call. Want to run it in clusters on Azure with hundreds of servers? Give us a call.

One more thing: we are not ClickHouse Inc. It is very important to stress that we are a different company. We compete with ClickHouse Inc. at a business level, but we collaborate on code. We have submitted something like 600-plus merged PRs in ClickHouse. We are the authors of the Altinity Kubernetes Operator for ClickHouse®, Altinity Backup for ClickHouse®, and a number of other features you may already be using.


[01:32] – What Is ClickHouse?

Robert: For those of you who are new and wondering what ClickHouse® is, here is a quick introduction. It is a real-time analytic database. You can think of it as combining the best features of an open-source relational database like MySQL with an analytic database. From the MySQL side, you get things like SQL understanding, running anywhere, and being open source. From the analytic database side, you get storing data in columns, being able to use parallel and vectorized execution for very fast results, and the ability to scale to many petabytes. ClickHouse is open source under the Apache 2.0 license, which places virtually no restrictions on what you can do with it. Because of these properties, ClickHouse has become extremely popular, with about 41,000-plus stars on GitHub, roughly the same as Spark. It is used for all kinds of use cases, from observability to real-time marketing.


[04:24] – Lesson 1: ClickHouse Runs Anywhere

Robert: Lesson number one: ClickHouse runs anywhere. I mentioned running on Android phones. That is actually not a joke. Alexander Kuzmanov wrote a blog article back in 2020 where he did exactly that. Probably not something you want to do for production systems, but it really does run anywhere.

Single-binary install. One of the simplest ways to run ClickHouse is a one-liner: curl the site, pipe it to Bash, and it grabs the ClickHouse binary, drops it into your directory, and sets it to executable. You can just start it from there. It is super easy if you want to play around with ClickHouse. It is a little bit harder for beginners to configure, but if you know your way around ClickHouse configuration files, it is very convenient. You can watch the logs right there in the terminal.

Linux packages. If you are on Linux and want something more conventional, and in fact this is how many enterprise environments do it, you install using apt or RPM. Both packaging types are supported by ClickHouse, along with regular binaries. You have a couple of choices of builds. There are official builds directly from ClickHouse Inc., which appear every month with two long-term support releases per year and a year of support each. We at Altinity do what we call Altinity Stable® Builds for ClickHouse®, designed for people who want stability. We give a three-year tail on support, and we do not release them until we are very sure they are stable. In addition to fixing bugs over those three years, we backport features so you do not have to upgrade. These are a great option if you prefer to upgrade major versions every two years or so.

Containers. Containers are very popular because they are so simple. You can drop the container onto your desktop and run it with Docker. Here is an example of a docker run command pulling up an Altinity Stable build. Just run it and off you go. This is particularly nice on Mac OS where there are Mac OS builds for ClickHouse, but it is usually more convenient to just run it in Docker. You can also run it in Docker Compose, which is very common. ClickHouse itself uses Docker Compose heavily for testing, so it is extremely well tested as a development environment.

Kubernetes. If you want to run at scale, you can set it up on Kubernetes. The Altinity Kubernetes Operator for ClickHouse® allows you to, with a simple YAML resource definition, create a full cluster. We have many videos and examples of this. About 12 lines of YAML and you have a full cluster. The big advantage of Kubernetes for building systems is that it allows you to iterate through many different combinations of CPU, storage, replicas, and shards very quickly. You can set these up in a few minutes.

Cloud. Finally, if you are all about convenience, you can run ClickHouse in a cloud. Two major options: ClickHouse Cloud is the SaaS version with Snowflake-like convenience. Altinity.Cloud, which we run, has been around since 2020, runs on four different platforms, runs any version of open-source ClickHouse, and is focused on the bring-your-own-cloud model where ClickHouse runs in your own environment.

Those are five ways to run ClickHouse. Pick the right one for you and jump in.


[10:14] – Lesson 2: ClickHouse Tables and What Makes Them Different

Robert: The next thing we want to talk about is what is going on inside ClickHouse tables and particularly what makes them different from other databases you may have worked with. You want to think about partitioning, sorting, and compression, because these are what make ClickHouse column storage work.

Columnar storage and speed. ClickHouse tables are built to be fast. When you look at table definitions, you will see visible examples of this. You have ENGINE = MergeTree, the standard engine for high-performance analytics on very large data sets. Compression is just baked in: when you create a MergeTree table, it defaults to LZ4 compression. You do not have to do anything about it, but you may decide to optimize it later for higher compression levels. There is a PARTITION BY clause, which says to break up the table into pieces by year in this example. And there is an ORDER BY, which says within the table, what is the ordering of the rows we would like to have.

Why columnar storage is so fast. Let us illustrate with a query that finds airlines with the most cancellations in a year using the popular ontime data set. This query actually scanned about 196 million rows. It ran on a relatively small host with only a few vCPUs and came back in less than a second, scanning at about 300 million rows per second. How does ClickHouse do this?

In a row database like PostgreSQL or MySQL, when you read one column, you have to touch all the values for the other columns in that row. In ClickHouse, we flip that: each column has its own separate file. When you read, you only read the exact parts of disk that contain your column data and nothing else. Moreover, those columns are highly compressed. Columns are arrays of a single data type, which compresses much better than rows with heterogeneous data types.

Let us walk through concretely how much storage that query has to touch. In MySQL, with no good row compression and a row store, you would end up reading 61 GB of data off storage. In ClickHouse, because it is a column store, we drop most of the data immediately. This particular table has 109 columns, but we are reading only about 1.5% of them. Right away we are down to less than 1 GB of data, a factor of 60 reduction. Now add compression: those columns are not 937 MB, they are actually 17 MB. A tiny amount of data, readable in a fraction of a second. And in a query, we parallelize across eight vCPUs, each reading just 2 MB. Even a full table scan can be done extremely cheaply, which opens up a lot of doors for analysis.

Partitioning best practices. A really strong best practice across a wide range of data sets is to partition by time. In this example, we partition by year because the data set covers about 30 to 40 years and we end up with about half a million rows per partition. One thing to keep in mind when partitioning: try to end up with something like a thousand parts or fewer. ClickHouse works best at that count. You can go higher, but that means more files to open and it will slow down restarts and other operations.

Why is time-based partitioning so good? One reason is that for many data sets, roughly the same amount of data arrives over time. If you partition by something else, like carrier in the airline data set, you get very lumpy distributions ranging from a few hundred thousand rows to 30 million. This results in parts that are difficult to manage. Time partitioning tends to give you balanced parts naturally.

Sorting by tenant and key columns. Within the parts, ordering also matters a lot. If you are building a multi-tenant system, the question often comes up: should we partition by tenant? The answer in ClickHouse is generally no, because tenants tend to be lumpy. What you do instead is sort by them. If you imagine this airline data set as a multi-tenant system with airlines as tenants, you sort by carrier followed by flight date. That means all Southwest Airlines data will always be in a contiguous stretch of storage, making those queries vastly faster because you minimize the amount of storage you need to touch. In a multi-tenant system you will order by tenant first and then by other fields, typically in order of decreasing cardinality. The ORDER BY also builds a sparse index that helps locate data quickly, but the ordering is the key thing because that controls where your data lives on disk.

Compression. When you are starting out as a developer, you actually do not have to think about compression very much because ClickHouse has powerful compression capabilities that you can change at any time, even on very large tables that are online. Here is a cool trick called a TTL recompress. In this example, after 6 months, the table recompresses LZ4 data to ZSTD level 1. That gives faster queries and smaller size on disk. After 12 months, it recompresses to ZSTD level 10, getting even better on-disk compression. This lets you improve cost efficiency automatically as data ages.

ClickHouse’s system tables make this easy to monitor. There is a query against system.columns that shows you the exact compressed and uncompressed size of any column you want to inspect. There is also system.parts which tells you everything about parts, and system.tables for table-level information. Totally easy to find.

The partition scheme, the ordering, and compression are the top three things to think about in table design. One final reason to get them right early: once you have picked a partition scheme you cannot change it without restructuring the table. And there are limitations on changing ordering too. So think about them up front.


[22:28] – Lesson 3: ClickHouse Likes Big Parts

Robert: Let us look at another issue: parts. ClickHouse really likes big parts, and I will show you why in a moment. What we want to do is help ClickHouse get big parts as quickly as possible by inserting large chunks of data.

Why small inserts are a problem. Lots of small inserts can crush your server. Imagine a bunch of devices, maybe running Fluent Bit or OpenTelemetry collectors, blasting small chunks of data into ClickHouse from all directions. This causes two problems. First, when you are doing a query against newly arrived data, you have to scan a very large number of parts. Parts are expensive to scan. We want about a thousand if we can get it. Having many tiny ones can double, triple, or quadruple your query time. Second, because ClickHouse automatically merges data in the background to coalesce parts into efficiently queryable sizes, sending many 10 MB parts multiple times per second means ClickHouse ends up doing a huge amount of merge work. This bogs down your server. There is a famous “too many parts” error that can arise where inserts are creating too many parts at once and ClickHouse will simply start rejecting inserts. This is something you need to think about.

Fix 1: Stream large files via the HTTP interface. If you are loading data with your own application, just load big chunks. Put the data in big files. ClickHouse does not have a problem with very large files. Here is an example using the ClickHouse HTTP interface with curl, piping a large CSV file directly to ClickHouse. This is a very simple way to load large amounts of data because it streams through without using a lot of memory. Whether it is huge CSV files, Parquet files, or any other format, you can blast them up exactly like this. The bigger they are, the less merge work ClickHouse has to do.

Fix 2: Async inserts. A great and very handy ClickHouse feature is called async inserts, or asynchronous inserts. The basic idea is that if you have a bunch of clients doing small inserts, ClickHouse will wait until it has accumulated a certain amount of data or enough time has elapsed, and then it will take all the data from all the clients and write it at once. It is like group commit if you are familiar with relational databases. This feature allows us to make bigger parts even when clients send small batches.

The settings are a little tricky and you have to play around with them, but there are two or three parameters to set. You can set them on the insert statement itself, but that is painful. What we typically do is put them in a profile, which is a group of settings automatically applied for a particular account. In this example, a user called async has a profile that makes it wait about 10 seconds before committing. Async inserts used to be unstable in earlier releases, but now they are quite stable and are really the standard tool for handling this problem.

Fix 3: Kafka table engine. The third way is to use Kafka or something equivalent like Redpanda to buffer data from upstream producers before writing it to ClickHouse. Many systems already use Kafka because they are collecting from many different sources and want things completely decoupled. ClickHouse has good support for Kafka integration using the Kafka table engine. The Kafka table engine wraps a Kafka topic and makes it look like a table. Every time you do a SELECT off a Kafka table, you are actually reading from Kafka. You set up the Kafka table engine, then set up a materialized view that fires automatically whenever blocks of data appear in its source table, and then have a MergeTree table as the target. Every time something arrives in Kafka, it buffers up for say 10 seconds (configurable), then the materialized view automatically selects the data and puts it where it belongs. This is a very simple way to have this integration, and it scales very well.


[29:49] – Lesson 4: ClickHouse Joins

Robert: Let us talk about joins. I want to thank one of the folks who signed up for this talk and mentioned wanting to know more about joins in ClickHouse. Sometimes we hear “ClickHouse cannot do joins well.” That is actually not true. ClickHouse does joins. It is a SQL database just like any other. Here is an example using the airline data set where we join the ontime table with an airports table to expand airport IDs into full airport names. Those four lines of join SQL would work in practically any database, and they run fast.

Join types. ClickHouse has a huge number of join types. Between two tables, you have the standard inner, left, right, full outer, cross, and also semi-joins, anti-joins, and as-of joins. An as-of join, if you have worked with KDB+, is where you have an approximate key. An anti-join brings back rows only where there is no match in the join table. There is also an increasing number of join algorithms, at least six and growing, reflecting that this is a deep area of active research and roadmap investment.

What ClickHouse joins do not handle well. One important thing to know: with all these features, ClickHouse does not handle BI-style situations well where you might need to join a very large table of products with a large table of SKUs with a large table of customer sentiment information. ClickHouse does not have a query optimizer powerful enough to order those joins efficiently, and it does not have very well-developed ways of moving data around. These are things that are still hard. However, there is a lot of work going on and joins are constantly improving. When you hit this, you do have to put a little thought into it.

How the hash join works. If you do not specify anything, you get a hash join. In the join shown earlier, the left side is your big table. ClickHouse will first go fetch the right-side table or tables, filter that data, and put it into an in-memory hash table. Then it scans the left side table in parallel, looks up matching data in the hash table, and combines them. This is a very common join type and very fast, but one problem is that it requires memory, and it often has to load the same data into memory time after time after time for repeated queries.

Dictionaries as in-memory join tables. One of the innovations in ClickHouse is what is called a dictionary. A dictionary is basically a permanent in-RAM hash table. Instead of constantly scanning a right-side table and loading it into memory for every query, you load the dictionary once in a separate operation and then it is shared by everyone who needs to join on that table. This is great for reference tables, even fairly large ones if you have enough memory. Dictionaries are very easy to create in ClickHouse using SQL that looks very much like a table definition. Once created, you can join on it as if it were a table, with no repeated loading cost.

Subquery rewrites. There is another thing we often do in ClickHouse: restructure joins to reduce data scanning. The whole point of systems like observability platforms is to write SQL that returns results in a second or less. A conventional join will often get pushed down and processed as part of the scan of the large table, which means you are dragging around a bunch of extra data in your intermediate results. For example, airport names are being carried around throughout the scan. What we do instead is look at the join and, if the right side is relatively small, rewrite it as a subquery. The subquery processes the scan and minimizes the data, and then we join only on the results that come back. That dropped query time by a factor of four or five in this example. In the fullness of time this will probably be done automatically by the query analyzer, but for now you tend to do these rewrites yourself.

Conditional aggregation: avoiding joins entirely. There is a final, powerful technique that starts to make you think like a distributed systems engineer rather than just a relational database person. For very large tables, or when data is spread across many ClickHouse nodes, getting it all into a single place to join efficiently is incredibly expensive. The question is: how do we avoid joins in the first place?

Here is a device data example. You have devices reading temperatures, and you have restart times from the same device. You want to get aggregates of temperature readings aligned with restart times in the same row. Many databases would force you to join two separate tables or do a self-join, both of which require scanning the table twice. In ClickHouse, we can solve this differently using conditional aggregation.

Here we have a table with about a trillion rows of test data, where both reading data and restart data live in the same table with a type column. We use minIf, which says: take the min of this value if it is a reading row. This is not null handling; it is saying “only aggregate this if the row is of type reading.” The result is that we scan the data in a single pass and get the results we want without any join. This is really powerful. I learned this trick from one of our users who does it on trading data, but it can be used wherever you have multiple types of entities in a single table and can lean on aggregation.


[41:45] – Lesson 5: ClickHouse Prioritizes Speed (ReplacingMergeTree and Eventual Consistency)

Robert: The last lesson is that ClickHouse prioritizes speed and scalability over everything else. Want fast or want transactions? You are going to get fast. This is why ClickHouse has not spent a huge amount of time on transactions. Want BI joins or fast? ClickHouse prioritizes fast.

You can look at this as deficiencies in ClickHouse, but there is another way to look at it: this is a different way of thinking about large systems. When you are designing really large systems with very large data sets, it is simplest and most cost-efficient if as much as possible you can work with data that never changes. Changing things is expensive. Going to look up a row to delete it across 50 servers is just not an operation you want to do; it will slow you down. And if data is arriving very quickly and you want to accept it as quickly as possible, receiving that data and being completely consistent across the entire system at the same time is very expensive. This is why large systems tend to use some form of eventual consistency, accepting that at any given time nodes will see somewhat different views of data in exchange for much higher insert rates.

ReplacingMergeTree. ClickHouse leans into this and provides features that use it. The most common example is solving the problem of updating rows. Let us say you have a data set that is constantly receiving new data and from time to time a value needs to be changed. That kind of update is expensive if you have a very large data set. But we can solve this using immutable data and eventual consistency through a table type called ReplacingMergeTree.

ReplacingMergeTree automatically deduplicates rows that share the same ORDER BY key. Any row that arrives with a matching language ID, studio ID, and film ID, for example, is assumed to be the same row. You can also have a version column, which ClickHouse uses to decide which of the duplicate rows should be the winner. This is a classic approach to eventual consistency: you have a series of possible values and a mechanism to pick the one you actually want to show.

How it works under the covers. As these row values arrive, the first row gets inserted. Then new data with the same key arrives: it just gets inserted as a new row. Finally a third version arrives, maybe to mark it as deleted. When ClickHouse merges the blocks in the background, it finds these matching keys sorted together and picks the last one using the version field. The rows are replaced when the merges occur.

There is an important subtlety, however. If we naively select from this table, we might see multiple rows because the inserts went into different parts that have not merged yet. We need to do a little extra work to handle that.

The FINAL keyword. ClickHouse is designed for really big systems, and in really big systems there is no guarantee that data will ever fully merge, because parts at some point reach an optimum size and ClickHouse no longer merges them in the background. The solution is the FINAL keyword. When you add FINAL to a SELECT, it says: at read time, do an extra scan and make sure that within a partition there are not multiple copies of the data. Make sure they are fully merged before you run aggregations and present results to the user. So that is how immutable data and eventual consistency give us a way to do row-level updates and deduplication. For backfills, the same mechanism works: you insert the new values with higher version numbers and they win.


[49:11] – Overview of Other MergeTree Variants

Robert: ClickHouse uses this notion of being able to deduplicate or merge data based on the ORDER BY across several table types.

ReplacingMergeTree is what I just showed you: it keeps the latest version of any set of rows that match the ORDER BY key.

SummingMergeTree and AggregatingMergeTree work for aggregations: the ORDER BY becomes effectively your GROUP BY key. These merge tree types automatically compute sums or general aggregates as parts merge.

CollapsingMergeTree and VersionedCollapsingMergeTree are older variations on ReplacingMergeTree, still useful in some specific cases.

There is a really interesting newer table type called CoalescingMergeTree. It is similar to ReplacingMergeTree but solves a slightly different problem. Suppose you have metric data for the same device arriving at different times, each row containing only some of the metric values. You would like to get them all combined into one row for queries. That is exactly what CoalescingMergeTree does. It deduplicates, but when it deduplicates, it unions the data. One row inserted with only a value for field one, and another with only a value for field two, will be merged into a single row containing both fields. Combining this with the FINAL keyword ensures this happens at query time. This is a recently added table type and a great example of how ClickHouse keeps finding new ways to solve these consistency problems.


[54:02] – Summary

Robert: Here are the five things every new ClickHouse user building applications should know.

Lesson 1: There are many ways to run ClickHouse. Beyond developer convenience, pick the right option based on cost efficiency, how much management you want to handle, and where you want to run it. You can do a huge amount just on your laptop, particularly on Linux or WSL2, and then you have a full range of choices for production.

Lesson 2: Columnar data is what makes analytic systems fast, along with parallel execution and fast aggregates. Pay close attention to PARTITION BY, ORDER BY, and as your systems grow, compression. One word of warning: with a few hundred million rows of test data you may not see performance problems because ClickHouse is so fast. At a trillion rows, problems become very obvious. Think about partitioning and ordering up front.

Lesson 3: Make inserts as big as possible. This is true across analytic systems generally, but ClickHouse has some very specific problems you can avoid with big inserts. Async inserts are probably your number one tool for this when you cannot control client-side batching.

Lesson 4: Joins in ClickHouse are definitely different from other databases. There is a huge amount of roadmap work here and things are continuing to improve. Understand how joins work internally and think about alternatives like dictionaries and conditional aggregation, which can give you join-equivalent results in constant time with a single scan.

Lesson 5: Immutable data and eventual consistency are tools for building very large systems. ClickHouse has features that embrace them. Learning to use those features, and thinking about them as you design your own systems, will make your applications much more efficient at scale.


[54:37] – Q&A

Robert: We have a question: what is the best way to deal with pre-1900 dates and datetimes?

Machad: There is a DateTime64 type that uses 64 bits for storing datetime values. If you have dates like those, you should use that type.

Robert: Great, thank you. So Anthony has shared that DateTime64 now supports dates back to 1900. For dates further back than that, for example historical weather data, it gets tricky. One approach is to use a schema where you store a negative offset using intervals, but that loses some of the built-in date functions.

Machad: The best option if it can be done in SQL is to do so, because there are no external dependencies. If you have a custom conversion that cannot be expressed in SQL, you can use user-defined functions, or UDFs. ClickHouse UDFs can be written in Python, which is not necessarily super fast, or in C++, which is quite performant and relatively easy to integrate with ClickHouse. The key is that the UDF needs to handle the conversion logic and return something ClickHouse can work with as a DateTime64 plus some additional column indicating how many years or months to shift.

Robert: If you want to try out solutions for your specific historical data use case, feel free to join our Slack and post the question. We will have a look and see if we can come up with a good scheme.


FAQ Section

Q: What are the most important things to get right when designing a ClickHouse table schema?

A: The three most important things to decide up front are the PARTITION BY clause, the ORDER BY clause, and eventually compression. For partitioning, use time-based partitioning and aim for about 1,000 parts or fewer across your table. For ordering, put your most important filtering columns first, typically in order of increasing cardinality, which places tenant IDs and coarse grouping columns ahead of timestamps. Compression defaults to LZ4 and works well out of the box, but for large tables you can use TTL recompress to automatically apply higher compression codecs like ZSTD to older data. These decisions are hard to reverse after the fact, so getting them approximately right at the start saves significant restructuring work later.

Q: Why does ClickHouse perform so much better than row-oriented databases for analytics?

A: ClickHouse stores each column in its own separate file. When you run an analytical query that aggregates a few columns from a table with dozens of columns, ClickHouse only reads the exact column files it needs, typically a tiny fraction of the total data. Those column files are also highly compressed, because an array of a single data type compresses much more efficiently than rows mixing many different types. Finally, ClickHouse parallelizes reads and aggregations across all available CPU cores. The cumulative effect is that a query against 196 million rows might only require reading 17 MB of compressed data spread across eight CPU threads, each seeing just 2 MB, which is why sub-second results on large tables are routine.

Q: What is the “too many parts” problem and how do I avoid it?

A: Every ClickHouse insert creates one or more immutable data files called parts. ClickHouse merges these in the background into larger parts, but if inserts arrive faster than merges can keep up, the part count grows out of control. This slows queries dramatically because every part requires its own file operations, and ClickHouse will eventually start throttling or refusing inserts with a “too many parts” error. The three main solutions are: stream large batches via the HTTP interface, enable async inserts so ClickHouse buffers many small client inserts into a single write, and use the Kafka table engine with a materialized view to buffer data upstream before it hits ClickHouse.

Q: How does the ReplacingMergeTree handle row-level updates?

A: ReplacingMergeTree never actually updates rows in place. Instead, you insert a new version of the row with the same ORDER BY key and a higher version number. Both the old and new rows coexist in different parts until ClickHouse merges those parts in the background, at which point it keeps only the row with the highest version and discards the others. Since merges happen asynchronously and may never fully complete for very large parts, the FINAL keyword in your SELECT statement applies the deduplication logic at query time to guarantee you always see the correct latest version. This pattern of immutable inserts plus eventual consistency is far more efficient than traditional row-level updates for large analytical data sets.

Q: When should I use a dictionary instead of a regular join in ClickHouse?

A: Use a dictionary when you have a reference table that many queries need to join against and the data changes infrequently enough that you can reload it periodically. A regular hash join fetches the right-side table, decompresses it, and loads it into an in-memory hash table for every single query. A dictionary does that loading once and keeps the hash table in RAM permanently, shared across all users and queries. This eliminates the repeated loading cost for high-frequency queries. Dictionaries are easy to create in ClickHouse using SQL similar to a CREATE TABLE statement and can be used directly in joins or via the dictGet() function family.

Q: What is conditional aggregation and when should I use it?

A: Conditional aggregation uses functions like minIf, maxIf, sumIf, and countIf to compute aggregates over a subset of rows in a single table scan. It is particularly useful when you have multiple types of entities (such as temperature readings and device restart events) stored in the same table, and you want to produce a result that combines both in one row without needing a self-join. Rather than scanning the table twice and joining the results, a single GROUP BY with conditional aggregate functions produces the same result in one pass. This can be dramatically faster than joins for large tables and is especially valuable in distributed ClickHouse deployments where moving data for joins is expensive.


© 2026 Altinity, Inc. All rights reserved. Altinity®, Altinity.Cloud®, and Altinity Stable® are registered trademarks of Altinity, Inc. ClickHouse® is a registered trademark of ClickHouse, Inc.; Altinity is not affiliated with or associated with ClickHouse, Inc. Kubernetes, MySQL, and PostgreSQL are trademarks and property of their respective owners.

Join our Slack

ClickHouse® is a registered trademark of ClickHouse, Inc.; Altinity is not affiliated with or associated with ClickHouse, Inc.

Related:

Leave a Reply

Your email address will not be published. Required fields are marked *