Webinars

Five Things Every New ClickHouse® User Should Know (Part 2: Admin)

Recorded: August 20 @ 08:00 am PDT
Presenters: Robert Hodges & Tatiana Saltykova

In this webinar — a companion to the previous month’s developer-focused session — Robert Hodges (CEO, Altinity) and Tatiana Saltikova (ClickHouse® Administration Instructor, Altinity) cover five essential ClickHouse® administration topics for new users. The session walks through: (1) choosing and understanding your deployment environment — Kubernetes with the Altinity Kubernetes Operator for ClickHouse®, bare-metal/VM deployments, and managed clouds; (2) understanding replication, Keeper vs. ZooKeeper, replication queues, and mutation management; (3) schema changes — what can and can’t be changed online, partition key design, and the mutations queue; (4) upgrade strategies — LTS vs. monthly releases, Altinity Stable® Builds for ClickHouse®, rolling upgrades via the operator, and API change risk; and (5) system tables — a tour of the richest diagnostic toolset in any database, including system.query_log, system.parts, system.replication_queue, and more. The session closes with Q&A on multi-tenant schema design, lost replica recovery, and settings recommendations.

Here are the slides:

Key Moments (Timestamps)

AI helped write this. Mistakes may happen.

  • 00:00 – Welcome, housekeeping, and introductions
  • 01:29 – What is ClickHouse®? Quick overview for newcomers
  • 03:35Topic #1: Choose and understand your environment
  • 03:48 – Kubernetes deployment: EKS, Prometheus, Grafana, the Altinity Operator
  • 06:34 – Operators explained: custom resources and reconciliation
  • 10:10 – Running ClickHouse® on bare metal or VMs
  • 13:49 – Managed clouds: Altinity.Cloud® vs. running it yourself
  • 15:38Topic #2: Replication — how it works and how to manage it
  • 16:04 – Sharding and replication patterns
  • 19:06 – ReplicatedMergeTree, ZooKeeper paths, and cluster definitions
  • 22:35 – ZooKeeper vs. Keeper: which should you use?
  • 25:17 – How insert replication works step by step
  • 27:00 – Replication queue: system.replication_queue and what to do when it backs up
  • 28:27 – Schema change: what’s mutable and what isn’t
  • 32:17 – Mutations: the mutation queue and system.mutations
  • 32:32Topic #3: Upgrade strategies
  • 33:08 – Leading edge vs. trailing edge: two upgrade philosophies
  • 36:19 – ClickHouse® release schedule: monthly builds vs. LTS
  • 37:01 – Altinity Stable® Builds for ClickHouse®: what they are and when to use them
  • 38:00 – Upgrade commands and the Altinity upgrade guide
  • 40:38 – Rolling upgrades and canary updates
  • 42:57 – Rolling upgrades with the Altinity Kubernetes Operator for ClickHouse®
  • 47:31Topic #4: System tables — ClickHouse®’s best feature
  • 47:38 – Monitoring tables: system.metrics, system.events, system.asynchronous_metrics
  • 49:08 – Diagnosing I/O and storage: system.parts, system.tables, system.columns
  • 50:11 – system.processes and history tables: system.part_log, system.query_log
  • **51:00– CPU and merge diagnostics:system.merges`
  • 53:38 – Example queries: compression ratios and resource-heavy query patterns
  • 57:26 – Summary and key takeaways
  • 59:47 – Q&A: multi-tenant settings, lost replica recovery

Transcript


[00:00–03:34] – Welcome, Housekeeping, and Introductions

Robert: Hi everybody, and welcome to our webinar on five things every new ClickHouse® user should know — focusing today on administration. In the webinar last month, we talked about development, so this is the companion session.

A bit of housekeeping before we dive in. This is being recorded. We’ll send you a link to the recording and the slides within a few hours of the webinar ending, so no need for frantic notes. We also have time for questions — lots of them. My colleague Tatiana Saltikova is here with me. She was responsible for many of these slides, teaches our administration course, and runs production ClickHouse® clusters. She is an authority on everything we’re going to discuss today. Drop your questions into the Q&A box or the webinar chat, and we’ll get to as many as we can — we may even go a few minutes over if needed.

My name is Robert Hodges. I’m CEO of Altinity. I’ve been working with ClickHouse® since January 1st, 2019, and it’s my favorite database. Josh Lee, our developer advocate, is also here helping answer questions.

Also joining us as a panelist: Josh Lee, our developer advocate, who does regular talks on ClickHouse® as well.

A quick word on who Altinity is. We are a vendor for ClickHouse®. We have two main offerings. The first, which we started with back in 2017, is enterprise support — 24×7 support for organizations that run ClickHouse® themselves, from embedded systems to massive clusters. That’s about half our customers. The other half runs on Altinity.Cloud®, our managed ClickHouse® service that can run in our account or yours. We are not the same as ClickHouse, Inc. — we are in fact, competitors, though we collaborate extensively on making the open source code better.


[01:29–03:34] – What Is ClickHouse®?

Robert: For the ~2% of you who may be newer to this: ClickHouse® is a real-time analytic database. You can think of it as combining the best features of a traditional database — SQL support, open source, runs anywhere — with the hallmarks of an analytic database: shared-nothing architecture, columnar storage, parallel vectorized execution, and the ability to scale to many petabytes. We have customers ingesting up to 20 petabytes of uncompressed data per day into ClickHouse®. It’s incredibly popular.

Today we won’t be covering what makes it popular — we’ll be focusing on what you need to do to run it and make it work well for your users.


[03:35–15:37] – Topic #1: Choose and Understand Your Environment

Robert: Whatever environment you choose to run ClickHouse® in, you need to understand it inside and out. ClickHouse® can literally run anywhere — someone even wrote a blog post a few years ago about running it on a mobile phone. Let’s cover the main options.

Kubernetes (the most common production pattern)

For teams building SaaS or large-scale environments, Kubernetes has become the dominant deployment model. Here’s the standard stack:

  • A managed Kubernetes service — EKS (Amazon), GKE (Google), AKS (Azure) — as the base
  • Prometheus and Grafana for monitoring and observability
  • The Altinity Kubernetes Operator for ClickHouse® to manage the ClickHouse® lifecycle
  • ClickHouse® server processes and Keeper (more on that shortly) are running as pods
  • Terraform for cloud infrastructure setup, plus Helm, raw manifests, or Argo CD for Kubernetes management

If you’re using EKS, we have Terraform blueprints that set up the entire stack — Kubernetes, storage classes, the operator, and a sample ClickHouse® cluster. It takes about 20 minutes for EKS to come up, and then you’re ready to go. Most people take these blueprints and tweak them for their own environment.

Understanding operators:

If you’re running on Kubernetes, using an operator is really important. Operators extend Kubernetes by allowing you to define new kinds of resources. In the case of ClickHouse®, there’s a resource called ClickHouseInstallation. If you want to change the number of replicas, you just change the number in the manifest and the operator performs the change safely and correctly — following best practices automatically.

To make operators work efficiently, you need to understand how Kubernetes allocates resources in the background:

  • Cluster Autoscaler — automatically spins VMs up and down as Kubernetes needs capacity
  • CSI drivers — manage storage provisioning (persistent volume claims → persistent volumes)
  • IAM roles and permissions — EKS needs appropriate roles to allocate and manage resources

This is the part of Kubernetes that’s genuinely complex. If you know it, you’ll be fine. If you don’t, you’ll need to learn it or consider a different deployment model.

Here’s an example of a basic ClickHouse® installation defined for the operator — just 6–10 lines of YAML:

apiVersion: “clickhouse.altinity.com/v1”

kind: ClickHouseInstallation

metadata:

  name: my-cluster

spec:

  configuration:

    clusters:

      – name: my-cluster

        layout:

          shardsCount: 2

          replicasCount: 2

    zookeeper:

      nodes:

        – host: clickhouse-keeper

  templates:

    podTemplates:

      – name: default

        spec:

          containers:

            – name: clickhouse

              image: altinity/clickhouse-server:24.3.5.46.altinitystable

    volumeClaimTemplates:

      – name: data

        spec:

          resources:

            requests:

              storage: 100Gi

To upgrade ClickHouse®, you change that image tag. The operator handles the rest — rolling upgrade, draining queries, zero downtime. More on that in the upgrade section.

Bare metal and VMs

Many organizations — particularly financial services — run ClickHouse® directly on racked hardware or on VMs they allocate in the cloud. The ClickHouse® packages support ARM, Intel, AMD, and even Power PC architectures and are easy to install via apt or yum.

Key things to understand when running directly on a host:

  • Standard directory structure: configuration in /etc/clickhouse-server/, data in /var/lib/clickhouse/, logs in /var/log/clickhouse-server/
  • Configuration files: config.xml controls server operation (ports, system tables, etc.); users.xml defines built-in users
  • Never modify config.xml or users.xml directly. Instead, add your customizations to config.d/ (for server settings) and users.d/ (for user definitions). These overlay files are applied on top of the base configuration — and when you upgrade, they won’t be overwritten
  • Verify processed configuration: You can view the fully merged configuration at runtime to debug anything that’s not working as expected

Using the official packages is strongly recommended over installing binaries manually — the packages handle directory setup, permissions, and configuration file management correctly, and they handle upgrades gracefully.

Managed clouds

A large number of ClickHouse® users run on managed clouds, either Altinity.Cloud® or ClickHouse Cloud. The key trade-off to understand: clouds trade convenience for cost and raw performance. That said, the total cost of a managed cloud — including the engineering time you don’t spend managing ClickHouse® — can actually be lower than self-managed, especially at a smaller scale.

If you’re operating at scale with stable workloads and data center expertise, running ClickHouse® on racked hardware with 128 cores and 256 GB of RAM will deliver extremely high performance at lower unit cost. There’s no single right answer — it depends entirely on what your business needs.


[15:38–28:26] – Topic #2: Replication — How It Works and How to Manage It

Robert: Replication is probably the single most important feature of open-source ClickHouse®. You need to understand how it works.

Sharding and replication patterns

ClickHouse® supports three main cluster patterns:

PatternDescriptionUse case
Sharded and replicatedLarge table split across shards, each shard has 2+ replicasMost common pattern for large tables — increases both throughput and availability
Replicated onlyAll servers have a full copy of the tableSmall reference/dimension tables you join on (e.g., business unit names)
All-shardedEach server has its own copy, but queries span all copiesSystem tables — allows you to query across all nodes from one place

Defining a replicated table

The engine is ReplicatedMergeTree. The path argument tells ClickHouse® where to store replication metadata in Keeper/ZooKeeper:

— Replica 1 of Shard 1

CREATE TABLE ontime ON CLUSTER my_cluster

(FlightDate Date, Carrier String, …)

ENGINE = ReplicatedMergeTree(‘/clickhouse/tables/shard_1/ontime’, ‘replica_1’)

PARTITION BY toYear(FlightDate)

ORDER BY (Carrier, FlightDate);

If two replicas share the same path, they’re replicas of the same shard. If paths differ, they’re different shards. That’s the entire mechanism.

Cluster definition

The cluster is defined in remote_servers.xml (by convention):

<remote_servers>

  <my_cluster>

    <shard>

      <replica><host>ch-01.internal</host><port>9000</port></replica>

      <replica><host>ch-02.internal</host><port>9000</port></replica>

    </shard>

    <shard>

      <replica><host>ch-03.internal</host><port>9000</port></replica>

      <replica><host>ch-04.internal</host><port>9000</port></replica>

    </shard>

  </my_cluster>

</remote_servers>

You define whatever shard/replica topology you want. ClickHouse® just uses what you say.

ZooKeeper vs. Keeper — which should you use?

ClickHouse® originally used Apache ZooKeeper for consensus — tracking cluster membership, part ownership, and replication tasks. About four to five years ago, the ClickHouse® team wrote Keeper — a C++ reimplementation of the ZooKeeper API bundled directly into the ClickHouse® binary.

Which should you use?

  • Running ZooKeeper now, and it works? Keep it. Not worth the migration.
  • Starting a new system? Use Keeper. It’s the future — actively developed, easier to maintain, and increasingly required for new ClickHouse® features (like the S3Queue engine) that depend on Keeper-specific functionality.

Long-term, Keeper is the strategic direction. ZooKeeper development has slowed significantly, and bug fixes are harder to get. Keeper can be fixed and extended by both Altinity and the ClickHouse® core team.

How insert replication works

  1. A client inserts data into one ClickHouse® server
  2. That server writes the part to local storage
  3. The server registers the new part in Keeper
  4. Other replicas are watching Keeper for new part notifications
  5. They see the new part, fetch it from the original server over port 9009 (the internal interserver replication port), and write it to their own storage

This works 99.8% of the time without any intervention. Keeper nodes should run on dedicated hosts, away from ClickHouse® servers — they need low disk latency, dedicated CPU, and no interference from other workloads.

The replication queue

Replication operations (adding parts, merging parts) are processed as an ordered queue per table. If something gets stuck — for example, a part that can’t be replicated due to a disk error or a mismatch — it will block everything behind it in the queue for that table.

To investigate:

— Count items in the replication queue per table

SELECT database, table, count()

FROM system.replication_queue

GROUP BY database, table

ORDER BY count() DESC;

— Look at what’s stuck

SELECT database, table, type, create_time, exception

FROM system.replication_queue

WHERE last_exception != ”

ORDER BY create_time;

Most replication problems resolve on their own once the underlying cause is addressed. The usual cure is: identify what’s causing the overload, stop doing it, and wait. ClickHouse® is very robust about clearing the queue once pressure is relieved.

Warning for developers: Don’t constantly refresh entire table contents. In some databases (MySQL, PostgreSQL), it’s common to truncate and reload a report table nightly. In ClickHouse®, doing this on replicated tables — especially multiple tables at once — generates enormous replication traffic and can severely back up the queue.


[28:27–32:16] – Schema Changes and the Mutations Queue

Robert: ClickHouse® databases aren’t static — schemas change over time. Here’s what you need to know.

What can and can’t be changed easily

Two settings in MergeTree tables are effectively permanent:

  1. PARTITION BY — Once set, you cannot change the partition key without copying all data to a new table. Get this right. As a rule of thumb, partition by time — this produces evenly sized partitions. Partition by month for most time-oriented tables; adjust based on your data retention window.
  2. ORDER BY — The sort order within parts is critical for query performance in ClickHouse®. You can extend the ORDER BY by adding new columns to the end when adding columns, but you can’t restructure the existing ordering without rewriting data.

If you need to change either of these, you’ll need to copy the data to a new table — that’s painful, so get them right from the start. See the Altinity KB on schema design for detailed guidance.

Online schema changes (lightweight)

Most schema changes in ClickHouse® are online and nearly instantaneous:

— Adding a column — almost instant

ALTER TABLE ontime ADD COLUMN TailNum String;

— Dropping a column — almost instant

ALTER TABLE ontime DROP COLUMN TailNum;

— Adding an index — lightweight

ALTER TABLE ontime ADD INDEX my_idx (Carrier) TYPE minmax;

These are so fast that you’re unlikely to notice any impact on running queries.

Heavyweight schema changes and mutations

Some changes require rematerializing data — for example, changing a column’s data type. These generate mutations, which are processed through a queue:

— Check active mutations

SELECT database, table, mutation_id, command, create_time, is_done

FROM system.mutations

WHERE NOT is_done

ORDER BY create_time;

Mutations are applied part by part across your table. For large tables, this can take considerable time. Combined with active replication traffic, a heavy mutation load can slow down your system. Monitor system.mutations whenever you’re making heavyweight schema changes.


[32:32–47:30] – Topic #3: Upgrade Strategies

Robert: Upgrades are probably a DBA’s least favorite topic — they’re where things can go really wrong. Let’s talk about how to do them well in ClickHouse®.

Two upgrade philosophies

ClickHouse® is a massive open-source project with several hundred unique contributors per year. It changes fast. That creates two distinct upgrade strategies:

Leading edge (always current):

  • Upgrade to each new monthly release shortly after it comes out
  • Benefit: you get the latest features and bug fixes immediately (e.g., active data lake integration work requires the very latest builds)
  • Risk: monthly releases can have bugs, API changes, performance regressions, and breaking changes to SQL behavior
  • Required: fast automation to test and roll back quickly if problems arise
  • Best for: teams with strong engineering culture, heavy automation, and dependency on cutting-edge features

Trailing edge (LTS-based):

  • Upgrade only to Long-Term Support releases, which come out twice a year (typically March and August)
  • Test thoroughly against your actual application before upgrading
  • Less frequent upgrades, but each one is well-validated
  • Best for: large production applications with complex SQL that would be expensive to regression-test frequently

The API change problem

ClickHouse® deliberately prioritizes speed and new features over API stability. Aggregate function signatures change. SQL behavior changes. New settings appear; old ones are deprecated. For large applications, API changes are often the biggest upgrade risk — not crashes or data loss, but subtle behavioral differences that break application queries.

For trailing-edge upgrades, a rigorous testing approach is critical. Some organizations go even further: spin up new ClickHouse® servers running the new version, replicate data from the old servers, test the application against the new cluster, and then cut over at a controlled time.

ClickHouse® release schedule

Release typeFrequencyBug fix support window
Monthly releasesMonthly3 months
LTS releasesTwice yearly (March, August)12 months
Altinity Stable® Builds for ClickHouse®After each LTS3 years

Altinity Stable® Builds for ClickHouse®

Altinity Stable® Builds for ClickHouse® are based on LTS releases, published approximately 3 months after the upstream LTS appears (after additional hardening), and maintained for 3 years. They’re designed for organizations that want the trailing-edge upgrade strategy — confident, infrequent upgrades with a long support window.

These builds are open source and free to download. Bug fixes for the Altinity Stable® build require a support contract (similar to the Red Hat model, but without restricting access to the repo).

Upgrade commands (bare metal / VM)

# Pin the target version

CLICKHOUSE_VERSION=24.3.5.46

# Upgrade the three packages

sudo apt-get install \

  clickhouse-client=$CLICKHOUSE_VERSION \

  clickhouse-server=$CLICKHOUSE_VERSION \

  clickhouse-common-static=$CLICKHOUSE_VERSION

# This does NOT restart ClickHouse automatically.

# Restart when you’re ready:

sudo systemctl restart clickhouse-server

The apt install step installs the new software without restarting the server — giving you control over when the actual upgrade takes effect. See the Altinity upgrade guide (by Dmitry Baf) for detailed plans and common pitfalls.

Rolling upgrades with the Altinity Kubernetes Operator for ClickHouse®

This is one of the most powerful features of running ClickHouse® on Kubernetes with the Altinity operator. To trigger a rolling upgrade across your entire cluster, you just change the image tag in your CHI manifest:

spec:

  templates:

    podTemplates:

      – name: default

        spec:

          containers:

            – name: clickhouse

              image: altinity/clickhouse-server:24.8.5.10.altinitystable  # ← change this

The operator then:

  1. Identifies the version change
  2. Upgrades one server as a canary — if it fails, stops there
  3. If the canary succeeds, it proceeds across all shards in parallel
  4. For each replica being upgraded: removes it from the load balancer, removes it from remote_servers, waits for running queries to drain, performs the upgrade, then re-adds it in reverse order

You get a fully safe, zero-downtime rolling upgrade from a one-line YAML change. This is one of the most compelling reasons to use the Altinity Kubernetes Operator for ClickHouse® in production.

Canary upgrades outside Kubernetes

Even without Kubernetes, you can do canary upgrades. Update one replica first and monitor it. If versions are too far apart, replication may not work between mixed versions — but for adjacent versions or LTS-to-LTS upgrades, mixed-version clusters are generally fine for short periods.

In rare cases — typically when checksum calculation algorithms change — you may need to upgrade all nodes simultaneously. This is documented in the release notes when it applies (Tatiana estimates this happens once every one to two years). Checksum mismatches during a rolling upgrade are typically transient — they resolve once all nodes are on the same version.

Tatiana: Yes, exactly. It’s only a problem while you have mixed versions in the cluster. It resolves itself after the upgrade completes. There are also rare cases where metadata format changes make large version jumps (e.g., 2020 to 2025) problematic — but these are uncommon and well-documented.


[47:31–57:25] – Topic #4 & #5: System Tables — ClickHouse®’s Best Feature

Robert: I love system tables. They are the single best feature in ClickHouse® — better than any other database I’ve worked with. There are so many of them it would take hours to cover in depth, so I’ll give you a tour of the most important ones.

Monitoring tables

These are the foundation of your observability stack:

TableContents
system.metricsCurrent values — active queries, connections, memory usage
system.eventsMonotonic counters — total inserts, queries executed, etc.
system.asynchronous_metricsBackground-updated metrics — memory usage, uptime, etc.

The Altinity Kubernetes Operator for ClickHouse® automatically exports these in Prometheus format from port 8888, ready to scrape into Prometheus and visualize in Grafana. This gives you time-series monitoring — queries per second, insert rates, memory pressure — with no additional configuration.

I/O and storage diagnostics

— Find tables with too many parts (a common performance problem)

SELECT database, table, count() AS part_count

FROM system.parts

WHERE active

GROUP BY database, table

ORDER BY part_count DESC

LIMIT 20;

— Check compression at the column level

SELECT

    database,

    table,

    column,

    formatReadableSize(data_compressed_bytes) AS compressed,

    formatReadableSize(data_uncompressed_bytes) AS uncompressed,

    round(data_uncompressed_bytes / data_compressed_bytes, 2) AS ratio

FROM system.columns

ORDER BY data_compressed_bytes DESC

LIMIT 20;

A table with 100,000 parts is almost certainly a performance problem. system.parts is the first place to check when queries are slow.

Processes and query history

— What’s running right now?

SELECT query_id, user, elapsed, query

FROM system.processes

ORDER BY elapsed DESC;

For history, two tables you should always have enabled:

  • system.part_log — Records everything that happened to parts: inserts, merges, downloads, removals. Vital for understanding ingestion throughput and block sizes.
  • system.query_log — Records every query executed: what SQL was run, how long it took, how many rows and bytes were read, memory used, and much more. This is one of your most important diagnostic tools.

CPU and merge diagnostics

— What merges are currently active?

SELECT database, table, elapsed, progress,

       formatReadableSize(total_size_bytes_compressed) AS size

FROM system.merges

ORDER BY elapsed DESC;

Merges are a major consumer of CPU and I/O. If your system is slow, check whether merges are the bottleneck.

Example: Find your compression ratio by table

SELECT

    database,

    table,

    formatReadableSize(sum(data_compressed_bytes)) AS compressed,

    formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed,

    round(sum(data_uncompressed_bytes) / sum(data_compressed_bytes), 2) AS ratio

FROM system.columns

GROUP BY database, table

ORDER BY sum(data_compressed_bytes) DESC;

Run this to quickly check compression levels across your database. If a table compresses poorly, consider adding codecs or changing compression settings — both can be done online.

Example: Find resource-heavy query patterns (from Mika Filimonov)

SELECT

    normalized_query_hash,

    any(query) AS sample_query,

    count() AS query_count,

    avg(query_duration_ms) AS avg_duration_ms,

    sum(ProfileEvents[‘RealTimeMicroseconds’]) AS total_cpu_microseconds,

    sum(read_bytes) AS total_bytes_read

FROM cluster(‘all_replicas’, system.query_log)

WHERE type = ‘QueryFinish’

  AND event_time >= now() – INTERVAL 1 DAY

GROUP BY normalized_query_hash

ORDER BY total_cpu_microseconds DESC

LIMIT 20;

Key features of this query:

  • normalized_query_hash — hashes structurally similar queries (same pattern, different argument values) to the same value, letting you see query patterns rather than individual queries
  • ProfileEvents — a map column in system.query_log with detailed per-query resource usage including CPU microseconds, S3 requests, network bytes, and more
  • cluster(‘all_replicas’, system.query_log) — runs this across every node in your cluster simultaneously, giving you a cluster-wide view rather than a single server’s history

This is one of the most powerful queries for identifying performance bottlenecks across a production cluster. See the Altinity KB on query_log for more examples.

Example: Trace a distributed query across all subqueries

— Find all subqueries spawned from a specific distributed query

SELECT

    query_id,

    hostName() AS host,

    query_duration_ms,

    ProfileEvents[‘S3ReadMicroseconds’] AS s3_read_us,

    ProfileEvents[‘ReadBufferFromS3Bytes’] AS s3_bytes

FROM cluster(‘all_replicas’, system.query_log)

WHERE initial_query_id = ‘your-query-id-here’

  AND type = ‘QueryFinish’

ORDER BY host;

When a distributed query runs, it spawns subqueries on each shard. This lets you trace what each shard actually did — useful for diagnosing S3 read patterns, uneven shard load, or unexpected query behavior.


[57:26–59:46] – Summary: Five Key Takeaways

Robert: Here’s a quick summary of the five areas we covered:

1. Environment: ClickHouse® can run anywhere. Choose based on what your business actually needs. Kubernetes is great for most teams. Bare metal is great for maximum performance and stability. Managed clouds reduce operational overhead. Whatever you choose, understand how it works deeply.

2. Replication: The most powerful open-source ClickHouse® feature — but you need to understand it. Key risks: excessive bulk data refreshes, mutations + heavy replication traffic combined, and queues that back up. Most replication problems are self-healing once you remove the pressure causing them. Know where to look: system.replication_queue and system.mutations.

3. Schema changes: ClickHouse® is excellent at online schema changes — adding or dropping columns is nearly instant. But PARTITION BY and ORDER BY must be right from the start — changing them requires copying data. For multi-tenant schemas, partition by time and sort by tenant ID. Never partition by tenant.

4. Upgrade: Plan, test, and automate. The biggest risk is API changes that break your application’s SQL. For large systems with complex queries, use the LTS-based trailing edge strategy and Altinity Stable® Builds for ClickHouse®. Use the Altinity Kubernetes Operator for ClickHouse® for automatic rolling upgrades. For bare-metal systems, build automation.

5. System tables: Use them constantly. Export metrics to Prometheus/Grafana for time-series monitoring. Use system.query_log for query performance analysis. Use system.parts and system.columns for storage diagnostics. Treat these tables as your primary source of truth about what ClickHouse® is doing.

On using AI for ClickHouse® administration: AI tools like ChatGPT or Gemini search overviews can be helpful, but be cautious. ClickHouse® administration is a specialized domain with limited training data compared to general programming topics. Verify anything an AI tells you — especially advice about operational tasks like detached parts, mutations, or replication repair. If in doubt, ask us.


[59:47–1:04:31] – Q&A

Q: What are the most important settings for a multi-tenant ClickHouse® environment? (from audience)

Tatiana: For multi-tenancy specifically, there isn’t one particular setting that stands out. Generally, the defaults are well-chosen. You start with what you have and adjust based on how your system actually behaves and what your workload needs.

Robert: I’d add that the most important thing for multi-tenant systems isn’t actually a setting — it’s schema design. If you’re coming from MySQL, your instinct will be to create a separate database (or table) per tenant. That’s the right pattern in MySQL. In ClickHouse®, it’s almost always wrong.

The right pattern:

  • Partition by time — this creates evenly sized partitions regardless of tenant distribution
  • Sort by tenant — put the tenant ID first (or early) in the ORDER BY clause

CREATE TABLE events

(

    tenant_id UInt32,

    event_time DateTime,

    user_id UInt64,

    event_type String,

    …

)

ENGINE = ReplicatedMergeTree

PARTITION BY toYYYYMM(event_time)   — even-sized partitions

ORDER BY (tenant_id, event_time);    — all tenant data contiguous in storage

This places all data for a given tenant in contiguous storage sections, making per-tenant queries very fast. The trade-off: if you need to delete a tenant, you’ll have to issue a mutation (ALTER TABLE … DELETE WHERE tenant_id = X) rather than dropping a partition. For most observability-style workloads, that’s acceptable.

See the Altinity KB on multi-tenant schema design for more details.


Q: How do you best recover a lost replica when the host and data are gone? (from C in the chat)

Tatiana: If the data is completely lost — host gone, storage gone — the simplest approach is:

  1. Drop the old replica’s metadata from Keeper using SYSTEM DROP REPLICA
  2. Create a new replica

ClickHouse® will automatically replicate data from the surviving replicas to the new one.

Robert: In Kubernetes with the Altinity operator, this is even simpler. Delete the PVC (Persistent Volume Claim) for the failed node and then delete the pod. The operator will recreate the pod, provision new storage, and replication will automatically repopulate the data — no manual steps needed.

The other scenario is when a replica loses connectivity for a long time and gets detached from its Keeper metadata, but the storage is still intact. In that case:

SYSTEM RESTORE REPLICA table_name;

This reads the data back from disk and re-syncs the replica’s metadata with Keeper. Your storage is good, so no data needs to be transferred over the network.

Tatiana: Yes, SYSTEM RESTORE REPLICA is the right command when you have the data but lost the Keeper sync. SYSTEM DROP REPLICA is for when you need to clean up a dead replica’s metadata so you can create a fresh one.

Robert: The SYSTEM RESTORE REPLICA feature is something Altinity contributed to ClickHouse® — we use it constantly.


Robert: We’re over time — thank you so much, everybody. Thanks to Tatiana and Josh for all your help. Good luck running your ClickHouse® systems. It is a wonderful database, and you can build amazing applications with it. Come visit us on Slack, check out altinity.com, and let us know how it goes. Have a great day!


FAQ Section

Q1: Should I run ClickHouse® on Kubernetes or bare metal?

It depends on your priorities. Kubernetes with the Altinity Kubernetes Operator for ClickHouse® is excellent for teams that need flexibility, easy scaling, rolling upgrades, and multi-cluster management. Bare metal or VM deployments are better for maximum raw performance and for organizations (like financial services) that need absolute control and minimal latency. Managed clouds like Altinity.Cloud® reduces operational overhead and can be more cost-effective when factoring in engineering time. There is no universally right answer.

Q2: Should I use ZooKeeper or ClickHouse® Keeper?

For new systems, use Keeper — it’s the strategic direction, actively developed, and bundled with ClickHouse®. Some new features (like the S3Queue engine) require Keeper-specific functionality. If you’re running ZooKeeper in production and it’s working well, there’s no urgent reason to migrate — but plan to move to Keeper for your next major upgrade or new cluster.

Q3: How do I handle multi-tenant schema design in ClickHouse®?

Don’t partition by tenant. Partition by time (by month for most use cases) to create evenly sized partitions, and put the tenant ID first in the ORDER BY clause. This places each tenant’s data in contiguous storage sections, making per-tenant queries fast while keeping partition count manageable. See the Altinity KB on schema design for detailed examples.

Q4: What is the safest way to upgrade ClickHouse® in production?

For large production systems, use the LTS-based trailing-edge strategy with Altinity Stable® Builds for ClickHouse®: upgrade infrequently, test thoroughly against your actual application SQL before upgrading, and automate rollback. The biggest risk is API changes that break existing queries. If running on Kubernetes, the Altinity Kubernetes Operator for ClickHouse® performs automatic rolling upgrades with zero downtime from a single image tag change.

Q5: What system tables should every ClickHouse® administrator know?

Start with these: system.query_log (query history, performance, resource usage), system.parts (part counts by table — too many parts is a common problem), system.replication_queue (replication backlog and errors), system.mutations (active and pending schema changes), system.merges (active merge operations), system.processes (currently running queries), and system.columns (compression ratios by column). Export system.metrics, system.events, and system.asynchronous_metrics to Prometheus/Grafana for time-series monitoring.

Q6: How do I recover a lost ClickHouse® replica?

If the data is completely lost: use SYSTEM DROP REPLICA ‘replica_name’ FROM TABLE db.table to clean up the dead replica’s metadata in Keeper, then create a new replica — ClickHouse® will automatically replicate data from surviving replicas. In Kubernetes with the Altinity operator, simply delete the PVC and pod; the operator will recreate everything automatically. If the storage is intact but the replica lost its Keeper sync (e.g., was offline for a long time), use SYSTEM RESTORE REPLICA table_name to re-sync the replica’s metadata from disk without transferring data over the network.


Copyright Notice: This content is © Altinity, Inc. All rights reserved. Altinity®, Altinity.Cloud®, and Altinity Stable® are registered trademarks of Altinity, Inc. ClickHouse® is a registered trademark of ClickHouse, Inc. Altinity is not affiliated with or associated with ClickHouse, Inc. Kubernetes, MySQL, and PostgreSQL are trademarks and property of their respective owners.

Share

ClickHouse® is a registered trademark of ClickHouse, Inc.; Altinity is not affiliated with or associated with ClickHouse, Inc.

Related:

Leave a Reply

Your email address will not be published. Required fields are marked *