What's new in the Altinity Kubernetes Operator for ClickHouse®?

The Altinity Operator is the most popular way to run ClickHouse in Kubernetes, whether it is on-prem or in cloud. It quietly celebrated its 5th birthday in 2024. So it is time to go to school and learn something new. As you probably know, in the past ClickHouse required ZooKeeper to coordinate replication. And though the operator could easily spin up ClickHouse clusters and configure replication, it relied on an external ZooKeeper installation.

Two years ago, when the ClickHouse team introduced the replacement for ZooKeeper – ClickHouse Keeper – we planned to add native support to the operator for that. The first version was contributed by a community member a year ago (see the PR from Frank Wong), and after some refactoring we have released it as an experimental feature in the 0.23 version of the operator. However, it lacked many things that operator users expect, like flexible volume management, service templates, and others. That resulted in a lot of confusion and numerous bug reports. It took us a few months to rewrite Keeper support from scratch unifying functionality with ClickHouse operator. We are happy to announce that Keeper is finally supported in Altinity Operator for ClickHouse as of 0.24.0 version!

Note: To keep things clear, throughout this post the word “Keeper” by itself refers to ClickHouse Keeper.

How to manage ClickHouse Keeper with the Altinity Operator

Managing ClickHouse Keeper with the Altinity Operator is very simple and looks familiar to operator users. Just make sure you have installed operator version 0.24.0 or above, and then apply this manifest:

apiVersion: "clickhouse-keeper.altinity.com/v1" kind: "ClickHouseKeeperInstallation" metadata: name: simple-1 spec: configuration: clusters: - name: cluster1

Once applied, the operator will deploy Kubernetes resources to make Keeper to work properly. In order to reference it from ClickHouseInstallation service service/keeper-simple-1 can be used as follows:

apiVersion: "clickhouse.altinity.com/v1" kind: "ClickHouseInstallation" metadata: name: simple-with-keeper spec: configuration: zookeeper: nodes: - host: keeper-simple-1 # This is a service name of chk/simple-1 port: 2181 clusters: - name: default replicasCount: 2

That’s it. ClickHouse replicated cluster with Keeper is ready to go!

Of course, examples above do not have persistence and can only be used for demonstration purposes. ClickHouseKeeperInstallation (or, CHK) resource supports all the features needed for production operation, such as volume claims, pod and service templates, replica counts and configuration changes.. Here is an example that shows more features in action:

apiVersion: "clickhouse-keeper.altinity.com/v1"
kind: "ClickHouseKeeperInstallation"
metadata:
  name: clickhouse-keeper
spec:
  defaults:
    templates:
      podTemplate: default
      volumeClaimTemplate: default
  templates:
    podTemplates:
      - name: default
        spec:
          containers:
            - name: clickhouse-keeper
              image: "clickhouse/clickhouse-keeper:24.3.5.46"
    volumeClaimTemplates:
      - name: default
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 10Gi
  configuration:
    clusters:
      - name: default
        layout:
          replicasCount: 3
    settings:
      logger/level: trace

As you can see, a ClickHouseKeeperInstallation is very similar to a ClickHouseInstallation, so if you’re already familiar with Altinity Operator, there is nothing new.

Upgrade from 0.23.x

Altinity Operator 0.23.x used a different implementation of Keeper resource. For example, if we would use the 0.23.x operator with the simple-1 ClickHouseKeeperInstallation above, the default service name would be simple-1 instead of keeper-simple-1. But there is a bigger problem: the storage mapping is completely different. If one upgrades the operator from 0.23.x to 0.24.0 and reconciles Keeper resources, those will lose storage. What should users do? There are two possibilities.

Recover Keeper metadata

The first approach is to recreate Keeper metadata from ClickHouse.There is a SYSTEM RESTORE REPLICA command for that purpose. This needs to be done for every ReplicatedMergeTree table and for every replica. That works for small clusters. However, it does not recover other data that may be stored in Keeper. For example, it does not recover users or user defined functions.

Remap Persistent Volume

If recovering Keeper is not an option, it is possible to do a migration and remap the persistent volume created by operator 0.23.x. Before doing this, let’s take a look at the differences between objects created by 0.23.x and 0.24+.

Let’s consider that single node CHK installation is named test. So the following objects would be created:

	0.23.x	0.24+
Pod name	`test-0`	`chk-test-simple-0-0-0`
Service name	`test`	`keeper-test`
PVC name	`both-paths-test-0`	`default-chk-test-0-0-0`
Volume mounts	`- mountPath: /var/lib/clickhouse_keeper name: working-dir - mountPath: /var/lib/clickhouse_keeper/coordination/logs name: both-paths subPath: logs - mountPath: /var/lib/clickhouse_keeper/coordination/snapshots name: both-paths subPath: snapshots`	`- mountPath: /var/lib/clickhouse-keeper name: default`

So, in order to remap the volume, the following steps need to be done:

Find Persistent Volume (PV) in old CHK installation
Patch it, setting the persistentVolumeReclaimPolicy to Retain:

kubectl patch pv $PV -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

Delete the old CHK installation
Delete the old PVC, since it is not deleted automatically
Patch the PV one more time, removing the claimRef. That will make the volume available for remounting:

kubectl patch pv $PV -p '{"spec":{"claimRef": null}}'

Upgrade the operator to 0.24.x
Deploy the new CHK with the following changes:
1. Add volumeName to volumeClaimTemplate, referencing the old volume
2. Add settings to mount logs and raft coordination to folders matching old operator:
  keeper_server/log_storage_path: /var/lib/clickhouse-keeper/logs keeper_server/snapshot_storage_path: /var/lib/clickhouse-keeper/snapshots
3. Add a serviceTemplate to match the old name

Please refer to the migration procedure for more detail.

Once the CHK is up, ClickHouse will resume working.

Limitations and Future Work

The current implementation is not ideal. (There is nothing ideal in the world.) We can see at least two things that can be improved.

Reference CHK from CHI

As you could see in examples above, we have to reference CHK from CHI in “the old style”. This is fully compatible with ZooKeeper, but requires knowing service names. After all, it does not look pretty. Instead, we could reference CHK directly by name, as follows:

 configuration:
   zookeeper:
     keeper:
       - name: my-keeper
         namespace: my-namespace
         serviceType: loadBalancer | replicas

So instead of the service name we would use CHK name and also specify the way ClickHouse should be configured to access it: using load balancer service or individual replica services. The latter one may unlock some interesting capabilities to optimize network costs in public clouds.

Dynamic Reconfiguration

Sometimes it is necessary to change the Keeper cluster, e.g. add more replicas. That requires more than just spinning extra pods and services. The Keeper cluster needs to be notified of the change. Of course, it can be done via full restart but it may affect ClickHouse users. There is a better way – dynamic reconfiguration. Once implemented in the operator, it will make all changes go smoothly.

Both features are already in development for the next operator release.

Get started!

We encourage you to get the latest version of the operator to take advantage of the ClickHouse Keeper support. Please subscribe to the Altinity clickhouse-operator project in GitHub to be notified about new releases. And of course, we’ll post all the details here as we continue to add more features to the operator.

PRODUCTS

OPEN SOURCE SOFTWARE

CLICKHOUSE^® SOLUTIONS

Get in touch with ClickHouse experts.

What’s new in the Altinity Kubernetes Operator for ClickHouse®?

How to manage ClickHouse Keeper with the Altinity Operator

Upgrade from 0.23.x

Recover Keeper metadata

Remap Persistent Volume

Limitations and Future Work

Reference CHK from CHI

Dynamic Reconfiguration

Get started!

Related:

PRODUCTS

OPEN SOURCE SOFTWARE

CLICKHOUSE® SOLUTIONS

Get in touch with ClickHouse experts.

How to manage ClickHouse Keeper with the Altinity Operator

Upgrade from 0.23.x

Recover Keeper metadata

Remap Persistent Volume

Limitations and Future Work

Reference CHK from CHI

Dynamic Reconfiguration

Get started!

Related:

Altinity.Cloud Anywhere: An Open Cloud for ClickHouse®

Bring up ClickHouse® on Kubernetes with Argo CD

Build a Low-Cost, High-Performance Analytic Platform with Kubernetes and Open Source

CLICKHOUSE^® SOLUTIONS