Safety First! Using Altinity Backup for ClickHouse® for ClickHouse Backup and Restore

Recorded: October 25 @ 07:00 am PDT
Presenter: Robert Hodges & Eugene Klimov
In this webinar, Robert Hodges and ClickHouse®-backup Maintainer, Eugene Klimov, walk you through the complete landscape of ClickHouse backup and restore, from first principles through live demonstrations and advanced operational topics.
Robert opens by framing why backups matter beyond disaster recovery: restoring a production table to a development environment for debugging, testing upgrades safely on a restored copy before touching the live cluster, and cloning an empty schema and configuration for new tenant provisioning. He then enumerates what actually needs to be protected in a ClickHouse deployment: configuration files (config.xml and related files), table schema definitions, table data, and RBAC metadata created via CREATE USER and related SQL commands. RBAC metadata can be stored either in local files or in ZooKeeper, and both cases must be handled. For tables using tiered storage, object-backed data in S3 or equivalent must also be addressed.
The talk surveys the backup options available before focusing on clickhouse-backup. Options include replication alone (insufficient, does not cover configs or RBAC), clickhouse-copier (primarily a migration tool), and the built-in BACKUP/RESTORE command (covers schema and data but not configs or RBAC metadata), and clickhouse-backup itself (covers everything). Robert covers the history of the project: originally written by Alex Akulov, it was donated to Altinity in late 2019 and has been maintained by Eugene Klimov ever since. It is a statically linked Go binary, Apache 2.0 licensed, hosted at the Altinity GitHub organization.
Installation is simple: download the tarball, extract the binary to /usr/local/bin, and generate a starting configuration with clickhouse-backup default-config. The tool must run on the same host as ClickHouse because it relies on Linux hard links for efficient local backups, and hard links only work within a single filesystem.
The hard-link mechanism is explained in detail. ALTER TABLE FREEZE creates hard links inside a shadow/ subdirectory pointing to the same inodes as the live part files, giving an instantaneous stable snapshot of the table. clickhouse-backup uses these hard links as the local backup copy, then uploads them to remote storage (S3, GCS, Azure, FTP, SFTP, and others). ALTER TABLE UNFREEZE removes the shadow links when done. On restore, the tool downloads the backup, creates hard links in the table’s detached/ directory, and issues ALTER TABLE ATTACH PART to hand the data to ClickHouse.
The live demo shows the full backup lifecycle: clickhouse-backup create, clickhouse-backup list, clickhouse-backup upload, dropping a table to simulate an accident, and restoring it from the local backup in seconds. A second demo shows restoring selected tables into a different target database using the --table and --mapping options.
Incremental backups are explained with the --diff-from-remote flag: clickhouse-backup creates a complete local snapshot via hard links (fast, no data movement), then compares part names against the remote full backup and only uploads new or changed parts. An important caveat: mutations that rename all parts (such as ALTER TABLE DELETE) will invalidate the part-name comparison, causing the next incremental to upload everything as if it were a full backup.
The REST API server mode allows clickhouse-backup to run as a daemon (clickhouse-backup server), accepting backup and restore commands via HTTP rather than the command line. This is the mode used in Kubernetes deployments and in Altinity.Cloud. Integration tables (system.backup_list, system.backup_actions) expose the API directly through ClickHouse SQL.
The session closes with operational tips: using {shard} macros in remote storage paths so shards do not overwrite each other’s backups, configuring upload and download concurrency for S3 performance, using clean to remove shadow directories after backups, and clean-remote-broken to remove partially uploaded backups identified by the absence of metadata.json.
The Q&A covers schema changes between full and incremental backups (each backup captures the current schema; old parts carry the old schema and attach correctly on restore), cross-shard partition rebalancing (clickhouse-backup is not the ideal tool; ALTER TABLE FETCH PARTITION is better), the per-table-at-a-time freeze behavior (each table is frozen and unfrozen independently, so a backup is not a true cross-table snapshot), and the critical operational lesson from Robert’s career: always test your backups before you need them.
Here are the slides:
Key Moments (Timestamps)
Key moments generated with AI assistance.
- 00:07 – Welcome and housekeeping
- 01:40 – Speaker introductions: Robert Hodges and Eugene Klimov
- 03:46 – Why backups matter: disaster recovery, debugging, upgrade testing, cloning
- 06:28 – ClickHouse overview for context
- 08:32 – What to protect in a ClickHouse deployment: configs, schema, data, RBAC metadata
- 11:20 – Backup options: replication, copier, built-in BACKUP/RESTORE, clickhouse-backup
- 12:58 – clickhouse-backup: history, Go binary, Apache 2.0, Altinity maintainership
- 14:22 – Installation: download, extract, put in /usr/local/bin, must run on same host
- 15:35 – Generating and customizing config.yml
- 18:07 – Basic backup lifecycle: create, upload, download, restore
- 21:26 – Live demo: create a backup, list, upload to S3
- 24:28 – Live demo: drop a table, restore it from local backup
- 26:10 – Live demo: restore tables into a different database with –mapping
- 27:44 – What is stored inside a backup: shadow/, metadata JSON, part information
- 30:15 – How hard links work in Linux: inodes, multiple names, reference counting
- 33:22 – ALTER TABLE FREEZE and the shadow/ directory mechanism
- 35:15 – How clickhouse-backup uses hard links for create and restore
- 36:05 – Restore internals: download → detached/ → ALTER TABLE ATTACH PART
- 38:00 – Restore and manage commands overview
- 38:57 – Incremental backups: –diff-from-remote flag
- 44:43 – Advanced: managing backup storage, clean, clean-remote-broken, retention settings
- 47:31 – Remote storage tips: upload/download concurrency, shard macros
- 48:43 – Incremental backup caveats: mutations that rename all parts break incrementals
- 50:00 – REST API server mode and integration with Kubernetes
- 46:00 – Integration tables: system.backup_list and system.backup_actions
- 51:15 – Further topics: sharded cluster examples, converting MergeTree to replicated, custom storage types
- 53:10 – Q&A: schema changes between backups (Eugene explains)
- 55:23 – Final tip: always test your backups before you need them
- 51:57 – Roadmap: improved incremental support, S3-backed MergeTree support
- 57:50 – Wrap-up and thank-you
Webinar Transcript
[00:07] – Welcome and Housekeeping
Robert: Hi everybody and welcome to our webinar “Safety First: Using clickhouse-backup to Back Up and Restore ClickHouse® Databases.” My name is Robert Hodges and I am joined today by Eugene Klimov, the maintainer of clickhouse-backup. We will cover a wide range of topics starting with why we need backups, all the way to some of the operational details of using the project.
This webinar is being recorded. We will send you a link to the recording as well as the slides shortly after the talk, certainly by early Friday. The session is interactive. You can ask questions in the Q&A box on the Zoom control bar or in the chat. We have an hour and it is not a long presentation, so there should be plenty of time for questions at any point.
[01:40] – Speaker Introductions
Robert: Eugene Klimov is a very experienced developer who has become the maintainer of clickhouse-backup. His day job is on Altinity.Cloud but he has been deeply involved in ClickHouse for several years, and most of the software you will hear about today is either something Eugene worked on, implemented, or designed from scratch.
I am Robert Hodges again. I am a database geek who has been working on databases for about 40 years, including doing a lot of backups. My day job is Altinity CEO, but I present frequently on ClickHouse internals and have a deep interest in this topic.
A little bit about Altinity if you are not familiar with us: we are an enterprise provider for ClickHouse. We run Altinity.Cloud, which runs on Amazon and GCP in our accounts, and also supports a bring-your-own-cloud model for Azure, Hetzner, and anywhere you can run Kubernetes. It was the first cloud for ClickHouse in the United States running on Amazon. We are the authors of the Kubernetes Operator for ClickHouse® and the maintainers of clickhouse-backup, along with a number of other open-source projects.
[03:46] – Why Backups Matter
Robert: Let us talk about why we back up databases. Backups are critical but painful. I started in the days when they ran on tape and you had to sit around mounting reels. The primary reason people think about backups is catastrophic mistake: a system failure, a data center down from a natural disaster, or an accidental delete. I accidentally wiped out a company’s data about six months into my IT career in the late seventies. Backups handle that because you have a copy somewhere far away to pull back in.
But backups are useful for several other things too. You can restore a production table to a development environment to debug a problem that requires real data. You can test upgrades safely: take a recent production backup, restore it to a new location, run your upgrade procedure, verify your applications and database still function correctly, and then upgrade the real system with confidence. You can also use backups just to clone: if you need an empty copy of the schema and configuration to bring up a new tenant, grab the empty shell from a backup and restore as much data as you need to get started.
[06:28] – ClickHouse Overview
Robert: For those less familiar: ClickHouse is a real-time analytic database, one of the most popular on the internet. It is like the child of MySQL and Vertica. Like MySQL, it understands SQL, is open source, and runs practically anywhere. Like Vertica, it has a shared-nothing architecture with servers communicating across a network, stores data in columns, has parallel and vectorized execution, and scales to many petabytes. Even five years ago, companies like Cloudflare had 15 to 20 petabytes in ClickHouse. It has since evolved to support S3-backed storage with separation of storage and compute.
[08:32] – What to Protect in a ClickHouse Deployment
Robert: If we are copying data to keep it safe, what exactly do we need to protect? Several things in ClickHouse do not need to go into a backup: the ClickHouse process itself, logs directories, certificates and keys. Those we assume will be regenerated fresh for each machine.
The things that do matter are marked clearly. First, configuration files: ClickHouse depends on config.xml and related files at startup to know where it is storing data, what ports it is listening on, and so on. Without these you cannot bring up a server that behaves like the original. Second, schema: your table definitions including indexes and settings. Third, data: the actual content of your tables. Fourth, RBAC metadata: if you use SQL commands like CREATE USER and CREATE PROFILE, you need copies of those definitions. When RBAC metadata is stored in ZooKeeper (which Altinity.Cloud uses), you need to fetch it from there specifically. And fifth, for tables using tiered storage, object-backed data in S3 or equivalent must also be addressed, since the data references can point to remote storage.
[11:20] – Backup Options for ClickHouse
Robert: There are several options for backup and restore in ClickHouse, and clickhouse-backup is just one of them.
Originally there was no backup tool. People used replication and assumed that if a host caught fire, replicas on other machines would be unaffected. But replication does not copy configs, RBAC, or even schema; you have to do those yourself.
There is also clickhouse-copier, a utility that uses ZooKeeper to migrate data between locations. It can be used as a backup tool but was designed primarily for migration.
The built-in BACKUP/RESTORE command was introduced a couple of years ago. It copies schema and data but does not handle configs or RBAC metadata.
And then there is clickhouse-backup, which covers everything: config files, schema, data, and RBAC. We will assume you have chosen clickhouse-backup and work from there.
[12:58] – clickhouse-backup: History and Project Overview
Robert: clickhouse-backup was originally written by Alex Akulov several years ago. He recognized that there was just no solution and people were writing their own scripts and handmade mechanisms. He wrote the original tool, which is a Go language program and therefore a statically linked binary that builds practically anywhere. Around late 2019 he reached out to us and asked if Altinity could maintain it. We have been running and maintaining it ever since. It is Apache 2.0 licensed, completely open source, and hosted at the Altinity GitHub organization. Eugene Klimov is the principal maintainer.
[14:22] – Installation
Robert: Installation is straightforward. Download the latest tarball from the GitHub releases page, extract it, and use the install command to place it in /usr/local/bin. Because it is a statically linked Go binary, that is all there is to it. Run clickhouse-backup --version to verify it is alive.
One critical constraint: clickhouse-backup must run on the same host as ClickHouse. It does not work remotely. The reason will be clear once we explain the hard-link mechanism.
Run clickhouse-backup as the clickhouse user or as root. Use sudo -u clickhouse as a convention throughout your backup scripts.
[15:35] – Configuring config.yml
Robert: Once installed, prepare a config.yml file. clickhouse-backup has a detailed YAML configuration with a wide range of options, because backup is naturally somewhat complex and there are different ways to handle different situations.
Rather than hunting for a template, clickhouse-backup can generate a default configuration for you. Create the directory /etc/clickhouse-backup, run clickhouse-backup default-config > /etc/clickhouse-backup/config.yml, then open that file and fill in your specific values. Detailed documentation for every parameter is in the README in the project repository.
The sections you typically need to fill in are: general settings (a small handful of values), the ClickHouse connection section, and if you are using remote storage, the section for your storage provider (S3, GCS, Azure, FTP, etc.). Usually about half a dozen values total and you are done.
[18:07] – Basic Backup Lifecycle
Robert: Backups work in two stages, though you can combine them into a single command.
The first stage is create, which builds the local backup on the same filesystem as ClickHouse. The second stage is upload, which sends the local backup to remote storage such as S3. For disaster protection you must get the backup off the host; if the machine is gone you want your backup somewhere far away and unaffected.
The backup lives under /var/lib/clickhouse/backup/ by default. Inside each named backup you will find a shadow/ directory containing the data (as hard links, explained shortly) and a metadata/ directory containing the schema.
For restore, first run clickhouse-backup download to pull the backup from remote storage back to the local filesystem, then run clickhouse-backup restore.
These two stages can also be run as single commands: clickhouse-backup create-remote creates a local backup and immediately uploads it to remote storage. clickhouse-backup restore-remote downloads from remote and restores in one step.
[21:26] – Live Demo: Create, Upload, and List Backups
Robert: Let us go ahead and create a backup. Here is the command: sudo -u clickhouse clickhouse-backup create my-backup with options to also back up RBAC metadata and configuration files. This is a full backup of the entire system.
By default, clickhouse-backup prints all the SQL commands it uses. We recommend keeping that turned on. If something goes wrong you will see the exact trace of what was executed.
You will see ALTER TABLE FREEZE and ALTER TABLE UNFREEZE commands go by as we process each table. We will explain those in a moment.
After the backup completes, clickhouse-backup list shows the backup sitting locally on the filesystem. Now let us upload it: clickhouse-backup upload my-backup. This sends the backup to the S3 bucket configured in config.yml. Running clickhouse-backup list again shows it is now present in remote storage as a tar file.
[24:28] – Live Demo: Dropping a Table and Restoring It
Robert: Time to prove the backup works by destroying something. Let us drop a table named ex2. It is gone. To restore it: sudo -u clickhouse clickhouse-backup restore --table default.ex2 my-backup. The --table option accepts regular expressions, so you can restore multiple tables matching a pattern at once. That was quick. The data is back.
[26:10] – Live Demo: Restoring Tables into a Different Database
Robert: A common need is restoring tables into a different database so you can work on them without disturbing the original. Here is how: use clickhouse-backup restore --table with a comma-separated list of table names and add --mapping default:default_2. This maps the source database default to the target default_2. The two tables are restored and placed in the target database without overwriting anything in default.
[27:44] – What Is Stored Inside a Backup
Robert: Looking inside a backup directory: there is a shadow/ directory (where the data lives as hard links) and a metadata/ directory containing JSON files. Each JSON file captures the schema for a table, including the parts that make up the table and their names. This is the combination of schema and part-name information that clickhouse-backup needs to do a restore. The metadata directory can be inspected at any time to understand exactly what was captured.
[30:15] – How Hard Links Work on Linux
Robert: clickhouse-backup depends on a Linux filesystem feature called hard links, and understanding them is essential to understanding why backups work the way they do.
In a Linux filesystem, the actual data in a file is an unnamed blob pointed to by something called an inode. A directory entry is just a name, a hard link, that points to an inode. The key property is that multiple directory entries can point to the same inode: the file has multiple names. If you delete one name the data does not disappear, because the other names still reference the inode. Linux files are only truly released when all hard links to the inode are removed, at which point the filesystem releases the underlying storage. This is why we sometimes say “unlink” instead of “delete.”
Hard links only work within a single filesystem. This is why clickhouse-backup must be on the same filesystem as ClickHouse data: if the data and the backup were on different filesystems, hard links would not work and the whole mechanism would fail.
[33:22] – ALTER TABLE FREEZE and the shadow/ Directory
Robert: ClickHouse has a command called ALTER TABLE FREEZE that uses hard links to create an instant backup snapshot. When you run ALTER TABLE default.ontime FREEZE WITH NAME 'my_backup', ClickHouse goes to every part in the table and creates a directory under /var/lib/clickhouse/shadow/my_backup/ containing hard links back to the real part files. The operation is nearly instantaneous regardless of table size because no data is copied.
The result is a stable copy of the table: even if ClickHouse continues inserting new data and merging parts, the original files referenced from the shadow directory will not be deleted. Those hard links keep the inodes alive.
After the backup process has made its own copies, ALTER TABLE UNFREEZE WITH NAME 'my_backup' removes the shadow hard links. The underlying files continue to exist as long as ClickHouse itself still uses them.
clickhouse-backup runs ALTER TABLE FREEZE for each table, captures the hard links in the backup’s shadow/ directory, uploads those files to remote storage, and then runs ALTER TABLE UNFREEZE to release the shadow links.
[36:05] – How Restore Works Internally
Robert: Restore is essentially the same process in reverse. First the backup is downloaded from remote storage, reconstructing the local backup directory with all the part files. For each table being restored, clickhouse-backup runs CREATE TABLE using the schema from the JSON metadata, dropping any existing copy of the table first to avoid conflicts. It then creates hard links to the downloaded part files inside the table’s detached/ subdirectory, which is the standard ClickHouse staging area for parts being moved into a table. Finally it issues ALTER TABLE ATTACH PART for each part, and ClickHouse moves the hard-linked files from detached/ into the live table storage.
Once ClickHouse takes ownership of those hard links, you can safely delete the local backup copy. The data will remain accessible through ClickHouse’s own references. The underlying files are only reclaimed when ClickHouse eventually merges or drops the parts through its normal lifecycle.
If you restored configs or RBAC metadata, a ClickHouse server restart is needed to pick up those changes.
[38:57] – Incremental Backups
Robert: Taking a full backup every time is expensive for large databases. clickhouse-backup supports incremental backups using the --diff-from-remote flag.
The workflow is: run a full backup first with clickhouse-backup create-remote my_full_backup, then delete the local copy to release the hard links. Later, when you want an incremental, run clickhouse-backup create-remote my_incremental --diff-from-remote my_full_backup.
What happens: clickhouse-backup creates a complete local snapshot using hard links as usual, because hard links are fast and involve no data movement. It then compares the part names in this snapshot against the parts already available in the remote full backup. Only parts with new names, meaning parts that were created since the full backup, are uploaded. The result is a small upload representing only what has changed.
To restore from an incremental, simply give the name of the last incremental backup. clickhouse-backup will automatically look back through the backup chain to find any parts not present in the incremental and pull them from the full backup.
Important caveat: if you run a mutation such as ALTER TABLE DELETE, it will rewrite all affected parts with new names. When clickhouse-backup compares part names for the next incremental, it will see all new names and treat the entire table as changed, effectively uploading a full backup. Frequent large mutations make incremental backups much less efficient. See the Altinity Knowledge Base article on differential backups for detailed examples.
[44:43] – Managing Backup Storage
Robert: A few important housekeeping commands.
clickhouse-backup clean removes the shadow directory from the filesystem. You should run this after every backup because accumulated shadow directories hold hard links and consume inodes and disk space that ClickHouse cannot account for. This is the most common source of the apparent discrepancy where ClickHouse thinks you have 500 GB of storage but your filesystem shows 1 TB in use.
clickhouse-backup clean-remote-broken removes partially uploaded backups from remote storage. clickhouse-backup writes metadata.json as the very last file when completing an upload. If that file is absent, the backup is incomplete. This command identifies and removes those broken uploads. You can also configure the upload to resume rather than restart.
In config.yml, the backups_to_keep_local and backups_to_keep_remote settings limit the number of retained backups. When incrementals are involved, the tool is smart enough to keep the full backup and all increments associated with it even when the count limit would otherwise trigger deletion.
[47:31] – Remote Storage Tips
Robert: When uploading to S3, parallelization helps significantly. Configure upload_concurrency and download_concurrency in config.yml to control the number of simultaneous connections. Do not set this too high since backups consume CPU, particularly if you are compressing data, and you do not want to starve ClickHouse.
For sharded clusters, back up at least one replica per shard. A critically important tip: use the {shard} macro in the path setting of your remote storage configuration. If multiple shards share the same remote storage bucket without path differentiation, they can overwrite each other’s backup metadata or accidentally delete each other’s backups when retention cleanup runs.
[50:00] – REST API Server Mode and Kubernetes Integration
Robert: clickhouse-backup can run as a daemon by starting it with clickhouse-backup server. In this mode it accepts backup and restore commands via a REST API over HTTP, which is how it is used in Kubernetes deployments and in Altinity.Cloud. The api: section of config.yml has all the options, including enabling a Prometheus metrics exporter, pprof profiling, and integration tables.
When integration tables are enabled (create_integration_tables: true), ClickHouse automatically gets two system tables: system.backup_list and system.backup_actions. You can query these with standard SQL to see available backups and issue commands directly through ClickHouse instead of via REST calls. For example, inserting a row into system.backup_actions triggers a backup. This is extremely convenient in environments where direct shell access is constrained.
[51:15] – Further Topics and Project Resources
Robert: The examples repository contains working command examples for all the scenarios discussed here and more, including sharded cluster backup workflows, converting MergeTree to ReplicatedMergeTree using backup, and custom remote storage using shell scripts.
[51:57] – Roadmap
Robert: We are continuing to improve incremental backup support. Beta support for backing up S3-backed MergeTree tables is also in progress. Better support for the built-in BACKUP/RESTORE command is on the roadmap. The detailed backlog is public in the GitHub issues.
If you use clickhouse-backup, help us make it better: log issues when you find problems, send pull requests for features or fixes, and tell others about the project. This is community software and broad participation makes it better for everyone.
[53:10] – Q&A
Audience: What happens if the schema changes between a full backup and an incremental backup?
Eugene: We collect the schema on every backup, including incrementals. An old part carries the old schema in its metadata, and a new part carries the new schema. During an incremental backup we calculate the new set of data part names and upload only the new incremental parts. During restore, when we attach old parts with ALTER TABLE ATTACH PART, ClickHouse will attach them using the old column set that is recorded in the part metadata. So yes, schema changes between backups are handled correctly.
Audience: Can clickhouse-backup be used to move partitions between shards for rebalancing?
Robert: You could do it that way but there are better tools. The ALTER TABLE FETCH PARTITION FROM command allows direct cross-replica partition movement and is more efficient for that purpose. clickhouse-backup is not the best tool for rebalancing, though it will work.
Audience: Are restored data files immutable?
Robert: Yes, they are. ClickHouse data parts are write-once and immutable by design.
Audience: Is the backup a true consistent snapshot across all tables?
Robert: This is an important point. clickhouse-backup processes one table at a time. Each table is frozen and unfrozen individually before we move to the next. This means that each table’s snapshot comes from a slightly different point in time. If there are active inserts on the system, tables in the backup will be slightly inconsistent with one another. However, this is not something you should rely on in your application in any case. ClickHouse does not provide cross-table transactional consistency at the application level; when you are running inserts into one table you cannot reliably correlate them with inserts into another. ClickHouse applications should not assume strong cross-table consistency, so the per-table backup behavior is acceptable in practice.
[55:23] – Final Advice: Always Test Your Backups
Robert: This is the most important takeaway from personal experience. One of the biggest mistakes you can make is to run backups and never test them. I have first-hand experience of wiping out all the data for an employer and it was not good. Test your backups before you need them. clickhouse-backup is thoroughly tested, including in production on Altinity.Cloud, and the project includes a detailed test suite. But you must also verify that your specific configuration and commands are backing up and restoring what you think they are. If you get a command wrong you may not be backing up what you expect.
Thank you all for attending. A mail message with the recording link and slide link will go out shortly. Watch for related blog articles as well. Eugene, thank you very much for helping make this a great presentation.
FAQ Section
Q: What exactly does clickhouse-backup protect, and how does it differ from using ClickHouse replication alone?
A: clickhouse-backup captures everything needed to fully reconstruct a ClickHouse® deployment: configuration files, table schema definitions, table data, and RBAC metadata (users, roles, profiles, quotas created via SQL). Replication alone cannot substitute for a backup because it provides no protection against logical errors such as accidental drops and does not copy configuration files, RBAC definitions, or schema to a separate location. If an accidental DROP TABLE is replicated to all replicas, replication makes the damage worse, not better. A proper backup that lives somewhere entirely separate is the only solution for true data protection.
Q: Why must clickhouse-backup run on the same host as ClickHouse, and what does that mean for operations?
A: clickhouse-backup uses Linux hard links as the mechanism for creating instant local snapshots via ALTER TABLE FREEZE. Hard links can only reference inodes within the same filesystem. Since ClickHouse stores its data on a local filesystem, clickhouse-backup must also run on that same host to create hard links into the data directories. The practical consequence is that backup processes consume CPU and I/O resources on the ClickHouse host. For very large backups (40-plus terabytes), running on a host that is serving production queries will create some impact. The recommended mitigation is to dedicate one replica in each shard to backup duties, keeping production query load on other replicas.
Q: How do incremental backups work and what can break them?
A: clickhouse-backup’s incremental backups work by comparing part names in the current snapshot against part names already stored in the remote full backup. Only parts with names not present in the full backup are uploaded. This is fast because creating the local snapshot is just hard-linking, with no data copying. The mechanism breaks down when parts are renamed by mutations. Running ALTER TABLE DELETE WHERE ... rewrites all parts that contain matching rows and gives the new parts new names. When clickhouse-backup runs the next incremental, it sees all new part names and treats the entire set as changed, uploading everything as if it were a full backup. For workloads with frequent large-scale deletes, incremental backups are significantly less efficient. Using TTL-based expiry or partition drops instead of row-level deletes keeps part names more stable and makes incrementals more effective.
Q: What is the shadow/ directory and why do I sometimes see a discrepancy between ClickHouse’s reported disk usage and my filesystem?
A: The shadow/ directory under /var/lib/clickhouse/ is where ClickHouse places hard links during ALTER TABLE FREEZE. Each hard link in the shadow directory points to the same inode as a real part file, so it consumes no additional disk space for the data bytes. However, as long as those hard links exist, the inodes they reference cannot be released. If you forget to unfreeze a backup or do not delete an old local backup, those shadow hard links will keep the inode alive even after ClickHouse has merged or dropped the original parts. The disk space is still occupied, and df reports it as used, but system.parts does not show it because ClickHouse has no knowledge of the shadow directory contents. Running clickhouse-backup clean removes accumulated shadow directories and resolves the discrepancy.
Q: How does the REST API server mode work and when should I use it?
A: When you start clickhouse-backup with clickhouse-backup server, it runs as a daemon exposing an HTTP REST API instead of requiring direct command-line access. This is the standard deployment model for Kubernetes environments and for Altinity.Cloud. With integration tables enabled in config.yml, the API is also accessible via SQL through ClickHouse system tables system.backup_list and system.backup_actions, allowing you to trigger and monitor backups purely through SQL without needing shell access to the clickhouse-backup process. This is particularly useful in containerized environments where exec-ing into pods for routine operations is undesirable.
© 2026 Altinity, Inc. All rights reserved. Altinity®, Altinity.Cloud®, and Altinity Stable® are registered trademarks of Altinity, Inc. ClickHouse® is a registered trademark of ClickHouse, Inc.; Altinity is not affiliated with or associated with ClickHouse, Inc. Kubernetes, MySQL, and PostgreSQL are trademarks and property of their respective owners.
ClickHouse® is a registered trademark of ClickHouse, Inc.; Altinity is not affiliated with or associated with ClickHouse, Inc.