Webinars

Keeping Your Cloud Native Data Safe: A Common-Sense Guide to Kubernetes, ClickHouse®, and Security

Recorded: September 27 @ 08:00 am PDT
Presenter: Robert Hodges, CEO @Altinity

In this webinar, Robert Hodges presents a practical, layered guide to securing ClickHouse® running on Kubernetes. Rather than attempting to tackle security all at once across an infinitely complex system, he introduces a three-layer threat model: protect the database itself, protect the Kubernetes layer, and protect external data such as backups and object storage. This structure allows teams to make meaningful progress one layer at a time.

At the database layer, Robert shows how the Altinity Kubernetes Operator for ClickHouse® dramatically reduces complexity by collapsing the many individual Kubernetes resources a ClickHouse cluster requires into a single ClickHouseInstallation manifest. He walks through using this manifest to configure SHA-256 hashed passwords for the default user, restrict that user to cluster-local IPs, and enable shared-secret authentication for inter-cluster distributed queries. He then covers the three main ways to use Kubernetes Secrets with the operator: supplying user passwords by referencing a Secret key, passing cloud credentials (such as AWS S3 access keys) as environment variables, and mounting TLS certificates and private keys as files inside the pod. He also shows how to configure encrypted EBS storage through a custom StorageClass with a single encrypted: true setting.

At the GitOps and container security layer, Robert explains how Argo CD makes consistent, auditable deployments possible by storing all service definitions in GitHub and syncing them to Kubernetes clusters on demand. He addresses the challenge of storing Secrets in Git with Bitnami Sealed Secrets (the kubeseal project) and HashiCorp Vault, and covers container image scanning with Trivy and Docker Scout.

At the Kubernetes infrastructure layer, he recommends using managed Kubernetes on any of the major clouds, noting that EKS, GKE, and AKS all cost only 10 cents per hour to operate, making them a straightforward choice over self-managed clusters. He covers short-lived credentials, disabling SSH to worker nodes, using private subnets and VPC peering, and using Amazon GuardDuty for anomaly detection.

Finally, at the external data layer, he addresses securing backups with Altinity Backup for ClickHouse® (encryption enabled, proper S3 bucket policies blocking public access), securing S3-backed MergeTree tables, and being thoughtful about database log exports to external systems to prevent leakage of personally identifiable information.

Here are the slides:

Key Moments (Timestamps)

Key moments generated with AI assistance.

  • 0:04 – Welcome, introductions, and housekeeping
  • 1:31 – About Altinity: products and open source projects
  • 3:34 – What is Kubernetes?
  • 6:01 – Why databases on Kubernetes need a security model
  • 9:49 – Layer 1: Using operators to manage complexity and enable security
  • 16:06 – Layer 1 continued: Credentials and Kubernetes Secrets
  • 20:57 – Layer 1 continued: TLS configuration and mounting certificates
  • 25:58 – Layer 1 continued: Encrypted storage with custom StorageClasses
  • 29:04 – Live demo: ClickHouseInstallation with passwords, TLS, and encrypted storage
  • 32:43 – Layer 2: Argo CD for consistent GitOps-based service management
  • 37:29 – Sealed Secrets and managing credentials safely in Git
  • 38:34 – Container scanning with Trivy and Docker Scout
  • 39:43 – Layer 3: Protecting Kubernetes itself with managed Kubernetes
  • 41:47 – Layer 3 continued: Protecting external data — backups, S3 tables, and log exports
  • 45:11 – Summary and Altinity.Cloud® overview

Webinar Transcript

[0:04] – Welcome, Introductions, and Housekeeping

Robert: Hi, everybody. Welcome to our webinar today on keeping your cloud-native data safe. We’re going to present a common-sense guide to Kubernetes, ClickHouse®, and security, and show you some interesting tricks for combining all three together. My name is Robert Hodges. I’m CEO of Altinity, and I’m backed up by the Altinity engineering team who did the legwork and developed the software that makes all of this possible.

Before we go further, a couple of hints. This is being recorded and you’ll get an email containing a link to the recording and the slides within about 24 hours. There is time for questions. You can enter them into the Q&A box or throw them into the chat. If it’s relevant to the current topic, I’ll dive in and answer right away; otherwise we’ll cover them at the end.


[1:31] – About Altinity

Robert: I’m Robert Hodges. My day job is CEO of Altinity. I spend a lot of time looking at spreadsheets, but I’ve been working on databases for about 40 years (I say 30-plus in my slides because 40 sounds older) and on Kubernetes since 2018. The Altinity engineering team is about two-thirds of our staff, around 30 people spread across 16 countries, with deep experience in both databases and Kubernetes.

Altinity is an enterprise provider for ClickHouse. We enable you to run analytics based on ClickHouse anywhere you need to. Our main offerings are Altinity.Cloud®, one of the first clouds built for ClickHouse, available on Amazon, GCP, and Azure; enterprise support for self-managed clusters; and we are the authors of the Altinity Kubernetes Operator for ClickHouse®, which is the main subject of this talk. We’ve been working on it since early 2019 and have invested a huge amount of effort to make ClickHouse fully cloud-native.


[3:34] – What Is Kubernetes?

Robert: Kubernetes is a system for orchestrating container-based applications. If you have an application deployable as one or more Docker containers, Kubernetes can deploy those containers anywhere you want, in any combination, and can scale them up and down and manage storage. It’s a very capable system for developing modern applications.

A typical simple example is a ClickHouse database running in a container and talking to block storage. Kubernetes defines this using resources: a StatefulSet to manage stateful containers, a Pod to represent the running process, a PersistentVolumeClaim to describe the storage the pod needs, and a PersistentVolume to represent the actual allocated storage. Kubernetes maps these definitions to physical infrastructure, for example a host running on a VM in Amazon talking to EBS block storage.


[6:01] – Why Databases on Kubernetes Need a Security Model

Robert: Kubernetes is actually a great platform for databases. For a long time it wasn’t, because storage management and stateful application management were cumbersome, but those problems were solved years ago. The challenge is that when you run a database on Kubernetes, it’s another place you have to protect.

When you start looking at security, you need a threat model: what are the components of the system, what are the API surfaces an attacker could exploit, and where could data be grabbed from? When you combine ClickHouse, which is a complex application in its own right, with Kubernetes, which is a complex distributed system, and add the underlying infrastructure like EBS volumes and object storage, you end up with something very complex. It’s hard to know where to even start.

The key is to think about security in layers, attacking one layer at a time:

Layer 1: Protect the database itself, everything within the scope of the deployed ClickHouse cluster.

Layer 2: Protect the Kubernetes environment the database runs in.

Layer 3: Protect external data, things that exist outside of Kubernetes such as object storage and backups.

One additional choice to note: you can take a “protect everything” approach and encrypt communications between every internal component, even those not visible from outside the cluster. Or you can assume that internal Kubernetes traffic is reasonably well protected and only harden the external-facing parts. Both are valid depending on your requirements. In this webinar we’ll assume you want to protect everything.


[9:49] – Layer 1: Using Operators to Manage Complexity and Enable Security

Robert: The first problem is that ClickHouse is complicated, and real production clusters are even more so. Our task is to deal with that complexity in a way that makes it manageable. The answer is Kubernetes operators.

The Altinity Kubernetes Operator for ClickHouse® runs as a container inside Kubernetes. It does two things: it defines a custom resource called ClickHouseInstallation (CHI) that lets you describe a ClickHouse cluster in a relatively compact YAML file, and it processes those definitions by reconciling the actual Kubernetes resources to match what you specified.

A simple ClickHouse definition might be a few lines of YAML. A real production definition might be 24 inches of YAML covering availability zones, security features, storage, and more. The operator reads all of that and maps it to a fully replicated ClickHouse cluster with all the right services, StatefulSets, PersistentVolumeClaims, and configuration.

Here’s an example of a ClickHouseInstallation with security features built in. We have a cluster definition, and in the user settings we define the default ClickHouse user with a SHA-256 encoded password. This is a single line in the operator YAML. The operator also automatically restricts the default user to localhost and to IPs within the cluster, so it can’t be used for external attacks. Another security feature available is a shared secret for inter-cluster communications, which is used when distributed queries need to authenticate between ClickHouse servers. If you’re familiar with sharding and replication in ClickHouse, you know how important this is. Using the operator, you get these features essentially for free by setting a small amount of YAML.


[16:06] – Layer 1 Continued: Credentials and Kubernetes Secrets

Robert: We want to look more carefully at credentials because they’re the keys to everything. In a typical ClickHouse deployment you might be talking to Kafka (using the Kafka table engine, which needs credentials), to other ClickHouse databases for distributed queries, to object storage like Amazon S3 (where leaked credentials expose everything), and you have passwords for the database users themselves.

Kubernetes Secrets are the right tool for passing credentials safely into pods. Here’s an example of a Secret resource: it has a name, and its data is a base64-encoded key-value pair. When loaded, it’s stored securely in etcd. The Altinity operator supports Secrets in three ways:

First, for passwords: The operator has special syntax where you can tell it: for the SHA-256 password value of the default user, go to the Secret called db-passwords and look up the key root-login. One line of YAML, no plaintext password anywhere in your manifest.

Second, for cloud credentials: A common way to pass S3 credentials into a pod is as environment variables. You reference the Secret in the pod spec’s env section, and Kubernetes materializes those values as environment variables inside the running container. Your application then picks them up to connect to S3. This is a very standard approach for cloud credential injection.

Third, for TLS certificates: Secrets can be mounted as files inside pods. This is enormously useful. You store your server certificate, private key, and CA certificate in a Secret, then in the pod spec you define a volume that mounts the Secret as a directory of files, and a volumeMount that places those files at the correct paths inside the container. ClickHouse then reads them from disk via its OpenSSL configuration.


[20:57] – Layer 1 Continued: TLS Configuration and Mounting Certificates

Robert: Let’s cover TLS more specifically. TLS is the standard for encrypting connections to and from databases. In a ClickHouse deployment, you want to encrypt client-to-server connections and, optionally, server-to-server connections for replication and distributed queries.

To set up TLS in ClickHouse you need three things: a server certificate (containing the public key), a matching private key, and optionally a Certificate Authority (CA) certificate. In many enterprise environments, teams create their own private CA rather than using a public one, which is simpler for internal services. If you do that, you need to pass the CA certificate to ClickHouse so it can verify the server certificate.

In the ClickHouseInstallation manifest you configure three things:

  1. Enable secure: yes to activate TLS ports.
  2. In the settings section, enable the secure TCP port (9440) and HTTPS port (8443), and optionally leave the standard TCP port open for localhost-only access without exposing it externally.
  3. Include an OpenSSL configuration file (passed in via a ConfigMap or Secret) that tells ClickHouse where to find the certificate, key, and CA certificate files, and what verification level to use.

Mounting the actual certificate files uses the Secret-as-files pattern described above. You store the three files in a Secret, mount it as a volume in the pod spec, and map those files to the paths your OpenSSL config expects. Once set up, the most you’ll ever pass in is three files, so it’s really not complicated at all.


[25:58] – Layer 1 Continued: Encrypted Storage with Custom StorageClasses

Robert: Before we look at the full demo, let’s cover encrypted storage.

In Kubernetes, when a pod has a PersistentVolumeClaim, a StorageClass is responsible for actually provisioning the storage. Storage classes are connected to provisioners that allocate physical storage matching the claim. On modern cloud providers, the EBS CSI provisioner and its equivalents support encrypted storage natively.

Creating a custom StorageClass with encryption enabled is very simple. You define a StorageClass resource using the EBS provisioner, and you add encrypted: "true" to the parameters. That’s it. Any EBS volume provisioned by this storage class will be encrypted. Amazon manages the keys; you don’t have to do anything else. If someone somehow accessed that storage through another path, they’d see garbage. We find it makes relatively little difference to ClickHouse performance.

To apply it, in the ClickHouseInstallation manifest you simply reference this storage class by name in the volume claim template. One word gets you encrypted storage. It’s so easy that you should always do it.


[29:04] – Live Demo: ClickHouseInstallation with Passwords, TLS, and Encrypted Storage

Robert: Let me jump over and show you what all of this looks like assembled. The samples directory has the YAML I’ve developed for various talks.

Here’s a real ClickHouseInstallation resource. Looking through it: we have the cluster definition with two replicas, one shard. Here’s the configuration to get the ports set correctly, enabling the secure TCP and HTTPS ports. Here’s the connection to ZooKeeper or ClickHouse Keeper. Here’s the password being set by referencing a Secret. Here’s a configuration file passed in that contains the OpenSSL settings, pointing ClickHouse to the certificate files. And here’s the pod definition where we’re mounting the CA certificate, server certificate, and server key. Down here is our encrypted storage class reference.

Everything I showed you piece by piece is right there. It takes a few minutes to set up and then it all works.

The other tool running in this environment is Argo CD. The ClickHouse operator itself was installed from Argo CD, and the ZooKeeper instance for this cluster is also managed by Argo CD. This is an incredibly helpful tool and we’ll discuss it in the next section.


[32:43] – Layer 2: Argo CD for Consistent GitOps-Based Service Management

Robert: To do security effectively, you need to be consistent. Operators organize the ClickHouse service; you also need a way to organize all the services you’re using. The ideal is to have definitions in GitHub and apply them to specific Kubernetes clusters, keeping your definitions well-defined and making it easy to sync them to reality.

This is exactly what Argo CD solves. Other approaches like Terraform or Ansible also work. We actually use Ansible ourselves in some cases, and about a third of the Altinity customers I’ve spoken to use Argo CD. Argo CD supports Kubernetes manifests, Kustomize projects, and Helm charts, and handles all of them with the same consistent commands.

Here’s how it works: you store all your application and service definitions in a GitHub project, install Argo CD into your Kubernetes cluster, and then use the Argo CD CLI to define applications pointing at specific paths in GitHub. When you run argocd app sync, it takes what’s in GitHub and applies it to the cluster. The whole process is auditable because every change goes through Git, and because it’s addictive: once you get used to having everything defined in one place you can sync to any cluster, you won’t want to go back to running scripts manually.


[37:29] – Sealed Secrets and Managing Credentials Safely in Git

Robert: One problem with GitOps is that if you check Kubernetes Secrets directly into GitHub, even base64-encoded, you’re exposing your credentials publicly. If you’ve done this by accident, you know that Amazon will send you an email within minutes. Base64 encoding is not encryption and provides no protection.

A clean solution is Bitnami Sealed Secrets (the kubeseal project). It gives you a command-line tool to encrypt a Secret before checking it into GitHub. Once encrypted, the values look like garbage to anyone browsing the repository. When the manifest is applied to Kubernetes, a Sealed Secrets controller running inside the cluster decrypts them, turning them back into ordinary Secrets that your applications can use.

Another popular approach is HashiCorp Vault, which provides a full secrets management service with access controls, audit logging, and dynamic secret rotation. Both are solid choices depending on your needs.


[38:34] – Container Scanning with Trivy and Docker Scout

Robert: Another important layer of protection is scanning the container images you run. We scan all Altinity builds: we run Trivy and Docker Scout against every Altinity Stable® Build for ClickHouse® and every official ClickHouse release. If either tool reports a CVE marked High, we stop, investigate, fix it, and rebuild.

Trivy is excellent: it runs from the command line and can scan any container image. Docker Scout is provided by Docker and scans images in Docker Hub repositories; it’s currently free. Both are great tools that are freely available and easy to integrate into CI/CD pipelines. Scanning containers before deployment is a basic hygiene step that’s worth doing.


[39:43] – Layer 3: Protecting Kubernetes Itself with Managed Kubernetes

Robert: We’ve spent about 20 slides securing the database. Now let’s protect Kubernetes and the external data.

For protecting Kubernetes itself, the answer in two words is managed Kubernetes. Services like Amazon EKS, Google GKE, and Azure AKS ensure that etcd encryption is enabled, that control plane components are kept up to date, that Kubernetes security patches are applied, and that worker node health is monitored. They’re an excellent deal: EKS, GKE, and AKS all cost 10 cents per hour to operate the cluster. You were going to pay for VMs anyway, so the overhead is minimal. Most Altinity users run managed Kubernetes, and very few operate their own unless they’re on-premises.

Beyond using managed Kubernetes, a few practical steps:

Use short-lived credentials to access the cluster. I accidentally demonstrated this live when I couldn’t log into the demo cluster because my token had expired. That’s the right behavior.

Disable SSH to worker nodes. There’s no reason to shell into a Kubernetes worker node in normal operations. Removing that access reduces the attack surface.

Use private subnets and VPC peering so that your Kubernetes workers have no public endpoints. Keep everything off the public internet where possible.

Use Amazon GuardDuty (or the equivalent on your cloud). GuardDuty monitors your account for anomalous behavior. It’s not a panacea, but if something unusual is happening inside your Kubernetes environment, GuardDuty will typically send you a notification so you can investigate.


[41:47] – Layer 3 Continued: Protecting External Data — Backups, S3 Tables, and Log Exports

Robert: External data is a broad category, but three types are particularly common with ClickHouse.

Backups: Use Altinity Backup for ClickHouse® with encryption enabled for uploads to S3 or GCS. Use the same Secrets-based credential injection techniques we covered earlier so access keys aren’t hard-coded in config files. Critically, apply appropriate S3 bucket policies that block public access. If an attacker can download your backups, they don’t need to touch Kubernetes at all: they already have all your data.

S3-backed MergeTree tables: S3-backed MergeTree is increasingly popular for storing ClickHouse data cheaply. The same principle applies: the S3 bucket containing your table data must be protected with bucket policies that prevent public access and restrict access to only the roles and credentials that need it. Leaked S3 credentials for a bucket containing ClickHouse table data are just as bad as leaked database credentials.

Database logs: This one is often overlooked. If you’re forwarding ClickHouse logs to an external system, such as Grafana Cloud’s Loki or Datadog, log messages may contain personally identifiable information or other sensitive data. ClickHouse supports defining regex-based filter expressions to suppress specific log message patterns before they leave the system. Consider applying those filters. Alternatively, simply don’t forward logs externally: keep them inside the Kubernetes cluster and access them there when needed.


[45:11] – Summary and Altinity.Cloud® Overview

Robert: In summary, if you want to sleep well while running cloud-native ClickHouse, here are the tools and techniques we’ve covered:

Use the Altinity Kubernetes Operator for ClickHouse® to manage cluster complexity and get security features built in. Use Kubernetes Secrets for passwords, cloud credentials, and TLS certificates. Enable TLS on all ClickHouse ports. Use a custom StorageClass with encryption enabled for EBS volumes. Use Argo CD or a similar GitOps tool for consistent, auditable deployments. Use Sealed Secrets or Vault to protect credentials in Git. Scan container images regularly with Trivy or Docker Scout. Use managed Kubernetes. Use short-lived credentials, private subnets, and security monitoring tools like GuardDuty. Encrypt backups, protect S3 object storage with appropriate bucket policies, and be careful about log forwarding.

This isn’t a complete list, but it covers the most important, high-return steps. Security is hard and you can’t protect against every possible attack. But there are a lot of easy wins available and I’ve shown you most of them.

Of course, you might want to save effort by using Altinity.Cloud®, which implements most of what we’ve discussed out of the box. All the hardening we’ve shown was developed working with Altinity.Cloud®; we use it ourselves. Altinity.Cloud® can run in our own Kubernetes accounts on EKS and GKE, or it can run in your own Kubernetes environment using the BYOK model. You can’t tell the difference in the UI; you get the same interface and API either way. We manage the database for you, so it’s not just security but many other operational concerns as well.

If you want to run it yourself, the resources are all publicly available: the Altinity Kubernetes Operator for ClickHouse® with its security hardening guide in the docs directory, the clickhouse-sql-examples repository with worked examples specific to this topic, the Argo CD project, and the Altinity documentation with a wide range of Kubernetes and security guidance.

Thank you very much for attending today.


FAQ Section

What is the three-layer security model for ClickHouse on Kubernetes? The model divides protection into three layers. Layer 1 covers the database itself: passwords, user restrictions, TLS encryption, inter-cluster authentication, encrypted storage, and using Kubernetes Secrets to pass credentials safely. Layer 2 covers Kubernetes: using managed Kubernetes, short-lived credentials, private subnets, disabling SSH to worker nodes, and enabling anomaly detection tools like Amazon GuardDuty. Layer 3 covers external data: encrypting and restricting access to backups in object storage, protecting S3 buckets used for MergeTree table data, and filtering personally identifiable information from log exports.

How does the Altinity Kubernetes Operator for ClickHouse® help with security? The operator collapses a complex ClickHouse cluster, which requires many individual Kubernetes resources, into a single ClickHouseInstallation YAML manifest. This makes security configuration manageable. The operator has built-in support for SHA-256 password hashing for ClickHouse users, automatic IP restriction of the default user to cluster-local addresses, shared secret authentication for inter-cluster distributed queries, Kubernetes Secret references for passing passwords and credentials, and mounting TLS certificates as pod files. Using the operator means these features are consistently applied rather than configured manually and potentially missed.

What are the three ways to use Kubernetes Secrets with ClickHouse? The Altinity Operator supports three uses for Kubernetes Secrets. First, password references: instead of putting a hashed password directly in the manifest, you reference a Secret key, keeping credentials out of your YAML files. Second, environment variable injection: S3 credentials and other cloud keys are passed as environment variables into the running pod, available to ClickHouse without appearing in plaintext anywhere in the manifest. Third, file mounting: TLS certificates, private keys, and CA certificates are stored in a Secret and mounted as actual files at specific paths inside the pod, which ClickHouse then reads via its OpenSSL configuration.

Why should I use Sealed Secrets or Vault when checking Kubernetes Secrets into GitHub? Kubernetes Secrets are only base64-encoded, not encrypted. Checking them into a public or even private GitHub repository exposes your credentials: cloud providers like Amazon actively scan public repositories and will send alerts if they find exposed credentials. Bitnami Sealed Secrets (kubeseal) lets you encrypt the Secret values with a public key before checking them in, and a controller inside Kubernetes decrypts them at apply time. HashiCorp Vault provides a more comprehensive secrets management solution with access controls and audit logging. Either approach is appropriate depending on the complexity of your requirements.

What is the simplest way to get encrypted storage for ClickHouse on Kubernetes? Create a custom Kubernetes StorageClass using the EBS CSI provisioner (or equivalent) with encrypted: "true" in the parameters. Reference this storage class by name in the volumeClaimTemplate of your ClickHouseInstallation manifest. Any EBS volume provisioned for ClickHouse data will then be encrypted at rest. Amazon manages the encryption keys automatically. The performance impact is minimal, and the protection is significant: anyone who somehow accessed the raw storage would see only encrypted garbage.

Why is managed Kubernetes recommended for securing the Kubernetes layer? Managed Kubernetes services like Amazon EKS, Google GKE, and Azure AKS handle critical security hygiene automatically: etcd encryption, control plane security patches, Kubernetes version upgrades, and certificate rotation. They all cost approximately 10 cents per hour to operate, making them essentially free relative to the VM costs you would pay regardless. Self-managed Kubernetes requires ongoing operational effort to apply security patches and manage the control plane, effort that is better spent elsewhere. Most Altinity customers running ClickHouse on Kubernetes use managed Kubernetes for this reason.


© Altinity, Inc. Altinity®, Altinity.Cloud®, and Altinity Stable® are registered trademarks of Altinity, Inc. ClickHouse® is a registered trademark of ClickHouse, Inc.; Altinity is not affiliated with or associated with ClickHouse, Inc. Kubernetes, MySQL, and PostgreSQL are trademarks and property of their respective owners.

Join our Slack

ClickHouse® is a registered trademark of ClickHouse, Inc.; Altinity is not affiliated with or associated with ClickHouse, Inc.

Leave a Reply

Your email address will not be published. Required fields are marked *