Bring up ClickHouse on Kubernetes with Argo CD
ClickHouse users are creative and teach us new tricks constantly. Our customers introduced us to one of my favorites: Argo CD. It lets you store Kubernetes service configurations in GitHub projects, then deploy them on Kubernetes in a few simple commands or even automatically when you check in changes.
The Argo CD way of managing services is known as GitOps. You have probably heard of GitOps or use it today. Once you get it set up it can be addictively easy to use.
This article explains what Argo CD does, how to install it, and how to set up a simple ClickHouse stack. To follow along, you’ll need your own Kubernetes cluster. The Argo CD configuration used in this blog is located in the Altinity argocd-examples-clickhouse project on GitHub. I tested on Minikube running Kubernetes 1.27.4, but the examples should run anywhere.
Introducing Argo CD
Kubernetes is great for running container-based services, but it has a fundamental issue. There is no consistent definition of the “service” itself. Depending on the service, users may need to use Helm charts, YAML manifests, or Kustomize projects to install. This makes deployment confusing, especially for systems that consist of many services with different install patterns.
Argo CD solves the deployment problem. Users define Argo CD Applications, which map source configuration files (the “state”) in GitHub to a target Kubernetes cluster and name space. Argo CD handles standard deployment methods, including Helm Charts, manifest files, kustomize projects, and jsonnet, to name a few. Argo CD provides simple commands to define the application, install it, and keep it up to date when the definition changes in GitHub.
Here’s a simple illustration.
In short Argo CD is a way to implement GitOps on Kubernetes. If you want to know more about how it works, there is a nice overview in the documentation. Meanwhile, let’s show how to use it.
Install ARGOCD
Install Argo CD using the Getting Started instructions. Argo CD runs on a Kubernetes cluster. It does not have to be the same cluster where you install your stack. Here are sample commands to install it from a Linux system. (Note: This is a non-HA installation suitable for development. See the Argo CD docs for tips on HA installation.)
kubectl create namespace argocd kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
You’ll also need to talk to Argo CD once it’s installed. Install the argocd command line tool.
curl -sSL -o argocd-linux-amd64 https://github.com/argoproj/argo-cd/releases/latest/download/argocd-linux-amd64 sudo install -m 555 argocd-linux-amd64 /usr/local/bin/argocd rm argocd-linux-amd64
Finally, you’ll need to login. Expose the Argo CD API port, login using the default password, and reset it to something you can use permanently.
nohup kubectl port-forward svc/argocd-server \ -n argocd 8080:443 > server.out 2>&1 & kubectl port-forward svc/argocd-server -n argocd 8080:443 & # Prints a password for one-time use. Use that to login. argocd admin initial-password -n argocd TEMPPWD=`argocd admin initial-password -n argocd | cut -f1 -d ' '` argocd login localhost:8080 --username=admin \ --password="$TEMPPWD" --insecure echo $TEMPPWD # Update password to 'secretsecret' using $TEMPPWD value to login argocd account update-password argocd login localhost:8080 --username=admin \ --password="secretsecret" --insecure
Argo CD also has a UI. The Argo CD Getting Started pages explain how to enable it, but what you see above is plenty for this exercise.
Bringing up a ClickHouse Stack with Argo CD
OK, let’s now bring up a ClickHouse stack with the three components we listed earlier. We’ll start with the Altinity Kubernetes operator because it’s required to manage ClickHouse.
Installing clickhouse-operator
First, let’s create the namespace.
kubectl create namespace ch
Next, we need to create the app, which we will call clickhouse-operator. This app maps the clickhouse-operator resource files (a GitHub repo + path) to a specific Kubenetes cluster and namespace.
argocd app create clickhouse-operator \ --repo https://github.com/Altinity/argocd-examples-clickhouse.git \ --path apps/clickhouse-operator \ --dest-server https://kubernetes.default.svc --dest-namespace ch
The create command just defines the app. It does not actually install it on Kubernetes. For that we need to “sync” the clickhouse-operator app to Kubernetes using the following command.
argocd app sync clickhouse-operator
A short flood of messages appears as Argo CD applies the operator manifest. Now we run the following two commands to see the state of the application and the resources Argo CD created from it.
$ argocd app list NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET argocd/clickhouse-operator https://kubernetes.default.svc ch default Synced Healthy <none> <none> https://github.com/Altinity/argocd-examples-clickhouse.git apps/clickhouse-operator $ kubectl get all -n ch NAME READY STATUS RESTARTS AGE pod/clickhouse-operator-altinity-clickhouse-operator-6bb6949dcg2fzj 2/2 Running 0 58s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/clickhouse-operator-altinity-clickhouse-operator-metrics ClusterIP 10.110.94.177 <none> 8888/TCP 58s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/clickhouse-operator-altinity-clickhouse-operator 1/1 1 1 58s NAME DESIRED CURRENT READY AGE replicaset.apps/clickhouse-operator-altinity-clickhouse-operator-6bb6949dc4 1 1 1 58s
That’s it. The operator is up and running. There is no need to remember where the installation scripts are located or how to run them. Argo CD takes care of everything.
If you are curious what the resource files look like in GitHub, follow this link to see them. You will find that the clickhouse-operator is installed from a Helm Chart.
Installing the Remaining Services
We can now quickly install the ZooKeeper and ClickHouse cluster services using the same patterns.
argocd app create zookeeper \ --repo https://github.com/Altinity/argocd-examples-clickhouse.git \ --path apps/zookeeper \ --dest-server https://kubernetes.default.svc --dest-namespace ch argocd app create clickhouse \ --repo https://github.com/Altinity/argocd-examples-clickhouse.git \ --path apps/clickhouse \ --dest-server https://kubernetes.default.svc --dest-namespace ch argocd app sync zookeeper argocd app sync clickhouse
Once the commands complete, we run argocd app list and kubectl get all -n ch to see the Argo CD status and the resources in Kubernetes.
At this point the stack is deployed and we can begin to use it. Once again, it’s no longer necessary to remember the location of a bunch of scripts to install the services in a ClickHouse stack. We use argocd app create and argocd app sync for each application, regardless of how they are actually deployed.
Important note: using argocd app commands is just one way to install Argo CD applications. Applications are actually Kubernetes resources that live in the argocd namespace. You can define applications yourself in Kubernetes using manifest files, like this. There are also multiple ways to sync applications to Kubernetes.
Further Operations on ClickHouse Stacks Using Argo CD
Deleting Services
It’s easy to remove services installed by Argo CD. The following commands eliminate the ClickHouse cluster and ZooKeeper from the stack.
argocd app delete clickhouse argocd app delete zookeeper
Argo CD will ask to confirm deletion in each case. To avoid this we can add the ‘–yes’ option.
Note to users: The ClickHouse manifest has a reclaimPolicy setting to preserve storage even after cluster deletion. To delete storage you’ll need to find the corresponding Persistent Volume Claims (PVCs) using ‘kubectl get pvc -n ch’ and delete them manually. See this blog for more information.
Adding Different Services
We can now reconstitute the stack to set up ClickHouse using ClickHouse Keeper instead of ZooKeeper. The ClickHouse manifest uses the same DNS name for the [Zoo]Keeper connection, which allows the stack to use them interchangeably. Here are the commands.
argocd app create clickhouse-keeper \ --repo https://github.com/Altinity/argocd-examples-clickhouse.git \ --path apps/clickhouse-keeper \ --dest-server https://kubernetes.default.svc --dest-namespace ch argocd app create clickhouse \ --repo https://github.com/Altinity/argocd-examples-clickhouse.git \ --path apps/clickhouse \ --dest-server https://kubernetes.default.svc --dest-namespace ch argocd app sync clickhouse-keeper argocd app sync clickhouse
We can of course add other services to the stack. The sample project also has application definitions for Prometheus, Grafana, and CloudBeaver. We’re working on improvements that will configure Prometheus metric storage and Grafana dashboards automatically.
Making Changes to Service Configurations
The whole point of GitOps with Argo CD is to make controlled and repeatable changes to applications. Up until now we’ve been using the Altinity argocd-examples-clickhouse repo. To make changes we’ll need our own copy so we can commit changes.
Let’s start by forking the repository in GitHub. Login to GitHub, navigate to the argocd-examples-clickhouse project, and bring up the page to create a new fork.
Let’s say the new fork is yourgithub/argocd-examples-clickhouse. We now delete the current ClickHouse cluster application in Argo CD and create a new one from the fork in GitHub.
argocd app create clickhouse \ --repo https://github.com/yourgithub/argocd-examples-clickhouse.git \ --path apps/clickhouse \ --dest-server https://kubernetes.default.svc --dest-namespace ch argocd app sync clickhouse
Within a minute or two a new ClickHouse cluster appears. We check the pods and see the following.
$ kubectl get pods -n ch --selector=app.kubernetes.io/instance=clickhouse
NAME READY STATUS RESTARTS AGE
chi-argocd-demo-0-0-0 1/1 Running 0 3m1s
chi-argocd-demo-0-1-0 1/1 Running 0 2m48s
Let’s add a shard. We’ll clone the repo to a nearby host and edit the cluster definition in demo.yaml to increase the number of shards.
git clone git@github.com:yourgithub/argocd-examples-clickhouse.git
cd argocd-examples-clickhouse/apps/clickhouse
vi demo.yaml
After finishing, the differences in the file are as shown below. We check in and push the changes to our fork.
git diff
...
--- a/apps/clickhouse/demo.yaml
+++ b/apps/clickhouse/demo.yaml
@@ -7,7 +7,7 @@ spec:
clusters:
- name: "demo"
layout:
- shardsCount: 1
+ shardsCount: 2
replicasCount: 2
templates:
podTemplate: server
...
git add demo.yaml
git commit -m 'Increase storage to 100Gi'
git push origin main
Finally, we tell Argo CD to sync the clickhouse app.
argocd app sync clickhouse
ArgoCD applies the new application resources. After a couple minutes the pods should look like the following.
$ kubectl get pods -n ch --selector=app.kubernetes.io/instance=clickhouse
NAME READY STATUS RESTARTS AGE
chi-argocd-demo-0-0-0 1/1 Running 0 11m
chi-argocd-demo-0-1-0 1/1 Running 0 11m
chi-argocd-demo-1-0-0 1/1 Running 0 113s
chi-argocd-demo-1-1-0 1/1 Running 0 83s
The change has propagated and the new shard is up and running. Welcome to GitOps!
Conclusion
The simple examples in this article show how Argo CD lets you deploy, change, and remove analytic stacks on Kubernetes. We’ve seen how customers use Argo CD to combine ClickHouse with open source software like Prometheus, Superset, Grafana, and others into purpose-built stacks that beat Snowflake and BigQuery. We call this the modern analytic stack.
There’s a lot more to learn about using Argo CD and GitOps. Check out the Argo CD documentation if you have not done so already. Refer back to the Altinity argocd-examples-clickhouse project in GitHub. See also our recent webinars (here and here) on modern analytic stack, Kubernetes, and Argo CD.
I would like to thank the many Altinity customers who taught us about the power of Argo CD. We’re returning the favor with new Altinity.Cloud tools to help Argo CD and ClickHouse work even better together. To find out how Altinity can help you deliver modern analytic stacks, contact us to schedule an appointment or join our Slack workspace. See you soon!