Squeezing JuiceFS with ClickHouse® (Part 1): Setting Things Up in Kubernetes

If you love ClickHouse®, Altinity can simplify your experience with our fully managed service in your cloud (BYOC) or our cloud and 24/7 expert support. Learn more.

JuiceFS is a POSIX-compatible filesystem which can be used over S3 object storage. It is a distributed and cloud-native file system with many features, including consistency, encryption in transit and at rest, BSD and POSIX file locks, and data compression. Every application that uses S3 needs to adapt to the object model. For more complex applications, such as ClickHouse, storing table data on S3 is hard, and native S3 support still has issues. Therefore, if you want to avoid your applications being specifically aware of how to work with S3 storage directly, JuiceFS could be a tempting choice, and if you are already using JuiceFS and ClickHouse as part of your infrastructure, combining the two is a natural choice. In this two-part blog series, we’ll walk through the configuration of JuiceFS over S3 inside Kubernetes and how to use it as a disk to store data for ClickHouse MergeTree tables. Once we set everything up, in Part 2, we’ll look at what performance you can expect and the potential problems with combining JuiceFS and ClickHouse.

The idea of using JuiceFS with ClickHouse is not new, and back in 2021, JuiceFS published a blog article, Exploring storage and computing separation for ClickHouse, which explored using JuiceFS as storage for ClickHouse MergeTree tables, and more recently, Low-Cost Read/Write Separation: Jerry Builds a Primary-Replica ClickHouse Architecture. We’ll try it out for ourselves. However, directly managing servers to run ClickHouse and JuiceFS is complex. Therefore, we’ll use Kubernetes. Specifically, we’ll combine clickhouse-operator and JuiceFS CSI (Container Storage Interface) Driver, simplifying our use of JuiceFS as the storage class for our cluster’s persistent volume claims (PVC).

I’ve chosen an economical and developer-friendly setup for our testing environment. We’ll employ self-managed Kubernetes on Hetzner Cloud with Wasabi S3 object storage.

Deploying Kubernetes cluster

I’ll use hetzner-k3s utility to set up a low-cost Kubernetes cluster quickly. The setup is the same as described in the Low-Cost ClickHouse clusters using Hetzner Cloud with Altinity.Cloud Anywhere article.

First, download and install hetzner-k3s utility.

curl -fsSLO "https://github.com/janosmiko/hetzner-k3s/releases/latest/download/hetzner-k3s_`uname -s`_`uname -m`.deb"
sudo dpkg -i "hetzner-k3s_`uname -s`_`uname -m`.deb"

hetzner-k3s -v

hetzner-k3s version v0.1.9

My Kubernetes cluster configuration will be as follows: I’ll use CPX31 servers that provide 4 vCPU with 8GB RAM in the us-east (US Ashburn, VA). For S3, I’ll use the Wasabi S3 bucket in the us-east-1 region.

Here is my k3s_cluster.yaml. Use your Hetzner project API token and your public and private SSH keys.

---
hetzner_token: <YOUR HETZNER PROJECT API TOKEN>
cluster_name: clickhouse-cloud
kubeconfig_path: "kubeconfig"
k3s_version: v1.29.4+k3s1
public_ssh_key_path: "/home/user/.ssh/<YOUR PUBLIC SSH KEY>.pub"
private_ssh_key_path: "/home/user/.ssh/<YOUR PRIVATE SSH KEY>"
image: "ubuntu-22.04"
verify_host_key: false
location: ash
schedule_workloads_on_masters: false
masters:
  instance_type: cpx31
  instance_count: 1
worker_node_pools:
- name: clickhouse
  instance_type: cpx31
  instance_count: 3

Now, I’ll create my Kubernetes cluster using the following command:

hetzner-k3s create-cluster -c k3s_cluster.yaml

After the cluster setup, check that all nodes are up and working. Remember to point kubectl to our cluster’s kubeconfig file created for the cluster in the same directory as the k3s_cluster.yaml.

kubectl --kubeconfig ./kubeconfig get nodes

NAME                                             STATUS   ROLES                       AGE     VERSION
clickhouse-cloud-cpx31-master1                   Ready    control-plane,etcd,master   2m32s   v1.29.4+k3s1
clickhouse-cloud-cpx31-pool-clickhouse-worker1   Ready    <none>                      2m20s   v1.29.4+k3s1
clickhouse-cloud-cpx31-pool-clickhouse-worker2   Ready    <none>                      2m21s   v1.29.4+k3s1
clickhouse-cloud-cpx31-pool-clickhouse-worker3   Ready    <none>                      2m22s   v1.29.4+k3s1

Installing JuiceFS CSI driver

The best way to use JuiceFS in Kubernetes is to use the JuiceFS CSI driver. The CSI driver will allow us to create persistent volumes (PV) by defining persistent volume claims (PVC) that ClickHouse will use to store data and provides a standard way for exposing storage to applications running on Kubernetes. I’ll install the driver using the kubectl method.

kubectl --kubeconfig kubeconfig apply -f https://raw.githubusercontent.com/juicedata/juicefs-csi-driver/master/deploy/k8s.yaml

serviceaccount/juicefs-csi-controller-sa created
serviceaccount/juicefs-csi-dashboard-sa created
serviceaccount/juicefs-csi-node-sa created
clusterrole.rbac.authorization.k8s.io/juicefs-csi-dashboard-role created
clusterrole.rbac.authorization.k8s.io/juicefs-csi-external-node-service-role created
clusterrole.rbac.authorization.k8s.io/juicefs-external-provisioner-role created
clusterrolebinding.rbac.authorization.k8s.io/juicefs-csi-dashboard-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/juicefs-csi-node-service-binding created
clusterrolebinding.rbac.authorization.k8s.io/juicefs-csi-provisioner-binding created
service/juicefs-csi-dashboard created
deployment.apps/juicefs-csi-dashboard created
statefulset.apps/juicefs-csi-controller created
daemonset.apps/juicefs-csi-node created
csidriver.storage.k8s.io/csi.juicefs.com created

Verify installation.

kubectl --kubeconfig kubeconfig  -n kube-system get pods -l app.kubernetes.io/name=juicefs-csi-driver

NAME                                     READY   STATUS    RESTARTS   AGE
juicefs-csi-controller-0                 4/4     Running   0          45s
juicefs-csi-controller-1                 4/4     Running   0          34s
juicefs-csi-dashboard-58d9c54877-jrphl   1/1     Running   0          45s
juicefs-csi-node-7tsr8                   3/3     Running   0          44s
juicefs-csi-node-mmxbk                   3/3     Running   0          44s
juicefs-csi-node-njftm                   3/3     Running   0          44s
juicefs-csi-node-rnkmz                   3/3     Running   0          44s

Creating Redis cluster for JuiceFS metadata store

Having installed the JuiceFS CSI driver, we need to set up a metadata store. JuiceFS needs a metadata store because it uses a data-metadata separated architecture, with data being the file contents presented to the users and the metadata being the information describing the files and the filesystem itself, such as file attributes, filesystem structure, mapping of file contents to S3 objects, etc. The metadata store can be implemented using different databases. I’ll create a simple Redis cluster. For more information on choosing a metadata store, see Guidance on selecting metadata engine in JuiceFS.

First, let’s create a service.

---
apiVersion: v1
kind: Service
metadata:
  name: redis-service
  namespace: redis
  labels:
    app: redis
spec:
  ports:
    - port: 6379
  clusterIP: None
  selector:
    app: redis

Then, we need to create simple configuration files that will support both Redis master and secondary nodes.

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: redis-config
  namespace: redis
  labels:
    app: redis
data:
  master.conf: |
    maxmemory 1024mb
    maxmemory-policy allkeys-lru
    maxclients 20000
    timeout 300
    appendonly no
    dbfilename dump.rdb
    dir /data
  secondary.conf: |
    slaveof redis-0.redis.redis 6379
    maxmemory 1024mb
    maxmemory-policy allkeys-lru
    maxclients 20000
    timeout 300
    dir /data

I will use a StatefulSet to define the Redis cluster. Note that master and secondary nodes will have different configurations, where the pod’s ordinal index determines master and secondary nodes. However, for testing we can get away with only 1 replica and only use 10Gi volumes for the /data and /etc folders.

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis
  namespace: redis
spec:
  serviceName: "redis-service"
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      initContainers:
      - name: init-redis
        image: redis:7.2.4
        command:
        - bash
        - "-c"
        - |
          set -ex
          # Generate redis server-id from pod ordinal index.
          [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
          ordinal=${BASH_REMATCH[1]}
          # Copy appropriate redis config files from config-map to respective directories.
          if [[ $ordinal -eq 0 ]]; then
            cp /mnt/master.conf /etc/redis-config.conf
          else
            cp /mnt/slave.conf /etc/redis-config.conf
          fi
        volumeMounts:
        - name: redis-claim
          mountPath: /etc
        - name: config-map
          mountPath: /mnt/
      containers:
      - name: redis
        image: redis:7.2.4
        ports:
        - containerPort: 6379
          name: redis
        command:
          - redis-server
          - "/etc/redis-config.conf"
        volumeMounts:
        - name: redis-data
          mountPath: /data
        - name: redis-claim
          mountPath: /etc
      volumes:
      - name: config-map
        configMap:
          name: redis-config                  
  volumeClaimTemplates:
  - metadata:
      name: redis-claim
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 10Gi
  - metadata:
      name: redis-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 10Gi

I’ll also create a dedicated namespace for it.

kubectl --kubeconfig kubeconfig create namespace redis

namespace/redis created

Now, we can create our Redis cluster.

kubectl --kubeconfig kubeconfig apply -n redis -f redis_cluster.yaml

service/redis-service created
configmap/redis-config created
statefulset.apps/redis created

Let’s check the created resources.

kubectl --kubeconfig kubeconfig get configmap -n redis

NAME               DATA   AGE
kube-root-ca.crt   1      3m22s
redis-config       2      57s

kubectl --kubeconfig kubeconfig get svc -n redis

NAME            TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
redis-service   ClusterIP   None         <none>        6379/TCP   88s

kubectl --kubeconfig kubeconfig get statefulset -n redis
NAME    READY   AGE
redis   1/1     96s

kubectl --kubeconfig kubeconfig get pods -n redis

NAME      READY   STATUS    RESTARTS   AGE
redis-0   1/1     Running   0          115s

kubectl --kubeconfig kubeconfig get pvc -n redis

NAME                  STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS     VOLUMEATTRIBUTESCLASS   AGE
redis-claim-redis-0   Bound    pvc-0d4c5dba-67f0-43df-a8e4-d44575e0bc1b   10Gi       RWO            hcloud-volumes   <unset>                 3m57s
redis-data-redis-0    Bound    pvc-fc763f30-397f-455f-bd7e-c49db7a8d36e   10Gi       RWO            hcloud-volumes   <unset>                 3m57s

We can also check the master node by accessing it directly.

kubectl --kubeconfig kubeconfig exec -it redis-0 -c redis -n redis -- /bin/bash

bash-5.2# redis-cli
127.0.0.1:6379> info replication
# Replication
role:master
connected_slaves:0
master_failover_state:no-failover
master_replid:5c09f7cf01eadafe8120d2c0862cb6c69b43c438
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:0
second_repl_offset:-1
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0

Everything looks good. Our metadata store is ready!

Creating JuiceFS storage class

The last thing we need to do is create a storage class for JuiceFS. For that, first, we need to define the secrets that will specify credentials to our S3 bucket and then the storage class itself.

Here is my juicefs-sc.yaml. You’ll need to specify your bucket, access, and secret keys.

---
apiVersion: v1
kind: Secret
metadata:
  name: juicefs-secret
  namespace: default
  labels:
    juicefs.com/validate-secret: "true"
type: Opaque
stringData:
  name: ten-pb-fs
  metaurl: redis://redis-service.redis:6379/1
  storage: s3
  bucket: https://<BUCKET>.s3.<REGION>.wasabisys.com
  access-key: <ACCESS_KEY>
  secret-key: <SECRET_KEY>
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: juicefs
provisioner: csi.juicefs.com
parameters:
  csi.storage.k8s.io/provisioner-secret-name: juicefs-secret
  csi.storage.k8s.io/provisioner-secret-namespace: default
  csi.storage.k8s.io/node-publish-secret-name: juicefs-secret
  csi.storage.k8s.io/node-publish-secret-namespace: default

Create the storage class as follows.

kubectl --kubeconfig kubeconfig apply -f juicefs-sc.yaml

secret/juicefs-secret created
storageclass.storage.k8s.io/juicefs created

Now, check that storage class is available.

kubectl --kubeconfig kubeconfig get sc

NAME                       PROVISIONER         RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
juicefs                    csi.juicefs.com     Delete          Immediate              false                  5m43s

Testing JuiceFS

Having defined a storage class for JuiceFS, we can now test it by defining a persistent volume claim that we could then use in our test application pod.

Here is my juicefs-pvc-sc.yaml that uses the new juicefs storage class.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: juicefs-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  volumeMode: Filesystem
  storageClassName: juicefs
  resources:
    requests:
      storage: 10P

Let’s apply it.

kubectl --kubeconfig kubeconfig apply -f juicefs-pvc-sc.yaml

persistentvolumeclaim/juicefs-pvc created

Test it by creating a simple pod using the following juicefs-app.yaml manifest.

apiVersion: v1
kind: Pod
metadata:
  name: juicefs-app
  namespace: default
spec:
  containers:
    - name: juicefs-app 
      image: centos
      command:
      - sleep
      - "infinity"
      volumeMounts:
      - mountPath: /data
        name: data
  volumes:
    - name: data
      persistentVolumeClaim:
        claimName: juicefs-pvc

Start the pod and check that /data was successfully mounted.

kubectl --kubeconfig kubeconfig apply -f juicefs-app.yaml

pod/juicefs-app created

kubectl --kubeconfig kubeconfig exec -it -n default juicefs-app -- /bin/bash

[root@juicefs-app /]# ls /data
[root@juicefs-app /]#

We have /data mounted in our pod. JuiceFS is ready for use!

Running benchmarks provided by JuiceFS

Install the juicefs client into the pod, and let’s run the benchmarks to check that the filesystem is working as expected.

curl -sSL https://d.juicefs.com/install | sh -

Run the benchmark that uses only 1 thread.

juicefs bench -p 1 /data

Write big blocks: 1024/1024 [==============================================================]  96.9/s   used: 10.566310616s
 Read big blocks: 1024/1024 [==============================================================]  39.9/s   used: 25.661374375s
Write small blocks: 100/100 [==============================================================]  11.0/s   used: 9.059618651s 
 Read small blocks: 100/100 [==============================================================]  755.4/s  used: 132.409568ms 
  Stat small files: 100/100 [==============================================================]  2566.2/s used: 39.04321ms   
Benchmark finished!
BlockSize: 1 MiB, BigFileSize: 1024 MiB, SmallFileSize: 128 KiB, SmallFileCount: 100, NumThreads: 1
+------------------+----------------+---------------+
|       ITEM       |      VALUE     |      COST     |
+------------------+----------------+---------------+
|   Write big file |    96.91 MiB/s |  10.57 s/file |
|    Read big file |    39.91 MiB/s |  25.66 s/file |
| Write small file |   11.0 files/s | 90.59 ms/file |
|  Read small file |  758.5 files/s |  1.32 ms/file |
|        Stat file | 2604.5 files/s |  0.38 ms/file |
+------------------+----------------+---------------+

Run the benchmark that uses 4 parallel threads.

juicefs bench -p 4 /data

Write big blocks: 4096/4096 [==============================================================]  267.1/s   used: 15.333303679s
 Read big blocks: 4096/4096 [==============================================================]  112.3/s   used: 36.469782933s
Write small blocks: 400/400 [==============================================================]  16.5/s    used: 24.257067969s
 Read small blocks: 400/400 [==============================================================]  2392.4/s  used: 167.231047ms 
  Stat small files: 400/400 [==============================================================]  10742.0/s used: 37.281491ms  
Benchmark finished!
BlockSize: 1 MiB, BigFileSize: 1024 MiB, SmallFileSize: 128 KiB, SmallFileCount: 100, NumThreads: 4
+------------------+-----------------+----------------+
|       ITEM       |      VALUE      |      COST      |
+------------------+-----------------+----------------+
|   Write big file |    267.14 MiB/s |   15.33 s/file |
|    Read big file |    112.32 MiB/s |   36.47 s/file |
| Write small file |    16.5 files/s | 242.56 ms/file |
|  Read small file |  2401.8 files/s |   1.67 ms/file |
|        Stat file | 10901.5 files/s |   0.37 ms/file |
+------------------+-----------------+----------------+

Let’s increase parallelization by 4 and re-run the benchmark with 16 threads.

juicefs bench -p 16 /data

Write big blocks: 16384/16384 [==============================================================]  307.2/s  used: 53.335911256s
 Read big blocks: 16384/16384 [==============================================================]  331.4/s  used: 49.440177355s
Write small blocks: 1600/1600 [==============================================================]  48.9/s   used: 32.723927882s
 Read small blocks: 1600/1600 [==============================================================]  3181.2/s used: 503.016108ms 
  Stat small files: 1600/1600 [==============================================================]  9822.4/s used: 162.940025ms 
Benchmark finished!
BlockSize: 1 MiB, BigFileSize: 1024 MiB, SmallFileSize: 128 KiB, SmallFileCount: 100, NumThreads: 16
+------------------+----------------+----------------+
|       ITEM       |      VALUE     |      COST      |
+------------------+----------------+----------------+
|   Write big file |   307.19 MiB/s |   53.34 s/file |
|    Read big file |   331.39 MiB/s |   49.44 s/file |
| Write small file |   48.9 files/s | 327.23 ms/file |
|  Read small file | 3184.9 files/s |   5.02 ms/file |
|        Stat file | 9852.3 files/s |   1.62 ms/file |
+------------------+----------------+----------------+

The benchmark shows that the JuiceFS filesystem is working, and the write and read speeds increase as we increase the number of worker threads. We are ready to combine JuiceFS with ClickHouse.

Conclusion

We have successfully set up the JuiceFS CSI driver in our Kubernetes environment and checked that the JuiceFS storage class is working by running benchmarks provided by the juicefs client. Now, stay tuned. In Part 2, we will create a ClickHouse cluster and examine the performance you can expect when combining JuiceFS with ClickHouse using clickhouse-operator.

PRODUCTS

OPEN SOURCE SOFTWARE

CLICKHOUSE^® SOLUTIONS

Get in touch with ClickHouse experts.

Squeezing JuiceFS with ClickHouse® (Part 1): Setting Things Up in Kubernetes

Deploying Kubernetes cluster

Installing JuiceFS CSI driver

Creating Redis cluster for JuiceFS metadata store

Creating JuiceFS storage class

Testing JuiceFS

Running benchmarks provided by JuiceFS

Conclusion

Related:

Leave a Reply Cancel reply

PRODUCTS

OPEN SOURCE SOFTWARE

CLICKHOUSE® SOLUTIONS

Get in touch with ClickHouse experts.

Deploying Kubernetes cluster

Installing JuiceFS CSI driver

Creating Redis cluster for JuiceFS metadata store

Creating JuiceFS storage class

Testing JuiceFS

Running benchmarks provided by JuiceFS

Conclusion

Related:

Squeezing JuiceFS with ClickHouse® (Part 2): Bringing The Two Together

ClickHouse® MergeTree on S3 – Keeping Storage Healthy and Future Work

Petabyte-Scale Data in Real-Time: ClickHouse®, S3 Object Storage, and Data Lakes

Leave a Reply Cancel reply

CLICKHOUSE^® SOLUTIONS