Bring Your Own Cloud for ClickHouse

“On an open shelf I keep my box.
Its key is in the lock.

– Gillian Clarke

Altinity.Cloud Anywhere is a unique ClickHouse deployment and management technology. We sometimes call it Open Cloud for ClickHouse since it allows unified cloud management experience with any cloud provider. All resources are controlled by the user in their own cloud infrastructure, while the deployment and management is provided by Altinity.Cloud. We have users running it in AWS, GCP, Azure, and even Hetzner Cloud.

Altinity.Cloud Anywhere was introduced a year ago, and has matured since then. We started with the Bring Your Own Kubernetes (BYOK) model, which requires a user to manage Kubernetes infrastructure. In this article, we will explain the Bring Your Own Cloud (BYOC) deployment model, which provides a much easier user experience.

Overview of Anywhere Deployment Models

Before going into the details, let me explain how Altinity.Cloud Anywhere works.

The key component is the Altinity Connector. Altinity Connector is a secure management channel between Altinity.Cloud and the managed environment. It is deployed inside the user’s Kubernetes, dials back to the management plane and enables Altinity Cloud to manage ClickHouse and infrastructure components. An important property of Altinity Connector is that it only establishes an outbound connection, which pleases corporate Infosec teams.

If the user manages the Kubernetes cluster, the user is responsible for setting up the Kubernetes cluster properly, and deploying the Altinity Connector. We provide guidelines that help to do that properly, but still it requires the expertise and resources of the user’s IT team, not to mention knowledge of Kubernetes itself. 

To make it easier for users, Altinity.Cloud Anywhere can also provision and manage the Kubernetes infrastructure, thus building the full stack in the user’s cloud account. This is the Bring Your Own Cloud (BYOC) model. With BYOC, the user grants Altinity.Cloud permissions to manage cloud resources directly. This access can be provided in two different ways: via a separate VM running Altinity Connector, or directly, depending on the cloud provider’s security model. 

Altinity.Cloud uses a VM-based approach on AWS. It provides the best isolation between Altinity and user account, with the VM serving as a secure bridge between two. It can be deployed using a CloudFormation template that sets an EC2 instance up and sets up the Altinity Connector. Altinity.Cloud does not have direct access to the user’s account in this case; all management operations are routed via the EC2 instance.

The other approach requires users to grant an Altinity Cloud service account direct access to the user’s cloud infrastructure. It works well in GCP, where it is easy to create a dedicated new project and manage security on the project level.

Let’s have a quick tour to see how this can be done.  

Creating a GCP Project

We start with creating a project. A separate project provides convenient isolation of resource and cost management, not to mention security. To create the project, use the gcloud command line or the GCP web console. Here’s a UI example. 

We will need the ProjectID soon, so let’s memorize ‘gcp-anywhere-test’.

Once the project is created, we need to grant roles to the Altinity.Cloud anywhere-admin@altinity.com account. A lot of project admin level roles are required, but all those are in the newly created project, so it does not sacrifice organization security. 

Connecting to Altinity.Cloud

Now it is time to connect the GCP project to Altinity.Cloud. If there is no Altinity.Cloud account yet, it can be requested via the trial signup form. Once logged into the Altinity.Cloud management console, we need to press ‘SETUP ENVIRONMENT’ button and select ‘Bring Your Own Cloud (BYOC)’ option in the dropdown:

On the next screen enter an environment name – this is going to be the third-level domain for ClickHouse clusters. By default it starts with the user’s email domain.

Finally, Altinity.Cloud needs to know which cloud to use. We select ‘GKE Provisioned by Altinity’.

The preparations are almost done. Altinity.Cloud requires a few more bits of information in order to setup the environment, in particular:

  • Region where resources are going to be deployed
  • ProjectID that we created earlier (gcp-anywhere-test)
  • Number of availability zones to use. We recommend starting with two; the third zone can be added later if needed.
  • CIDR block for GKE VPC. Since Altinity.Cloud knows nothing about the user’s cloud configuration, we need some help here in order to avoid conflicts with other internal networks. If there are no other networks, use something like 10.1.2.3/21. Note, this can not be changed later, so consult with your cloud administrators.
  • Instance types used for System, ZooKeeper and ClickHouse workloads. 

The last setting may require some more explanation. We recommend the following node types and configuring them by default:

  • e2-standard-2 for both System and ZooKeeper nodes – those are low-cost, low performance nodes, since workloads there are not high
  • n2d-standard-2/4/8/16/32 differently sized ClickHouse nodes. N2d is the best for ClickHouse in terms of performance. It is also possible to have custom RAM/CPU ratios, though 4GB RAM per 1 vCPU is optimal for most of ClickHouse workloads. Node pools can be modified later, but at least one is required to start.

Once everything is configured, press PROCEED and go grab some coffee. Altinity.Cloud will provision cloud resources in the user’s GCP project, create the GKE cluster, and deploy infrastructure components inside GKE. The process takes about 30 minutes. 

When the environment is connected, the user can start ClickHouse clusters in the user’s cloud.

Managing Cloud Resources

Curious readers may wonder what cloud resources are managed by Altinity.Cloud in the user’s account (GCP project). Here is an incomplete list:

  • Kubernetes cluster (GKE). Altinity.Cloud is setting it up, managing upgrades and ensuring proper health.
  • Node pools. Altinity.Cloud can add, modify and delete node pools by user request. Node pools can also be tagged for cost attribution.
  • Network settings. Private and public internet access, private service connect and others.
  • Object storage buckets for backups and log storage
  • Roles, service accounts, and permissions for secure access to cloud resources from inside Kubernetes cluster.

In short, it takes full responsibility for configuration and management of all cloud resources that are required for efficient ClickHouse operations. That saves costs on SRE and DevOps teams that would be required to manage the cloud otherwise.

Most cloud management operations can be done directly from the Altinity.Cloud management plane – configuring node pools, for example. For more exotic cases like setting up private networking, users need to contact Altinity support in order to make changes.

‘Boxed’ Cloud Experience

‘The box’ for Altinity.Cloud is the Kubernetes cluster. Altinity.Cloud not only sets up the cloud infrastructure, including Kubernetes, but deploys services inside Kubernetes itself. That includes logging and monitoring infrastructure, load balancer, volume properties management (for AWS only), clickhouse-operator, and others. The stack inside Kubernetes is exactly the same in all public clouds, which makes it easy to deploy anywhere. The only difference is how the box itself is created and handled.

For example, in order to deploy Altinity.Cloud in Azure for the first time, we did not have to do anything special except for providing custom annotations for the load balancer. It takes engineering time to implement Azure cloud provisioning logic for the BYOC model, but once Azure Kubernetes cluster (AKS) is up, the rest of the stack works out of the box.

Of course, one user may put multiple boxes in different places. It is not unusual to run ClickHouse clusters in many regions and even in different cloud providers. User experience does not change.

It is important to note that ClickHouse is rarely operated as a single application, but as a part of the application stack. Altinity.Cloud allows sharing Kubernetes clusters with other user deployments that makes it much easier to build and configure applications end-to-end. The box is open on the user side and the user holds the keys.

Final Words

Altinity.Cloud Anywhere is the deployment and management technique that commoditized the cloud experience for ClickHouse users. Its flexibility allows users to choose the model of ownership and control for their ClickHouse deployments that suits best user needs. Users may choose from a pure SaaS, managed cloud, or integrate Altinity.Cloud into their own Kubernetes deployments. Best of all, the ClickHouse server as well as key infrastructure like Grafana, Prometheus, and ZooKeeper are open source. 

This approach is very different from the typical cloud database. With solutions from Snowflake, BigQuery, Redshift, and even other SaaS vendors for ClickHouse, users have to give control to the cloud vendor. With Altinity.Cloud Anywhere, users have full control of the data in their own account. Moreover they can disconnect any time and run the stack directly. Altinity.Cloud Anywhere builds an analytic stack that is 100% open source. Feel free to do anything you want with it! 

Share

Related: