Adventures in ClickHouse Development: Running Integration Tests, Part 1

Altinity is a large contributor to ClickHouse, and our team has submitted hundreds of PRs and issues over the years. We provide our customers and the community with Altinity Stable Builds for ClickHouse, which are releases with an extended service period that undergo rigorous testing to verify they are secure and ready for production use. As part of our work for Altinity Stable Builds, we maintain an in-house CI/CD pipeline that builds binaries and runs extensive tests. Our build pipeline includes many stages from the upstream, which consists of many GitHub Actions workflows. The two largest workflows are the pull request and the master. These workflows include testing jobs such as functional stateless testsfunctional stateful testsstress testsintegration tests, and many others. Most ClickHouse developers rely heavily on tests executed by the CI/CD pipeline. As the complexity of test setups and the CI/CD pipeline grew over time, so did the dependency of the developers on running tests exclusively as part of the pipeline. Nonetheless, there is a need for developers to be able to execute tests locally. Not just some tests, but all the tests!

In this two-part article, we will show how you can execute ClickHouse integration tests by hand as well as look into the complexity of the CI/CD pipeline that is used to run all these tests. We will also introduce you to our helper test program that we have developed at Altinity to help our QA engineers run all the integration tests locally. If you are a seasoned ClickHouse developer, you might think that running integration tests is not that difficult. Let’s see if you are aware of all the pieces that are needed to get it right! Hopefully this article will help you discover new subtleties or at least serve as a reminder of something that is easy to forget unless you have to run these tests every day.

Getting started

In order to get started, we need to check out the ClickHouse source code. ClickHouse development moves fast. So fast that using the master branch would be of no use. Therefore, we’ll use close to the recent source code found in the v23.12.1.1368-stable branch so that results can be reproduced consistently.

I’ll also use a Hetzner cloud machine. For this article, I will be using the CCX43 instance that will be using the default Ubuntu 22.04 image.

After SSH’ing into my new server, I will just do a shallow clone of the branch.

git clone --depth 1 --branch v23.12.1.1368-stable https://github.com/ClickHouse/ClickHouse.git

Now, we have the integration tests at ClickHouse/tests/integration.

cd ClickHouse/tests/integration/

Integration tests depend on test environments that are created with Docker Compose. Actually, they still rely on the old 1.29.2 version, which is deprecated. Therefore, we must have Docker installed. I will use these Docker installation instructions for Ubuntu. Another reason that we must have Docker installed is that if we don’t want to run tests natively, which is not recommended, we need Docker to execute the runner script that uses a Docker image that provides a complete environment for running integration tests.

Before we continue, let’s make sure Docker is working correctly.

docker run hello-world

If successful, you should see a message starting with the following lines:

Hello from Docker!
This message shows that your installation appears to be working correctly.

Running just a few tests

At first glance, one can think that running integration tests is easy. After all, the procedure is documented in README.md right where the source code for the integration tests lives. However, right from the start, we hit the first decision point: do we want to run tests natively, using the pytest installed on the local host, or using a runner script that uses the https://hub.docker.com/r/clickhouse/integration-tests-runner Docker image, which contains all the dependencies needed to run integration tests?

Running tests using pytest installed on the local host is not recommended. The main reason is that it requires a lot of dependencies, some of which are partially documented, or a complete list can be obtained from the integration tests runner image Dockerfile. Another reason is that integration tests can mess with system networking, leave artifacts, and have other unwanted side effects. Therefore, it is recommended to execute integration tests using the runner script, which launches a Docker container that isolates tests from the host system. Before we can run any integration tests, we need to satisfy a couple of dependencies.

First, we need to get our hands on some ClickHouse binaries that will be used to run integration tests against. Specifically, we need to specify the paths of the following binaries:

binaryrunner script option
clickhouse--binary
clickhouse-odbc-bridge--odbc-bridge-binary
clickhouse-library-bridge--library-bridge-binary

However, if clickhouse-odbc-bridge and clickhouse-library-bridge binaries are located in the same directory as the clickhouse server binary, then the --odbc-bridge-binary and --library-bridge-binary options can be skipped as the runner script will default to using these binaries from the same directory where the clickhouse binary is.

Specifying binaries as the path on the host filesystem works well if you just want to run integration tests either against a locally installed ClickHouse, in which case binaries will be located in /usr/bin/ folder, or if you have built ClickHouse locally, in which case binaries will be located in the ClickHouse/build/programs folder. Nevertheless, if you want to compare the behavior of the same tests against other versions, the following cases are not covered by default:

  • pointing to binaries provided by a Docker image
  • pointing to binaries provided by a deb package at some URL
  • pointing to binaries provided by a deb package on the local filesystem

We already have ClickHouse source code available, so we could build the binaries locally. But this would take a lot of time and would require additional system setup. Instead, let’s manually grab ClickHouse binaries from a Docker image. For our branch, there is a corresponding Docker image that we can grab from Dockerhub.

docker pull clickhouse/clickhouse-server:23.12.1.1368-alpine

The procedure is simple:

  • run the image in the background
  • create a local folder where we’ll copy the binaries
  • copy all the binaries from the container to our local folder
docker run -d --name "clickhouse" clickhouse/clickhouse-server:23.12.1.1368-alpine
mkdir 23.12.1.1368-alpine
docker cp clickhouse:/usr/bin/clickhouse 23.12.1.1368-alpine/clickhouse
docker cp clickhouse:/usr/bin/clickhouse-odbc-bridge 23.12.1.1368-alpine/clickhouse-odbc-bridge
docker cp clickhouse:/usr/bin/clickhouse-library-bridge 23.12.1.1368-alpine/clickhouse-library-bridge
docker stop clickhouse

With ClickHouse binaries ready, we can run integration tests against them. But before we do that, we also need to worry about the version of the clickhouse/integration-tests-runner image that will be used by the runner script, in addition to the version of any images that are needed by the integration tests themselves. If the version of the runner image is not specified explicitly using a specific tag in the --docker-image-version option, the latest tag will be used by default. Now, in most cases, it could work just fine; however, if you are working with an older ClickHouse source code, then you have to watch out, as the clickhouse/integration-tests-runner:latest image might not work or not work as expected. In order to avoid this problem, you will have to build the clickhouse/integration-tests-runner image by hand locally, and only then you ensure that the clickhouse/integration-tests-runner image matches exactly the source code of your tests.

You can find the Dockerfile needed to build the runner image in the /docker/test/integration/runner folder. But wait, in the /docker/test/integration folder, you will find many more images that are needed for integration tests. In addition to this, you will find that docker/images.json specifies any image dependencies where some images are needed to be built before others, further complicating the process.

Let’s ignore these complexities for now and just use the default tags for all the images, including the runner image itself. If not specified, the default tag is the latest, and if you are using source code close to the master, then fingers crossed, it might just work.

Note that the 'test_ssl_cert_authentication -ss' argument mentioned in the README.md will not work and will result in the following error:

ERROR: file or directory not found: test_ssl_cert_authentication -ss

But we can just slightly modify our command and run all tests/integration/test_ssl_cert_authentication tests as follows:

./runner --binary 23.12.1.1368-alpine/clickhouse 'test_ssl_cert_authentication'

We got the first integration tests running by hand against binaries that we got from the clickhouse/clickhouse-server:23.12.1.1368-alpine Docker image! However, remember, we kind of cheated and were lucky that the clickhouse/integration-tests-runner:latest and other images worked for us. We can’t rely on luck all the time, and the problem of using the correct images that exactly match our ClickHouse source code remains.

How integration tests are run in CI/CD

Having launched the first integration tests by hand, let’s look at how integration tests execute as part of the CI/CD pipeline. Unfortunately, if we start tracing pipeline code from workflow YAML files to where integration tests are actually run, we’ll quickly find ourselves going down a rabbit hole. Ignoring the YAML file code and the code in ci.py, the diagram below shows the trace where, in total, 1800 lines of Python code are above the code that actually runs integration tests.

First we have the integration_test_check.py that runs ci-runner.py, which then uses the runner script to launch a Docker container that runs the clickhouse/integration-tests-runner image, where integration tests are actually executed.

When examining the above files, we need to separate the code that is related to CI/CD from the code that is related to running the tests. A quick examination shows that all the logic for running all the integration tests can be found in ci-runner.py. Therefore, we will use this file to get an idea of how we need to execute all the tests. As it turns out, it is not that simple, and we will cover this in the next part of this article. So stay tuned for part two!

Conclusion

It should be evident by now that running ClickHouse integration tests is not a trivial task as one would think. In this first part, we have set up our environment and learned how to launch a few integration tests by hand. However, we did cheat a little bit and, for now, ignored the problem of properly handling Docker images that are needed to run all the tests. We have also briefly looked at how the CI/CD launches integration tests and discovered the complexity of how much wrapper code sits before actual test execution. To make the problem worse, CI/CD scripts are written to be run mostly inside the GitHub Actions and are not user-friendly if you try to run them by hand. In the second part, we will look at solving these problems and see how we enabled running ClickHouse integration tests by hand using our own solution.

Share

Related: