Is ClickHouse Moving Away from Open Source?

Note: To avoid any possible confusion for readers, we would like to state clearly that the ClickHouse Apache 2.0 license is not changing. Please read the article for further details. Thanks!

ClickHouse is one of the best open source databases with a very active community. Released in 2016 with permissive Apache 2.0 license, it now counts more than 30,000 GitHub stargazers, hundreds of code contributors, a rich ecosystem, and thousands of businesses using ClickHouse in production. All are excellent indicators of the outstanding success of this open source technology. 

In 2021 ClickHouse Inc. emerged in order to commercialize ClickHouse. Two years later there are now signs that ClickHouse might be moving away from its open source origins – important new features are available only in ClickHouse Cloud. That raises a lot of questions about ClickHouse’s future.

Closing Open Source?

ClickHouse Cloud service was started in September 2022, and since then we can see evidence that ClickHouse Inc. started to make certain features in the private version only. For a long time there were only minor ones around the Replicated database engine. They had little impact on open source use.

But the recent announcement of SharedMergeTree and lightweight updates, which are available only in ClickHouse Cloud, shows that this strategy may be changing. It’s now apparent that important features are not going to be available in the open source version. 

Community members of course raised questions to the ClickHouse team about their plans. As Alexey Milovidov, ClickHouse Inc. CTO, responded:

“It’s good to have a small, limited number of modifications exclusive to ClickHouse Cloud, but only those that do not compromise the features or operation in self-managed usages, but in the same way, are crucial and distinguishing for the Cloud.”

Unfortunately, this statement is incomplete. The current closed source features include not only those that are essential for cloud operation at scale (like new SharedMergeTree storage engine, allowing true separation of storage from compute), but also generic features like lightweight UPDATE that any ClickHouse user would use, whether in the cloud or on prem, or S3-role based access which is a key security feature in public clouds. 

Moreover, the current object storage implementation in open source ClickHouse looks neglected. Significant improvements were planned in the 2023 Roadmap, and some, like shared metadata, were a part of the 2022 Roadmap and even earlier, so the community expected those to be available in ClickHouse. Now it is clear that those improvements were implemented in ClickHouse internal fork as a part of SharedMergeTree, and are not going to be available in open source at all. This is a disappointment for the community, which has been waiting several years for those features.

So is ClickHouse now Open Core?

Open core is the common name for an open source project with a set of advanced features that are closed source. It is one of several models to monetize open source projects, an increasing concern for software businesses looking for a big return on investment. It also means tectonic changes for the ClickHouse open source community.

Diagram from “How to Make Money from Open Source Platforms – Linux.com”

In the true open source model that ClickHouse has followed for years, the community is the main development driver. The community helps to define the roadmap, submits new features or feature requests, develops the ecosystem, and starts new businesses. The core development team develops strategic features, and acts as a moderator and the housekeeper, trying to ensure sustainable community growth. The more active the community, the more successful applications emerge, and the more popular the project becomes. This is a positive feedback loop.

In a full open core project, it is quite the opposite. The business and product of the project owner are the main drivers. The core development team focuses on product development. The open source community, to the extent it exists, is valued as a lead generation source. One can start developing an application using an open source version, but in order to run it in production at scale, one has to switch to the closed source product.

What we can see now in ClickHouse is a shift to the latter model. Using object storage efficiently is crucial for big data analytics, yet this feature is not fully available in open source. The Roadmap execution is successful in closed source cloud features only, others are abandoned, and so on. 

What’s Next?

ClickHouse is a great database that has proven its value many times over. It runs everywhere–from edge devices to huge server farms–with outstanding results. Its extreme performance, flexibility, and portability were the main reasons for its success. It could not happen without an outstanding open source community that has grown for years – thanks to Apache 2.0 licensing and a caring approach to users.

Unfortunately, a switch to an open core model undermines several factors that made ClickHouse successful. The focus on the product does not allow the core team to maintain the same level of community support as before. The focus on ClickHouse Cloud features ignores the needs of users that want to use ClickHouse elsewhere. Those and other factors hurt the trust with the community that made ClickHouse so popular. The community will have to adapt to continue the amazing growth of the past few years. 

First, it is necessary to distinguish the open source roadmap from the ClickHouse cloud roadmap. Users need to know what is going to be available in open source and when.

Second, the community must step up to drive development of strategic features. We recently submitted an RFC for object storage support improvements. It is based on abundant feedback from other open source users and is truly a community effort. We hope this work can be done and will be merged to upstream even if it overlaps with ClickHouse cloud-only features. 

Third, the ClickHouse team will have to delegate more authority to community contributors. The ClickHouse team is emerging as a bottleneck for reviewing and merging pull requests, which is understandable given their focus on the product.

We’ve always dreamed that ClickHouse will move eventually to a foundation with independent governance. This would make many users happy. We may still dream, right?

Share

Related:

2 Comments

  1. As a user of clickhouse since 2018 I’m fully aligned with the content of this article. This technology is one of the best I’ve been using in my career.

    The choice of clickhouse for a new project in my company has always been a no-brainer, but the recent move from clickhouse.inc to a closed source version has made this choice less straightforward.

    Btw, there is an interesting discussion of this article in HN: https://news.ycombinator.com/item?id=37608186

  2. Agree on most parts here. Clickhouse is a really mature technology and Ideally, donating the core to an Apache/CNCF foundation would really strengthen the confidence in using clickhouse as an open source project.

Comments are closed.