A few weeks ago the SingleStore team published interesting research in their blog. They demonstrated how to load 100 billion rows in a database in 10 minutes. While it did not seem outstanding for ClickHouse, we were intrigued to learn how much faster we can accomplish the same task, so we conducted some experiments in Altinity.Cloud.
2021 has been a great year for Altinity.Cloud. Altinity.Cloud users have grown to almost a third of our subscription customers. So what have we learned? In this article, I would like to call out three points for anyone working on real-time analytics and ClickHouse.
May 23, 2019
ClickHouse offers incredible flexibility to solve almost any business problem in a multiple of ways. Schema design plays a major role in this. For our recent benchmarking using the Time Series Benchmark Suite (TSBS) we replicated TimescaleDB schema in order to have fair comparisons. In that design every metric is stored in a separate column. This is the best for ClickHouse from a performance perspective, as it perfectly utilizes column store and type specialization.
Sometimes, however, schema is not known in advance, or time series data from multiple device types needs to be stored in the same table. Having a separate column per metric may be not very convenient, hence a different approach is required. In this article we discuss multiple ways to design schema for time series, and do some benchmarking to validate each approach.