In the fight between you and the world, back the world
Important notice for our beloved Apache Kafka users. We continue to improve Kafka engine reliability, performance and usability, and as a part of this entertaining process we have released 18.104.22.168 Altinity Stable ClickHouse release. This release supersedes the previous stable release 22.214.171.124, and addresses following problems:
When there is a rebalance of partitions between consumers, occasional data duplicates and losses were possible. It is fixed now.
When data was polled from several partitions with one poll and committed partially, occasional data losses were possible. It is fixed now.
Block size threshold (‘kafka_max_block_size’ setting) now triggers block flush correctly that reduces the chance of data loss under high load.
When new columns are added to the Kafka pipeline (Kafka engine -> Materialized View -> Table) it was previously not possible to alter the destination tables first with default value and then alter MV in order to minimize downtime. The problem was not related directly to Kafka, but general implementation of materialized views. It is fixed now.
Few other minor problems have been addressed as well.
Since ClickHouse now respects the ‘kafka_max_block_size’ setting that defaults to 65535, we recommend increasing it to the bigger values for high volume streaming. Setting it to 524288 or 1048576 may increase consumer throughput up to 20%.
If you use Kafka engine we recommend upgrading to this release. For those who are new to ClickHouse integration with Kafka, please watch our webinar “Fast Insight From Fast Data: Integrating ClickHouse and Apache Kafka”.
We are looking forward to certifying the 20.1.x release soon. Year 2020 brings a lot of new features and performance improvements. Stay tuned.