Altinity Stable Release for ClickHouse 22.3

A few months ago we certified 21.8 as an Altinity Stable release. It was delivered together with the Altinity Stable build for ClickHouse.  Since then many things have happened to ClickHouse. In Altinity we continued to work on newer releases and run them in-house. We completed several new features, and many more have been added by community contributors. We started testing the new ClickHouse LTS release 22.3 as soon as it was out in late March. It took us more than three months to confirm 22.3 is ready for production use and to make sure upgrades go smoothly. As of 22.3.8 we are confident in certifying 22.3 as an Altinity Stable release. 

This release is a significant upgrade since the previous Altinity Stable release. It includes more than 3000 pull requests from 415 contributors. Please look below for detailed release notes.

Major new features in 22.3 since the previous stable release 21.8

A new release introduces a lot of changes and new functions. It is very hard to pick the most essential ones, so refer to the full list in the Appendix. The following major features are worth mentioning on the front page:

  • SQL features:
    • User defined functions as lambda expressions.
    • User defined functions as external executables.
    • Schema inference for INSERT and SELECT from external data sources.
    • -Map combinator for Map data type.
    • WINDOW VIEW for stream processing (experimental). See our blog article “Battle of Views” comparing it to LIVE VIEW.
    • INTERSECT, EXCEPT, ANY, ALL, EXISTS operators.
    • EPHEMERAL columns.
    • Asynchronous inserts.
    • Support expressions in JOIN ON.
    • OPTIMIZE DEDUPLICATE on a subset of columns a).
    • COMMENT on schema objects a).
  • Security features:
    • Predefined named connections (or named collections) for external data sources. Can be used in table functions, dictionaries, table engines.
    • system.session_log table that tracks connections and login attempts  a).
    • Disk-level encryption. See the meetup presentation for many interesting details.
    • Support server-side encryption keys for S3 a).
    • Support authentication of users connected via SSL by their X.509 certificate.
  • Replication and Cluster improvements:
    • ClickHouse Keeper — in-process ZooKeeper replacement – has been graduated to production-ready by the ClickHouse team. We also keep testing it on our side: the core functionality looks good and stable, some operational issues and edge cases still exist.
    • Automatic replica discovery – no need to alter remote_servers anymore.
    • Parallel reading from multiple replicas (experimental).
  • Dictionary features:
    • Array attributes, Nullable attributes.
    • New hashed_array dictionary layout that reduces RAM usage for big dictionaries (idea proposed by Altinity).
    • Custom queries for dictionary source.
  • Integrations:
    • SQLite table engine, table function, database engine.
    • Executable function, storage engine and dictionary source.
    • FileLog table engine.
    • Huawei OBS storage support.
    • Aliyun OSS storage support.
  • Remote file system and object storage features:
    • Zero-copy replication for HDFS.
    • Partitioned writes into S3 a) , File, URL and HDFS storages.
    • Local data cache for remote filesystems.
  • Other:
    • Store user access management data in ZooKeeper.
    • Production ready ARM support.
    • Significant rework of clickhouse-local that is now as advanced as clickhouse-client.
    • Projections and window functions are graduated and not experimental anymore.

As usual with ClickHouse, there are many performance and operational improvements in different server components.

a) – contributed by Altinity developers.

Backward Incompatible Changes

The following changes are backward incompatible and require user attention during an upgrade:

  • Do not output trailing zeros in text representation of Decimal types. Example: 1.23 will be printed instead of 1.230000 for decimal with scale 6. Serialization in output formats can be controlled with the setting output_format_decimal_trailing_zeros.
  • Now, scalar subquery always returns a Nullable result if its type can be Nullable.
  • Introduce syntax for here documents. Example: SELECT $doc$ VALUE $doc$. This change is backward incompatible if there are identifiers that contain $.
  • Now indices can handle Nullable types, including isNull and isNotNull functions. This required index file format change – idx2 file extension. ClickHouse 21.8 can not read those files, so a correct downgrade may not be possible.
  • MergeTree table-level settings replicated_max_parallel_sends, replicated_max_parallel_sends_for_table, replicated_max_parallel_fetches, replicated_max_parallel_fetches_for_table were replaced with max_replicated_fetches_network_bandwidth, max_replicated_sends_network_bandwidth and background_fetches_pool_size.
  • Change the order of json_path and json arguments in SQL/JSON functions to be consistent with the standard.
  • A “leader election” mechanism is removed from ReplicatedMergeTree, because multiple leaders have been supported since 20.6. If you are upgrading from ClickHouse version older than 20.6, and some replica with an old version is a leader, then the server will fail to start after upgrade. Stop replicas with the old version to make the new version start. Downgrading to versions older than 20.6 is not possible.
  • Change implementation specific behavior on overflow of the function toDateTime. It will be saturated to the nearest min/max supported instant of datetime instead of wraparound. This change is highlighted as “backward incompatible” because someone may unintentionally rely on the old behavior.

Upgrade Notes

There were several changes between versions that may affect the rolling upgrade of big clusters. Upgrading only part of the cluster is not recommended

  • Data after merge is not byte-identical for tables with MINMAX indexes.
  • Data after merge is not byte-identical for tables created with the old syntax – count.txt is added in 22.1

Rolling upgrade from 20.4 and older is impossible because the “leader election” mechanism is removed from ReplicatedMergeTree.

Known Issues in 22.3.x

The development team continues to improve the quality of the 22.3 release. The following issues still exist in the 22.3.8 version and may affect ClickHouse operation. Please inspect them carefully to decide if those are applicable to your applications.

If you started using 22.3 from the earlier versions, please note that following important bugs have been fixed, especially related to PREWHERE functionality:

You may also look into a GitHub issues using v22.3-affected label.​​

Other Important Changes

  • ClickHouse now recreates system tables if the Engine definition is different from the one in the config. (see https://github.com/ClickHouse/ClickHouse/pull/31824 + edge case https://github.com/ClickHouse/ClickHouse/issues/34929).
  • Behavior of some metrics has been changed. For example written_rows / result_rows may be reported differently, see  https://gist.github.com/filimonov/c83fdf988398c062f6fe5b3344c35e80 BackgroundPoolTask was split into BackgroundMergesAndMutationsPoolTask and BackgroundCommonPoolTask.
  • New setting background_merges_mutations_concurrency_ratio=2 – that means ClickHouse can schedule two times more merges/mutations than background_pool_size, which is still 16 by default. For some scenarios with stale replicas the behavior may be harder to predict / explain. If needed, you can return the old behavior  by setting  background_merges_mutations_concurrency_ratio=1.
  • Pool sizes should be now configured in config.xml (Fallback reading that from default profile of users.xml is still exists) .
  • In queries like SELECT a, b, ...  GROUP BY (a, b, ...) ClickHouse does not untuple the GROUP BY expression anymore. The same is true for ORDER BY and PARTITION BY.
  • When dropping and renaming schema objects ClickHouse now checks dependencies and throws an Exception if the operation may break the dependency: 
    Code: 630. DB::Exception: Cannot drop or rename X because some tables depend on it: Y
    It may be disabled by setting check_table_dependencies=0

ClickHouse embedded monitoring is since 21.8.  It now collects host level metrics, and stores them every second in the table system.asynchronious_metric_log.  This can be visible as an increase of background writes, storage usage, etc. To return to the old rate of metrics refresh / flush, adjust those settings in config.xml:

<asynchronous_metrics_update_period_s>
	60
</asynchronous_metrics_update_period_s>
<asynchronous_metric_log>
	<flush_interval_milliseconds>
		60000
	</flush_interval_milliseconds>
</asynchronous_metric_log>

Alternatively, metric_log and asynchronous_metric_log tables can be completely disabled:

<yandex>
    <asynchronous_metric_log remove="1"/>
    <metric_log remove="1"/>
</yandex>

Some new ClickHouse features are now enabled by default. It may lead to a change in behavior, so review those carefully and disable features that may affect your system:

  • check_table_dependencies
  • empty_result_for_aggregation_by_constant_keys_on_empty_set
  • http_skip_not_found_url_for_globs
  • input_format_allow_seeks
  • input_format_csv_empty_as_default
  • input_format_with_types_use_header
  • log_query_views
  • optimize_distributed_group_by_sharding_key
  • remote_filesystem_read_prefetch
  • remote_fs_enable_cache
  • use_hedged_requests
  • use_local_cache_for_remote_storage
  • use_skip_indexes
  • wait_for_async_insert

In the previous releases we recommended disabling optimize_on_insert. This recommendation stays for 22.3 as well as inserts into Summing and AggregatingMergeTree can slow down.

Changes Compared to Community Build

ClickHouse Altinity Stable builds are based on the community LTS versions. Altinity.Stable 22.3.8.40 is based on community 22.3.8.39-lts, but we have additionally backported several features we were working on for our clients:

Let’s Install!

ClickHouse Altinity Stable releases are based on the community versions. 

Linux packages can be found at https://packages.clickhouse.com for community builds, and at https://builds.altinity.cloud for Altinity Stable builds.

Note: The naming schema for Altinity.Stable build packages has been changed since 21.8.x.

21.8.x22.3.x
<package>_<ver>.altinitystable_all.deb<package>_<ver>.altinitystable_amd64.deb
<package>-<ver>.altinitystable-2.noarch.rpm<package>-<ver>.altinitystable.x86_64.rpm

Docker images for community versions have been moved from ‘yandex’  to ‘clickhouse’ organization, and should be referenced as ‘clickhouse/clickhouse-server:22.3’. Altinity stable build images are available as ‘altinity/clickhouse-server:22.3’.

Mac users are welcome to use Homebrew Formulae. Ready-to-use bottles are available for both M1 and Intel Macs running Monterey.

For more information on installing ClickHouse from either the Altinity Builds or the Community Builds, see the ClickHouse Altinity Stable Release Build Install Guide.

Please contact us at info@altinity.com if you experience any issues with the upgrade.

Also, please refer to the release notes from the development team available at the following URLs:

Share

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.