ClickHouse Altinity Stable Release 21.1

ClickHouse Altinity Stable Release

Several months ago we certified ClickHouse 20.8 as Altinity Stable. Since then we have worked on newer releases and run them in-house. We completed quite a few new features, and even more have been added by community contributors. It is always difficult to stop and pick a community release for certification. Newer releases promise new features, but older ones are easier to upgrade to. We were running ClickHouse 21.1 to power our public datasets instance at Altinity.Cloud for two months, and testing it in our environments. We are now confident to certify 21.1.7.1 as an Altinity Stable release.

This release is a significant step forward since the previous Altinity Stable release. It includes 1462 pull requests from 208 contributors with many new features and performance improvements! Please look below for detailed release notes. Special thanks to Alexey Milovidov for reviewing this.

Major new features since the previous stable release 20.8.x

A new release introduces a lot of changes and new features. These are organized by feature in the Appendix, so refer to this section for more detail. The following new features are worth mentioning on the front page:

  • Security features:
    • LDAP external users directory, see the article in our blog a)
    • AES Encryption functions, see the article in our blog for more detail a)
    • Migration from OpenSSL to BoringSSL library
    • Support SNI in https connections to remote resources
    • Support SNI in ClickHouse server TLS endpoint a)
    • Security context propagation in distributed queries
  • MergeTree features:
    • New TTL extension: TTL RECOMPRESS.
    • ALTER UPDATE/DELETE IN PARTITION for replicated MergeTree tables a)
    • DETACH TABLE/VIEW PERMANENTLY a)
    • OPTIMIZE DEDUPLICATE BY — deduplicate MergeTree tables by a subset of columns a)
    • SimpleAggregateFunction in SummingMergeTree
    • Option to disable merges for a cold storage in tiered storage configuration a)
  • Integrations:
    • gRPC protocol
    • zstd and xz compression for file-based engines
    • EmbeddedRocksDB engine
  • SQL compatibility:
    • UNION DISTINCT (previously only UNION ALL was supported). The default can be altered by union_default_mode setting
    • Improved CTE compatibility
    • REPLACE TABLE and CREATE OR REPLACE TABLE DDL statements for Atomic database engine
  • Other:

As usual with ClickHouse, there are many performance and operational improvements in different server components.

a) – contributed by Altinity developers.

Backward Incompatible Changes

The following changes are backward incompatible and require user attention during an upgrade:

  • Atomic database engine is enabled by default. It does not affect existing databases but new databases will be created with Engine = Atomic. The engine can not be modified for the database once created. Database Atomic has been used for system tables since 20.5, and it is a good feature in the long term. We recommend disabling it for now, however, especially if you use some backup tools, including clickhouse-backup 0.6.4 or earlier. The data layout on the storage has been changed. In order to disable it by default, add the following configuration section to the default profile:
<yandex>
    <profiles>
        <default>
            <default_database_engine>Ordinary</default_database_engine>
        </default>
    </profiles>
</yandex>
  • toUUID(N) is no longer supported. If there is a DEFAULT column with this expression ClickHouse won’t start.
  • Following functions where removed: sumburConsistentHash, timeSeriesGroupSum, timeSeriesGroupRateSum.
  • avg and avgWeighted functions now always return Float64. In previous versions they returned Decimal for Decimal arguments.
  • Accept user settings related to file formats (e.g. format_csv_delimiter) in the SETTINGS clause when creating a table that uses File engine, and use these settings in all INSERT and SELECT statements. Session level settings are ignored in this case.

Upgrade Notes

There were several changes between versions that may affect the rolling upgrade of big clusters. Upgrading only part of the cluster is not recommended

  • Replication protocol has been changed in 20.10 in order to improve reliability of TTL merges. Replication between versions prior to 20.10 and 20.10+ is incompatible if ReplicatedMergeTree tables with TTL are used. See https://github.com/ClickHouse/ClickHouse/pull/14490 for more information.
    • For a safe upgrade all replicas should be upgraded at once. 
    • Alternatively, SYSTEM STOP TTL MERGES should be used during the upgrade.

Other Important Changes

Some new ClickHouse features are now enabled by default. It may lead to a change in behaviour, so review those carefully and disable features that may affect your system:

  • Insert quorum behaviour has been changed. insert_quorum_parallel is enabled by default. It breaks sequential consistency and may have other side effects. We recommend disabling this feature if you are using quorum inserts on previous versions.
  • optimize_on_insert is enabled by default. This is a new feature that applies logic of Replacing, Summing, Collapsing and AggregatingMergeTree on the inserted block. Unfortunately, it still has some issues so we recommend disabling this after upgrading.
  • use_compact_format_in_distributed_parts_names  is enabled by default. 
  • input_format_null_as_default is enabled by default.
  • Background fetches are now limited by background_fetches_pool_size setting. The default value is 3 that may be low in some cases. In previous versions the common background pool has been used for merges and fetches with the default size 16.
  • Compact MergeTree parts are enabled by default for parts below 10MB of size uncompressed. See min_bytes_for_wide_part setting. 

Known issues in 21.1.7.1

Development team continues to improve the quality of the 21.1 release. Following issues still exist in the 21.1.7.1 version and may affect ClickHouse operation. Please inspect them carefully in order to decide if those are applicable to your applications:

ClickHouse Altinity Stable release is based on community version. It can be downloaded from repo.clickhouse.tech, and RPM packages are available from the Altinity Stable Repository

Please contact us at info@altinity.com if you experience any issues with the upgrade.

——————

Appendix

New data types

  • DateTime32 (alias to DateTime)
  • Map (Experimental)

New formats

  • JSONCompactStrings
  • JSONCompactStringsEachRow
  • JSONCompactStringsEachRowWithNamesAndTypes
  • JSONStrings
  • JSONStringsEachRow
  • JSONStringsEachRowWithProgress
  • LineAsString
  • ProtobufSingle
  • RawBLOB

New functions

  • DateTime functions:
    • fromModifiedJulianDay, toModifiedJulianDay
    • fromModifiedJulianDayOrNull, toModifiedJulianDayOrNull
    • parseDateTime32BestEffort, parseDateTime32BestEffortOrNull, parseDateTime32BestEffortOrZero, toDateTime32
  • String functions:
    • countMatches, countMatchesCaseInsensitive
    • countSubstrings, countSubstringsCaseInsensitive, countSubstringsCaseInsensitiveUTF8
    • isIPv4String, isIPv6String
  • Functions for semi-duplicate search and strings proximity:
    • bitHammingDistance, tupleHammingDistance
    • ngramMinHash*, ngramSimHash*
  • Encoding/Formatting:
    • encodeXMLComponent
    • formatReadableQuantity
    • formatReadableTimeDelta
  • Array functions:
    • arrayAvg, arrayMax, arrayMin
    • mapPopulateSeries
  • Machine-learning and statistics:
    • mannWhitneyUTest
    • rankCorr
    • hypot
    • studentTTest, welchTTest
  • Encryption:
    • aes_decrypt_mysql, aes_encrypt_mysql
    • dencrypt, encrypt
  • URL functions
    • cutToFirstSignificantSubdomainCustom, cutToFirstSignificantSubdomainCustomWithWWW, cutToFirstSignificantSubdomainWithWWW
    • firstSignificantSubdomainCustom
  • Aggregate functions
    • New combiantor -SimpleState
  • Functions to work with new Map data type:
    • map
  • Math/Statistics:
    • acosh, asinh, atan2, atanh, cosh, sinh
    • log1p
  • Type related:
    • accurateCast
    • accurateCastOrNull
    • byteSize
    • reinterpretAsUUID
    • toUUIDOrNull
    • toUUIDOrZero
  • Other:
    • farmFingerprint64
  • Internal ClickHouse:
    • errorCodeToName
    • logTrace
    • tid
    • tcpPort

New table functions

  • null
  • view

New table engines

  • EmbeddedRocksDB

New metrics and events

system.metrics

  • BackgroundFetchesPoolTask
  • BackgroundMessageBrokerSchedulePoolTask
  • MaxDDLEntryID
  • TablesToDropQueueSize
  • REMOVED MemoryTrackingForMerges
  • REMOVED MemoryTrackingInBackgroundBufferFlushSchedulePool
  • REMOVED MemoryTrackingInBackgroundDistributedSchedulePool
  • REMOVED MemoryTrackingInBackgroundMoveProcessingPool
  • REMOVED MemoryTrackingInBackgroundProcessingPool
  • REMOVED MemoryTrackingInBackgroundSchedulePool

system.asynchronous_metrics:

  • HTTPThreads
  • InterserverThreads
  • MySQLThreads
  • TCPThreads
  • TotalBytesOfMergeTreeTables
  • TotalPartsOfMergeTreeTables
  • TotalRowsOfMergeTreeTables

system.events:

  • RWLockReadersWaitMilliseconds
  • REMOVED CreatedWriteBufferOrdinary

New system tables

  • errors
  • replicated_fetches
  • replicated_merge_tree_settings

New columns in system tables

  • grants.access_type
  • merges.merge_type, merge_algorithm
  • parts.uuid, default_compression_codec, recompression_ttl_info.expression, recompression_ttl_info.min, recompression_ttl_info.max
  • privileges.privilege, parent_group
  • processes.forwarded_for
  • query_log.event_time_microseconds, query_start_time_microseconds, normalized_query_hash, query_kind, databases, tables, columns, forwarded_for
  • query_thread_log.event_time_microseconds, query_start_time_microseconds, normalized_query_hash, current_database, forwarded_for
  • quotas.keys
  • replication_queue.merge_type
  • storage_policies.prefer_not_to_merge
  • table_engines.supports_parallel_insert
  • trace_log.event_time_microseconds
  • user_directories.path, readonly, params

system.merge_tree_settings added/changed

NameOld valueNew valueDescription
assign_part_uuids0Generate UUIDs for parts. Before enabling check that all replicas support the new format.
execute_merges_on_single_replica_time_threshold0When greater than zero only a single replica starts the merge immediately, others wait up to that amount of time to download the result instead of doing merges locally. If the chosen replica doesn’t finish the merge during that amount of time, fallback to standard behavior happens.
fsync_after_insert0Do fsync for every inserted part. Significantly decreases performance of inserts, not recommended to use with wide parts.
fsync_part_directory0Do fsync for part directory after all part operations (writes, renames, etc.).
in_memory_parts_insert_sync0If true insert of part with in-memory format will wait for fsync of WAL
max_compress_block_size0Compress the pending uncompressed data in a buffer if its size is larger or equal than the specified threshold. Block of data will be compressed even if the current granule is not finished. If this setting is not set, the corresponding global setting is used.
max_number_of_merges_with_ttl_in_pool2When there is more than a specified number of merges with TTL entries in the pool, do not assign a new merge with TTL. This is to leave free threads for regular merges and avoid “Too many parts”
max_partitions_to_read-1Limit the max number of partitions that can be accessed in one query. <= 0 means unlimited. This setting is the default that can be overridden by the query-level setting with the same name.
max_replicated_merges_with_ttl_in_queue1How many tasks of merging parts with TTL are allowed simultaneously in the ReplicatedMergeTree queue.
merge_with_recompression_ttl_timeout14400Minimal time in seconds, when merge with recompression TTL can be repeated.
merge_with_ttl_timeout8640014400Minimal time in seconds, when merge with delete TTL can be repeated.
min_bytes_for_wide_part010485760Minimal uncompressed size in bytes to create part in wide format instead of compact
min_compress_block_size0When granule is written, compress the data in a buffer if the size of pending uncompressed data is larger or equal than the specified threshold. If this setting is not set, the corresponding global setting is used.
min_compressed_bytes_to_fsync_after_fetch0Minimal number of compressed bytes to do fsync for part after fetch (0 – disabled)
min_compressed_bytes_to_fsync_after_merge0Minimal number of compressed bytes to do fsync for part after merge (0 – disabled)
min_rows_to_fsync_after_merge0Minimal number of rows to do fsync for part after merge (0 – disabled)
remove_empty_parts1Remove empty parts after they were pruned by TTL, mutation, or collapsing merge algorithm
try_fetch_recompressed_part_timeout7200Recompression works slowly in most cases.  We don’t start a merge with recompression until this timeout and try to fetch the recompressed part from the replica which assigned this merge with recompression.
write_ahead_log_bytes_to_fsync104857600Amount of bytes, accumulated in WAL to do fsync.
write_ahead_log_interval_ms_to_fsync100Interval in milliseconds after which fsync for WAL is being done.

system.settings added/changed

NameOld valueNew valueDescription
aggregate_functions_null_for_empty0Rewrite all aggregate functions in a query, adding -OrNull suffix to them
allow_experimental_cross_to_join_conversion1
allow_experimental_data_skipping_indices1
allow_experimental_low_cardinality_type1
allow_experimental_map_type0Allow data type Map
allow_experimental_multiple_joins_emulation1
allow_experimental_window_functions0Allow experimental window functions
asterisk_include_alias_columns0Include ALIAS columns for wildcard query
asterisk_include_materialized_columns0Include MATERIALIZED columns for wildcard query
async_socket_for_remote1Asynchronously read from socket executing remote query
background_fetches_pool_size3Number of threads performing background fetches for replicated tables. Only has meaning at server startup.
background_message_broker_schedule_pool_size16Number of threads performing background tasks for message streaming. Only has meaning at server startup.
compile0
database_atomic_wait_for_drop_and_detach_synchronously0When executing DROP or DETACH TABLE in Atomic database, wait for table data to be finally dropped or detached.
date_time_output_formatsimpleMethod to write DateTime to text output. Possible values: ‘simple’, ‘iso’, ‘unix_timestamp’.
default_database_engineOrdinaryAtomicDefault database engine.
do_not_merge_across_partitions_select_final0Merge parts only in one partition in select final
enable_global_with_statement0Propagate WITH statements to UNION queries and all subqueries
experimental_use_processors1
force_data_skipping_indicesComma separated list of strings or literals with the name of the data skipping indices that should be used during query execution, otherwise an exception will be thrown.
force_optimize_skip_unused_shards_no_nested0
format_regexp_escaping_ruleEscapedRawField escaping rule (for Regexp format)
input_format_csv_arrays_as_nested_csv0When reading Array from CSV, expect that its elements were serialized in nested CSV and then put into string. Example: “[“”Hello””, “”world””, “”42″””” TV””]”. Braces around an array can be omitted.
input_format_csv_enum_as_number0Treat inserted enum values in CSV formats as enum indices
input_format_null_as_default01For text input formats initialize null fields with default values if data type of this field is not nullable
input_format_tsv_enum_as_number0Treat inserted enum values in TSV formats as enum indices
insert_distributed_one_random_shard0If setting is enabled, inserting into distributed table will choose a random shard to write when there is no sharding key
insert_quorum_parallel1For quorum INSERT queries – enable to make parallel inserts without linearizability
limit0Limit on read rows from the most ‘end’ result for select query, default 0 means no limit length
load_balancing_first_offset0Which replica to preferably send a query when FIRST_OR_RANDOM load balancing strategy is used.
log_queries_min_query_duration_ms0Minimal time for the query to run, to get to the query_log/query_thread_log.
mark_cache_min_lifetime0
max_bytes_to_read_leaf0Limit on read bytes (after decompression) on the leaf nodes for distributed queries. Limit is applied for local reads only excluding the final merge stage on the root node.
max_concurrent_queries_for_all_users0The maximum number of concurrent requests for all users.
max_partitions_to_read-1Limit the max number of partitions that can be accessed in one query. <= 0 means unlimited.
max_rows_to_read_leaf0Limit on read rows on the leaf nodes for distributed queries. Limit is applied for local reads only excluding the final merge stage on the root node.
merge_tree_uniform_read_distribution1
min_count_to_compile0
multiple_joins_rewriter_version20Obsolete setting, does nothing. Will be removed after 2021-03-31
mysql_datatypes_support_levelWhich MySQL types should be converted to corresponding ClickHouse types (rather than being represented as String). Can be empty or any combination of ‘decimal’ or ‘datetime64’. When empty MySQL’s DECIMAL and DATETIME/TIMESTAMP with non-zero precision are seen as String on ClickHouse’s side.
offset0Offset on read rows from the most ‘end’ result for select query
opentelemetry_start_trace_probability0Probability to start an OpenTelemetry trace for an incoming query.
optimize_move_functions_out_of_any10Move functions out of aggregate functions ‘any’, ‘anyLast’.
optimize_on_insert1Do the same transformation for inserted block of data as if merge was done on this block.
optimize_skip_merged_partitions0Skip partitions with one part with level > 0 in optimize final
output_format_json_array_of_rows0Output a JSON array of all rows in JSONEachRow(Compact) format.
output_format_json_named_tuples_as_objects0Serialize named tuple columns as JSON objects.
output_format_parallel_formatting1Enable parallel formatting for some data formats.
output_format_pretty_row_numbers0Add row numbers before each row for pretty output format
output_format_tsv_null_representationCustom NULL representation in TSV format
partial_merge_join0
read_backoff_min_concurrency1Settings to try keeping the minimal number of threads in case of slow reads.
read_overflow_mode_leafthrowWhat to do when the leaf limit is exceeded.
remerge_sort_lowered_memory_bytes_ratio2If memory usage after remerge does not reduced by this ratio, remerge will be disabled.
s3_max_redirects10Max number of S3 redirects hops allowed.
s3_max_single_part_upload_size67108864The maximum size of object to upload using singlepart upload to S3.
special_sortnot_specified
system_events_show_zero_values0Include all metrics, even with zero values
union_default_modeSet default Union Mode in SelectWithUnion query. Possible values: empty string, ‘ALL’, ‘DISTINCT’. If empty, query without Union Mode will throw exception.
use_antlr_parser0Parse incoming queries using ANTLR-generated experimental parser
use_compact_format_in_distributed_parts_names01Changes format of directories names for distributed table insert parts.

———-

Also, please refer to the release notes from the development team available at the following URLs:

Share