ClickHouse Altinity Stable Release 21.1

By Alexander Zaitsev on March 24th, 2021

ClickHouse Altinity Stable Release 21.1

Alexander Zaitsev Altinity StableClickHouseReleases

Several months ago we certified ClickHouse 20.8 as Altinity Stable. Since then we have worked on newer releases and run them in-house. We completed quite a few new features, and even more have been added by community contributors. It is always difficult to stop and pick a community release for certification. Newer releases promise new features, but older ones are easier to upgrade to. We were running ClickHouse 21.1 to power our public datasets instance at Altinity.Cloud for two months, and testing it in our environments. We are now confident to certify 21.1.7.1 as an Altinity Stable release.

This release is a significant step forward since the previous Altinity Stable release. It includes 1462 pull requests from 208 contributors with many new features and performance improvements! Please look below for detailed release notes. Special thanks to Alexey Milovidov for reviewing this.

Major new features since the previous stable release 20.8.x

A new release introduces a lot of changes and new features. These are organized by feature in the Appendix, so refer to this section for more detail. The following new features are worth mentioning on the front page:

  • Security features:
    • LDAP external users directory, see the article in our blog a)
    • AES Encryption functions, see the article in our blog for more detail a)
    • Migration from OpenSSL to BoringSSL library
    • Support SNI in https connections to remote resources
    • Support SNI in ClickHouse server TLS endpoint a)
    • Security context propagation in distributed queries
  • MergeTree features:
    • New TTL extension: TTL RECOMPRESS.
    • ALTER UPDATE/DELETE IN PARTITION for replicated MergeTree tables a)
    • DETACH TABLE/VIEW PERMANENTLY a)
    • OPTIMIZE DEDUPLICATE BY — deduplicate MergeTree tables by a subset of columns a)
    • SimpleAggregateFunction in SummingMergeTree
    • Option to disable merges for a cold storage in tiered storage configuration a)
  • Integrations:
    • gRPC protocol
    • zstd and xz compression for file-based engines
    • EmbeddedRocksDB engine
  • SQL compatibility:
    • UNION DISTINCT (previously only UNION ALL was supported). The default can be altered by union_default_mode setting
    • Improved CTE compatibility
    • REPLACE TABLE and CREATE OR REPLACE TABLE DDL statements for Atomic database engine
  • Other:

As usual with ClickHouse, there are many performance and operational improvements in different server components.

a) – contributed by Altinity developers.

Backward Incompatible Changes

The following changes are backward incompatible and require user attention during an upgrade:

  • Atomic database engine is enabled by default. It does not affect existing databases but new databases will be created with Engine = Atomic. The engine can not be modified for the database once created. Database Atomic has been used for system tables since 20.5, and it is a good feature in the long term. We recommend disabling it for now, however, especially if you use some backup tools, including clickhouse-backup 0.6.4 or earlier. The data layout on the storage has been changed. In order to disable it by default, add the following configuration section to the default profile:
<yandex>
    <profiles>
        <default>
            <default_database_engine>Ordinary</default_database_engine>
        </default>
    </profiles>
</yandex>
  • toUUID(N) is no longer supported. If there is a DEFAULT column with this expression ClickHouse won’t start.
  • Following functions where removed: sumburConsistentHash, timeSeriesGroupSum, timeSeriesGroupRateSum.
  • avg and avgWeighted functions now always return Float64. In previous versions they returned Decimal for Decimal arguments.
  • Accept user settings related to file formats (e.g. format_csv_delimiter) in the SETTINGS clause when creating a table that uses File engine, and use these settings in all INSERT and SELECT statements. Session level settings are ignored in this case.

Upgrade Notes

There were several changes between versions that may affect the rolling upgrade of big clusters. Upgrading only part of the cluster is not recommended

  • Replication protocol has been changed in 20.10 in order to improve reliability of TTL merges. Replication between versions prior to 20.10 and 20.10+ is incompatible if ReplicatedMergeTree tables with TTL are used. See https://github.com/ClickHouse/ClickHouse/pull/14490 for more information.
    • For a safe upgrade all replicas should be upgraded at once. 
    • Alternatively, SYSTEM STOP TTL MERGES should be used during the upgrade.

Other Important Changes

Some new ClickHouse features are now enabled by default. It may lead to a change in behaviour, so review those carefully and disable features that may affect your system:

  • Insert quorum behaviour has been changed. insert_quorum_parallel is enabled by default. It breaks sequential consistency and may have other side effects. We recommend disabling this feature if you are using quorum inserts on previous versions.
  • optimize_on_insert is enabled by default. This is a new feature that applies logic of Replacing, Summing, Collapsing and AggregatingMergeTree on the inserted block. Unfortunately, it still has some issues so we recommend disabling this after upgrading.
  • use_compact_format_in_distributed_parts_names  is enabled by default. 
  • input_format_null_as_default is enabled by default.
  • Background fetches are now limited by background_fetches_pool_size setting. The default value is 3 that may be low in some cases. In previous versions the common background pool has been used for merges and fetches with the default size 16.
  • Compact MergeTree parts are enabled by default for parts below 10MB of size uncompressed. See min_bytes_for_wide_part setting. 

Known issues in 21.1.7.1

Development team continues to improve the quality of the 21.1 release. Following issues still exist in the 21.1.7.1 version and may affect ClickHouse operation. Please inspect them carefully in order to decide if those are applicable to your applications:

ClickHouse Altinity Stable release is based on community version. It can be downloaded from repo.clickhouse.tech, and RPM packages are available from the Altinity Stable Repository

Please contact us at [email protected] if you experience any issues with the upgrade.

——————

Appendix

New data types

  • DateTime32 (alias to DateTime)
  • Map (Experimental)

New formats

  • JSONCompactStrings
  • JSONCompactStringsEachRow
  • JSONCompactStringsEachRowWithNamesAndTypes
  • JSONStrings
  • JSONStringsEachRow
  • JSONStringsEachRowWithProgress
  • LineAsString
  • ProtobufSingle
  • RawBLOB

New functions

  • DateTime functions:
    • fromModifiedJulianDay, toModifiedJulianDay
    • fromModifiedJulianDayOrNull, toModifiedJulianDayOrNull
    • parseDateTime32BestEffort, parseDateTime32BestEffortOrNull, parseDateTime32BestEffortOrZero, toDateTime32
  • String functions:
    • countMatches, countMatchesCaseInsensitive
    • countSubstrings, countSubstringsCaseInsensitive, countSubstringsCaseInsensitiveUTF8
    • isIPv4String, isIPv6String
  • Functions for semi-duplicate search and strings proximity:
    • bitHammingDistance, tupleHammingDistance
    • ngramMinHash*, ngramSimHash*
  • Encoding/Formatting:
    • encodeXMLComponent
    • formatReadableQuantity
    • formatReadableTimeDelta
  • Array functions:
    • arrayAvg, arrayMax, arrayMin
    • mapPopulateSeries
  • Machine-learning and statistics:
    • mannWhitneyUTest
    • rankCorr
    • hypot
    • studentTTest, welchTTest
  • Encryption:
    • aes_decrypt_mysql, aes_encrypt_mysql
    • dencrypt, encrypt
  • URL functions
    • cutToFirstSignificantSubdomainCustom, cutToFirstSignificantSubdomainCustomWithWWW, cutToFirstSignificantSubdomainWithWWW
    • firstSignificantSubdomainCustom
  • Aggregate functions
    • New combiantor -SimpleState
  • Functions to work with new Map data type:
    • map
  • Math/Statistics:
    • acosh, asinh, atan2, atanh, cosh, sinh
    • log1p
  • Type related:
    • accurateCast
    • accurateCastOrNull
    • byteSize
    • reinterpretAsUUID
    • toUUIDOrNull
    • toUUIDOrZero
  • Other:
    • farmFingerprint64
  • Internal ClickHouse:
    • errorCodeToName
    • logTrace
    • tid
    • tcpPort

New table functions

  • null
  • view

New table engines

  • EmbeddedRocksDB

New metrics and events

system.metrics

  • BackgroundFetchesPoolTask
  • BackgroundMessageBrokerSchedulePoolTask
  • MaxDDLEntryID
  • TablesToDropQueueSize
  • REMOVED MemoryTrackingForMerges
  • REMOVED MemoryTrackingInBackgroundBufferFlushSchedulePool
  • REMOVED MemoryTrackingInBackgroundDistributedSchedulePool
  • REMOVED MemoryTrackingInBackgroundMoveProcessingPool
  • REMOVED MemoryTrackingInBackgroundProcessingPool
  • REMOVED MemoryTrackingInBackgroundSchedulePool

system.asynchronous_metrics:

  • HTTPThreads
  • InterserverThreads
  • MySQLThreads
  • TCPThreads
  • TotalBytesOfMergeTreeTables
  • TotalPartsOfMergeTreeTables
  • TotalRowsOfMergeTreeTables

system.events:

  • RWLockReadersWaitMilliseconds
  • REMOVED CreatedWriteBufferOrdinary

New system tables

  • errors
  • replicated_fetches
  • replicated_merge_tree_settings

New columns in system tables

  • grants.access_type
  • merges.merge_type, merge_algorithm
  • parts.uuid, default_compression_codec, recompression_ttl_info.expression, recompression_ttl_info.min, recompression_ttl_info.max
  • privileges.privilege, parent_group
  • processes.forwarded_for
  • query_log.event_time_microseconds, query_start_time_microseconds, normalized_query_hash, query_kind, databases, tables, columns, forwarded_for
  • query_thread_log.event_time_microseconds, query_start_time_microseconds, normalized_query_hash, current_database, forwarded_for
  • quotas.keys
  • replication_queue.merge_type
  • storage_policies.prefer_not_to_merge
  • table_engines.supports_parallel_insert
  • trace_log.event_time_microseconds
  • user_directories.path, readonly, params

system.merge_tree_settings added/changed

Name Old value New value Description
assign_part_uuids 0 Generate UUIDs for parts. Before enabling check that all replicas support the new format.
execute_merges_on_single_replica_time_threshold 0 When greater than zero only a single replica starts the merge immediately, others wait up to that amount of time to download the result instead of doing merges locally. If the chosen replica doesn’t finish the merge during that amount of time, fallback to standard behavior happens.
fsync_after_insert 0 Do fsync for every inserted part. Significantly decreases performance of inserts, not recommended to use with wide parts.
fsync_part_directory 0 Do fsync for part directory after all part operations (writes, renames, etc.).
in_memory_parts_insert_sync 0 If true insert of part with in-memory format will wait for fsync of WAL
max_compress_block_size 0 Compress the pending uncompressed data in a buffer if its size is larger or equal than the specified threshold. Block of data will be compressed even if the current granule is not finished. If this setting is not set, the corresponding global setting is used.
max_number_of_merges_with_ttl_in_pool 2 When there is more than a specified number of merges with TTL entries in the pool, do not assign a new merge with TTL. This is to leave free threads for regular merges and avoid “Too many parts”
max_partitions_to_read -1 Limit the max number of partitions that can be accessed in one query. <= 0 means unlimited. This setting is the default that can be overridden by the query-level setting with the same name.
max_replicated_merges_with_ttl_in_queue 1 How many tasks of merging parts with TTL are allowed simultaneously in the ReplicatedMergeTree queue.
merge_with_recompression_ttl_timeout 14400 Minimal time in seconds, when merge with recompression TTL can be repeated.
merge_with_ttl_timeout 86400 14400 Minimal time in seconds, when merge with delete TTL can be repeated.
min_bytes_for_wide_part 0 10485760 Minimal uncompressed size in bytes to create part in wide format instead of compact
min_compress_block_size 0 When granule is written, compress the data in a buffer if the size of pending uncompressed data is larger or equal than the specified threshold. If this setting is not set, the corresponding global setting is used.
min_compressed_bytes_to_fsync_after_fetch 0 Minimal number of compressed bytes to do fsync for part after fetch (0 – disabled)
min_compressed_bytes_to_fsync_after_merge 0 Minimal number of compressed bytes to do fsync for part after merge (0 – disabled)
min_rows_to_fsync_after_merge 0 Minimal number of rows to do fsync for part after merge (0 – disabled)
remove_empty_parts 1 Remove empty parts after they were pruned by TTL, mutation, or collapsing merge algorithm
try_fetch_recompressed_part_timeout 7200 Recompression works slowly in most cases.  We don’t start a merge with recompression until this timeout and try to fetch the recompressed part from the replica which assigned this merge with recompression.
write_ahead_log_bytes_to_fsync 104857600 Amount of bytes, accumulated in WAL to do fsync.
write_ahead_log_interval_ms_to_fsync 100 Interval in milliseconds after which fsync for WAL is being done.

system.settings added/changed

Name Old value New value Description
aggregate_functions_null_for_empty 0 Rewrite all aggregate functions in a query, adding -OrNull suffix to them
allow_experimental_cross_to_join_conversion 1
allow_experimental_data_skipping_indices 1
allow_experimental_low_cardinality_type 1
allow_experimental_map_type 0 Allow data type Map
allow_experimental_multiple_joins_emulation 1
allow_experimental_window_functions 0 Allow experimental window functions
asterisk_include_alias_columns 0 Include ALIAS columns for wildcard query
asterisk_include_materialized_columns 0 Include MATERIALIZED columns for wildcard query
async_socket_for_remote 1 Asynchronously read from socket executing remote query
background_fetches_pool_size 3 Number of threads performing background fetches for replicated tables. Only has meaning at server startup.
background_message_broker_schedule_pool_size 16 Number of threads performing background tasks for message streaming. Only has meaning at server startup.
compile 0
database_atomic_wait_for_drop_and_detach_synchronously 0 When executing DROP or DETACH TABLE in Atomic database, wait for table data to be finally dropped or detached.
date_time_output_format simple Method to write DateTime to text output. Possible values: ‘simple’, ‘iso’, ‘unix_timestamp’.
default_database_engine Ordinary Atomic Default database engine.
do_not_merge_across_partitions_select_final 0 Merge parts only in one partition in select final
enable_global_with_statement 0 Propagate WITH statements to UNION queries and all subqueries
experimental_use_processors 1
force_data_skipping_indices Comma separated list of strings or literals with the name of the data skipping indices that should be used during query execution, otherwise an exception will be thrown.
force_optimize_skip_unused_shards_no_nested 0
format_regexp_escaping_rule Escaped Raw Field escaping rule (for Regexp format)
input_format_csv_arrays_as_nested_csv 0 When reading Array from CSV, expect that its elements were serialized in nested CSV and then put into string. Example: “[“”Hello””, “”world””, “”42″””” TV””]”. Braces around an array can be omitted.
input_format_csv_enum_as_number 0 Treat inserted enum values in CSV formats as enum indices
input_format_null_as_default 0 1 For text input formats initialize null fields with default values if data type of this field is not nullable
input_format_tsv_enum_as_number 0 Treat inserted enum values in TSV formats as enum indices
insert_distributed_one_random_shard 0 If setting is enabled, inserting into distributed table will choose a random shard to write when there is no sharding key
insert_quorum_parallel 1 For quorum INSERT queries – enable to make parallel inserts without linearizability
limit 0 Limit on read rows from the most ‘end’ result for select query, default 0 means no limit length
load_balancing_first_offset 0 Which replica to preferably send a query when FIRST_OR_RANDOM load balancing strategy is used.
log_queries_min_query_duration_ms 0 Minimal time for the query to run, to get to the query_log/query_thread_log.
mark_cache_min_lifetime 0
max_bytes_to_read_leaf 0 Limit on read bytes (after decompression) on the leaf nodes for distributed queries. Limit is applied for local reads only excluding the final merge stage on the root node.
max_concurrent_queries_for_all_users 0 The maximum number of concurrent requests for all users.
max_partitions_to_read -1 Limit the max number of partitions that can be accessed in one query. <= 0 means unlimited.
max_rows_to_read_leaf 0 Limit on read rows on the leaf nodes for distributed queries. Limit is applied for local reads only excluding the final merge stage on the root node.
merge_tree_uniform_read_distribution 1
min_count_to_compile 0
multiple_joins_rewriter_version 2 0 Obsolete setting, does nothing. Will be removed after 2021-03-31
mysql_datatypes_support_level Which MySQL types should be converted to corresponding ClickHouse types (rather than being represented as String). Can be empty or any combination of ‘decimal’ or ‘datetime64’. When empty MySQL’s DECIMAL and DATETIME/TIMESTAMP with non-zero precision are seen as String on ClickHouse’s side.
offset 0 Offset on read rows from the most ‘end’ result for select query
opentelemetry_start_trace_probability 0 Probability to start an OpenTelemetry trace for an incoming query.
optimize_move_functions_out_of_any 1 0 Move functions out of aggregate functions ‘any’, ‘anyLast’.
optimize_on_insert 1 Do the same transformation for inserted block of data as if merge was done on this block.
optimize_skip_merged_partitions 0 Skip partitions with one part with level > 0 in optimize final
output_format_json_array_of_rows 0 Output a JSON array of all rows in JSONEachRow(Compact) format.
output_format_json_named_tuples_as_objects 0 Serialize named tuple columns as JSON objects.
output_format_parallel_formatting 1 Enable parallel formatting for some data formats.
output_format_pretty_row_numbers 0 Add row numbers before each row for pretty output format
output_format_tsv_null_representation Custom NULL representation in TSV format
partial_merge_join 0
read_backoff_min_concurrency 1 Settings to try keeping the minimal number of threads in case of slow reads.
read_overflow_mode_leaf throw What to do when the leaf limit is exceeded.
remerge_sort_lowered_memory_bytes_ratio 2 If memory usage after remerge does not reduced by this ratio, remerge will be disabled.
s3_max_redirects 10 Max number of S3 redirects hops allowed.
s3_max_single_part_upload_size 67108864 The maximum size of object to upload using singlepart upload to S3.
special_sort not_specified
system_events_show_zero_values 0 Include all metrics, even with zero values
union_default_mode Set default Union Mode in SelectWithUnion query. Possible values: empty string, ‘ALL’, ‘DISTINCT’. If empty, query without Union Mode will throw exception.
use_antlr_parser 0 Parse incoming queries using ANTLR-generated experimental parser
use_compact_format_in_distributed_parts_names 0 1 Changes format of directories names for distributed table insert parts.

———-

Also, please refer to the release notes from the development team available at the following URLs:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.