User Notice: Possible ClickHouse Data Corruption in Recent Releases

This is an advisory message that ClickHouse community versions 22.10+. 22.9+. 22.8.6.71+. 22.7.6.74+. 22.3.13.80+ have a backwards incompatible change that may corrupt user data if you use certain aggregation functions. Altinity.Stable versions are not affected.

Read this advisory fully before upgrading/downgrading to or from any of the above versions.

What happened

Serialization of internal state for aggregate functions that store a String state has been changed to a non-compatible format in new builds. Common examples of those functions are:

  • AggregateFunction(argMax, String)
  • AggregateFunction(argMin, String)
  • AggregateFunction(max, String)
  • AggregateFunction(min, String)
  • AggregateFunction(any, String)

If those functions are used in AggregatingMergeTree, the state will be incompatible after an upgrade to the versions listed above, and ClickHouse will not be able to read the data correctly from affected columns. Similarly, the downgrade from affected versions to the previous ones will result in corrupted data.

How to check if your cluster may be affected

Altinity.Cloud

Altinity will check Altinity.Cloud clusters for problems and will let you know if action is needed. You do not need to check yourself. Affected releases will also be marked in the Altinity.Cloud Manager. Do not upgrade to them without consulting with Altinity first.

All Other Community ClickHouse Installations

Run this query on your cluster to confirm if you have affected datatypes or not:

select database, table, name, type from system.columns 
where match(type,'^AggregateFunction\(.*,.*String')

Inspect results carefully. In addition to the examples given above, the following functions are affected::

  • Functions with -If combinator, e.g. AggregateFunction(argMaxIf, String) 
  • Functions over Array(String), tuples with string columns and similar, e.g. AggregateFunction(argMax, Array(String))

If you are not sure if your cluster is affected or not, reach out to us for more information. You can contact Altinity at info@altinity.com or via normal support channels (Zendesk or Slack).

What to do now

If you are using affected data types, DO NOT upgrade.

If you are using affected data types, and you have already upgraded:

  • Downgrade if you have not inserted any data into the affected table columns
  • Otherwise rebuild or recreate affected tables and columns if possible

In both cases wait for a bugfix release for a safe upgrade.

How it is going to be fixed

ClickHouse developers are working on a fix that will be compatible with serialization both in old and new versions of ClickHouse. The fix will be deployed to the minor releases of affected versions. We will post another message as soon as this update is released.

Altinity Stable builds will be updated with the fix in the next patch release. Current Altinity Stable builds are not affected. You may continue to use them as before.

Once again you can contact us at info@altinity.com or via normal support channels (Zendesk or Slack) for more information. We welcome questions about this or any other topic related to ClickHouse.

Share

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.