Ultra-Fast AWS Graviton Instances in Altinity.Cloud

My dear, here we must run as fast as we can, just to stay in place.
And if you wish to go anywhere you must run twice as fast as that
.”

Lewis Carroll, Alice in Wonderland

Last month, AWS introduced new instance type families, powered by Graviton3 ARM processors: m7g and r7g. Since our go-to instance types for Altinity.Cloud are m5, m6i and m6g, I could not resist testing the performance of m7g. TLDR – it is outstanding!

The Scene Setup

One of the standard benchmarks we are using in order to test performance is the 600M row Star Schema Benchmark (SSB). We will use the same procedure as described in “Ultra-fast Data Loading and Testing in Altinity.Cloud”  and  “Altinity.Cloud Extends Managed Service to ARM” articles. 

Our next generation of Altinity.Cloud infrastructure allows us to add new node types with a single user click, so I added a new m7g.xlarge node type:

After 5 minutes, the new node type was available to start a cluster. (Note: adding node types dynamically is a new feature that is not yet released to our clients; please contact Altinity support if you need extra node types in your account).

In order to compare apples-to-apples, I started 3 single node ClickHouse instances of the same ‘xlarge’ size, i.e. 4 vCPUs and 16GB of RAM, and also 200GB of gp3 EBS storage:

M6i.xlarge

This is the fastest instance for ClickHouse so far in “M”-family, introduced in August 2021. It is powered by 3rd generation Intel Xeon Scalable processors (code named Ice Lake) with an all-core turbo frequency of 3.5 GHz.

M6g.xlarge

This Graviton2 instance was introduced in May 2020, so it is pretty old already. It showed good results in our previous benchmark, delivering results in-between m5 and m6i instances for a smaller price tag.

M7g.large

This is our new guest from February 2023 powered by AWS Graviton3 processors. AWS claims it delivers up to 25% better performance over Graviton2-based instances. We will find out if it is true or not for ClickHouse soon.

For benchmarks, we always use the latest available ClickHouse version, which is 23.2.3.17. Once started, the 3 instances look great next to each other on my screen, somewhat resembling the famous “Three Knights” painting of Victor Vasnetsov.

For every instance, I created a ‘lineorder_wide’ table using this script. Data was loaded from an S3 bucket. This time it is even simpler than it was before, as there is no need to specify schema as a part of s3 function call thanks to schema inference:

INSERT INTO lineorder_wide
SELECT * FROM
s3('https://s3.us-east-1.amazonaws.com/altinity-clickhouse-data/ssb/data/lineorder_wide_bin/*.bin.zstd', 'Native')
SETTINGS max_threads=8, max_insert_threads=8;

I did not measure loading time, but checked it afterwards using the Altinity.Cloud monitoring dashboard: it took about 12 minutes to load all the data with peak memory usage at 9GB.

The Battle Begins and Ends

Once the data is loaded, we can run test queries. We use the same benchmark script as in the previous articles. It performs one warmup run for every query, then takes 3 test runs, and extracts the average query time from the query_log table using query SQL comments technique in order to exclude network latency. The test run command for ‘m6i’ cluster is the following:

$ TRIES=3 CH_CLIENT=clickhouse-client CH_HOST=m6i.us-east1.dev.altinity.cloud CH_USER=admin CH_PASS=*** CH_DB=default QUERIES_DIR=flattened/queries ./bench.sh

flattened/queries: ef751cb1a6226399a5e0c9ebdd012e4b  -
Q1.1.sql ... 0.281
Q1.2.sql ... 0.028
Q1.3.sql ... 0.057
Q2.1.sql ... 0.927
Q2.2.sql ... 0.449
Q2.3.sql ... 0.384
Q3.1.sql ... 0.521
Q3.2.sql ... 0.891
Q3.3.sql ... 0.514
Q3.4.sql ... 0.012
Q4.1.sql ... 0.261
Q4.2.sql ... 0.094
Q4.3.sql ... 0.132
Total :	4.551

It has been similarly executed for m6g and m7g.  The results are summarized in this chart:

Total query time for all queries (sec):

■ 23.2 m6i.xlarge■ 23.2 m6g.xlarge■ 23.2 m7g.xlarge
4.5515.8683.865

As you can see results are even better than advertised by AWS!

  • With ClickHouse SSB workload, m7g is 35% faster than its older brother m6g!
  • It is also 15% faster than an Intel m6i instance! 
  • On Q2.1 the difference in performance was 44%

Note that the full experiment from starting ClickHouse to finishing the tests took me about 30 minutes. The speed of testing different hardware configurations in Altinity.Cloud is unbeatable.

Who Pays for the Party?

It is easy to get stellar performance in the cloud nowadays, but it sometimes may cost you a fortune. How does m7g compare to the rest m-family? In fact, it does very well. Let’s take the table with a total query time, add a price per hour, and normalize both price and query time to the lowest number: 

■ 23.2 m6i.xlarge■ 23.2 m6g.xlarge■ 23.2 m7g.xlarge
SSB Benchmark time4.5515.8683.865
Normalized time1.1751.5181
Cost per hour in us-east10.19200.15400.1632
Normalized cost1.246811.0597

You can derive conclusions by yourself, but it looks like a gift from AWS to me. There are two caveats though:

  1. The availability of m7g instances is still limited. It should increase over time.
  2. Some ClickHouse features still do not work on ARM, namely HDFS integration and GRPC. We do not use those features in Altinity.Cloud though.

Conclusion

AWS constantly delivers new features in their cloud, and powerful computers are good examples. M7g performance is outstanding, and we will be recommending it to Altinity.Cloud users. Please contact us if you want to try them out. The absolute flexibility and agility of Altinity.Cloud makes it possible to run ClickHouse with any instance type in any AWS or GCP region, and, with Altinity.Cloud Anywhere, in any customer’s environment.

Share