Webinars

What’s a Data Lake and What Does It Mean For My Open Source ClickHouse® Stack?

Recorded: January 22 @ 08:00 am PT
Presenters: Robert Hodges

Data lakes on open table formats are emerging as the go-to storage of large datasets for analytics, data science, and AI. This talk explains how data lakes work and how ClickHouse® is integrating them.

We’ll introduce the key components of data lakes using a concrete example based on Parquet, Iceberg open table format, and the Iceberg REST catalog. Next we’ll look at new ClickHouse® feature adaptations, exploring specific issues like event stream ingest, compaction, and queries.

Finally, we’ll illustrate how to combine ClickHouse® with Apache Spark and Kafka to deliver fast analytics on massive, shared tables. The real-time data lake is arriving and ClickHouse® is going to be a big part of it.

Here are the slides:

Share

ClickHouse® is a registered trademark of ClickHouse, Inc.; Altinity is not affiliated with or associated with ClickHouse, Inc.

Related:

Leave a Reply

Your email address will not be published. Required fields are marked *