Petabyte-Scale Data in Real-Time: ClickHouse, S3 Object Storage, and Data Lakes

These days new ClickHouse applications start with petabyte-sized datasets and scale up from there. Fortunately, ClickHouse gives you open-source tools for real-time analytics on big data: MergeTree backed by object storage as well as reading on data lakes. We'll start by showing you popular design patterns for ingest, aggregation, and queries on source data. We'll then dig into specific best practices for defining S3 storage policies, reading from Parquet data, backing up, monitoring, and setting up high-performance clusters in the cloud. It's all open source and works in any cloud. Join us!