ClickHouse Data Management Internals — Understanding MergeTree Storage, Merges, and Replication
ClickHouse manages huge amounts of data in MergeTree tables. But how is MergeTree organized in storage? What’s a merge and how does it work? How does ClickHouse replicate data and commands across clusters? And what’s a mutation? Our talk explains these features and more. We’ll even discuss mysteries like hard links and the DDL replication queue. Our answers focus on the knowledge necessary to run ClickHouse efficiently and avoid problems. Bring your curiosity and questions!