ClickHouse is famously fast, but a small amount of extra work makes it much faster. Join us for the latest version of our popular talk on single-node ClickHouse performance. We start by examining the system log to see what ClickHouse queries are doing. Then we introduce standard tricks to increase speed: adding CPUs, reducing I/O with filters, restructuring joins, adding indexes, and using materialized views, plus many more. In each case we show how to measure the results of your work. There will as usual be time for questions as well at the end. Watch the replay to polish your ClickHouse performance skills!
ClickHouse clusters apply the power of dozens or even hundreds of nodes to vast datasets. In this webinar, we’ll show you how to use the basic tools of replication and sharding to create high performance ClickHouse clusters. We’ll study the plumbing of inserts into sharded datasets and how to determine the correct number of shards for your desired writes. We’ll similarly look at distributed queries and show how to scale read capacity to desired levels using replicas. Finally, we’ll look at techniques for scaling up both shards and replicas to accommodate growth in your dataset.
You are about to deploy ClickHouse into production. Congratulations! But what about monitoring? In this webinar we will introduce how to track the health of individual ClickHouse nodes as well as clusters. We’ll describe available monitoring data, how to collect and store measurements, and graphical display using Grafana.
Materialized views are the killer feature of ClickHouse, and the Altinity 2019 webinar on how they work was very popular. Watch this replay to learn how to use materialized views to speed up queries hundreds of times. We’ll cover basic design, last point queries, using TTLs to drop source data, counting unique values, and other useful tricks. Finally, we’ll cover recent improvements that make materialized views more useful than ever.
Apache Kafka is a popular way to load large data volumes quickly to ClickHouse. In this webinar we will cover best practices for integrating Kafka and ClickHouse including setup of Kafka clusters, defining materialized views to pull data into ClickHouse, and organization of target tables.
Log messages are one of the most important types of application data. ClickHouse is very good at storing log data; many SaaS applications use it under the covers. In this webinar we will show examples of different application logs and how to design tables to store them. Options include using typed columns, strings, JSON, or key-value pair arrays. We’ll also discuss how materialized columns to improve filter speed, as well as techniques to tune index granularity for wide rows.
New users of ClickHouse love the speed but may run into a few surprises when designing applications. Column storage turns classic SQL design precepts on their heads. This talk shares our favorite tricks for building great applications. We’ll talk about fact tables and dimensions, materialized views, codecs, arrays, and skip indexes, to name a few of our favorites. We’ll show examples of each and also reserve time to handle questions. Join us to take your next step to ClickHouse guruhood!
Successful MySQL applications have a very common scale problem: how to provide analytics when data grows too large for MySQL to handle. Fortunately, there is a good answer. ClickHouse is a popular, open source column store you can use to add fast analytics to MySQL applications. We’ll start with an introduction to ClickHouse for MySQL users showing familiar SQL syntax, take you through important features like support for MySQL wire protocol and finish with design approaches that allow you to extend MySQL applications with analytics that can handle billions of rows.
ClickHouse is famous for speed. That said, you can almost always make it faster! This webinar uses examples to teach you how to deduce what queries are actually doing by reading the system log and system tables. We’ll then explore standard ways to increase query speed: data types and encodings, filtering, join reordering, skip indexes, materialized views, session parameters, to name just a few. We hope you’ll enjoy the first step to becoming a ClickHouse performance guru!
Built-in replication is a powerful ClickHouse feature that helps scale data warehouse performance as well as ensure high availability. This webinar will introduce how replication works internally, explain configuration of clusters with replicas, and show you how to set up and manage ZooKeeper, which is necessary for replication to function. We’ll finish off by showing useful replication tricks, such as utilizing replication to migrate data between hosts.
Materialized views are a killer feature of ClickHouse that can speed up queries 200X or more. Our webinar will teach you how to use this potent tool starting with how to create materialized views and load data. We’ll then walk through cookbook examples to solve practical problems like deriving aggregates that outlive base data, answering last point queries, and using AggregateFunctions to handle problems like counting unique values, which is a special ClickHouse feature.
This talk shows how to get a sub-second response from datasets containing a billion rows or more. We’ll start with defining schema and loading quickly data in parallel. We will then introduce tricks like LowCardinality datatype, ASOF joins, and materialized views that can reduce query response to thousandths of seconds. Finally, we’ll show you metrics and logging to analyze query performance. After this talk you’ll be ready for your first billion rows and many more afterwards