1. Events
  2. webinar

Views Navigation

Event Views Navigation


Build a Low-Cost, High-Performance Analytic Platform with Kubernetes and Open Source

Tired of big bills from Snowflake and BigQuery? Want to keep data in-house? Trying to avoid vendor lock-in? Solve these problems and more by building your own cloud-native analytic service. We start with the architecture of close-source, cloud analytic databases like Snowflake. We then introduce an equally capable design for real-time analytics built entirely on robust open source. Next, we stand up an example using Kubernetes for the run-time, ClickHouse as the query engine, and infrastructure-as-code to deploy apps. Ingest, visualization, and system services are all included. The talk ends with cost numbers to prove that you can operate the service at a fraction of the cost of your current cloud database.

(Code used in the platform demo is open source and available on GitHub.)

Using S3 Storage and ClickHouse: Basic and Advanced Wizardry

S3-compatible object storage can reduce cost and improve flexibility in ClickHouse analytics. This talk helps develop the legerdemain to use it well. We'll start with standard use patterns like tiered storage and archiving to Parquet. Next, we'll dig into the details of using object storage in ClickHouse tables: storage policies, authentication, caching, and zero-copy replication. Finally, we'll present expert tips on topics like S3 cost optimization, designing for mutable data, backup, and others. Join our expert presenters and bring your questions!

Snowflake, BigQuery, or ClickHouse? Pro Tricks to Build Cost-Efficient Analytics for Any Business

Do you ever look at your bill for Snowflake or BigQuery and just sigh? This talk is for you. We'll explain how pricing works for popular analytic databases and how to get the best deal. Then we'll look at how to build an alternative using open-source ClickHouse data warehouses. As the pros say, open source may be free but it ain't cheap! We'll teach you the tricks to build your own ClickHouse analytic stack that's less expensive and faster than Snowflake. Join us to become a wizard of cloud cost management.

Keeping Your Cloud Native Data Safe: A Common-Sense Guide to Kubernetes, ClickHouse, and Security

Kubernetes is now the default environment for ClickHouse-based analytic platforms. But how to keep your data safe? Our talk is a practical walkthrough of security for ClickHouse on Kubernetes. We'll show you how to divide the security problem into easily understood pieces. Then we'll walk through practical ways you can protect data using the Altinity Operator for ClickHouse, Kubernetes Secrets, managed Kubernetes, and more. You don't have to be a security wizard to protect your ClickHouse data. Common sense and a little organization will do. Join us to find out how!

Safety First! Using clickhouse-backup for ClickHouse Backup and Restore

Backups protect ClickHouse data in cases ranging from accidentally dropping a database to data centers burning down. In this webinar, we'll compare available backup options for ClickHouse,  then zero in on clickhouse-backup, a popular open-source project that Altinity maintains. The presentation will discuss how clickhouse-backup works, standard backup and restore commands, and tips for reliable operation in production systems. We'll end with Q & A. Join us if you love your ClickHouse data and want to ensure it sticks around! 

ClickHouse Data Management Internals — Understanding MergeTree Storage, Merges, and Replication

ClickHouse manages huge amounts of data in MergeTree tables. But how is MergeTree organized in storage? What's a merge and how does it work? How does ClickHouse replicate data and commands across clusters? And what's a mutation? Our talk explains these features and more. We'll even discuss mysteries like hard links and the DDL replication queue. Our answers focus on the knowledge necessary to run ClickHouse efficiently and avoid problems. Bring your curiosity and questions!

ClickHouse and Apache Parquet: Past, Present, and Future

Apache Parquet, a popular columnar storage format, has become a linchpin for data-driven organizations seeking efficient and scalable solutions.
ClickHouse can read and write Parquet data at very high speeds which allows interesting integration capability.
In this webinar, we will walk through the existing ClickHouse functionality, give an overview of performance improvements over the last year, and discuss how ClickHouse can do even better.
Presenter: Alexander Zaitsev - Altinity co-founder and CTO. ClickHouse expert since 2016

Eureka! 8 developer tricks for running ClickHouse on Kubernetes

Kubernetes is a great platform for ClickHouse with many success stories. What are the tricks that make it possible? This webinar shows 8 practices to help ClickHouse developers build faster and more cost-efficient analytics on Kubernetes. We'll start with using the Altinity operator, work our way through various techniques to scale up or shut off compute, and end with advice on zero-downtime upgrade. Along the way we'll review key Kubernetes features like node autoscaling. Join us to learn more about data on Kubernetes and impress your friends!

Deep Dive on ClickHouse Sharding and Replication Webinar

ClickHouse works out of the box on a single machine, but it gets complicated in a cluster. ClickHouse provides two scaling-out dimensions -- sharding and replication -- and understanding when and how those should be applied is vital for production and high-load applications. In this webinar we will focus on ClickHouse cluster setup and configuration, cluster operation in public clouds, executing distributed queries, hedged requests, and more.

ClickHouse Performance Master Class – Tools and Techniques to Speed up any ClickHouse App Webinar

ClickHouse gives impressive performance out of the box. Our webinar will show you how to make it amazing. We start by providing a framework for performance that includes basic drivers like I/O and compute, as well as ClickHouse structures like compression and indexes. We'll also discuss tools to evaluate performance including ClickHouse system tables and EXPLAIN. We'll then demonstrate how to evaluate and improve performance for common query use cases ranging from MergeTree data on block storage to Parquet files in data lakes. Join our webinar to become a master at diagnosing query bottlenecks and curing them quickly.

Petabyte-Scale Data in Real-Time: ClickHouse, S3 Object Storage, and Data Lakes

These days new ClickHouse applications start with petabyte-sized datasets and scale up from there. Fortunately, ClickHouse gives you open-source tools for real-time analytics on big data: MergeTree backed by object storage as well as reading on data lakes. We'll start by showing you popular design patterns for ingest, aggregation, and queries on source data. We'll then dig into specific best practices for defining S3 storage policies, reading from Parquet data, backing up, monitoring, and setting up high-performance clusters in the cloud. It's all open source and works in any cloud. Join us!

Building a Kubernetes Platform for Trillion Row Tables– CN Data at Scale

Kubernetes is an amazing platform for data, but it requires focus to deliver production apps, especially if they are large. In this talk I'll share learnings from operating a database SaaS for ClickHouse as well as helping customers do it themselves. We'll describe some of the attributes of large analytic applications, then discuss common approaches to building platforms to operate them. Our talk will cover use of standard tools like Terraform, Helm, and Argo CD, which work together to bring up Kubernetes and the apps that run inside it. Along the way we'll describe various "aha!" experiences, tools, and even team organization to support large database applications on Kubernetes.