Loading view.
SQL databases like ClickHouse love rectangular data. But what if you can't define your table schema in advance?
Olga Silyutina's talk shows how Sumsub solved this classic problem for product analytics using schema-agnostic design. The Sumsub platform ingests a wide range of event types for delivery to analytics. Sumsub uses a schema-agnostic approach to transform different event types with ClickHouse materialized views into a flattened form that's convenient for analysis. Apache Airflow brings the resulting data to Superset for visualization.
Olga's talk provides detailed design and code examples for each step of this innovative solution.
Tired of big bills from Snowflake and BigQuery? Want to keep data in-house? Trying to avoid vendor lock-in? Solve these problems and more by building your own cloud-native analytic service. We start with the architecture of close-source, cloud analytic databases like Snowflake. We then introduce an equally capable design for real-time analytics built entirely on robust open source. Next, we stand up an example using Kubernetes for the run-time, ClickHouse as the query engine, and infrastructure-as-code to deploy apps. Ingest, visualization, and system services are all included. The talk ends with cost numbers to prove that you can operate the service at a fraction of the cost of your current cloud database.
(Code used in the platform demo is open source and available on GitHub.)