ClickHouse and ProxySQL queries rewrite

ClickHouse and ProxySQL queries rewrite

ProxySQL is a popular open source, high performance and protocol-aware proxy server for MySQL and its forks. Since September 2017 ProxySQL supports ClickHouse as a backend, so clients can connect to ClickHouse via MySQL protocol. In practice, this helps MySQL-aware applications to start using ClickHouse as without changes in the client library.

To avoid some limitations to this approach, ProxySQL creator René Cannaò added additional functionality for query rewrite. With his permission, we cross-post his article describing new functionality in our blog.

Archiving MySQL Tables in ClickHouse

Archiving MySQL Tables in ClickHouse

Why Archive? Hard drives are cheap nowadays, but storing lots of data in MySQL is not practical and can cause all sorts of performance bottlenecks. 

In this article Percona’s blogger Alexander Rubin is talking about archiving MySQL tables in ClickHouse for storage and analytics.

Aggregate MySQL data at high speed with ClickHouse

Aggregate MySQL data at high speed with ClickHouse

Feb 12, 2018
There are multiple ways how ClickHouse and MySQL can work together. External Dictionaries, ProxySQL support or [realtime streaming] of MySQL binary logs into ClickHouse. A few weeks ago ClickHouse team has released mysql() table function that allows to access MySQL data directly from ClickHouse. This opens up a number of interesting capabilities. Accidentally we have found [a blog article in Japanese] by Mikage Sawatari? , that tests a new way of integration and translated it for our blog with some minor edits.

Big Data Analysis in Digital Marketing Research

Dec 6, 2017
Christian Hotz-Behofsits, Teaching & Research Associate at Vienna University of Business and Economics, is one of the creators of RClickhouse package for R that we have recently introduced on our blog. In this article he describes data analysis challenges his group is facing and how ClickHouse helps in their research.

ClickHouse Primary Keys

This is a cross-post from: (https://medium.com/@f1yegor/clickhouse-primary-keys-2cf2a45d7324)

Special thanks to Alexey Milovidov, ClickHouse developer, for providing material for this article.

Recently I dived deep into ClickHouse. ClickHouse is column-store database by Yandex with great performance for analytical queries. For example check benchmark and post of Mark Litwintschik.

Big Dataset: All Reddit Comments – Analyzing with ClickHouse

Big Dataset: All Reddit Comments – Analyzing with ClickHouse

Oct 5, 2017   
In this blog, I’ll use ClickHouse and Tabix to look at a new very large dataset for research.

It is an interesting dataset I found recently that has been available since 2015. This is Reddit’s comments and submissions dataset, made possible thanks to Reddit’s generous API. 

Massive Parallel Log Processing with ClickHouse

Massive Parallel Log Processing with ClickHouse

In this blog, I’ll look at how to use ClickHouse for parallel log processing. Below I’ll show how ClickHouse can be used to efficiently perform this task. ClickHouse is attractive because it has multi-core parallel query processing

Nested Data Structures in ClickHouse

In this blog post, we’ll look at nested data structures in ClickHouse for MySQL and how this can be used with PMM to look at queries.