• Migration to ClickHouse

    Oct 23, 2017   
    ClickHouse is an excellent analytics database choice not just for startups but also for companies that have already invested significant amount of resources into their analytics solutions, but are not completely satisfied with the results. In this article we will discuss how and when companies consider the ClickHouse migration project, and what challenges they may expect. We do not disclose any names, but every example has a real world prototype. 

  • ClickHouse release 1.1.54292

      The new deb packages are available from Yandex repo and RPMs from Altinity repo. New features: Added the pointInPolygon function for working with coordinates on a coordinate plane. Added the sumMap aggregate function for calculating the sum of arrays, similar to SummingMergeTree. Added the trunc function. Improved performance of the rounding functions (round, floor,[…]

  • Big Dataset: All Reddit Comments – Analyzing with ClickHouse

    Big Dataset: All Reddit Comments – Analyzing with ClickHouse

    Oct 5, 2017   
    In this blog, I’ll use ClickHouse and Tabix to look at a new very large dataset for research.

    It is an interesting dataset I found recently that has been available since 2015. This is Reddit’s comments and submissions dataset, made possible thanks to Reddit’s generous API. 

  • ClickHouse release 1.1.54289

      The new deb packages are available from Yandex repo and RPMs from Altinity repo. New features: SYSTEM queries for server administration: SYSTEM RELOAD DICTIONARY, SYSTEM RELOAD DICTIONARIES, SYSTEM DROP DNS CACHE, SYSTEM SHUTDOWN, SYSTEM KILL. Added functions for working with arrays: concat, arraySlice, arrayPushBack, arrayPushFront, arrayPopBack, arrayPopFront. Added the root and identity parameters for[…]

  • Massive Parallel Log Processing with ClickHouse

    Massive Parallel Log Processing with ClickHouse

    In this blog, I’ll look at how to use ClickHouse for parallel log processing. Below I’ll show how ClickHouse can be used to efficiently perform this task. ClickHouse is attractive because it has multi-core parallel query processing

  • ClickHouse New Releases and ChangeLog

    It took a while for Yandex to respond to numerous community requests regarding release notes. But finally it’s there. On August 21-st Yandex created a changelog file covering releases starting from 1.1.54245, and they keep it up to date so far.

  • Nested Data Structures in ClickHouse

    In this blog post, we’ll look at nested data structures in ClickHouse for MySQL and how this can be used with PMM to look at queries.

  • Who and Why is Using ClickHouse

    In this article we discover several companies that are using ClickHouse in production for different use cases. 

  • ClickHouse 1.1.54245 release

      Few weeks ago Yandex released ClickHouse 1.1.54245 that included a number of features requested by community. The new deb packages are available from Yandex repo and RPMs from Altinity repo. So let’s see what’s new in this release. New features Distributed DDLs (see CREATE TABLE ON CLUSTER ) New table engine Dictionary that allows[…]

  • ClickHouse AggregateFunctions and Aggregate State

    Jul 10, 2017   
    The guest article from ClickHouse evangelist Yegor Andreenko (@f1yegor) about an interesting extension of aggregate functions aggregate states, that can be used to pre-aggregate data that usually has to be kept in raw, like uniques.

  • ClickHouse vs Amazon RedShift Benchmark #2: STAR2002 dataset

    Jul 3, 2017   
    We continue to benchmark ClickHouse against other analytic DBMSs. We were inspired by the benchmark with star2002 experiment dataset described here, and decided to replicate it using ClickHouse. That gives another interesting comparison vs Amazon RedShift.

  • ClickHouse in a general analytical workload (based on Star Schema Benchmark)

    Jun 26, 2017   
    In this blog post, we’ll look at how ClickHouse performs in a general analytical workload using the star schema benchmark test.

  • ClickHouse vs Amazon RedShift Benchmark

    ClickHouse vs Amazon RedShift Benchmark

    Jun 19, 2017   
    We continue benchmarking ClickHouse. In this article we discuss a benchmark against Amazon RedShift.

  • ClickHouse Data Distribution

    Jun 5, 2017   
    ClickHouse data distribution and replication are fundamental techniques for building reliable and scalable ClickHouse systems. In this article we will explain how it works from the user perspective

  • ClickHouse Release Notes

    ClickHouse is an exellent DBMS with very smart people working on it. Unfortunately, it still lacks some important communication procedures, and arguable the most wanted one is release notes. 

    Altinity mission is to make ClickHouse use easy for everbody, and we will try to fill the gap between Yandex and end users. 

    Below are release notes that Yandex team presented at the recent meetup in Ekaterinburg. It includes changes and new features appeared in latest 1.1.54231 release as well as some others that were not yet documented or publically explained.
     

  • Altinity ClickHouse Repository

    We are pleased to announce that Altinity repository hosting ClickHouse RHEL/CentOS RPMs as well as packages for Fedora is available at packagecloud

  • ClickHouse Dictionaries Benchmarking

    Apr 26, 2017   
    There are few ClickHouse benchmarks in the web already. Most of them use denormalized database schema. However, in denormalization is not always possible or desirable. In this article we will compare the query performance between denormalized and normalized schema where normalization is modelled using unique ClickHouse external dictionaries feature.

  • ClickHouse Dictionaries Explained

    Apr 12, 2017   
    One of the most useful ClickHouse features is external dictionaries. They are extremely powerful, and if used efficiently may lead to quite elegant designs. I will lead you through the dictionaries using few examples that highlight basic and advanced usage scenarios. So let’s begin.