Tableau with ClickHouse

By Andrey Monakhov on July 19th, 2018

Tableau with ClickHouse

Andrey Monakhov ClickHouseTableauTutorial

Preface

ClickHouse users often require data to be accessed in a user-friendly way. There is a number of tools that can display big data using visualization effects, charts, filters, etc. Tableau is one of the popular ones. In this article, we will explain how Tableau can be used with ClickHouse data source.

Installation

For the purpose of this article, we will use a free trial distribution of the desktop version for Windows platform (https://www.tableau.com/products/desktop).

So download it by specifying your email address and install it once a download is complete. It will require to fill up the form before you will start using it.

Driver Configuration

In order to fetch the data from ClickHouse server, it is required to install and configure the proper driver for that particular connection. This driver could be found on Yandex github repo here:

https://github.com/yandex/clickhouse-odbc/releases/tag/2018-05-15

Pick a correct version of it and install it.

Once it’s installed you can configure it using Windows ODBC Admin Tool called “ODBC Data Sources (XX-bit)”. Important Note: Instead of XX use the same number as you have selected for connection driver.

Click Add… button

Pick the correct driver

After these steps restart Tableau application if it was running.

Here is another article on how to do it, made by MySQL Team: https://dev.mysql.com/doc/connector-odbc/en/connector-odbc-configuration-dsn-windows-5-2.html

Application Configuration

Add data source using ODBC driver. On the main screen of Tableau application, there is a list of supported types of Databases. We pick the one that says “Other Databases (ODBC)” and select a previously configured ClickHouse server connection from the list of DSNs.

The next step is to setup data source using a selected connection.

It is as simple as it gets. Just click the Search button (a magnifier icon) and the application will connect to the database and fetch the list of available tables. From that list, we can either drag and drop one table to the work area or double-click on it to get the same results.

We can add more tables and configure joins right away to prepare a complicated data source but for now, we leave only one table there.

Once data source setup is complete we can start doing some stuff with the actual data.

Custom SQL Definition

In the example above we used a simple table pick from the list of available tables. What if we have a more complicated use case with extra fields or joins? In that case, we could use a custom query definition. So here is a simple definition of a query with an additional field picked from an external dictionary.

Here we define a new field with alias ‘t’ along with other fields in the table ‘yellow_tripdata’ which will contain values of ‘max_temperature’ from an external dictionary called ‘weather’.

And now we can pick this extra field from dictionary right to our Measures and into the chart. We’re going to take a closer look at setting up Dimensions, Measures, Filters in the next section.

Dataset Description

We are going to use one of the public datasets based on flights activity in the US. This dataset is also used by Yandex in their ClickHouse documentation.

Basic info about the dataset: it has one table with 30+ columns representing flight departure and arrival date time, distance between origin and destination spots, travel duration, etc.

Use Case

As for our use case, let’s state that we need to measure dynamics of the number of flights, complete from California to New York in a period of time. Also, let’s check whether the average distance was the same for all flights.

We have configured a data source, so we can fetch data from only one table. Let’s now make a chart that displays the number of flights over a period of time.

In order to do so in Tableau, one should go to a “Sheet” window of the current Workbook first.

On a brand new sheet you can see page areas for columns and rows definitions.
Add a Dimension to the Column area (drag and drop) and Tableau will divide fetched data by the values from that Dimension.
By adding a Measure to the Rows area you will get different metric values grouped by added Dimension.

In our case, we picked WEEK(FlightDate) as Column Dimension and we picked two Measures: Distance and COUNT(*) – which is basically Number of Flights for that period.

Now we have automatically generated and live chart that fetches data from our ClickHouse data source and displays all that in a visual way.

Everything is set and we can add some Filters. As we said before we need to filter out only those flights that have flown from California to New York state. To do so we can simply drag and drop Dimensions “Origin State” and “Dest State” respectively to the Filters area of the Sheet.
When we pull one of our filters, Tableau will show a dialog where we can setup values for that particular filter.

And here we set up both filters:

So here it is! Just a few clicks and the data from ClickHouse is displayed in a user-friendly way in Tableau.

 
  1. Does this solution work on Tableau server? I’ve published workbook on server but constantly getting this error HTTP status code: 500Received error:Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 39 (line 2, col 30):The table "table name" does not exist.
    In ODBC.ini I assigned default database. And queries in isql works fine with out specifying database before table name

    1. Hello Roman,
      What version of Tableau Server are you using? We have published our data source described in this article as ODBC Data Source for ClickHouse using version 2018.2.0 (20182.18.0812) of Tableau Server and it seems to work just fine.

      1. Hi Roman and Andrey,Thanks for inspiring me to try again, after last year it was absolutely impossible to get anywhere with it.I also have a problem with the server (2018.3.2 (20183.18.1214.0808) 64-bit Linux)I’ve followed your instructions and created a workbook with a live data source. Published it to the server, but when I try to access a view, I get the following error message, although my server administrator already installed the Clickhouse ODBC drivers to the Server.[unixODBC][Driver Manager]Can't open lib 'ClickHouse Unicode' : file not found
        Generic ODBC requires additional configuration. The driver and DSN (data source name) must be installed and configured to match the connection.
        Unable to connect using the DSN named "Clickhouse_w". Check that the DSN exists and is a valid connection.

  2. Hi guys!I’m facing some difficulties with extracting and publishing datasources for large queries.I’m using Custom SQL which results in let’s say 10 mln rows with 10 columns. After ~3 mln rows are imported the extract creation is being interrupted by error like this:
    An error occurred while communicating with the Other Databases (ODBC) data source ‘Custom_SQL_Query (stats)’.Bad Connection: Tableau could not connect to the data source.Incomplete result received. Want size=4.
    Did anyone encountered same problem?


Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: