Access control is one of the essential features of database management. Starting in late 2019 ClickHouse contributor Vitaly Baranov began to introduce robust, full-featured Role Based Access Control (RBAC). As a result of this work–which included a huge number of tests implemented by the Altinity QA team–ClickHouse can now rightfully boast enterprise level access control. Best of all, the commands are all in SQL.
User management is the front gate of RBAC. It controls access to ClickHouse itself. This article digs into new commands like CREATE USER that allow you to create, change, and delete users conveniently. We’ll focus on ways to control authentication for single ClickHouse servers.
Along the way the article packs in tips to get the most out of SQL user management without opening up new security holes along the way. Let’s go!
Old-School ClickHouse User Management
ClickHouse users were originally defined using XML files. This still works today. All you have to do is put the following definition in a file called root.xml and place the file in /etc/clickhouse-server/users.d.
<yandex> <users> <root> <password_sha256_hex>2bb80d537b1da3e38bd30361aa855686bde0eacd7162fef6a25fe97bf527a25b</password_sha256_hex> <networks> <ip>127.0.0.1</ip> </networks> <profile>default</profile> <quota>default</quota> <access_management>1</access_management> </root> </users> </yandex>
ClickHouse will recognize the file automatically without restarting and you can login using a command like the following:
clickhouse-client --user=root --password=secret
The root user definition we created has three interesting features.
- It stores the password as a SHA256 hash, so the actual password value secret is not immediately evident. On the other hand, it’s resident on the file system and subject to dictionary attack if it falls into the hands of an evildoer.
- It uses a network mask to restrict logins to the local host.
- It enables access management by putting the value 1 in the <access_management> tag.
The last is the most important for the rest of the article. It means that this login, root, can issue SQL commands related to user management. Let’s login using root as shown above and start trying it out.
SQL User Management
ClickHouse releases after 20.5 have SQL commands that manage the full life cycle of operations on users. For instance, we can create a user as follows.
ch-1 :) CREATE USER IF NOT EXISTS example IDENTIFIED WITH SHA256_PASSWORD BY 'secret'; CREATE USER IF NOT EXISTS example IDENTIFIED WITH sha256_hash BY '2BB80D537B1DA3E38BD30361AA855686BDE0EACD7162FEF6A25FE97BF527A25B' 0 rows in set. Elapsed: 0.001 sec.
ClickHouse generates the SHA-256 hash automatically, which is convenient. We can change the password just as easily using the ALTER USER command. This will generate another SHA-256 hash and store it properly.
ALTER USER example IDENTIFIED WITH SHA256_PASSWORD BY 'topsecret';
If you login now you’ll need to use topsecret as the password. We can easily see all the users on the system as follows.
SHOW USERS ┌─name─────┐ │ default │ │ example │ │ root │ └──────────┘
We can also see how a particular user was created.
SHOW CREATE USER example ┌─CREATE USER example─────────────────────────────────┐ │ CREATE USER example IDENTIFIED WITH sha256_password │ └─────────────────────────────────────────────────────┘
Finally, we can discard the user when it is no longer required.
DROP USER example
MySQL users will notice that user management commands have a familiar ring to them. MySQL syntax has been a strong influence on ClickHouse. The syntax is quite friendly in my opinion.
A more realistic example
We have shown very simple commands so far, but CREATE USER and related commands are quite powerful and cover the complete range of account management including user profiles, quotas, and roles. The following commands create a user settings profile, a role for read-only queries, a quota for the role, and a user that has all of these.
CREATE SETTINGS PROFILE IF NOT EXISTS ro_profile SETTINGS max_threads = 2 MIN 1 MAX 4, max_memory_usage = 10000000 MIN 1000000 MAX 20000000 READONLY CREATE ROLE IF NOT EXISTS ro_role SETTINGS PROFILE 'ro_profile' CREATE QUOTA IF NOT EXISTS batch_quota FOR INTERVAL 3600 second MAX queries 60, MAX result_rows 1000000 TO ro_role CREATE USER ro_user IDENTIFIED WITH SHA256_PASSWORD BY 'top_secret' DEFAULT ROLE ro_role
In case it’s not obvious how the entities are related, here’s a picture:
You can now use login to this user and check settings and privileges assigned to the user via the role.
clickhouse-client --user=ro_user --password=top_secret
We can run a quick query on the settings to ensure the profile is set correctly.
SELECT name, value, min, max FROM system.settings WHERE name IN ('max_threads', 'max_memory_usage') ┌─name─────────────┬─value────┬─min─────┬─max──────┐ │ max_threads │ 2 │ 1 │ 4 │ │ max_memory_usage │ 10000000 │ 1000000 │ 20000000 │ └──────────────────┴──────────┴─────────┴──────────┘
We can also try to create a table, to prove the account lacks privileges to do so.
CREATE TABLE foo (`id` UInt32) ENGINE = tinylog Received exception from server (version 21.1.2): Code: 497. DB::Exception: Received from localhost:9000. DB::Exception: ro_user: Not enough privileges. To execute this query it's necessary to have grant CREATE TABLE ON default.foo.
Here’s some homework: figure out how to confirm that ro_user has the quota. I did it by granting SHOW QUOTAS ON *.* to ro_user. That way ro_user can run SHOW QUOTA and see the limits. There’s probably another way to do it.
The introduction of RBAC authorization opens opportunities for controlling access to data that did not previously exist. The most obvious improvement is that you can now manage user access and privileges over the network without access to the local file system at all. As the foregoing example shows it also opens up the possibility of multi-tenant operation on a single ClickHouse. Exploring this will have to wait for a future article.
The clickhouse.tech website has excellent reference documentation on RBAC commands, so feel free to explore. We’ll meanwhile return to our theme of managing user access to ClickHouse.
Security policies around user passwords
Any system that relies on local passwords tends to create headaches related to security management. Old-school XML user definition files store passwords on disk, which tends to trigger corporate infosec teams that hear about it. Does SQL user management help? Let’s look!
To start, we’ll rerun the command to create an example user.
CREATE USER IF NOT EXISTS example IDENTIFIED WITH SHA256_PASSWORD BY 'secret';
ClickHouse stores the resulting user definition in /var/lib/clickhouse/access. For instance, after running the previous command I issue the following command to see what ClickHouse did.
ATTACH USER example IDENTIFIED WITH sha256_hash BY '2BB80D537B1DA3E38BD30361AA855686BDE0EACD7162FEF6A25FE97BF527A25B';
So the short answer is that passwords are still on disk, but in a new location. ClickHouse ensures that this directory has permissions such that only root or the clickhouse user can access it. However, if you backup /var/lib/clickhouse using a file system snapshot (for example), your snapshot will contain passwords either in plain text or hashed form.
You can move user management files to another location by updating the <local_directory> tag in /etc/clickhouse-server/config.xml. Be sure to check file permissions in the new location to avoid leaking passwords.
Speaking of passwords, there are a number of options for passwords besides SHA-256 hashing. Here are a couple examples that you are unlikely to use in most use cases.
-- Password stored in clear text; DON’T DO THIS! CREATE USER IF NOT EXISTS example_text IDENTIFIED WITH PLAINTEXT_PASSWORD BY 'secret'; -- Double SHA-1 for MySQL compatibility. CREATE USER IF NOT EXISTS example_mysql IDENTIFIED WITH DOUBLE_SHA1_PASSWORD BY 'secret';
The first example stores passwords on disk in clear text where anyone with file read permissions can see them. There’s no reason to do this. The last example shows how to create an account that will work with MySQL clients. It is there to support MySQL clients that use the mysql_native_password protocol.
Here’s a final example: no password at all!
-- No password required to login. CREATE USER IF NOT EXISTS example_no_pw IDENTIFIED WITH NO_PASSWORD;
If you use this approach you must limit access some other way or the data should be very low value. We’ll show how to limit access using network filters in the next section.
Overall SHA-256 passwords are the most secure option for local accounts. If you are still not comfortable with having passwords stored locally on disk, you are in luck. ClickHouse now offers another choice. You can store passwords and even entire user definitions in a central LDAP server. We’ll discuss that in a future blog article.
Protecting accounts with network masks
In our last password example we showed an account with no password. This might seem useless from a security point of view, but it’s actually quite practical in many cases. ClickHouse allows you to limit access using network masks.
Here’s a simple example: a user that can only login from localhost.
CREATE USER example_hostlocal IDENTIFIED WITH NO_PASSWORD HOST LOCAL
Login from other hosts will fail with a suitably ambiguous error message:
clickhouse-client --user=example_hostlocal --host=ch-1 ClickHouse client version 22.214.171.124 (official build). Connecting to ch-1:9000 as user example_hostlocal. Code: 516. DB::Exception: Received from ch-1:9000. DB::Exception: example_hostlocal: Authentication failed: password is incorrect or there is no user with such name.
ClickHouse network filters have a number of useful variations. Here’s a user that is limited to logins from a specific subnet.
CREATE USER example_subnet IDENTIFIED WITH NO_PASSWORD HOST IP '10.2.0.0/24'
Next, here’s a user that can login from IP addresses that correspond to specific hosts. Network masks can be combined by separating different filters with commas. You can add as many filters as you want.
CREATE USER example_host IDENTIFIED WITH NO_PASSWORD HOST NAME 'ch-1', NAME 'ch-2', NAME 'ch-3'
The last example uses a regular expression to match logins from hosts ch-1, ch-2, and ch-3. It also has an extra HOST LOCAL mask to permit login from the local host. This defines a user that may login from the three named hosts, plus locally from the host where the server itself runs.
CREATE USER example_regexp IDENTIFIED WITH NO_PASSWORD HOST LOCAL, REGEXP '^ch-$'
One final note: beware of proxies! If you put an NGINX reverse proxy in front of the ClickHouse server and connect through that to HTTP port 8123 or 8443, ClickHouse will see the network address of the proxy host, not the original client. The same applies if you do TLS termination before the connection reaches ClickHouse. You will see the termination host, not the client program host.
In summary, network filters are useful but do not offer complete access protection. It’s better to supplement them with passwords unless you have complete control over the network path.
SQL-based user management and role-based access control represent a tremendous step forward for ClickHouse security. This article focused on access control to single ClickHouse servers. In a future blog article we will discuss user management on ClickHouse clusters.
We have barely touched the many mechanisms ClickHouse now offers to protect data. There is much more. My colleague Vasily Nemkov just published an article on AES encryption functions. We’ll be publishing further articles about LDAP, RBAC, Kerberos support (it’s happening!), and many other topics. For a complete overview that also includes privacy issues, check out our recent webinar ClickHouse Defense against the Dark Arts–Intro to Security and Privacy.
2021 looks like a great year for ClickHouse security. Contact us at [email protected] if you have questions about how to solve problems related to your use cases. We also have a skilled engineering team ready to implement new features if something is missing. Either way, we will be delighted to talk!