Skip to main content

Can you PIVOT in ClickHouse?

ClickHouse doesn't have a PIVOT clause, but we can get close to this functionality using aggregate function combinators. Let's see how to do this using the UK housing prices dataset.

Introduction

ClickHouse has no pivot operator, but we can achieve similar behavior using aggregate function combinators and, in particular, the ones with the -Map suffix.

In this article, we're going to learn how to do that. There is also a video covering the same material, which you can see below:

Understanding aggregate function combinators

Let's start with a simple example. We're going to use clickhouse-local, which you can launch by running the following:

The following query calls the sumMap function, which takes in a map and sums the values of each key:

This isn't a particularly interesting example as it returns the same map that we passed in. Let's now call sumMap over multiple rows of maps;

The key ClickHouse appeared on both rows and its values have been summed. The key ClickBench was only present on one line, so it sums a single value, which returns that value!

We can also use maxMap to find the maximum values per key:

Or we can use avgMap to find the average value per key:

Hopefully that's given you an idea of how these function combinators work.

Real-World Application: UK housing prices dataset

Now we're going to use them on a bigger dataset in the ClickHouse SQL playground.

We can connect to the playground using clickhouse-client:

We're going to query the uk_price_paid table, so let's explore the data in that table:

We can see above that the table contains various fields related to property sales in the UK.

Grouping and aggregating by decade

Let's work out the median prices grouped by county for each decade in the dataset:

Filtering results

We can filter the results to only include data from 2010 and on:

Combining multiple aggregations

And if we want to find the maximum price per decade we can do that using the maxMap function that we saw earlier:

Applying functions to map values

Alternatively, we can compute the average price using avgMap. Those values have a lot of decimal points, which we can clean up by using the mapApply function to call the floor function on each value in the map:

Flexible grouping: counties, districts, and postcodes

Let's try to group by some different fields. This time we're going to compute the median price per decade grouped by county and district:

We could also choose to group by year and then concatenate postcode1 and postcode2 in the map:

· 7 min read