BFS zero-to-hero, part 7: Crosstab
Implementing the logic of a tiny crosstab statistics engine
Part 1 | Part 2 | Parts 3 & 4 | Part 5 | Part 6
Challenge 7: Crosstabs everywhere
This is the final installment of the BFS zero-to-hero series, hope you enjoyed it! (And learned a bit, too.)
It’s based on an exercise I was given a very long time ago, which really opened my mind to the power of abstractions.
In a nutshell: you’re going to implement a flexible engine for generating SQL queries to get data for crosstab statistics.
Say what? OK, let’s break that down…
What’s a crosstab, anyway?
Imagine you’re a software engineer at InspireNote Inc.; among other things, you’re intimately familiar with the following tables in the database:
Customer
:
CustomerID | Name |
---|---|
5312 | J. S. Bach |
5313 | Edvard Grieg |
Product
:
ProductID | Name |
---|---|
479 | Norwegian landscape vibes |
480 | Polyphony4Ever |
Orders
:
OrderID | DateTime | CustomerID | ProductID | Amount |
---|---|---|---|---|
101 | 1685-03-21 | 5312 | 480 | 12 |
102 | 1874-08-31 | 5313 | 479 | 1 |
157 | 1704-11-14 | 5312 | 480 | 48 |
One day your manager, wanting to better understand the company’s customers, asks you to create the following view:
↓ Customer / Product → | Norwegian landscape vibes | Polyphony4Ever |
---|---|---|
J. S. Bach | 0 | 1,128 |
Edvard Grieg | 140 | 0 |
“No worries”, you utter, and already the necessary SQL springs to your mind:
SELECT Customer.Name, Product.Name, SUM(Orders.Amount)
FROM Customer
JOIN Orders ON Customer.CustomerID = Orders.CustomerID
JOIN Product ON Product.ProductID = Orders.ProductID
GROUP BY Customer.Name, Product.Name
A pivot quickly follows, sprinkle some UI, deploy, and wham! Your manager has her crosstab.
The view is well-received – so much, in fact, that a week later your manager clears all your JIRA tickets, hangs an ERD poster of the company’s sprawling database next to your desk, and hands you the specs for 73 crosstab requests from various execs in the organization.
Other engineers would have been overwhelmed, perhaps. Not you; you won’t let a bit of work get in the way of being lazy.
The challenge
You need to write a function that, given a crosstab view specification (and with knowledge of the database schema), generates an SQL for querying the data to construct the requested crosstab.
To do that, we observe that such queries have a common structure:
- The
SELECT
clause contains the two category columns, and some aggregation of the value column - The
FROM
clause contains a chain ofJOIN
s linking the two tables containing the category columns GROUP BY
the two category columns
The JOIN
chain is where you’ll need BFS and the database schema.
The Chinook database
The challenge uses the Chinook toy database. Download the sqlite3 version and place it in the same directory as the challenge’s code. Here’s a reference ERD from the homepage:
Go ahead and try the challenge. Good luck!