A few months ago I attended the PostgreSQL conference in Portland, OR. There were a lot of talks ranging from hard-core stuff like Neil Conway‘s talk about the internals of query execution, to random fun stuff like David Fetter‘s discussion of procedural languages, including LOLCODE.
During their talks, a few people mentioned a handy function called generate_series. It took me a while to discover how useful this function really is. I thought I’d post an example. Here goes…
Let’s say that you have a table with sales information:
postgres=# select * from sales order by date; date | sales_person | part_number ------------+--------------+------------- 2008-05-05 | Glenn | 1 2008-05-05 | Shahaf | 1 2008-05-06 | Mike | 1 2008-05-06 | Mike | 2 2008-05-08 | Glenn | 1 2008-05-08 | Shahaf | 1 2008-05-08 | Mike | 2 2008-05-09 | Mike | 1 2008-05-09 | Glenn | 1 (9 rows)
You might want to get an idea of how many sales happened on each day. You could try to do it with a query like this:
postgres=# select date, count(*) from sales group by date order by date; date | count ------------+------- 2008-05-05 | 2 2008-05-06 | 2 2008-05-08 | 3 2008-05-09 | 2 (4 rows)
This basically works, but it hides one important fact — no sales happened on May 7.
To fix this, we can use generate_series. When you run this function normally, it just generates a series of numbers:
postgres=# select generate_series(0,4); generate_series ----------------- 0 1 2 3 4 (5 rows)
However, you can easily change it to generate a series of dates:
postgres=# select generate_series(0,4) + date '2008-05-05' as date; date ------------ 2008-05-05 2008-05-06 2008-05-07 2008-05-08 2008-05-09 (5 rows)
Once you have this, you can now join against the sales table to generate the report:
postgres=# select series.date, count(sales.date)
from (select generate_series(0,4) + date '2008-05-05' as date) as series
left outer join sales on series.date=sales.date group by series.date
order by series.date; date | count ------------+------- 2008-05-05 | 2 2008-05-06 | 2 2008-05-07 | 0 2008-05-08 | 3 2008-05-09 | 2 (5 rows)
The trick here is to do a left-outer join betwee the date sequence and the sales, and to count the rows that have a non-null sales date.
This is by no means the only use of generate_series, it’s just the most recent use I found. If you know of other ways to use this function, or if you know of other handy functions, drop a note below.
Image credits: Electric Vehicle Guide