Hi, I am using generate_series + a join to group time-series data into buckets, which works well as long as I do this only for one aggregation hierarchy: The index on the timestamp of the table with the actual time-series data is used for a nested loop index join.
However with more aggregation levels (one aggregation stage consuming the output of the previous one) chained together using CTEs, there is no index available and postgresql falls back to (no-index) nested loop joins. Example, with the actual access to the data-table replaced, as there is no index it triggers the nested loop join immediatly (despite MATERIALIZED + ORDER BY): WITH series1h AS MATERIALIZED (SELECT generate_series AS ts FROM generate_series('1990-01-01 00:00', '1990-12-31 23:59', INTERVAL '1 hour') ORDER BY ts), series15m AS MATERIALIZED (SELECT generate_series AS ts FROM generate_series('1990-01-01 00:00', '1990-12-31 23:59', INTERVAL '15 minutes') ORDER BY ts) SELECT count(*) FROM (SELECT h1.ts, count(*) FROM series1h h1 JOIN series15m m15 ON (m15.ts > (h1.ts - INTERVAL '1 hour') AND m15.ts <= h1.ts ) GROUP BY h1.ts ORDER BY h1.ts); date_bin would allow to join on equality, however according to the docs it doesn't support months/years: The stride interval must be greater than zero and cannot contain units of month or larger. For now I am using temporary tables which can be indexed, are there ways to avoid them? Thanks, Clemens