Recently Markus Winand pointed out to me that the PG15 changes made in [1] to teach the query planner about monotonic window functions improved the situation for PostgreSQL on his feature/optimization timeline for PostgreSQL. These can be seen in [2].
Unfortunately, if you look at the timeline in [2], we're not quite on green just yet per Markus's "Not with partition by clause (see below)" caveat. This is because nodeWindowAgg.c's use_pass_through code must be enabled when the WindowClause has a PARTITION BY clause. The reason for this is that we can't just stop spitting out rows from the WindowAgg when one partition is done as we still need to deal with rows from any subsequent partitions and we can only get to those by continuing to read rows until we find rows belonging to the next partition. There is however a missed optimisation here when there is a PARTITION BY clause, but also some qual exists for the column(s) mentioned in the partition by clause that makes it so only one partition can exist. A simple example of that is in the following: EXPLAIN SELECT * FROM (SELECT relkind, pg_relation_size(oid) size, rank() OVER (PARTITION BY relkind ORDER BY pg_relation_size(oid) DESC ) rank FROM pg_class) WHERE relkind = 'r' AND rank <= 10; (the subquery may be better imagined as a view) Here, because of the relkind='r' qual being pushed down into the subquery, effectively that renders the PARTITION BY relkind clause redundant. What the attached patch does is process each WindowClause and removes any items from the PARTITION BY clause that are columns or expressions relating to redundant PathKeys. Effectively, this allows the nodeWindowAgg.c code which stops processing WindowAgg rows when the run condition is met to work as the PARTITION BY clause is completely removed in the case of the above query. Removing the redundant PARTITION BY items also has the added benefit of not having to needlessly check if the next row belongs to the same partition as the last row. For the above, that check is a waste of time as all rows have relkind = 'r' I passed the patch along to Markus and he kindly confirmed that we're now green for this particular optimisation. I'll add this patch to the July commitfest. David [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9d9c02ccd [2] https://use-the-index-luke.com/sql/partial-results/window-functions
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 1e4dd27dba..deb22e9404 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -5999,6 +5999,9 @@ make_window_input_target(PlannerInfo *root, * Create a pathkeys list describing the required input ordering * for the given WindowClause. * + * Modifies wc's partitionClause to remove any clauses which are deemed + * redundant by the pathkey logic. + * * The required ordering is first the PARTITION keys, then the ORDER keys. * In the future we might try to implement windowing using hashing, in which * case the ordering could be relaxed, but for now we always sort. @@ -6007,8 +6010,7 @@ static List * make_pathkeys_for_window(PlannerInfo *root, WindowClause *wc, List *tlist) { - List *window_pathkeys; - List *window_sortclauses; + List *window_pathkeys = NIL; /* Throw error if can't sort */ if (!grouping_is_sortable(wc->partitionClause)) @@ -6022,12 +6024,43 @@ make_pathkeys_for_window(PlannerInfo *root, WindowClause *wc, errmsg("could not implement window ORDER BY"), errdetail("Window ordering columns must be of sortable datatypes."))); - /* Okay, make the combined pathkeys */ - window_sortclauses = list_concat_copy(wc->partitionClause, wc->orderClause); - window_pathkeys = make_pathkeys_for_sortclauses(root, - window_sortclauses, - tlist); - list_free(window_sortclauses); + /* + * First fetch the pathkeys for the PARTITION BY clause. We can safely + * remove any clauses from the wc->partitionClause for redundant pathkeys. + */ + if (wc->partitionClause != NIL) + { + bool sortable; + + window_pathkeys = make_pathkeys_for_sortclauses_extended(root, + &wc->partitionClause, + tlist, + true, + &sortable); + + Assert(sortable); + } + + /* + * And fetch the pathkeys for the ORDER BY clause. We must keep any + * redundant pathkeys. Removing these would change the semantics of peer + * rows during execution. + */ + if (wc->orderClause != NIL) + { + List *orderby_pathkeys; + + orderby_pathkeys = make_pathkeys_for_sortclauses(root, + wc->orderClause, + tlist); + + /* Okay, make the combined pathkeys */ + if (window_pathkeys != NIL) + window_pathkeys = append_pathkeys(window_pathkeys, orderby_pathkeys); + else + window_pathkeys = orderby_pathkeys; + } + return window_pathkeys; } diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 0ca298f5a1..94520103a1 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -1473,6 +1473,8 @@ typedef struct GroupingSet * if the clause originally came from WINDOW, and is NULL if it originally * was an OVER clause (but note that we collapse out duplicate OVERs). * partitionClause and orderClause are lists of SortGroupClause structs. + * partitionClause is sanitized by the query planner to remove any columns or + * expressions belonging to redundant PathKeys. * If we have RANGE with offset PRECEDING/FOLLOWING, the semantics of that are * specified by startInRangeFunc/inRangeColl/inRangeAsc/inRangeNullsFirst * for the start offset, or endInRangeFunc/inRange* for the end offset.