On Thu, Jun 8, 2023 at 7:37 AM David Rowley <dgrowle...@gmail.com> wrote:
> What the attached patch does is process each WindowClause and removes > any items from the PARTITION BY clause that are columns or expressions > relating to redundant PathKeys. > > Effectively, this allows the nodeWindowAgg.c code which stops > processing WindowAgg rows when the run condition is met to work as the > PARTITION BY clause is completely removed in the case of the above > query. Removing the redundant PARTITION BY items also has the added > benefit of not having to needlessly check if the next row belongs to > the same partition as the last row. For the above, that check is a > waste of time as all rows have relkind = 'r' This is a nice optimization. I reviewed it and here are my findings. In create_windowagg_plan there is such comment that says * ... Note: in principle, it's possible * to drop some of the sort columns, if they were proved redundant by * pathkey logic. However, it doesn't seem worth going out of our way to * optimize such cases. Since this patch removes any clauses from the wc->partitionClause for redundant pathkeys, this comment seems outdated, at least for the sort columns in partitionClause. Also I'm wondering if we can do the same optimization to wc->orderClause. I tested it with the query below and saw performance gains. create table t (a int, b int); insert into t select 1,2 from generate_series(1,100000)i; analyze t; explain analyze select * from (select a, b, rank() over (PARTITION BY a order by b) rank from t where b = 2) where a = 1 and rank <= 10; With and without this optimization to wc->orderClause the execution time is 67.279 ms VS. 119.120 ms (both best of 3). I notice you comment in the patch that doing this is unsafe because it would change the semantics of peer rows during execution. Would you please elaborate on that? Thanks Richard