On Mon, 26 Dec 2022 at 02:04, Ankit Kumar Pandey <itsanki...@gmail.com> wrote: > Point #1 > > In the above query Oracle 10g performs 2 sorts, DB2 and Sybase perform 3 > sorts. We also perform 3.
This shouldn't be too hard to do. See the code in select_active_windows(). You'll likely want to pay attention to the DISTINCT pathkeys if they exist and just use the ORDER BY pathkeys if the query has no DISTINCT clause. DISTINCT is evaluated after Window and before ORDER BY. One idea to implement this would be to adjust the loop in select_active_windows() so that we record any WindowClauses which have the pathkeys contained in the ORDER BY / DISTINCT pathkeys then record those separately and append those onto the end of the actives array after the sort. I do think you'll likely want to put any WindowClauses which have pathkeys which are a true subset or true superset of the ORDER BY / DISTINCT pathkeys last. If they're a superset then we won't need to perform any additional ordering for the DISTINCT / ORDER BY clause. If they're a subset then we might be able to perform an Incremental Sort, which is likely much cheaper than a full sort. The existing code should handle that part. You just need to make select_active_windows() more intelligent. You might also think that we could perform additional optimisations and also adjust the ORDER BY clause of a WindowClause if it contains the pathkeys of the DISTINCT / ORDER BY clause. For example: SELECT *,row_number() over (order by a,b) from tab order by a,b,c; However, if you were to adjust the WindowClauses ORDER BY to become a,b,c then you could produce incorrect results for window functions that change their result based on peer rows. Note the difference in results from: create table ab(a int, b int); insert into ab select x,y from generate_series(1,5) x, generate_Series(1,5)y; select a,b,count(*) over (order by a) from ab order by a,b; select a,b,count(*) over (order by a,b) from ab order by a,b; > and Point #2 > > Teach planner to decide which window to evaluate first based on costs. > Currently the first window in the query is evaluated first, there may be no > index to help sort the first window, but perhaps there are for other windows > in the query. This may allow an index scan instead of a seqscan -> sort. What Tom wrote about that in the first paragraph of [1] still applies. The problem is that if the query contains many joins that to properly find the cheapest way of executing the query we'd have to perform the join search once for each unique sort order of each WindowClause. That's just not practical to do from a performance standpoint. The join search can be very expensive. There may be something that could be done to better determine the most likely candidate for the first WindowClause using some heuristics, but I've no idea what those would be. You should look into point #1 first. Point #2 is significantly more difficult to solve in a way that would be acceptable to the project. David [1] https://www.postgresql.org/message-id/11535.1230501658%40sss.pgh.pa.us