Hello

Would it be possible and make sense to use notation of explicit WINDOW
clauses, for cases where multiple window functions invoke identical
window definitions?  I'm thinking of something like

explain verbose SELECT
    empno,
    depname,
    row_number() OVER testwin rn,
    rank() OVER testwin rnk,
    count(*) OVER testwin cnt
FROM empsalary
window testwin as
  (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
   UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);

for which, with the patch, we'd get this

                                                                       QUERY 
PLAN                                                                            
                                                                                
                                  
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 WindowAgg  (cost=74.64..101.29 rows=1070 width=68)
   Output: empno, depname, row_number() OVER (PARTITION BY depname ORDER BY 
enroll_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), rank() 
OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN UNBOUNDED 
PRECEDING AND UNBOUNDED FOLLOWING), count(*) OVER (PARTITION BY depname ORDER 
BY enroll_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), 
enroll_date
   ->  Sort  (cost=74.54..77.21 rows=1070 width=44)
         Output: depname, enroll_date, empno
         Sort Key: empsalary.depname, empsalary.enroll_date
         ->  Seq Scan on pg_temp.empsalary  (cost=0.00..20.70 rows=1070 
width=44)
               Output: depname, enroll_date, empno
(7 filas)

which is pretty ugly to read and requires careful tracking to verify
that they're all defined on the same window.  Previously, we just get

                                            QUERY PLAN                          
                  
──────────────────────────────────────────────────────────────────────────────────────────────────
 WindowAgg  (cost=74.64..101.29 rows=1070 width=68)
   Output: empno, depname, row_number() OVER (?), rank() OVER (?), count(*) 
OVER (?), enroll_date
   ->  Sort  (cost=74.54..77.21 rows=1070 width=44)
         Output: depname, enroll_date, empno
         Sort Key: empsalary.depname, empsalary.enroll_date
         ->  Seq Scan on pg_temp.empsalary  (cost=0.00..20.70 rows=1070 
width=44)
               Output: depname, enroll_date, empno
(7 filas)

so it didn't matter.

I'd imagine something like

                                                                       QUERY 
PLAN                                                                            
                                                                                
                                  
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 Window testwin AS (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN 
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
 WindowAgg  (cost=74.64..101.29 rows=1070 width=68)
   Output: empno, depname, row_number() OVER testwin, rank() OVER testwin, 
count(*) OVER testwin, enroll_date
   ->  Sort  (cost=74.54..77.21 rows=1070 width=44)
         Output: depname, enroll_date, empno
         Sort Key: empsalary.depname, empsalary.enroll_date
         ->  Seq Scan on pg_temp.empsalary  (cost=0.00..20.70 rows=1070 
width=44)
               Output: depname, enroll_date, empno
(7 filas)


I imagine this working even if the user doesn't explicitly use a WINDOW
clause, if only because it makes the explain easier to read, and it's
much clearer if the user specifies two different window definitions.
So with David Johnston's example, something like

 Window window1 AS (PARTITION BY depname ORDER BY enroll_date ROWS UNBOUNDED 
PRECEDING)
 Window window2 AS (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN 
CURRENT ROW AND CURRENT ROW)
 WindowAgg
   Output: empno, depname, (row_number() OVER window1), rank() OVER window1, 
count(*) OVER window2, enroll_date
   ->  WindowAgg
         Output: depname, enroll_date, empno, row_number() OVER window1, rank() 
OVER window1
         ->  Sort
               Output: depname, enroll_date, empno
               Sort Key: empsalary.depname, empsalary.enroll_date
               ->  Seq Scan on pg_temp.empsalary
                     Output: depname, enroll_date, empno

(Hmm, not sure if the Window clauses would be top-level or attached to
each WindowAgg in its own level.)

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
Thou shalt study thy libraries and strive not to reinvent them without
cause, that thy code may be short and readable and thy days pleasant
and productive. (7th Commandment for C Programmers)


Reply via email to