On Thu, 13 Apr 2023 at 10:09, David Rowley <dgrowle...@gmail.com> wrote:
> I also see I might need to do a bit more work on this as the following
> is not handled correctly:
>
> select count(*) over(rows between unbounded preceding and 10
> following) from tenk1;
>
> it's assuming all rows due to lack of ORDER BY, but it seems like it
> should be 10 rows due to the 10 FOLLOWING end bound.

Well, as it turned out, it was quite a bit more work. The frame
options have had quite a few additions since I last looked in detail.

I've attached v2 of the patch.  I've included a DEBUG1 message which
is useful to check what the estimate comes out as without having to
have a debugger attached all the time.

Here are a few samples of the estimator getting things right:

# select count(*) over (order by four range between unbounded
preceding and 2 following exclude current row) from tenk1 limit 1;
DEBUG:  startup_tuples = 7499
 count
-------
  7499

# select count(*) over (order by four rows between unbounded preceding
and 4000 following) from tenk1 limit 1;
DEBUG:  startup_tuples = 4001
 count
-------
  4001

# select count(*) over (order by four rows between unbounded preceding
and 4000 following exclude group) from tenk1 limit 1;
DEBUG:  startup_tuples = 1501
 count
-------
  1501

You can see in each case, startup_tuples was estimated correctly as
confirmed by count(*) during execution.

I've attached some more of these in sample_tests.txt, which all are
correct with the caveat of get_windowclause_startup_tuples() never
returning 0 due to it using clamp_row_est().  In practice, that's a
non-issue due to the way the startup_tuples value is used to calculate
the startup costs.

David
# set client_min_messages = 'debug1';
# select count(*) over (order by four range between unbounded preceding and 2 
preceding) from tenk1 limit 1;
DEBUG:  startup_tuples = 1
 count
-------
     0

# select count(*) over (order by four range between unbounded preceding and 2 
following) from tenk1 limit 1;
DEBUG:  startup_tuples = 7500
 count
-------
  7500

# select count(*) over (order by four range between unbounded preceding and 2 
following exclude current row) from tenk1 limit 1;
DEBUG:  startup_tuples = 7499
 count
-------
  7499

# select count(*) over (order by four range between unbounded preceding and 2 
following exclude ties) from tenk1 limit 1;
DEBUG:  startup_tuples = 5001
 count
-------
  5001

# select count(*) over (order by four range between unbounded preceding and 
20000 following exclude ties) from tenk1 limit 1;
DEBUG:  startup_tuples = 7501
 count
-------
  7501

# select count(*) over (order by four range between unbounded preceding and 
20000 following exclude group) from tenk1 limit 1;
DEBUG:  startup_tuples = 7500
 count
-------
  7500

# select count(*) over (order by four range between unbounded preceding and 
20000 following exclude current row) from tenk1 limit 1;
DEBUG:  startup_tuples = 9999
 count
-------
  9999

# select count(*) over (order by four range between unbounded preceding and 
20000 following) from tenk1 limit 1;
DEBUG:  startup_tuples = 10000
 count
-------
 10000

# select count(*) over (order by four range between unbounded preceding and 0 
following) from tenk1 limit 1;
DEBUG:  startup_tuples = 2500
 count
-------
  2500

# select count(*) over (order by four range between unbounded preceding and 0 
following exclude ties) from tenk1 limit 1;
DEBUG:  startup_tuples = 1
 count
-------
     1

# select count(*) over (order by four range between unbounded preceding and 0 
following exclude group) from tenk1 limit 1;
DEBUG:  startup_tuples = 1
 count
-------
     0

# select count(*) over (order by four range between unbounded preceding and 0 
following exclude current row) from tenk1 limit 1;
DEBUG:  startup_tuples = 2499
 count
-------
  2499

# select count(*) over (order by four rows between unbounded preceding and 4000 
following) from tenk1 limit 1;
DEBUG:  startup_tuples = 4001
 count
-------
  4001

# select count(*) over (order by four rows between unbounded preceding and 4000 
following exclude group) from tenk1 limit 1;
DEBUG:  startup_tuples = 1501
 count
-------
  1501

# select count(*) over (order by four rows between unbounded preceding and 4000 
following exclude ties) from tenk1 limit 1;
DEBUG:  startup_tuples = 1502
 count
-------
  1502

Attachment: fix_bug_17862_v2.patch
Description: Binary data

Reply via email to