Re: Tid scan improvements

David Rowley Wed, 13 Mar 2019 20:47:04 -0700

On Mon, 4 Feb 2019 at 18:37, Edmund Horner <[email protected]> wrote:
> 1. v6-0001-Add-selectivity-estimate-for-CTID-system-variables.patch


I think 0001 is good to go. It's a clear improvement over what we do today.

(t1 = 1 million row table with a single int column.)

Patched:
# explain (analyze, timing off) select * from t1 where ctid < '(1, 90)';
 Seq Scan on t1  (cost=0.00..16925.00 rows=315 width=4) (actual
rows=315 loops=1)

# explain (analyze, timing off) select * from t1 where ctid <= '(1, 90)';
 Seq Scan on t1  (cost=0.00..16925.00 rows=316 width=4) (actual
rows=316 loops=1)

Master:
# explain (analyze, timing off) select * from t1 where ctid < '(1, 90)';
 Seq Scan on t1  (cost=0.00..16925.00 rows=333333 width=4) (actual
rows=315 loops=1)

# explain (analyze, timing off) select * from t1 where ctid <= '(1, 90)';
 Seq Scan on t1  (cost=0.00..16925.00 rows=333333 width=4) (actual
rows=316 loops=1)

The only possible risk I can foresee is that it may be more likely we
underestimate the selectivity and that causes something like a nested
loop join due to the estimation being, say 1 row.

It could happen in a case like:

SELECT * FROM bloated_table WHERE ctid >= <last ctid that would exist
without bloat>

but I don't think we should keep using DEFAULT_INEQ_SEL just in case
this happens. We could probably fix 90% of those cases by returning 2
rows instead of 1.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Tid scan improvements

Reply via email to