Hi Hackers.

I have a feeling that GIN is "cheating" on the visibillity checks:

test=# set enable_seqscan = off;
SET
Time: 0.129 ms
test=# select count(id) from fts_test where fts @@ to_tsquery('core');
 count
--------
 158827
(1 row)

Time: 95.530 ms
test=# explain select count(id) from fts_test where fts @@ to_tsquery('core');
                                          QUERY PLAN
----------------------------------------------------------------------------------------------
 Aggregate  (cost=211571.52..211571.53 rows=1 width=4)
-> Bitmap Heap Scan on fts_test (cost=134925.95..211174.01 rows=159004 width=4)
         Recheck Cond: (fts @@ to_tsquery('core'::text))
-> Bitmap Index Scan on fts_idx (cost=0.00..134886.20 rows=159004 width=0)
               Index Cond: (fts @@ to_tsquery('core'::text))
(5 rows)

Time: 0.609 ms

test=# select count(id) from fts_test;
 count
--------
 168556
(1 row)

Time: 164.655 ms

test=# explain select count(id) from fts_test;
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Aggregate  (cost=10000075969.95..10000075969.96 rows=1 width=4)
-> Seq Scan on fts_test (cost=10000000000.00..10000075548.56 rows=168556 width=4)
(2 rows)

Time: 0.338 ms

This is run multiple times for both queries and the seqscan of the table
is consistently about 1.8 times more expensive than the fts-scan.
This is all on a fully memory cached dataset.

The first query should have the cost of the GIN-search + visibillity-test of 158K tuples, the latter should have the cost of visibillity-testing 168K tuples. If we set the cost of actually searching GIN to 0 then the gin-search - visibillity costs: 95/158000 0.000373ms/tuple where the seq-scan case costs close to 0.001ms/tuple (close to 3 times as much).

So I have a strong feeling that GIN is cheating on the visibillity tests
otherwise I have problems imagining how it ever can become faster to execute
than the seq_scan of the table.

Or is a Bitmap Heap Scan simply 3 times faster than a Seq-scan for visibillity-testing?

What have I missed in the logic?

Thanks.

--
Jesper


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to