hello everybody,

we are seriously fighting with some planner issue which seems to be slightly obscure to us.
we have a table which is nicely indexed (several GB in size).
i am using btree_gist operator classes to use a combined index including an FTI expression along with a number:

db=# \d product.t_product
                                      Table "product.t_product"
Column | Type | Modifiers -----------------------+---------------+---------------------------------------------------------------- id | bigint | not null default nextval('product.t_product_id_seq'::regclass)
shop_id               | integer       |
art_number            | text          |
title                 | text          |
description           | text          |
display_price         | numeric(10,4) |

Indexes:
   "t_product_pkey" PRIMARY KEY, btree (id)
"idx_test" gist (display_price, to_tsvector('german'::regconfig, (title || ' '::text) || description)) * "idx_test2" gist (to_tsvector('german'::regconfig, (title || ' '::text) || description), display_price)*


what we basically expected here is that Postgres will scan the table using the index to give us the cheapest products containing the words we are looking for. i am totally surprised to see that we have to fetch all products given the words, sort and then do the limit. this totally kills performance because some words simply show up millions of times. this totally kills everything.

the plans look like this:

db=#  explain analyze SELECT art_number, title
   FROM product.t_product
WHERE to_tsvector('german'::regconfig, (title || ' '::text) || description) @@ plainto_tsquery('harddisk')
   ORDER BY display_price
   LIMIT 10;
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=108340.08..108340.10 rows=10 width=54) (actual time=1328.900..1328.909 rows=10 loops=1) -> Sort (cost=108340.08..108422.48 rows=32961 width=54) (actual time=1328.899..1328.905 rows=10 loops=1)
        Sort Key: display_price
        Sort Method:  top-N heapsort  Memory: 18kB
-> Bitmap Heap Scan on t_product (cost=2716.62..107627.80 rows=32961 width=54) (actual time=1052.706..1328.772 rows=55 loops=1) Recheck Cond: (to_tsvector('german'::regconfig, ((title || ' '::text) || description)) @@ plainto_tsquery('harddisk'::text)) -> Bitmap Index Scan on idx_test2 (cost=0.00..2708.38 rows=32961 width=0) (actual time=1052.576..1052.576 rows=55 loops=1) Index Cond: (to_tsvector('german'::regconfig, ((title || ' '::text) || description)) @@ plainto_tsquery('harddisk'::text))
Total runtime: 1328.942 ms
(9 rows)


runtime increases badly if words start to be more likely ...


db=#  explain analyze SELECT art_number, title
       FROM product.t_product
WHERE to_tsvector('german'::regconfig, (title || ' '::text) || description) @@ plainto_tsquery('spiel')
       ORDER BY display_price
       LIMIT 10;
QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=108340.08..108340.10 rows=10 width=54) (actual time=33489.675..33489.682 rows=10 loops=1) -> Sort (cost=108340.08..108422.48 rows=32961 width=54) (actual time=33489.675..33489.675 rows=10 loops=1)
        Sort Key: display_price
        Sort Method:  top-N heapsort  Memory: 18kB
-> Bitmap Heap Scan on t_product (cost=2716.62..107627.80 rows=32961 width=54) (actual time=774.923..33408.522 rows=56047 loops=1) Recheck Cond: (to_tsvector('german'::regconfig, ((title || ' '::text) || description)) @@ plainto_tsquery('spiel'::text)) -> Bitmap Index Scan on idx_test2 (cost=0.00..2708.38 rows=32961 width=0) (actual time=759.078..759.078 rows=56047 loops=1) Index Cond: (to_tsvector('german'::regconfig, ((title || ' '::text) || description)) @@ plainto_tsquery('spiel'::text))
Total runtime: 33489.906 ms
(9 rows)

i am wondering why postgres is not able to use a combined index here?
is this some obscure thing related to gist, a logical problem or a planner deficiency?

ideas are welcome.

   many thanks,

      hans



--
Cybertec Schoenig & Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to