[BUGS] query very slow when enable_seqscan=on

Tomasz Ostrowski Mon, 03 Jul 2006 14:03:49 -0700

I have a very slow query when enable_seqscan=on and very fast when
enable_seqscan=off. My schema looks like this (relevant columns
only):


create table organizations (
        organization_id serial primary key,
        organization varchar(200) not null,
        organization_location varchar(55) not null
        -- and several irrelevant columns
); -- about 9000 records
create table persons (
        person_id serial primary key,
        surname varchar(50) not null,
        forename varchar(35) not null,
        organization_id int references organizations,
        -- and several irrelevant columns
); -- about 6500 records
create index persons_surname_forename_person_id on persons (
        surname,
        forename,
        lpad(person_id,10,'0')
); -- I was hoping this would speed up array comparisions


The query looking for a position of a person of given person_id in a
list sorted by surname, forename and person_id and filtered by some
criteria. In this example person_id=1, forename~*'to' (about 400
people) and organization_location~*'warszawa' (about 2000
organizations):

select count(*) as position from (select
                person_id, surname, forename
                from persons
                natural left join organizations
                where forename~*'to' and organization_location~*'warszawa'
        ) as person_filter
                where array[surname, forename, lpad(person_id,10,'0')]
                <
                (select array[surname, forename, lpad(person_id,10,'0')]
                        from persons where person_id=1);

This query take about 30 seconds when enable_seqscan=on and 65
milliseconds when off.

When enable_seqscan=on:
 Aggregate  (cost=785.72..785.73 rows=1 width=0) (actual 
time=27948.955..27948.956 rows=1 loops=1)
   InitPlan
     ->  Index Scan using persons_pkey on persons  (cost=0.00..3.11 rows=1 
width=26) (actual time=0.019..0.019 rows=0 loops=1)
           Index Cond: (person_id = 1)
   ->  Nested Loop  (cost=0.00..782.60 rows=1 width=0) (actual 
time=27948.939..27948.939 rows=0 loops=1)
         Join Filter: ("inner".organization_id = "outer".organization_id)
         ->  Seq Scan on organization  (cost=0.00..480.95 rows=1 width=4) 
(actual time=0.071..69.702 rows=1892 loops=1)
               Filter: ((organization_location)::text ~* 'warszawa'::text)
         ->  Seq Scan on persons  (cost=0.00..296.10 rows=444 width=4) (actual 
time=14.720..14.720 rows=0 loops=1892)
               Filter: (((forename)::text ~* 'to'::text) AND (ARRAY[surname, 
forename, (lpad((person_id)::text, 10, '0'::text))::character varying] < $0))
 Total runtime: 27949.106 ms

When enable_seqscan=off:
 Aggregate  (cost=100001710.26..100001710.27 rows=1 width=0) (actual 
time=66.788..66.789 rows=1 loops=1)
   InitPlan
     ->  Index Scan using persons_pkey on persons  (cost=0.00..3.11 rows=1 
width=26) (actual time=0.019..0.019 rows=0 loops=1)
           Index Cond: (person_id = 1)
   ->  Hash Join  (cost=100001408.81..100001707.14 rows=1 width=0) (actual 
time=66.756..66.756 rows=0 loops=1)
         Hash Cond: ("outer".organization_id = "inner".organization_id)
         ->  Seq Scan on persons  (cost=100000000.00..100000296.10 rows=444 
width=4) (actual time=14.972..14.972 rows=0 loops=1)
               Filter: (((forename)::text ~* 'to'::text) AND (ARRAY[surname, 
forename, (lpad((person_id)::text, 10, '0'::text))::character varying] < $0))
         ->  Hash  (cost=1408.81..1408.81 rows=1 width=4) (actual 
time=51.763..51.763 rows=1892 loops=1)
               ->  Index Scan using organizations_pkey on organizations  
(cost=0.00..1408.81 rows=1 width=4) (actual time=0.049..48.233 rows=1892 
loops=1)
                     Filter: ((organization_location)::text ~* 'warszawa'::text)
 Total runtime: 66.933 ms


Database is properly analyzed. postgresql-8.1.4 on Fedora Core 4.

Regards
Tometzky

PS. Actual table and column names are different (they're in Polish)
but I've translated them for better readability for english-speaking.

PS. I wonder if it makes sense to "enable_seqscan=off" for every client
if a database is small enough to fit in OS cache.
-- 
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
                                                      Winnie the Pooh

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

[BUGS] query very slow when enable_seqscan=on

Reply via email to