I have encountered what seems to be a bad decision made by the query
planner, on a quite simple query. I have attached a shell script which
creates a large (180MB) sql script, to demonstrate the problem. To
reproduce, execute the bash script, and pipe the stdout into psql.
Redirect the stdout of psql somewhere, as it isn't very interesting. psql
produces several EXPLAIN ANALYSE results on its stderr. I have also
attached a copy of the results I get.


The test creates three tables:
  jointest3_1 (keya int, infoa int)
  jointest3_2 (keya int, keyb int)
  jointest3_3 (keyb int, infob int)

It then fills tables jointest3_1 and jointest3_3 each with 1 million rows,
where the key value is sequential (0 to 999999), and the info value is a
random zero or positive integer below 10000.

It then fills table jointest3_2 with 2 million rows, each with random
6-digit integers in both of its columns.

Therefore, table jointest3_2 represents a many-to-many mapping between
jointest3_1 and jointest3_3, and the natural join of all three tables
should have 2 million rows.

The test then creates lots of indexes, clusters on the indexes, and does a
vacuum analyse.

The test does six EXPLAIN ANALYSE commands. There are actually three
distinct commands, each of which is performed twice, to get a
before-cached and after-cached performance indication. The first command
does a natural join of all three tables, and restricts the results on
jointest3_1.infoa. This is performed quickly, as there is an index on
infoa which returns approximately 100 rows.

The second distinct command also does a natural join of all three tables,
but restricts the results by jointest3_3.infob. The query planner appears
to make a bad decision, and the query takes a long time to return.

The third distinct command does the same query as the second distinct
command, but it forces the query planner to join tables jointest3_2 and
jointest3_3 together before joining with jointest3_1 by doing a subselect.
This performs as quickly as the first distinct command.

I would have expected the query planner to find the fast query plan, given
there are only three tables being joined together.


I am using PostgreSQL version 7.2.1, running on Debian unstable/testing,
reasonably up to date. If you need more details, please email back.
select version(); gives the following:
PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.95.4
PostgreSQL was installed as a Debian package.

The machine is a Athlon XP1900, with 512MB of RAM, a 40GB hard drive.
Kernel: Linux 2.4.19-686
libc6:  2.2.5-14.1

Matthew Wakeling
#!/bin/bash

echo "create table jointest3_1 (keya int, infoa int);"
echo "create table jointest3_2 (keya int, keyb int);"
echo "create table jointest3_3 (keyb int, infob int);"
echo "BEGIN;"
hexdump -d </dev/urandom \
        | sed -e "s/^[0123456789abcdefABCDEF]\+ \+[^ ]\([0123456789]\+\) \+[^ 
]\([0123456789]\+\) \+[^ ]\([0123456789]\+\) \+[^ ]\([0123456789]\+\) \+[^ 
]\([0123456789]\+\) \+[^ ]\([0123456789]\+\) \+[^ ]\([0123456789]\+\) \+[^ 
]\([0123456789]\+\).*$/\1\2\3\4\5\6\7\8/" \
        | sed -e "=" | sed -e 
"/^[0123456789]\{1,7\}$/h;/^[0123456789]\{1,7\}$/d;G;s/^\(.*\)\n\(.*\)/\2 
\1/;s/^1000000/0/" \
        | sed -e "s/^\([0123456789]*\)  
\([0123456789]\{4\}\)\([0123456789]\{4\}\)\([0123456789]\{6\}\)\([0123456789]\{6\}\)\([0123456789]\{6\}\)\([0123456789]\{6\}\).*$/insert
 into jointest3_1 values (\1, \2); insert into jointest3_3 values (\1, \3);insert into 
jointest3_2 values (\4, \5);insert into jointest3_2 values (\6, \7);/" \
        | head -n 1000000
echo "COMMIT;"
echo "create index indexjointest3_1_keya on jointest3_1 (keya);"
echo "cluster indexjointest3_1_keya on jointest3_1;"
echo "create index indexjointest3_1_infoa on jointest3_1 (infoa);"
echo "create index indexjointest3_2_keya on jointest3_2 (keya);"
echo "cluster indexjointest3_2_keya on jointest3_2;"
echo "create index indexjointest3_2_keyb on jointest3_2 (keyb);"
echo "create index indexjointest3_3_keyb on jointest3_3 (keyb);"
echo "cluster indexjointest3_3_keyb on jointest3_3;"
echo "create index indexjointest3_3_infob on jointest3_3 (infob);"
echo "vacuum analyse;"
echo "explain analyse select infoa, keya, keyb, infob from jointest3_1 natural join 
jointest3_2 natural join jointest3_3 where infoa = 1234;"
echo "explain analyse select infoa, keya, keyb, infob from jointest3_1 natural join 
jointest3_2 natural join jointest3_3 where infoa = 1234;"

echo "explain analyse select infoa, keya, keyb, infob from jointest3_1 natural join 
jointest3_2 natural join jointest3_3 where infob = 1234;"
echo "explain analyse select infoa, keya, keyb, infob from jointest3_1 natural join 
jointest3_2 natural join jointest3_3 where infob = 1234;"

echo "explain analyse select infoa, keya, keyb, infob from jointest3_1 natural join 
(select * from jointest3_2 natural join jointest3_3 where infob = 1234) as s;"
echo "explain analyse select infoa, keya, keyb, infob from jointest3_1 natural join 
(select * from jointest3_2 natural join jointest3_3 where infob = 1234) as s;"

NOTICE:  Skipping "pg_group" --- only table or database owner can VACUUM it
NOTICE:  Skipping "pg_database" --- only table or database owner can VACUUM it
NOTICE:  Skipping "pg_shadow" --- only table or database owner can VACUUM it
NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..1343.81 rows=199 width=24) (actual time=101.51..3583.62 
rows=181 loops=1)
  ->  Nested Loop  (cost=0.00..741.80 rows=199 width=16) (actual time=71.32..1990.54 
rows=181 loops=1)
        ->  Index Scan using indexjointest3_1_infoa on jointest3_1  (cost=0.00..420.65 
rows=105 width=8) (actual time=25.48..856.21 rows=92 loops=1)
        ->  Index Scan using indexjointest3_2_keya on jointest3_2  (cost=0.00..3.03 
rows=2 width=8) (actual time=12.31..12.32 rows=2 loops=92)
  ->  Index Scan using indexjointest3_3_keyb on jointest3_3  (cost=0.00..3.01 rows=1 
width=8) (actual time=8.79..8.80 rows=1 loops=181)
Total runtime: 3583.96 msec

NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..1343.81 rows=199 width=24) (actual time=0.32..23.74 rows=181 
loops=1)
  ->  Nested Loop  (cost=0.00..741.80 rows=199 width=16) (actual time=0.25..10.50 
rows=181 loops=1)
        ->  Index Scan using indexjointest3_1_infoa on jointest3_1  (cost=0.00..420.65 
rows=105 width=8) (actual time=0.16..3.49 rows=92 loops=1)
        ->  Index Scan using indexjointest3_2_keya on jointest3_2  (cost=0.00..3.03 
rows=2 width=8) (actual time=0.07..0.07 rows=2 loops=92)
  ->  Index Scan using indexjointest3_3_keyb on jointest3_3  (cost=0.00..3.01 rows=1 
width=8) (actual time=0.07..0.07 rows=1 loops=181)
Total runtime: 23.96 msec

NOTICE:  QUERY PLAN:

Hash Join  (cost=426.08..101497.21 rows=201 width=24) (actual time=319.68..15399.50 
rows=224 loops=1)
  ->  Merge Join  (cost=0.00..86902.39 rows=1888831 width=16) (actual 
time=28.48..13852.19 rows=2000000 loops=1)
        ->  Index Scan using indexjointest3_1_keya on jointest3_1  
(cost=0.00..18599.00 rows=1000000 width=8) (actual time=28.30..2474.90 rows=1000000 
loops=1)
        ->  Index Scan using indexjointest3_2_keya on jointest3_2  
(cost=0.00..37193.00 rows=2000000 width=8) (actual time=0.15..5663.64 rows=2000000 
loops=1)
  ->  Hash  (cost=425.81..425.81 rows=107 width=8) (actual time=217.25..217.25 rows=0 
loops=1)
        ->  Index Scan using indexjointest3_3_infob on jointest3_3  (cost=0.00..425.81 
rows=107 width=8) (actual time=24.98..217.09 rows=112 loops=1)
Total runtime: 15420.55 msec

NOTICE:  QUERY PLAN:

Hash Join  (cost=426.08..101497.21 rows=201 width=24) (actual time=363.25..14638.78 
rows=224 loops=1)
  ->  Merge Join  (cost=0.00..86902.39 rows=1888831 width=16) (actual 
time=36.47..13099.69 rows=2000000 loops=1)
        ->  Index Scan using indexjointest3_1_keya on jointest3_1  
(cost=0.00..18599.00 rows=1000000 width=8) (actual time=28.30..2486.89 rows=1000000 
loops=1)
        ->  Index Scan using indexjointest3_2_keya on jointest3_2  
(cost=0.00..37193.00 rows=2000000 width=8) (actual time=8.14..4948.04 rows=2000000 
loops=1)
  ->  Hash  (cost=425.81..425.81 rows=107 width=8) (actual time=246.15..246.15 rows=0 
loops=1)
        ->  Index Scan using indexjointest3_3_infob on jointest3_3  (cost=0.00..425.81 
rows=107 width=8) (actual time=53.91..245.99 rows=112 loops=1)
Total runtime: 14639.31 msec

NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..4010.50 rows=201 width=24) (actual time=56.30..893.31 
rows=224 loops=1)
  ->  Nested Loop  (cost=0.00..3365.04 rows=213 width=16) (actual time=56.13..876.68 
rows=224 loops=1)
        ->  Index Scan using indexjointest3_3_infob on jointest3_3  (cost=0.00..425.81 
rows=107 width=8) (actual time=0.18..4.18 rows=112 loops=1)
        ->  Index Scan using indexjointest3_2_keyb on jointest3_2  (cost=0.00..27.48 
rows=6 width=8) (actual time=7.75..7.78 rows=2 loops=112)
  ->  Index Scan using indexjointest3_1_keya on jointest3_1  (cost=0.00..3.01 rows=1 
width=8) (actual time=0.07..0.07 rows=1 loops=224)
Total runtime: 893.62 msec

NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..4010.50 rows=201 width=24) (actual time=0.39..40.16 rows=224 
loops=1)
  ->  Nested Loop  (cost=0.00..3365.04 rows=213 width=16) (actual time=0.32..16.79 
rows=224 loops=1)
        ->  Index Scan using indexjointest3_3_infob on jointest3_3  (cost=0.00..425.81 
rows=107 width=8) (actual time=0.14..4.05 rows=112 loops=1)
        ->  Index Scan using indexjointest3_2_keyb on jointest3_2  (cost=0.00..27.48 
rows=6 width=8) (actual time=0.07..0.11 rows=2 loops=112)
  ->  Index Scan using indexjointest3_1_keya on jointest3_1  (cost=0.00..3.01 rows=1 
width=8) (actual time=0.10..0.10 rows=1 loops=224)
Total runtime: 40.39 msec


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Reply via email to