On 10/12/2015 12:06 PM, Lele Gaifax wrote:
Hi all,
I'm doing some experiments to find the better layout for reimplementing
an existing db (MySQL cough!) with PostgreSQL 9.4+.
I noticed a strange plan coming out from a simple query joining two tables,
both containing 10Mrecs (and both ANALYZEd):
l10ntest=# \d master;
Table "public.master"
Column | Type | Modifiers
--------+---------+------------------------------------------------------
num | integer | not null default nextval('master_num_seq'::regclass)
Indexes:
"master_pkey" PRIMARY KEY, btree (num)
l10ntest=# \d master_l10n;
Table "public.master_l10n"
Column | Type | Modifiers
--------+----------------------+-----------------------------------------------------------
num | integer | not null default
nextval('master_l10n_num_seq'::regclass)
lang | character varying(2) | not null
text | text |
Indexes:
"master_l10n_pkey" PRIMARY KEY, btree (num, lang)
"l10n_text_index" btree (lower(text) text_pattern_ops)
l10ntest=# EXPLAIN SELECT count(l.num) AS count_1 FROM master_l10n l WHERE
l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
QUERY PLAN
------------------------------------------------------------------------------------------------------
Aggregate (cost=309315.38..309315.39 rows=1 width=4)
-> Bitmap Heap Scan on master_l10n l (cost=64700.56..307801.65
rows=605492 width=4)
Filter: (((lang)::text = 'it'::text) AND (lower(text) ~~
'quattro%'::text))
-> Bitmap Index Scan on l10n_text_index (cost=0.00..64549.19
rows=999662 width=0)
Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND
(lower(text) ~<~ 'quattrp'::text))
(5 rows)
Time: 1.665 ms
l10ntest=# EXPLAIN SELECT count(m.num) AS count_1 FROM master_l10n l JOIN
master m ON m.num = l.num WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Aggregate (cost=676558.14..676558.15 rows=1 width=4)
-> Hash Join (cost=373011.02..675044.41 rows=605492 width=4)
Hash Cond: (l.num = m.num)
-> Bitmap Heap Scan on master_l10n l (cost=64700.56..307801.65
rows=605492 width=4)
Filter: (((lang)::text = 'it'::text) AND (lower(text) ~~
'quattro%'::text))
-> Bitmap Index Scan on l10n_text_index
(cost=0.00..64549.19 rows=999662 width=0)
Index Cond: ((lower(text) ~>=~ 'quattro'::text) AND
(lower(text) ~<~ 'quattrp'::text))
-> Hash (cost=144247.76..144247.76 rows=9999976 width=4)
-> Seq Scan on master m (cost=0.00..144247.76
rows=9999976 width=4)
(9 rows)
Time: 1.244 ms
l10ntest=# SELECT count(l.num) AS count_1 FROM master_l10n l WHERE l.lang
= 'it' AND lower(l.text) LIKE 'quattro%';
count_1
---------
1101101
(1 row)
Time: 1221.941 ms
l10ntest=# SELECT count(m.num) AS count_1 FROM master_l10n l JOIN master m
ON m.num = l.num WHERE l.lang = 'it' AND lower(l.text) LIKE 'quattro%';
count_1
---------
1101101
(1 row)
Time: 3541.852 ms
Why does the join on the master table require a "Seq Scan on master"? I tried
different kinds of "JOIN", but the resulting explanation remains the same.
Am I missing something, or should I stop worrying about that sequential scan?
Off hand I would say it is because of this --> count(m.num). Try
count(l.num) instead and see what happens. As your queries above show
they are the same number.
Thanks in advance,
bye, lele.
--
Adrian Klaver
adrian.kla...@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general