On Wed, May 2, 2012 at 2:50 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:

> Maxim Boguk <maxim.bo...@gmail.com> writes:
> > I got very inefficient plan for a simple query.
>
> It looks like the problem is with the estimate of the antijoin size:
>
> >          ->  Nested Loop Anti Join  (cost=0.00..24576.82 rows=1
> width=206)
> > (actual time=0.043..436.386 rows=20761 loops=1)
>
> that is, only about 20% of the rows in sb_messages are eliminated by the
> NOT EXISTS condition, but the planner thinks that nearly all of them
> will be (and that causes it to not think that the LIMIT is going to
> affect anything, so it doesn't prefer a fast-start plan).
>
> Since you've not told us anything about the statistics of these tables,
> it's hard to speculate as to why the estimate is off.
>
>                        regards, tom lane
>

Most interesting part that NOT EXISTS estimates way off, when LEFT JOIN
WHERE ... IS NULL esimated correctly:

good esitmate (estimated rows=20504  vs real rows=20760):
Game2=# EXPLAIN ANALYZE
SELECT
*
FROM sb_messages messages_tbl
LEFT JOIN users users_tbl ON users_tbl.id = messages_tbl.from_user
WHERE
    messages_tbl.type IN (0, 9) AND
    messages_tbl.visibility_status = 0 AND
    messages_tbl.not_show_on_air = 'f' AND
    messages_tbl.clan_id IS NULL AND
    users_tbl.blocked IS DISTINCT FROM 't';

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=0.00..24577.74 rows=20504 width=1037) (actual
time=0.045..532.012 rows=20760 loops=1)
   Filter: (users_tbl.blocked IS DISTINCT FROM true)
   ->  Index Scan using sb_messages_special3_key on sb_messages
messages_tbl  (cost=0.00..3793.75 rows=35784 width=208) (actual
time=0.019..67.746 rows=24937 loops=1)
   ->  Index Scan using sb_users_pkey on users users_tbl  (cost=0.00..0.53
rows=1 width=829) (actual time=0.007..0.009 rows=1 loops=24937)
         Index Cond: (users_tbl.id = messages_tbl.from_user)
 Total runtime: 563.944 ms


bad estimate (estimated 1 vs real rows=20760):
Game2=# EXPLAIN (ANALYZE, COSTS) SELECT * FROM sb_messages messages_tbl
WHERE
   (messages_tbl.type IN (0, 9) AND messages_tbl.visibility_status=0 AND
messages_tbl.not_show_on_air='f' AND messages_tbl.clan_id IS NULL)
   AND  NOT EXISTS (SELECT 1 FROM users users_tbl WHERE blocked='t' and
users_tbl.id = messages_tbl.from_user);

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Anti Join  (cost=0.00..24488.28 rows=1 width=208) (actual
time=0.044..430.645 rows=20760 loops=1)
   ->  Index Scan using sb_messages_special3_key on sb_messages
messages_tbl  (cost=0.00..3793.75 rows=35784 width=208) (actual
time=0.020..67.810 rows=24937 loops=1)
   ->  Index Scan using sb_users_pkey on users users_tbl  (cost=0.00..0.53
rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=24937)
         Index Cond: (users_tbl.id = messages_tbl.from_user)
         Filter: users_tbl.blocked
 Total runtime: 461.296 ms


What is curious that not exists always perform 20% faster (I performed both
explains like 10 times each and each time not exits is close to 20% faster).


-- 
Maxim Boguk
Senior Postgresql DBA.

Reply via email to