On Wed, May 2, 2012 at 2:50 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Maxim Boguk <maxim.bo...@gmail.com> writes: > > I got very inefficient plan for a simple query. > > It looks like the problem is with the estimate of the antijoin size: > > > -> Nested Loop Anti Join (cost=0.00..24576.82 rows=1 > width=206) > > (actual time=0.043..436.386 rows=20761 loops=1) > > that is, only about 20% of the rows in sb_messages are eliminated by the > NOT EXISTS condition, but the planner thinks that nearly all of them > will be (and that causes it to not think that the LIMIT is going to > affect anything, so it doesn't prefer a fast-start plan). > > Since you've not told us anything about the statistics of these tables, > it's hard to speculate as to why the estimate is off. > > regards, tom lane >
Most interesting part that NOT EXISTS estimates way off, when LEFT JOIN WHERE ... IS NULL esimated correctly: good esitmate (estimated rows=20504 vs real rows=20760): Game2=# EXPLAIN ANALYZE SELECT * FROM sb_messages messages_tbl LEFT JOIN users users_tbl ON users_tbl.id = messages_tbl.from_user WHERE messages_tbl.type IN (0, 9) AND messages_tbl.visibility_status = 0 AND messages_tbl.not_show_on_air = 'f' AND messages_tbl.clan_id IS NULL AND users_tbl.blocked IS DISTINCT FROM 't'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop Left Join (cost=0.00..24577.74 rows=20504 width=1037) (actual time=0.045..532.012 rows=20760 loops=1) Filter: (users_tbl.blocked IS DISTINCT FROM true) -> Index Scan using sb_messages_special3_key on sb_messages messages_tbl (cost=0.00..3793.75 rows=35784 width=208) (actual time=0.019..67.746 rows=24937 loops=1) -> Index Scan using sb_users_pkey on users users_tbl (cost=0.00..0.53 rows=1 width=829) (actual time=0.007..0.009 rows=1 loops=24937) Index Cond: (users_tbl.id = messages_tbl.from_user) Total runtime: 563.944 ms bad estimate (estimated 1 vs real rows=20760): Game2=# EXPLAIN (ANALYZE, COSTS) SELECT * FROM sb_messages messages_tbl WHERE (messages_tbl.type IN (0, 9) AND messages_tbl.visibility_status=0 AND messages_tbl.not_show_on_air='f' AND messages_tbl.clan_id IS NULL) AND NOT EXISTS (SELECT 1 FROM users users_tbl WHERE blocked='t' and users_tbl.id = messages_tbl.from_user); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop Anti Join (cost=0.00..24488.28 rows=1 width=208) (actual time=0.044..430.645 rows=20760 loops=1) -> Index Scan using sb_messages_special3_key on sb_messages messages_tbl (cost=0.00..3793.75 rows=35784 width=208) (actual time=0.020..67.810 rows=24937 loops=1) -> Index Scan using sb_users_pkey on users users_tbl (cost=0.00..0.53 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=24937) Index Cond: (users_tbl.id = messages_tbl.from_user) Filter: users_tbl.blocked Total runtime: 461.296 ms What is curious that not exists always perform 20% faster (I performed both explains like 10 times each and each time not exits is close to 20% faster). -- Maxim Boguk Senior Postgresql DBA.