Thanks Robert, this is what I was looking for. I will try these suggestions and follow up if any of them are the silver bullet.
On Fri, Jan 14, 2011 at 7:11 AM, Robert Haas wrote: > On Thu, Jan 6, 2011 at 4:36 PM, Mike Broers <mbro...@gmail.com> wrote: > > Thanks for the assistance. > > Here is an explain analyze of the query with the problem limit: > > production=# explain analyze select * from landing_page.messages where > > ((messages.topic = E'x') AND (messages.processed = 'f')) ORDER BY > > messages.created_at ASC limit 10; > > > > QUERY PLAN > > > > > -------------------------------------------------------------------------------------------------------------- > > ------------------------------------------------------------ > > Limit (cost=0.00..2891.06 rows=10 width=1340) (actual > > time=207922.586..207922.586 rows=0 loops=1) > > -> Index Scan using idx_landing_page_messages_created_at on messages > > (cost=0.00..449560.48 rows=1555 widt > > h=1340) (actual time=207922.581..207922.581 rows=0 loops=1) > > Filter: ((NOT processed) AND ((topic)::text = 'x'::text)) > > Total runtime: 207949.413 ms > > (4 rows) > > You're not the first person to have been bitten by this. The > optimizer thinks that rows WHERE NOT processed and topic = 'x' are > reasonably common, so it figures that it can just index scan until it > finds 10 of them. But when it turns out that there are none at all, > it ends up having to scan the entire index, which stinks big-time. > > The alternative plan is to use a different index to find ALL the > relevant rows, sort them, and then take the top 10. That would suck > if there actually were tons of rows like this, but there aren't. > > So the root of the problem, in some sense, is that the planner's > estimate of the selectivity of "NOT processed and topic = 'x'" is not > very good. Some things to try: > > - increase the statistics target for the "processed" and "topic" > columns even higher > - put the processed rows in one table and the not processed rows in > another table > - do something like SELECT * FROM (SELECT .. LIMIT 200 OFFSET 0) LIMIT > 10 to try to fool the planner into planning based on the higher, inner > limit > - create a partial index on messages (topic) WHERE NOT processed and > see if the planner will use it > > ...Robert >