On Wed, May 27, 2009 at 8:54 PM, Jeff Davis <pg...@j-davis.com> wrote:
> If you're putting a LIMIT on it, why does it return millions of results? > It *doesn't* return millions of results with the LIMIT on it. It just does a sequential scan of the table and doesn't find any results until it gets to the last quarter of the table. Sequentially scanning through 3/4 of the huge table before it gets a single match takes a very long time. As I said, in my original post, Postgres's approach would be completely reasonable in this case,* if* the rows that it was looking for were sprinkled randomly throughout the table. But they're *not* in this case -- they're all at the end. Can you pick out an interesting query and give some specifics, like: > * the query > * the EXPLAIN ANALYZE output (or EXPLAIN without ANALYZE if it takes too > long to even run once) > * EXPLAIN ANALYZE output if you force the index scan > * the statistics for the relevant columns, such as histogram and > correlation As I mentioned, the situation is very simple, and easy to understand what is going on. There's absolutely no mystery as to why Postgres is doing what it's doing. 25% of the table matches the query. Postgres *knows* this due to the statistics histogram for the column. Postgres is deciding to do a sequential scan because it knows that 25% of the rows match the query. Unfortunately, in this case, that's a poor approach. |>ouglas