Ah, that did make a big difference! It went from taking 10x as long to
taking only 1.5x as long (about what I would have expected, if not
better.) Thank you!

On Fri, Feb 17, 2012 at 9:29 PM, Ants Aasma <ants.aa...@eesti.ee> wrote:

> On Feb 17, 2012 8:35 PM, "Alessandro Gagliardi" <alessan...@path.com>
> wrote:
> > Here is the EXPLAIN: http://explain.depesz.com/s/ley
> >
> > I'm using PostgreSQL 9.0.6 on i486-pc-linux-gnu, compiled by GCC
> gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 32-bit
> >
> > My random_page_cost is 2 and yet it still insists on using Seq Scan on
> blocks.
>
> As could be inferred from the row counts, it's slow because its joining
> and then aggregating a quarter of the blocks table. The hash join with its
> sequential scan is probably the correct choice for that type of join, it's
> the join itself that should be optimized out. The optimizer doesn't figure
> out that the join can be turned into a semi join if the output is
> aggregated with distinct and is from only one of the tables (in this case,
> because the output is the join key, it can be from either table).
>
> To make the optimizers job easier you can rewrite it as a semi-join
> explicitly:
> SELECT DISTINCT(user_id) FROM seen_its WHERE EXISTS (SELECT 1 FROM blocks
> WHERE blocks.user_id = seen_its.user_id) AND seen_its.created BETWEEN
> (now()::date - interval '8 days')::timestamp AND now()::date::timestamp
>
> --
> Ants Aasma
>

Reply via email to