Ah, that did make a big difference! It went from taking 10x as long to taking only 1.5x as long (about what I would have expected, if not better.) Thank you!
On Fri, Feb 17, 2012 at 9:29 PM, Ants Aasma <ants.aa...@eesti.ee> wrote: > On Feb 17, 2012 8:35 PM, "Alessandro Gagliardi" <alessan...@path.com> > wrote: > > Here is the EXPLAIN: http://explain.depesz.com/s/ley > > > > I'm using PostgreSQL 9.0.6 on i486-pc-linux-gnu, compiled by GCC > gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 32-bit > > > > My random_page_cost is 2 and yet it still insists on using Seq Scan on > blocks. > > As could be inferred from the row counts, it's slow because its joining > and then aggregating a quarter of the blocks table. The hash join with its > sequential scan is probably the correct choice for that type of join, it's > the join itself that should be optimized out. The optimizer doesn't figure > out that the join can be turned into a semi join if the output is > aggregated with distinct and is from only one of the tables (in this case, > because the output is the join key, it can be from either table). > > To make the optimizers job easier you can rewrite it as a semi-join > explicitly: > SELECT DISTINCT(user_id) FROM seen_its WHERE EXISTS (SELECT 1 FROM blocks > WHERE blocks.user_id = seen_its.user_id) AND seen_its.created BETWEEN > (now()::date - interval '8 days')::timestamp AND now()::date::timestamp > > -- > Ants Aasma >