On Mon, Mar 26, 2012 at 5:43 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Hm. This illustrates that it's not too prudent to rely on a default > numdistinct estimate to decide that a hash aggregation is safe :-(. > We had probably better tweak the cost estimation rules to not trust > that. Maybe, if we have a default estimate, we should take the worst > case estimate that the column might be unique? That could still burn > us if the rowcount estimate was horribly wrong, but those are not nearly > as shaky as numdistinct estimates ...
Perhaps we should have two work_mem settings -- one for the target to aim for and one for a hard(er) limit that we should ensure the worst case falls under? I have a sketch for how to handle spilling hash aggregates to disk in my head. I'm not sure if it's worth the amount of complexity it would require but I'll poke around a bit and see if it works out well. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers