Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-05-09 Thread Robert Haas
On Sat, May 9, 2009 at 7:00 PM, Tom Lane wrote: > I wrote: >> ... So it appears to me that instead of taking an average-case correction >> as is done in this patch and the old coding, we have to explicitly model >> the matched-tuple and unmatched-tuple cases separately. > > I've applied the attach

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-05-09 Thread Tom Lane
I wrote: > ... So it appears to me that instead of taking an average-case correction > as is done in this patch and the old coding, we have to explicitly model > the matched-tuple and unmatched-tuple cases separately. I've applied the attached patch that does things this way. I did not do anythin

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-25 Thread Robert Haas
On Fri, Apr 24, 2009 at 10:49 PM, Tom Lane wrote: > Robert Haas writes: >> As far as I can tell, the focus on trying to estimate the number of >> tuples per bucket is entirely misguided.  Supposing the relation is >> mostly unique so that the values don't cluster too much, the right >> answer is

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-25 Thread Robert Haas
On Sat, Apr 25, 2009 at 6:42 AM, Grzegorz Jaskiewicz wrote: > On 25 Apr 2009, at 04:52, Robert Haas wrote: >> blow the hash-join plan out of the water anyway... but Stephen Frost >> was telling me at JDcon East that he sometimes sets it to something >> like 8GB when he's the only user on his appar

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-25 Thread Grzegorz Jaskiewicz
On 25 Apr 2009, at 04:52, Robert Haas wrote: blow the hash-join plan out of the water anyway... but Stephen Frost was telling me at JDcon East that he sometimes sets it to something like 8GB when he's the only user on his apparently-quite-awesome hardware...) For the record, because most querie

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-24 Thread Robert Haas
On Fri, Apr 24, 2009 at 10:49 PM, Tom Lane wrote: > Robert Haas writes: >> As far as I can tell, the focus on trying to estimate the number of >> tuples per bucket is entirely misguided.  Supposing the relation is >> mostly unique so that the values don't cluster too much, the right >> answer is

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-24 Thread Tom Lane
Robert Haas writes: > As far as I can tell, the focus on trying to estimate the number of > tuples per bucket is entirely misguided. Supposing the relation is > mostly unique so that the values don't cluster too much, the right > answer is (of course) NTUP_PER_BUCKET. But the entire point of tha

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-24 Thread Robert Haas
On Fri, Apr 24, 2009 at 8:09 PM, Tom Lane wrote: > So it appears to me that instead of taking an average-case correction > as is done in this patch and the old coding, we have to explicitly model > the matched-tuple and unmatched-tuple cases separately.  For hashjoins, > what I think we should do

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-24 Thread Tom Lane
Robert Haas writes: > Upon further review, it appears that a big part of this problem is > that cost_hashjoin() doesn't understand that it needs cost semi-joins > differently from inner or left joins. > ... > The planner costs the semi-join as two orders of magnitude more > expensive than the hash

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-16 Thread Robert Haas
> If HashAggregate is faster, then the question is can you make it better > by avoiding building the hash structure twice.  I haven't considered all > the possibilities, but the situation you have used as an example, an IN > query, seems workable.  Instead of translating to a hash > aggregate/hash/

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-16 Thread Robert Haas
On Thu, Apr 16, 2009 at 7:26 PM, Tom Lane wrote: > Robert Haas writes: >> Upon further review, it appears that a big part of this problem is >> that cost_hashjoin() doesn't understand that it needs cost semi-joins >> differently from inner or left joins. > > Yeah, I have a note to look into that

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-16 Thread Tom Lane
Robert Haas writes: > Upon further review, it appears that a big part of this problem is > that cost_hashjoin() doesn't understand that it needs cost semi-joins > differently from inner or left joins. Yeah, I have a note to look into that before 8.4 final. The same is true for nestloops: stoppin

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-16 Thread Lawrence, Ramon
> Upon further review, it appears that a big part of this problem is > that cost_hashjoin() doesn't understand that it needs cost semi-joins > differently from inner or left joins. The bogus logic looks to be > right here: > startup_cost += hash_qual_cost.startup; > run_cost += hash_qual_c

Re: [HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-15 Thread Robert Haas
On Sun, Apr 12, 2009 at 12:00 AM, Robert Haas wrote: > On Thu, Feb 19, 2009 at 5:53 PM, Tom Lane wrote: >> Greg Stark writes: >>> It's tempting to have Hash cheat and just peek at the node beneath it >>> to see if it's a HashAggregate, in which case it could call a special >>> method to request

[HACKERS] HashJoin w/option to unique-ify inner rel

2009-04-11 Thread Robert Haas
On Thu, Feb 19, 2009 at 5:53 PM, Tom Lane wrote: > Greg Stark writes: >> It's tempting to have Hash cheat and just peek at the node beneath it >> to see if it's a HashAggregate, in which case it could call a special >> method to request the whole hash. But it would have to know that it's >> just