Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread panam
I'd like to thank you all for getting this analyzed, especially Tom! Your rigor is pretty impressive. Seems like otherwise it'd impossible to maintain a DBS, though. In the end, I know a lot more of postgres internals and that this idiosyncrasy (from a user perspective) could happen again. I guess

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread Ross J. Reedstrom
On Wed, Jun 01, 2011 at 04:58:36PM -0400, Robert Haas wrote: > On Wed, Jun 1, 2011 at 4:47 PM, Tom Lane wrote: > > Robert Haas writes: > >> I guess the real issue here is that m1.id < m2.id has to be evaluated > >> as a filter condition rather than a join qual. > > > > Well, if you can invent an

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread Robert Haas
On Wed, Jun 1, 2011 at 4:47 PM, Tom Lane wrote: > Robert Haas writes: >> I guess the real issue here is that m1.id < m2.id has to be evaluated >> as a filter condition rather than a join qual. > > Well, if you can invent an optimized join technique that works for > inequalities, go for it ... but

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread Tom Lane
=?ISO-8859-1?Q?C=E9dric_Villemain?= writes: > exact, thanks to your last email I read more the code and get the same > conclusion and put it in a more appropriate place : before > ExecScanHashBucket. > I was about sending it, so it is attached. Applied with cosmetic adjustments.

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread Tom Lane
Robert Haas writes: > I guess the real issue here is that m1.id < m2.id has to be evaluated > as a filter condition rather than a join qual. Well, if you can invent an optimized join technique that works for inequalities, go for it ... but I think you should get at least a PhD thesis out of that.

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread Robert Haas
On Wed, Jun 1, 2011 at 4:35 PM, Tom Lane wrote: > Robert Haas writes: >> On Wed, Jun 1, 2011 at 4:25 PM, Tom Lane wrote: >>> Because of the way that a bitmap heap scan works, the rows are >>> guaranteed to be loaded into the hash table in physical order, which >>> means (in the fast case) that t

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread Tom Lane
Robert Haas writes: > On Wed, Jun 1, 2011 at 4:25 PM, Tom Lane wrote: >> Because of the way that a bitmap heap scan works, the rows are >> guaranteed to be loaded into the hash table in physical order, which >> means (in the fast case) that the row with the largest "id" value gets >> loaded last.

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread Cédric Villemain
2011/6/1 Tom Lane : > We do need to look into putting a CHECK_FOR_INTERRUPTS call in here > somewhere, though.  I'm inclined to think that right before the > ExecScanHashBucket is the best place.  The reason that nest and merge > joins don't show a comparable non-responsiveness to cancels is that t

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread Robert Haas
On Wed, Jun 1, 2011 at 4:25 PM, Tom Lane wrote: > Because of the way that a bitmap heap scan works, the rows are > guaranteed to be loaded into the hash table in physical order, which > means (in the fast case) that the row with the largest "id" value gets > loaded last.  And because ExecHashTable

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread Tom Lane
Robert Haas writes: > On Tue, May 31, 2011 at 11:47 PM, Tom Lane wrote: >> I'd just write it off as being a particularly stupid way to find the >> max(), except I'm not sure why deleting just a few thousand rows >> improves things so much.  It looks like it ought to be an O(N^2) >> situation, so

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread panam
Tom Lane-2 wrote: > > It looks like it ought to be an O(N^2) > situation, so the improvement should be noticeable but not amazing. > Hm, the performance was reasonable again when doing a cluster... So I believe this should be more a technical than an algorithmical/complexity issue. Maybe it is

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-06-01 Thread Robert Haas
On Tue, May 31, 2011 at 11:47 PM, Tom Lane wrote: > Robert Haas writes: >> With respect to the root of the issue (why does the anti-join take so >> long?), my first thought was that perhaps the OP was very unlucky and >> had a lot of values that hashed to the same bucket.  But that doesn't >> app

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-05-31 Thread Tom Lane
Robert Haas writes: > With respect to the root of the issue (why does the anti-join take so > long?), my first thought was that perhaps the OP was very unlucky and > had a lot of values that hashed to the same bucket. But that doesn't > appear to be the case. Well, yes it is. Notice what the su

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-05-31 Thread Robert Haas
On Tue, May 31, 2011 at 9:11 PM, Cédric Villemain wrote: > Oh damned, I am currently with an eeepc, I'll need 2 days to bench that :-D > I'll see tomorow. LOL. With respect to the root of the issue (why does the anti-join take so long?), my first thought was that perhaps the OP was very unlucky

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-05-31 Thread Cédric Villemain
2011/6/1 Robert Haas : > On Tue, May 31, 2011 at 8:43 PM, Cédric Villemain > wrote: >> Yes, while here I noticed that the query was long to be killed. >> I added a CHECK_FOR_INTERRUPT() in the for(;;) loop in nodeHashjoin.c. >> It fixes the delay when trying to kill but I don't know about >> perfo

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-05-31 Thread Robert Haas
On Tue, May 31, 2011 at 8:43 PM, Cédric Villemain wrote: > Yes, while here I noticed that the query was long to be killed. > I added a CHECK_FOR_INTERRUPT() in the for(;;) loop in nodeHashjoin.c. > It fixes the delay when trying to kill but I don't know about > performance impact this can have in

Re: [HACKERS] [PERFORM] Hash Anti Join performance degradation

2011-05-31 Thread Cédric Villemain
2011/5/31 Robert Haas : > On Thu, May 26, 2011 at 8:33 AM, panam wrote: >> Any third party confirmation? > > Yeah, it definitely looks like there is some kind of bug here.  Or if > not a bug, then a very surprising feature.  EXPLAIN ANALYZE outputs > from your proposed test attached.  Here's a uni