On Sat, Jan 9, 2016 at 4:08 PM, Peter Geoghegan <p...@heroku.com> wrote: > Also, have you considered Hash join conditions with multiple > attributes as a special case? I'm thinking of cases like this:
Sorry, accidentally fat-fingered my enter key before I was finished drafting that mail. That example isn't useful, because there is no early/cheap elimination of tuples while scanning the outer relation. My point about bloom filters on only one attribute (or perhaps multiple bloom filters on multiple attributes) is that you might be able to make the bloom filter more effective, particularly relative to the amount of memory available, by only building it for one attribute where that distinction (one attribute vs multiple) happens to exist. Simon said: "So the objective must be to get a Bloom Filter that is small enough that it lives in a higher/faster level of cache than the main Hash table". I agree that that's really important here, although I did note that Tomas wasn't so sure, emphasizing the importance of avoiding serialization and deserialization overhead -- and certainly, that's why the patch helps some cases a lot. But Simon's point applies more to the worst case than the best or average cases, and evidently we have plenty of worrying about the worst case left to do here. So, one attribute could have relatively low cardinality, and thus fewer distinct items in the bloom filter's set, making it far slower to degrade (i.e. have increased probability of false positives) as compared to a composite of two or more attributes, and yet perhaps not significantly less useful in terms of its ability to eliminate (de)serialization overhead. Or, with two bloom filters (one per attribute), one bloom filter could degrade far faster than the other (due to having more distinct values), often very unpredictably, and yet it wouldn't matter because we'd discard the bad one (maybe really early, during the scan of the inner relation, using HLL). I notice that all these example queries involve less-than operators (inner relation tuples are thereby prevented from being entered into the hash table). It might not be a terrible idea to hash abbreviated keys (or even truncated abbreviated keys) for the bloom filter in certain cases, in particular when there is a correlation in the inner relation attributes (equi-joined attribute, and attribute that is other part of qual). There might not be a correlation, of course, but with some care hashing abbreviated keys may have virtually no additional overhead, making it worth doing despite the general uncertainty about any correlation existing. If you're going to accept that the bloom filter can't be very large, which I think you must, then hashing an entire value may be no better than hashing an abbreviated key (those 8 bytes tend to have a lot of entropy). This point is rather hand-wavy, though. I suspect that BLOOM_ERROR_RATE isn't the most useful constraint here. Is the degree to which it helps in sympathetic cases at all commensurate with the degree to which it hurts in unsympathetic cases? In your "low selectivity" cases (the sympathetic cases), there are relatively few values represented in the main hash table, and therefore in the bloom filter, and so a smaller bloom filter results automatically anyway. But in the bad cases, the BLOOM_ERROR_RATE constraint just wastes memory bandwidth for no gain, I think. If the number of elements is so large that a reasonable fixed sized bloom filter has a poor false positive rate anyway, we may have already lost. My sense is that we should try to make the bloom filter as cheap as possible more so than trying to make sure it's useful before much work has been done. Is it okay that you don't treat MCV/skew values as special? Actually, looking at this code, I think I notice something: @@ -238,6 +238,15 @@ ExecHashJoin(HashJoinState *node) hashvalue); node->hj_CurTuple = NULL; + /* If still in the first batch, we check the bloom filter. */ + if ((hashtable->curbatch == 0) && + (! ExecHashBloomCheckValue(hashtable, hashvalue))) + { + /* no matches; check for possible outer-join fill */ + node->hj_JoinState = HJ_FILL_OUTER_TUPLE; + continue; + } Haven't we already established by now that the value of "node->hj_CurSkewBucketNo" may not be INVALID_SKEW_BUCKET_NO, and so the value certainly was found in the inner relation scan? Why bother doing anything with the bloom filter in that common case? (Note that I'm not suggesting that we don't establish "node->hj_CurSkewBucketNo" first). Also, are you aware of this? http://www.nus.edu.sg/nurop/2010/Proceedings/SoC/NUROP_Congress_Cheng%20Bin.pdf It talks about bloom filters for hash joins in PostgreSQL specifically. Interestingly, they talk about specific TPC-H queries. BTW, I think code like this needs better comments: +static BloomFilter +BloomFilterInit(double nrows, double error) +{ + /* perhaps we should round nbits to multiples of 8 ? */ + int nbits = ceil((nrows * log(error)) / log(1.0 / (pow(2.0, log(2.0))))); + int nhashes = round(log(2.0) * nbits / nrows); + + BloomFilter filter = palloc0(offsetof(BloomFilterData, data) + ((nbits + 7) / 8)); + + filter->nbits = nbits; + filter->nhashes = nhashes; Have you experimentally verified that you get the expected probability of false positives in practice? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers