I wrote: > Hmm, I came across that paper while doing background reading. Okay, > now I get that "% (filter->nbits - 1)" is the second hash function in > that scheme. But now I wonder if that second function should actually > act on the passed "value" (the original hash), so that they are > actually independent, as required. In the language of that paper, the > patch seems to have > > g(x) = h1(x) + i*h2(h1(x)) + f(i) > > instead of > > g(x) = h1(x) + i*h2(x) + f(i) > > Concretely, I'm wondering if it should be: > > big_h = DatumGetUint32(hash_uint32(value)); > h = big_h % filter->nbits; > -d = big_h % (filter->nbits - 1); > +d = value % (filter->nbits - 1); > > But I could be wrong.
I'm wrong -- if we use different operands to the moduli, we throw away the assumption of co-primeness. But I'm still left wondering why we have to re-hash the hash for this to work. In any case, there should be some more documentation around the core algorithm, so that future readers are not left scratching their heads. -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services