On Mon, Apr 19, 2010 at 5:18 PM, Benson Margulies <[email protected]>wrote:
> > > > > > > > > > It's the conceptual model I'd like to understand here. In my > > > 'understanding', bloom filters work because each hash function grabs a > > > different picture of the total information content of the original key. > > > > > > > A good hash does this if you have different seeds. > > > > > What bothered me about the code I was reading was that there seemed to me > to > be no different seeds in the relevant sense. The code calculated a hash > using a Rabin fingerprint. That's a 32 bit number. Then it rehashed the 32 > bit number using different seeds. > > Maybe the point is that the initial 32-bit hash has enough 'stuff' in it to > generate multiple independent hashes if different seeds are in that second > hash pass. > Hmm.... that doesn't sound so good. 32 bits isn't much for a Bloom filter where you are hashing 10 times and your table can be big as well.
