Re: Bloom filtering

Ted Dunning Mon, 19 Apr 2010 17:57:02 -0700

On Mon, Apr 19, 2010 at 5:18 PM, Benson Margulies <[email protected]>wrote:


> >
> >
> > >
> > > It's the conceptual model I'd like to understand here. In my
> > > 'understanding', bloom filters work because each hash function grabs a
> > > different picture of the total information content of the original key.
> > >
> >
> > A good hash does this if you have different seeds.
> >
> >
> What bothered me about the code I was reading was that there seemed to me
> to
> be no different seeds in the relevant sense. The code calculated a hash
> using a Rabin fingerprint. That's a 32 bit number. Then it rehashed the 32
> bit number using different seeds.
>
> Maybe the point is that the initial 32-bit hash has enough 'stuff' in it to
> generate multiple independent hashes if different seeds are in that second
> hash pass.
>

Hmm....  that doesn't sound so good.  32 bits isn't much for a Bloom filter
where you are hashing 10 times and your table can be big as well.

Re: Bloom filtering

Reply via email to