That doesn't bother my intuition so much, since the two hashes are *different*. Or maybe I'm not following the implications of the linear conbination.
It's the conceptual model I'd like to understand here. In my 'understanding', bloom filters work because each hash function grabs a different picture of the total information content of the original key. Generally, I am indeed feeling like a bear of rapidly shrinking brain, since that page is at pains to insist on two independent has functions. On Mon, Apr 19, 2010 at 6:10 PM, Ted Dunning <[email protected]> wrote: > This gets worse before it gets better. > > You can actually use a linear combination of two hashes (for Bloom filters, > not in general). > > See http://en.wikipedia.org/wiki/Double_hashing > > > > On Mon, Apr 19, 2010 at 2:48 PM, Benson Margulies <[email protected] > >wrote: > > > I just had to set up a Bloom filter. I found a seemingly cogent > > implementation by one Bruno Martins, but it has a design decision that I > > found difficult to swallow. > > > > To get the multiple hashes, he: > > > > 1: used Rabin fingerprints to get an hash. > > 2: rehashed that hash with varying seeds to get the multiple hashes. > > > > In bear-of-little-brain mode, this struck me as approximately nuts. The > > initial hash already threw out a ton of information. So how can messing > > with > > it further give truly distinctive slices? > > >
