Guava library's approach is pretty naive:
Hashing.combineUnordered

It just adds each byte together, I don't think it is better than the one you 
proposed.

BTW, I don't understand why we need c? Isn't a*17+b good enough to avoid the 
corner case?

On 2020/07/15 20:33:55, Haisheng Yuan <hy...@apache.org> wrote: 
> > a) compute xor of the hashes
> > b) compute sum of the hashes
> > c) compute product of the hashes (with 0 replaced by 1 to avoid 0*n=0)
> > Then combine the three values somehow. For instance: (a*17+b)*17+c
> 
> That is really good one. Better than pure XOR.
> I also found this in scala:
> https://github.com/scala/scala/blob/2.11.x/src/library/scala/util/hashing/MurmurHash3.scala#L88
> 
> 
> On 2020/07/15 20:16:13, Vladimir Sitnikov <sitnikov.vladi...@gmail.com> 
> wrote: 
> > > things might be bad if we want do dedup RexNode children using
> > Set<RexNode>
> > 
> > I guess the following might work better than XOR:
> > 
> > a) compute xor of the hashes
> > b) compute sum of the hashes
> > c) compute product of the hashes (with 0 replaced by 1 to avoid 0*n=0)
> > Then combine the three values somehow. For instance: (a*17+b)*17+c
> > 
> > That would result in a hash code that tolerates item reordering, it is
> > better than XOR, and it should be fast.
> > 
> > Unfortunately, we can't use the improved function in Pair since Map.Entry
> > specifies the exact computation formula :-/
> > 
> > Vladimir
> > 
> 

Reply via email to