On 12/09/2016 09:45 AM, Ben Coman wrote:

The suggested fix depended on being willing to let go of byte ordering (which we might not want different results on different platforms)
    Float>>hash
^ByteArray
   hashBytes: self
   startingWith: self species hash
This fix contains a good idea, but has at least two problems:
1) The check for integers is important, that's what makes sure that 1.0 has the same hash as 1. Which it must, since they compare equal. 2) #hashBytes:startingWith: is designed to hash *bytes*. A BoxedFloat64 is two *words*, which ensures that when you hashMultiply the high-order bits of each word get discarded.

I tried implementing a hash that actually uses the bytes of the float (code at bottom of this message). I didn't do any real analysis of it, but at least it gives different answers for 0.5 and -0.5. I think it matches the *intent* of the code above (though would still need the integer check, which I didn't bother with).


but anyway that doesn't actually help this case...
    -0.5 hash "==>120484352"
    0.5 hash "==>120484352"

I see Squeak5's float hash has changed...
     Float>>hash "Squeak5"
(self isFinite and: [self fractionPart = 0.0]) ifTrue: [^self truncated hash].
^ ((self basicAt: 1) bitShift: -4) +
  ((self basicAt: 2) bitShift: -4)

This is slightly better. It's only discarding eight bits out of 64, instead of discarding 32. But it would be better to mix all the bits into the hash.

Note that all of these hash implementations end up manipulating at least one large integer most of the time, since each word can be a large integer. It's possible that a primitive to copy a word object into the equivalent bytes might speed things up somewhat.

Regards,

-Martin


"Strictly experimental method on BoxedFloat64, could probably be made faster even without a primitive."

bytesHash
    | bytes word1 word2 |
    bytes := ByteArray new: 8.
    word1 := self basicAt: 1.
    word2 := self basicAt: 2.
    bytes at: 1 put: (word1 bitShift: -24).
    bytes at: 2 put: ((word1 bitShift: -16) bitAnd: 16rFF).
    bytes at: 3 put: ((word1 bitShift: -8) bitAnd: 16rFF).
    bytes at: 4 put: (word1  bitAnd: 16rFF).
    bytes at: 5 put: (word2 bitShift: -24).
    bytes at: 6 put: ((word2 bitShift: -16) bitAnd: 16rFF).
    bytes at: 7 put: ((word2 bitShift: -8) bitAnd: 16rFF).
    bytes at: 8 put: (word2  bitAnd: 16rFF).
    ^ ByteArray hashBytes: bytes startingWith: self class hash

Reply via email to