On 12/19/2016 01:27 AM, Andres Valloud wrote:
At first glance, that the failure code only sees "two" things when it
should see "eight" seems to be problematic.
Perhaps the primitive insists on hashing byte objects, and there is a
distinction between "byte" objects and "word" objects (whatever "word"
means, presumably a constant width integer across all platforms). I
haven't looked at the code.
From my perspective... back in the day that primitive used to hash
bytes, and from what I saw here the failure code is hashing multi-byte
things. If all of these observations are correct, then I'd say the
failure code isn't doing what the primitive is doing, and in doing so
it's introducing a lot of collisions that I'd like to believe the
intended hash function wouldn't produce.
Ah, I see your concern.
As far as I can see, all classes that are using the StringHash primitive
are actually byte objects, so things are, I believe, working as designed.
The only problem is that Ben did an experiment to see whether Float's
hashing would be improved by using the StringHash primitive. Which it
failed to do, because Float is not a byte object.
We could use an equivalent primitive to hash word objects, but I haven't
found one.
We could also use a primitive to retrieve the bytes of a word object,
and I haven't found one of those either. There are places that are
converting the words to large integers and then hashing those, which
would work for Floats.
-Martin