On 12/19/2016 01:27 AM, Andres Valloud wrote:
At first glance, that the failure code only sees "two" things when it should see "eight" seems to be problematic.

Perhaps the primitive insists on hashing byte objects, and there is a distinction between "byte" objects and "word" objects (whatever "word" means, presumably a constant width integer across all platforms). I haven't looked at the code.

From my perspective... back in the day that primitive used to hash bytes, and from what I saw here the failure code is hashing multi-byte things. If all of these observations are correct, then I'd say the failure code isn't doing what the primitive is doing, and in doing so it's introducing a lot of collisions that I'd like to believe the intended hash function wouldn't produce.

Ah, I see your concern.

As far as I can see, all classes that are using the StringHash primitive are actually byte objects, so things are, I believe, working as designed.

The only problem is that Ben did an experiment to see whether Float's hashing would be improved by using the StringHash primitive. Which it failed to do, because Float is not a byte object.

We could use an equivalent primitive to hash word objects, but I haven't found one.

We could also use a primitive to retrieve the bytes of a word object, and I haven't found one of those either. There are places that are converting the words to large integers and then hashing those, which would work for Floats.

-Martin

Reply via email to