Bob Rogers wrote:

Below please find an additional test case for t/pmc/hash.t that defines

50K keys, while checking that earlier entries are still present.  This

takes about 0.8 sec on my 1.8GHz AMD box, tunable by increasing I22.  Is
this the sort of thing you had in mind?

Yeah. Thanks. It's OTOH failing here, I've too look at it, then it'll go in. Another one which additionally deletes keys would be great.

One question:  Why is there a space to store computed hash codes in
struct parrot_string_t but not in HashBucket?

Having it in the bucket is a space penalty for other objects that can provide an hash value easily like e.g. Integer PMCs. More complicated objects like strings can therefore cache their hash val inside their structure.

...  Part of the reason I am
asking is because I have started thinking about how to implement Common
Lisp hashes, for which (a) keys can be arbitrary objects, (b) object
equivalence is not defined in terms of stringification, and (c) hash
computation is not necessarily cheap.

It's like in Python. The Py PMCs already implement a hash vtable - albeit a bit buggy one for double or complex.

...  The standard technique is
therefore to store each key's computed hash in the bucket, which seems
like it would be a win for Parrot as well.

We can do that too, but that's more or less an optimization thingy which needs some existing RL code to benchmark.

I don't see the difference WRT shareable,
It sounds like Uri meant that it could be mapped at different addresses
for different processes . . . but maybe not, because then the key and
value pointers would be useless.

A shared PMC exists just once there is no address mapping. Different processes needs of course distinct PMC data, we can't share data, just code.

This may be too obvious to be worth pointing out, but if hash internals
are made persistent, then the hash code computation needs to be
invariant from one interpreter instantiation to another.  For this
reason, the "seed" member of struct _hash is probably counterproductive.

Or the seed is frozen too.

Not to mention the fact that initializing it randomly would make it
harder to test hash iteration, since the key enumeration order would
then be nondeterministic.

Well, the reason *is* exactly to make it nondeterministic. The perl5 archives should have a lot of discussion WRT that, keyword DOS attacks.

   FWIW, Common Lisp defines an SXHASH function [1] that must exhibit
just this invariance characteristic, and must also compute a definite
answer for circular structures.  Time to revive the "hash" opcode?

It depends on how we define the equality of objects or aggregates, i.e if or how deeply that might recurse. If yes then the hash *vtable* needs a change and cooporate with the visit vtable like freeze or thaw. But that would need a change for is_equal too.

[1]  http://www.lispworks.com/documentation/HyperSpec/Body/f_sxhash.htm

Yep.

leo

Reply via email to