Jonathan Worthington wrote:
Hi,
Earlier tonight I tried to iterate a hash. I have somewhat thoughtlessly
been doing things like:
PMC *iter = VTABLE_get_iter(interp, some_hash);
while (VTABLE_get_bool(iter)) {
PMC *key_pmc = VTABLE_shift_pmc(interp, iter);
STRING *key = VTABLE_get_string(interp, key_pmc);
...
}
Rather than:
PMC *iter = VTABLE_get_iter(interp, some_hash);
while (VTABLE_get_bool(iter)) {
STRING *key = VTABLE_shift_string(interp, iter);
...
}
So now you're thinking "so what, Jonathan did something stupid in his
code yet again, big deal". What is perhaps surprising (well, I found it
surprising) is that if you do the first of these you only ever get the
first key in the hash - and then no more! If you do the second you'll
get all of the keys.
> Does anyone have any thoughts on why these two code snippets do
> different things, or have any reasons why they should? Or a ruling that
> they shouldn't so we can decide if it's a bug...
The significant differences between the two are that Iterator's
shift_pmc throws an exception if the iteration key is -1 while
shift_string doesn't bother to check (it probably should), and
shift_string calls VTABLE_get_string_keyed on the aggregate it contains,
while shift_pmc calls VTABLE_get_pmc_keyed. And the only significant
differences between Hash's get_string_keyed and get_pmc_keyed are to be
expected for returning a string instead of a PMC. So, I can't explain
why they're giving different results. I can say they shouldn't be giving
different results. And note that in PIR, you can do either:
iter = new Iterator, some_hash
iter_loop:
unless iter, iter_end
shift $S2, iter
say $S2
goto iter_loop
iter_end:
Or:
iter = new Iterator, myhash
iter_loop:
unless iter, iter_end
shift $P2, iter
$S2 = $P2
say $S2
goto iter_loop
iter_end:
And get the same result.
The whole load of hash/iterator code and the way they are so closely
tied looks kinda evil. If I'm reading it right, hash iteration ain't
threadsafe either (iterating a hash in two threads simultaneously will
run into issues); I'm not sure if that's a design goal, but anyways...
Yes, the code for the two seems to be excessively intertwined, and needs
a review. As long as all the state for the iteration is stored in the
iterator object, and the iterator is only doing ordinary keyed lookups
on the hash, it should be threadsafe. (If that's not the case, it's a bug.)
Allison