Jonathan Worthington wrote:
Hi,

Earlier tonight I tried to iterate a hash. I have somewhat thoughtlessly been doing things like:

PMC *iter = VTABLE_get_iter(interp, some_hash);
while (VTABLE_get_bool(iter)) {
   PMC *key_pmc = VTABLE_shift_pmc(interp, iter);
   STRING *key = VTABLE_get_string(interp, key_pmc);
   ...
}

Rather than:

PMC *iter = VTABLE_get_iter(interp, some_hash);
while (VTABLE_get_bool(iter)) {
   STRING *key = VTABLE_shift_string(interp, iter);
   ...
}

So now you're thinking "so what, Jonathan did something stupid in his code yet again, big deal". What is perhaps surprising (well, I found it surprising) is that if you do the first of these you only ever get the first key in the hash - and then no more! If you do the second you'll get all of the keys.

> Does anyone have any thoughts on why these two code snippets do
> different things, or have any reasons why they should? Or a ruling that
> they shouldn't so we can decide if it's a bug...

The significant differences between the two are that Iterator's shift_pmc throws an exception if the iteration key is -1 while shift_string doesn't bother to check (it probably should), and shift_string calls VTABLE_get_string_keyed on the aggregate it contains, while shift_pmc calls VTABLE_get_pmc_keyed. And the only significant differences between Hash's get_string_keyed and get_pmc_keyed are to be expected for returning a string instead of a PMC. So, I can't explain why they're giving different results. I can say they shouldn't be giving different results. And note that in PIR, you can do either:

    iter = new Iterator, some_hash
  iter_loop:
    unless iter, iter_end
      shift $S2, iter
      say $S2
      goto iter_loop
  iter_end:

Or:

    iter = new Iterator, myhash
  iter_loop:
    unless iter, iter_end
      shift $P2, iter
      $S2 = $P2
      say $S2
      goto iter_loop
  iter_end:

And get the same result.

The whole load of hash/iterator code and the way they are so closely tied looks kinda evil. If I'm reading it right, hash iteration ain't threadsafe either (iterating a hash in two threads simultaneously will run into issues); I'm not sure if that's a design goal, but anyways...

Yes, the code for the two seems to be excessively intertwined, and needs a review. As long as all the state for the iteration is stored in the iterator object, and the iterator is only doing ordinary keyed lookups on the hash, it should be threadsafe. (If that's not the case, it's a bug.)

Allison

Reply via email to