On Sep-08, Leopold Toetsch wrote: > Nicholas Clark <[EMAIL PROTECTED]> wrote: > > On Fri, Sep 05, 2003 at 09:34:04AM -0400, Dan Sugalski wrote: > >> On Fri, 5 Sep 2003, Peter Haworth wrote: > > >> > With the seen hash approach, I wouldn't expect the hash itself to take > >> > nearly as much space as the frozen structure; > >> > >> My experience tracing variable graphs in Perl 5 indicates otherwise. > > > I think it may be worth listening to Dan on this one. > > I don't know why Storable.xs is using a general HV for mapping an > address to an integer (more isn't needed to serialize all IMHO - Dan's > next_for_GC approach doesn't provide more info). Perl5 does convert the > address to a string and uses all weirdness of hv.c code. This isn't > necessary for this special kind of problem. > > Parrot has some DOD counters for possibly active PMCs. A simple and > probably fix-sized[1] hash based on such a count with just a mapping of > {&pmc => id} takes 12 bytes per entry on 32-bit architecures and its > really fast.
It's late so I'm probably being an idiot, but is all that is needed (1) a unique id for every pmc address, and (2) whether or not the PMC has been visited yet? If so, why not use the address itself for the id (or some variant, like index of the pmc in its arena, added to/or-ed with some base id for the arena itself)? And for the flag, have arenas preallocate a bit vector of seen flags for all their PMCs? (Or if you have the arena base ids, then you can have a single bit vector for all arenas.) Or is there no simple way of going from a PMC address to its arena? Perhaps it's no win to pollute the cache with chunks of the bit vector instead of the actual referenced PMC flagpoles. But if that's an issue then I can't help but think that most of the time the seen bit vector would be quite sparse, so we could go to a multi-level scheme: use a conservative vector that can answer definitely not seen vs maybe seen, and then look at the actual PMC flagpole (or, again, a separate bit vector if it somehow helps multithreading) for the true answer. An eighth of a bit per entry ought to be small enough, no? I suppose I ought to pay more attention to the conversation (and the current code) before butting in with suggestions. I've a feeling I'm solving the wrong problem.