At 07:39 PM 4/19/2001 +0000, [EMAIL PROTECTED] wrote:
>Dan Sugalski <[EMAIL PROTECTED]> writes:
> >Okay, I've been pondering complex data structures, garbage collection, and
> >cache coherency for the past few days. (Between this, Unicode, the regex
> >engine, and backwards compatibility, I'll be easy to spot at TPC 5.0. Just
> >look for the tall guy wearing the wraparound canvas sweater...) Because of
> >that, I'm wondering whether it'd be in our best interests to have some sort
> >of split data structure for PMCs.
> >
> >We're going to have the advantage of *knowing* that all our PMCs will be
> >allocated out of arenas, which means we can safely partition the arenas
> >into pieces that correspond to pieces of the PMC. For an example, it would
> >mean we could do:
> >
> > struct arena {
> > struct base_PMC[4096];
> > long PMC_GC_data[4096];
> > }
>
>Neat. (probe for page size?)
Maybe. Page sizes are small enough relative to the size of this that I'm
not sure it's worth it. Even the Alpha's 8K page size would net us only 512
PMCs if they weighed in at 16 bytes each.
> >and know that arena.PMC_GC_data[12] corresponded to arena.base_PMC[12].
> >
> >This makes sense for pieces of a structure that are reasonably little used,
> >like the GC info. (Which is used only by the garbage collector and should,
> >I'd hope, be accessed significantly less than the rest of the PMC data)
> >
> >This works out well for the garbage collector, since it will be dealing
> >with arenas as arrays of PMCs. What I'm not sure of is whether this would
> >benefit us with other pieces of a PMC.
>
>Depends what they are. The scheme effectively makes the part "mandatory"
>as we will have allocated space whether used or not.
Well, we were talking about all PMCs having an int, float, and pointer
part, so it's not like we'd be adding anything. Segregating them out might
make things faster for those cases where we don't actually care about the
data. OTOH that might be a trivially small percentage of the times the
PMC's accessed, so...
>So it depends if access pattern means that the part is seldom used,
>or used in a different way.
>As you say works well for GC of PMCs - and also possibly for compile-time
>or debug parts of ops but is not obviously useful otherwise.
That's what I was thinking, but my intuition's rather dodgy at this level.
The cache win might outweigh other losses.
> >I'm thinking that passing around an
> >arena address and offset and going in as a set of arrays is probably
> >suboptimal in general,
>
>You don't, you pass PMC * and have offset embedded within the PMC
>then arena base is (pmc - pmc->offset) iff you need it.
I was trying to avoid embedding the offset in the PMC itself. Since it was
calculatable, it seemed a waste of space.
If we made sure the arenas were on some power-of-two boundary we could just
mask the low bits off the pointer for the base arena address. Evil, but
potentially worth it at this low a level.
Dan
--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
[EMAIL PROTECTED] have teddy bears and even
teddy bears get drunk