Re: Split PMCs

2001-04-19 Thread nick

Dan Sugalski <[EMAIL PROTECTED]> writes:
>Okay, I've been pondering complex data structures, garbage collection, and 
>cache coherency for the past few days. (Between this, Unicode, the regex 
>engine, and backwards compatibility, I'll be easy to spot at TPC 5.0. Just 
>look for the tall guy wearing the wraparound canvas sweater...) Because of 
>that, I'm wondering whether it'd be in our best interests to have some sort 
>of split data structure for PMCs.
>
>We're going to have the advantage of *knowing* that all our PMCs will be 
>allocated out of arenas, which means we can safely partition the arenas 
>into pieces that correspond to pieces of the PMC. For an example, it would 
>mean we could do:
>
>struct arena {
>   struct base_PMC[4096];
>   long PMC_GC_data[4096];
>}

Neat. (probe for page size?)

>
>and know that arena.PMC_GC_data[12] corresponded to arena.base_PMC[12].
>
>This makes sense for pieces of a structure that are reasonably little used, 
>like the GC info. (Which is used only by the garbage collector and should, 
>I'd hope, be accessed significantly less than the rest of the PMC data)
>
>This works out well for the garbage collector, since it will be dealing 
>with arenas as arrays of PMCs. What I'm not sure of is whether this would 
>benefit us with other pieces of a PMC. 

Depends what they are. The scheme effectively makes the part "mandatory"
as we will have allocated space whether used or not.
So it depends if access pattern means that the part is seldom used,
or used in a different way.
As you say works well for GC of PMCs - and also possibly for compile-time
or debug parts of ops but is not obviously useful otherwise.


>I'm thinking that passing around an 
>arena address and offset and going in as a set of arrays is probably 
>suboptimal in general, 

You don't, you pass PMC * and have offset embedded within the PMC
then arena base is (pmc - pmc->offset) iff you need it.

>but I'm curious as to whether anyone has any hard 
>experience with this sort of thing before I delve in any deeper.
>
>   Dan
>
>--"it's like this"---
>Dan Sugalski  even samurai
>[EMAIL PROTECTED] have teddy bears and even
>  teddy bears get drunk
-- 
Nick Ing-Simmons




Re: Split PMCs

2001-04-19 Thread Uri Guttman

> "NIS" ==   <[EMAIL PROTECTED]> writes:

  NIS> Dan Sugalski <[EMAIL PROTECTED]> writes:
  >> 
  >> struct arena {
  >> struct base_PMC[4096];
  >> long PMC_GC_data[4096];
  >> }

  NIS> Neat. (probe for page size?)

wouldn't that best be determined at configure/build time? and made into
a compile time constant?

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info: http://www.sysarch.com/perl/OOP_class.html



Re: Larry's Apocalypse 1 \}

2001-04-19 Thread David L. Nicol

Simon Cozens wrote:
> 
> On Thu, Apr 12, 2001 at 05:39:12PM -0400, Dan Sugalski wrote:
> > [We have FOO:BAR]
> > While this is reasonably true (and reasonably reasonable) it's not entirely
> > to the point. If we're going to provide a mechanism to define the syntax of
> > a mini-language (or a maxi one, I suppose, though there are probably better
> > ways to do it) then the details of colons and constants and what-have-yous
> > are pretty close to irrelevant.
> 
> No, I don't think so. The whole thing rests on the fact that class FOO knows
> how to parse string BAR. This, from the tokener's point of view, means that
> class FOO has to tell us when string BAR actually *ends*. 

No it doesn't.  There are well-defined rules for when string BAR actually
*ends*
which are followed before FOO ever sees it.


> For complex BAR (and
> complex FOO) this could be, uh, complex. It means that our parser would have
> to call out to other routines - which can presumably be defined in Perl - to
> assist in parsing Perl code. And hey, if BAR can be defined in Perl, it can be
> defined on-the-fly. Oh dear.
> 
> Not impossible by any means, but *by no means* irrelevant.


No. 

Recursive parsing is not needed.  We have the HERE string, which can
 include anything in with the rest of the code, by looking for the
 end-token.  The perl5 Inline module works that way.

Perl5 can be parsed by making everything token, whitespace, or
 literal. Literals have to end the way they start, but it is not
 recursive: interpolation is applied to a quoted literal, it does not
 affect what is in and what is out of the literal.

To me the simplest way to proceed, with maximum flexibility, would be
 to offer two types of rewriting systems:

1:  your system operates from scratch on a string literal,
like Inline does now.  Any syntax is allowed, as long as
there is some indicator you can remember to escape when
it appears within your string.  This is how all 
8-bit-safe transfer protocols work, except for the ones
that know the length of their payloads at the beginning. 
Prefixing literals with character counts would be
nightmarish and I am NOT suggesting it.

2:  your system operates on tokenized (but not yet
interpreted) perl symbols. The only restriction is, your
curlies have to match.  So we introduce two new tokens,
the literal curlies -- \{ and \} -- which are equivalent
to \" within a string -- in case your special token would
like to accept pre-tokened (token, whitespace, literal)
code and agrees with perl's idea of how blocking and
quoting works.  To parse python using this system we'd
need to keep the details of whitespace around instead of
instantly dismissing it.  Or insist that language
extensions must maintain curlie balance.  It's really a 
very minor demand, esp. since there is method 1 
(inline-style operation on a quoted literal string ) to 
fall back on.




-- 
  David Nicol 816.235.1187 [EMAIL PROTECTED]
Home of the V-90 modern




Re: Split PMCs

2001-04-19 Thread Dan Sugalski

At 07:39 PM 4/19/2001 +, [EMAIL PROTECTED] wrote:
>Dan Sugalski <[EMAIL PROTECTED]> writes:
> >Okay, I've been pondering complex data structures, garbage collection, and
> >cache coherency for the past few days. (Between this, Unicode, the regex
> >engine, and backwards compatibility, I'll be easy to spot at TPC 5.0. Just
> >look for the tall guy wearing the wraparound canvas sweater...) Because of
> >that, I'm wondering whether it'd be in our best interests to have some sort
> >of split data structure for PMCs.
> >
> >We're going to have the advantage of *knowing* that all our PMCs will be
> >allocated out of arenas, which means we can safely partition the arenas
> >into pieces that correspond to pieces of the PMC. For an example, it would
> >mean we could do:
> >
> >struct arena {
> >   struct base_PMC[4096];
> >   long PMC_GC_data[4096];
> >}
>
>Neat. (probe for page size?)

Maybe. Page sizes are small enough relative to the size of this that I'm 
not sure it's worth it. Even the Alpha's 8K page size would net us only 512 
PMCs if they weighed in at 16 bytes each.

> >and know that arena.PMC_GC_data[12] corresponded to arena.base_PMC[12].
> >
> >This makes sense for pieces of a structure that are reasonably little used,
> >like the GC info. (Which is used only by the garbage collector and should,
> >I'd hope, be accessed significantly less than the rest of the PMC data)
> >
> >This works out well for the garbage collector, since it will be dealing
> >with arenas as arrays of PMCs. What I'm not sure of is whether this would
> >benefit us with other pieces of a PMC.
>
>Depends what they are. The scheme effectively makes the part "mandatory"
>as we will have allocated space whether used or not.

Well, we were talking about all PMCs having an int, float, and pointer 
part, so it's not like we'd be adding anything. Segregating them out might 
make things faster for those cases where we don't actually care about the 
data. OTOH that might be a trivially small percentage of the times the 
PMC's accessed, so...

>So it depends if access pattern means that the part is seldom used,
>or used in a different way.
>As you say works well for GC of PMCs - and also possibly for compile-time
>or debug parts of ops but is not obviously useful otherwise.

That's what I was thinking, but my intuition's rather dodgy at this level. 
The cache win might outweigh other losses.

> >I'm thinking that passing around an
> >arena address and offset and going in as a set of arrays is probably
> >suboptimal in general,
>
>You don't, you pass PMC * and have offset embedded within the PMC
>then arena base is (pmc - pmc->offset) iff you need it.

I was trying to avoid embedding the offset in the PMC itself. Since it was 
calculatable, it seemed a waste of space.

If we made sure the arenas were on some power-of-two boundary we could just 
mask the low bits off the pointer for the base arena address. Evil, but 
potentially worth it at this low a level.


Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Split PMCs

2001-04-19 Thread Dan Sugalski

At 04:06 PM 4/19/2001 -0400, Uri Guttman wrote:
> > "NIS" ==   <[EMAIL PROTECTED]> writes:
>
>   NIS> Dan Sugalski <[EMAIL PROTECTED]> writes:
>   >>
>   >> struct arena {
>   >> struct base_PMC[4096];
>   >> long PMC_GC_data[4096];
>   >> }
>
>   NIS> Neat. (probe for page size?)
>
>wouldn't that best be determined at configure/build time? and made into
>a compile time constant?

I'm thinking of making it dynamic, but I'm not sure yet. On the one hand 
I'm thinking a fixed size would give the C compiler more to work with. On 
the other, if we know we're allocating a million scalars at some spot, it 
seems silly to not stick them in a single arena.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




debugging PDD: request for suggestions

2001-04-19 Thread Dave Storrs

Hey folks,

Ok, so I've picked up Dan's gauntlet on the debugging PDD.  I've
got a pretty long list of things to put in it already, but if anyone would
care to send me their suggestions, I can get them integrated into the
initial draft.  I probably won't have it ready by tomorrow, and I only
have Internet from work (I'm in the process of moving), so it will
probably be coming out on Monday.

Feel free to send suggestions direct to me so as not to clog the
list (or send them to the list, so as to spark other people's imagination;
I'm relaxed).

Dave





RE: debugging PDD: request for suggestions

2001-04-19 Thread Soyoung Park

Good Practice Dave,

ehem! (clear my throat)
I,rather intimidated, volunteered for "Unicode string handling" PDD.
:)

Please feel free to send suggestions and directions!

Thanks in advance,

Soyoung

-Original Message-
From: Dave Storrs [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 19, 2001 3:37 PM
To: Perl 6 Internals list
Subject: debugging PDD: request for suggestions


Hey folks,

Ok, so I've picked up Dan's gauntlet on the debugging PDD.  I've
got a pretty long list of things to put in it already, but if anyone would
care to send me their suggestions, I can get them integrated into the
initial draft.  I probably won't have it ready by tomorrow, and I only
have Internet from work (I'm in the process of moving), so it will
probably be coming out on Monday.

Feel free to send suggestions direct to me so as not to clog the
list (or send them to the list, so as to spark other people's imagination;
I'm relaxed).

Dave