Re: Pipeline Performance
At Tue, 31 Aug 2004 13:23:04 -0400, [EMAIL PROTECTED] (Aaron Sherman) wrote: > I would think you actually want to be able to define grep, map, et al. > in terms of the mechanism for unraveling, and just let the optimizer > collapse the entire pipeline down to a single map. Even for map and grep this is a bit trickier, since map can produce zero or more values for each input value, and calls its body in list context, whereas grep produces zero or one value, and gets called in scalar context. So you'd need something like a full call and return prototype for each mapping function, e.g.: FunctionReturn context Argument context -- -> $a { $a + 2 }($y)($x) grep(&block)($y is optional)($x) map(&block) ([EMAIL PROTECTED]) ($arg) Then your loop merging macro could deconstruct these into the appropriate kind of loop (using foreach and pushing single items only to make intention clear): @a ==> map &b ==> @c ==> foreach $a (@a) { foreach $b (map_item(&b, $a)) { push @c, $b } } @a ==> (&b = -> $a { $a + 2 }) ==> @c ==> foreach $a (@a) { push @c, b($a) } @a ==> grep \&b ==> @c ==> foreach $a (@a) { foreach $b (grep_item(&b, $a)) { push @c, $b } } where "map_item" and "grep_item" are the single-element mapper functions defining map and grep. I think that both the context and the number of items consumed/produced could be gathered from prototypes, so the only restrictions for mapping functions would be (1) having a prototype available at definition time, and (2) being side-effect-free. /s
Re: Compile op with return values
Steve Fink <[EMAIL PROTECTED]> wrote: > ... Leo's @ANON implementation of > your scheme works great for me (I have no problem wrapping that around > my code.) All this does raise the question of garbage collection for > packfile objects; is there any? Not yet. We basically have two kinds of dynamically compiled code: 1) loaded modules - persistent code used until end of program 2) evaled "statements" - volatile code, maybe used once only But the current implementation doesn't know about that difference. The compiled code is always appended to the list of code segments. There is no interface yet to manipulate packfile segments. We finally need a packfile PMC that is the owner of packfile segments. If that PMC goes out of scope the compiled code structures can be freed. This packfile PMC would also vastly eliminate the difference between 1) and 2), the more when there is some interface to be able to append the newly compiled code to existing code segments, so that you can e.g. dump the combined code to disc. But it would still be useful to differentiate between 1) and 2). For 1) we could do global constant folding (if a constant already exists in the main contant table just use it, or, if not, append to the main constant table). For 2) a distinct constant table is needed. leo
Cross Compiling parrot?
Hi, Did anybody try to crosscompile parrot? It doesn't seem to work. I tried it with parrot_2004-08-26_23 by setting --cc=arm-softfloat-linux-gnu-gcc --ld=arm-softfloat-linux-gnu-gcc on configure, but that fails with: --8<-- [EMAIL PROTECTED]:~/tmp/parrot> perl Configure.pl --cc=arm-softfloat-linux-gnu-gcc --ld=arm-softfloat-linux-gnu-gcc Parrot Version 0.1.0 Configure 2.0 Copyright (C) 2001-2003 The Perl Foundation. All Rights Reserved. Hello, I'm Configure. My job is to poke and prod your system to figure out how to build Parrot. The process is completely automated, unless you passed in the `--ask' flag on the command line, in which case it'll prompt you for a few pieces of info. Since you're running this script, you obviously have Perl 5--I'll be pulling some defaults from its configuration. Checking MANIFEST.done. Setting up Configure's data structuresdone. Tweaking settings for miniparrot..done. Loading platform and local hints filesdone. Enabling optimization.done. Determining nongenerated header files.done. Determining what C compiler and linker to use.done. Determining what types Parrot should use..done. Determining what opcode files should be compiled in...done. Setting up experimental systems...done. Determining what pmc files should be compiled in..done. Determining your minimum pointer alignmentC compiler failed (see test.cco) at lib/Parrot/Configure/Step.pm line 332 Parrot::Configure::Step::cc_build() called at config/auto/alignptrs.pl line 37 Configure::Step::runstep('undef', 'undef') called at lib/Parrot/Configure/RunSteps.pm line 110 Parrot::Configure::RunSteps::runsteps('Parrot::Configure::RunSteps', 'cc', 'arm-softfloat-linux-gnu-gcc', 'ld', 'arm-softfloat-linux-gnu-gcc', 'debugging', 1) called at Configure.pl line 376 [EMAIL PROTECTED]:~/tmp/parrot> --8<-- Robert -- Dipl.-Ing. Robert Schwebel | http://www.pengutronix.de Pengutronix - Linux Solutions for Science and Industry Handelsregister: Amtsgericht Hildesheim, HRA 2686 Hornemannstraße 12, 31137 Hildesheim, Germany Phone: +49-5121-28619-0 | Fax: +49-5121-28619-4
Re: NCI test 2 failing - but I know why
--- Bernhard Schmalhofer <[EMAIL PROTECTED]> wrote: > Printing a initialised ParrotLibrary currently gives > you the '_filename' > property. This is highly platform dependent, and > therefore hard to test. > > I could rewrite the test and check only, that the > stringified > ParrotLibrary contains the substring 'nci'. My guess > is, that this > should work on all platforms so far. > > CU, Bernhard Well, I seem to be the only one noticing it and the good thing is that it is the test itself, and not what is being tested that is b0rk. I would think portability is a good thing but don't go changing things on my account yet. When I get the time, I will investigate. Cheers Joshua Gatcomb a.k.a. Limbic~Region __ Do you Yahoo!? New and Improved Yahoo! Mail - Send 10MB messages! http://promotions.yahoo.com/new_mail
Re: Cross Compiling parrot?
At 7:32 PM +0200 9/1/04, Robert Schwebel wrote: Hi, Did anybody try to crosscompile parrot? It doesn't seem to work. That doesn't surprise me. We still pull information out of the local perl install (which'll be wrong, of course, in a cross-compilation environment) and I'm pretty sure we don't pass in the right flags in the right places to cross-compile properly. Part of the problem with this is that we just don't have people with a need or experience doing cross-compilation, though I'd be thrilled if we can find someone. (Any and all patches to make things cross-compile friendly, or even less cross-compile unfriendly, will be greatly appreciated) -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Pipeline Performance
On Tue, 2004-08-31 at 14:11, Sean O'Rourke wrote: > At Tue, 31 Aug 2004 13:23:04 -0400, > [EMAIL PROTECTED] (Aaron Sherman) wrote: > > I would think you actually want to be able to define grep, map, et al. > > in terms of the mechanism for unraveling, and just let the optimizer > > collapse the entire pipeline down to a single map. > > Even for map and grep this is a bit trickier, since map can produce > zero or more values for each input value, and calls its body in list > context, whereas grep produces zero or one value, and gets called in > scalar context. You're confusing two stages of grep's operation. grep's body is called in boolean context, but that has NOTHING to do with the return value, which is a list of zero or one elements like so: sub grep (&code, [EMAIL PROTECTED]) { map { if code($_) { $_ } else { () } } @list; } That said, I skipped over a MAJOR point in my original reply, and it must be said that in the case of map, you need iterators in order to do the job correctly. map actually needs to return an object which, while it can be treated as a list, is actually just an iterator object which remembers the input list and transform closure. Once you do that, you get cheap pipelining, as the major cost isn't the construction of the temporary lists in Perl 5, it's POPULATING the temporary lists. Constructing an iterator in Perl 6 means no population. -- â 781-324-3772 â [EMAIL PROTECTED] â http://www.ajs.com/~ajs
Re: Last bits of the basic math semantics
Hi, > fixed sizes of integer, so I'd aim some ops at low-level types of > known size and leave it at that. Quite a while back, I did add a few opcodes for fixed size integer operations for Parrot .. But they were added for a totally different HLL :) > matter what you do with the high bits. I suppose another way to > look at it is that they'll just want ops that'll JIT well, which > usually means to make the ops work on the natural datatype sizes > of the machine. But that fights against the fact that most crypto > algorithms do their commutations based on a known number of bits. Maybe the dotgnu.ops needs to be renamed as fixedsize.ops and add a few more fixed size int operations ?. int , uint, long and ulong should suffice for most crypto folks , but those ops are not JIT'd AFAIK. Gopal
A question about attribute functions
How do you declare attribute functions? Specifically, I was thinking about map and what kind of object it would return, and I stumbled on a confusing point: class mapper does iterator { has &.transform; ... } Ok, that's fine, but what kind of accessor does it get? my mapper $x .= new(transform => ->{(1,2,3)}); $x.transform() would imply that you're calling method transform, not invoking the accessor function which does not have a method signature. Would you have to do this: class mapper does iterator { has Code $transform; ... } ... $x.transform.(); ? -- â 781-324-3772 â [EMAIL PROTECTED] â http://www.ajs.com/~ajs
Re: perl6 garbage collector?
On Mon, 2004-08-30 at 14:40, Ozgun Erdogan wrote: > > > Currently, we're using perl-5.6.1 and are having problems with memory > > > leaks - thanks to reference counting. > > > > You'll have to break reference loops explicitely. > > If only I had known where those circular references are. I have a > circular ref. detector tool, but it still doesn't get them. The thing > is, you could do an SvREFCNT_inc, and boom you have a memory leak. Ok, you're no longer talking about Perl (the language) but rather about Perl 5's internals. Different beast. This is not the right list for debugging that kind of thing, so I won't go into it, but suffice to say that if you have trouble managing your references through XS, incorporating Parrot's GC into Perl 5 would be near impossible. That's not intended as a slight, believe me, I put myself in the same category (reference counting in Perl 5 is very difficult to grok from the docs, as the docs make some assumptions about how much you know about how Perl constructs scopes). All that aside, Ponie is your friend. As Ponie matures, it will provide what you need, and your XS could be transitioned over into Parrot bytecode. For now, if I were you I would upgrade to 5.8.x and try to make sure that every value that you move between your XS and Perl is properly mortal (see the perlapi, perlguts and perlxs man pages). -- â 781-324-3772 â [EMAIL PROTECTED] â http://www.ajs.com/~ajs
Re: NCI test 2 failing - but I know why
On Tue, 31 Aug 2004 05:56:27 -0700 (PDT), Joshua Gatcomb <[EMAIL PROTECTED]> wrote: > Obviously the test is passing, but the expected result > is different: > loaded runtime/parrot/dynext/libnci.so > vs > loaded libnci.so I'm getting the same thing on Solaris 8 using GCC 3.4.1 with solaris binutils: t/pmc/nci..NOK 2# Failed test (t/pmc/nci.t at line 59) # got: 'loaded libnci.so # 8.00 # ' # expected: 'loaded runtime/parrot/dynext/libnci.so # 8.00 # ' t/pmc/nci..ok 35/35# Looks like you failed 1 tests of 35. t/pmc/nci..dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 2 Failed 1/35 tests, 97.14% okay
Re: Library loading
On Sat, 2004-08-28 at 16:17, Dan Sugalski wrote: > Time to finish this one and ensconce the API into the embedding interface. That reminds me, I was reading P6&PE yesterday, and I came across a scary bit on loading of shared libraries. The statement was made that Parrot would search the current directory first. Perhaps this was an over-simplification, but if not, PLEASE, re-consider. Security implications aside (and they're huge), Parrot should probably be searching its installation area (possibly overridden by an environment variable) followed by whatever system path (e.g. LD_LIBRARY_PATH, ldconfig or whatever your OS uses) is given to Parrot externally, so as not to modify the behavior of a program based on the current directory of the user running it. -- â 781-324-3772 â [EMAIL PROTECTED] â http://www.ajs.com/~ajs
Re: Library loading
At 11:00 AM -0400 9/1/04, Aaron Sherman wrote: On Sat, 2004-08-28 at 16:17, Dan Sugalski wrote: Time to finish this one and ensconce the API into the embedding interface. That reminds me, I was reading P6&PE yesterday, and I came across a scary bit on loading of shared libraries. The statement was made that Parrot would search the current directory first. It does? Urk. No, not by default. We need to work out some library loading stuff, but this is *definitely* not going to be the default. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Synopsis 2 draft 1
On Sat, 14 Aug 2004, Smylers wrote: > > could reparse the result. XXX .repr is what Python calls it, I think. > > Is there a better name? > > Yes; I've no suggestions as to what it might be, but surely there's > _got_ to be a better name than C<.repr>. .repr is fine for me. An alternative that springs to mind could be .dump > > XXX We could yet replace <$foo> with $foo.more or $foo.iter or > > $foo.shift or some such (but not $foo.next or $foo.readline), > > That sounds good to me -- C<< while (<$file>) >> is one of the > least-intuitive bits of syntax to get across to people learning Perl; Gawd, no! It's so deeply perlish... could hardly do without it! Of course the above mentioned alternatives would be ok too, but only *as alternatives*... Michele -- Have you noticed that people whose parents did not have children, also tend not to have children. - Robert J. Kolker in sci.math, "Re: Genetics and Math-Ability"
Proposal for a new PMC layout and more
Below is a pod document describing some IMHO worthwhile changes. I hope I didn't miss some issues that could inhibit the implementation. Comments welcome, leo =head1 1. Proposal for a new PMC layout and more =head2 1.1. Current state - PMC size and structure PMCs are using too much memory (5 words for an Integer PMC, 11 + n words plus two indirection for an object with n attributes). The reduction of IIRC 9 words to the current 5 words almost doubled execution speed for not too small amounts of allocated PMCs. OTOH PMCS are rather rigid structures. There is only a fixed amount of data fields available. If the data the PMC should hold won't fit, custom malloced extensions (or Buffers) have to be added to the PMC, which take up more memory and (at least) one more indirection to get at these data. =head2 1.2. Current state - STRING buffer headers STRING buffer headers are kept in separate registers, have their own opcodes and PMCs have vtable variants dealing with STRINGs. This is using up a lot of memory and resources. There are currently 480 opcodes dealing with STRINGs and 46 vtable/MMD methods that have STRING* arguments. But a lot of PMC opcodes and vtables dealing with strings are still missing. But a STRING itself is already a rather fat structure (and it'll still need charset and encoding or such). Thus there isn't really an advantage to have a distinct STRING type, the more that all HLLs except Perl6 don't have a notion for such a type - they'll just have objects aka PMCs. Further STRINGs will need a vtable to deal with unicode or rather with different access levels (e.g. length in bytes, codepoints, or chars). And finally: we currently have STRING* hash keys only, a scheme that doesn't work for hashing arbitrary objects. In Python a hash key is just an object that provides a I vtable method. =head2 1.3. Current state - STRING memory In a multi-threaded parrot string memory could move beyond the interpreter at any time due to the copying collection of variable sized memory. E.g.: Thread 1 Thread 2 cursor = s->strstart collect string memory s->strstart moved end = s->strstart + s->bufused while (cursor < end) // boom Such code snippets are used all over the place in F. This would either need a lock for reading two or better a non-copying collection of string memory. The same problems currently arise with hashes and list-based arrays, which both are using buffer headers and movable memory. =head1 2. Proposed changes =over 4 =item PMCs are variable sized. We just allocate as much as needed to accomodate the object. This reduces memory usage vastly for small types, simplifies complexer PMCs and eliminates the additional indirections to access the object's data. =item STRINGs become PMCs This will reduce opcode count by almost one third and simplify the whole interpreter. =item Buffers become PMCs The final unification of Buffers and PMCs. =back =head2 2.1 The new PMC layout A simple PMC consists of a vtable pointer and a data portion. =head2 2.2 Example: Integer, Float, Object with 2 attributes, Ref ++++ pmc->| vtable| pmc->| vtable| ++++ | INTVAL|| FLOATVAL | ++|| || ++ ++++ pmc->| vtable| pmc->| vtable| ++++ | attrib_count || pmc_pointer | ++++ | attribute #1 | ++ | attribute #2 | ++ A String PMC is a similar structure with all the needed data items just like the current STRING structure. So eliminating the STRING type doesn't impose any overhead at all, except the vtable access - but that will be needed anyway. =head2 2.3. Where are the flags? Flags currently take one word per PMC. This is a lot of overhead. But we basically only need two types of flags: =over 4 =item What kind is that PMC This information is just the vtable or additional information (flags) in the vtable. All PMCs of one kind share one vtable anyway, so its much cheaper to use the vtable for that information then to provide one additional word per PMC. Some flags are also eliminated by getting rid of the distinction between PMCs and Buffers. =item Flags used during garbage collection GC flags are garbage collector-specific. A stop-the-world allocator can e.g. use the 2 low bits of the vtable pointer for the I and I flags. With I a nibble in a separate memory region is used. An implicit reclamation GC scheme has additonal pointers to manage the g
Re: A question about attribute functions
On Wed, Sep 01, 2004 at 10:41:37AM -0400, Aaron Sherman wrote: : How do you declare attribute functions? Specifically, I was thinking : about map and what kind of object it would return, and I stumbled on a : confusing point: : : class mapper does iterator { : has &.transform; : ... : } : : Ok, that's fine, but what kind of accessor does it get? : : my mapper $x .= new(transform => ->{(1,2,3)}); : $x.transform() : : would imply that you're calling method transform, not invoking the : accessor function which does not have a method signature. Would you have : to do this: : : class mapper does iterator { : has Code $transform; : ... : } : ... : $x.transform.(); : : ? That might not work either. This will, though: ($x.transform)(); Larry
Minor makefile fix
Hi, Here's a small fix to the root.in makefile; this fix is needed to get Parrot building again on Win32 and probably in some other places too. Jonathan makefile.diff Description: Binary data
Re: Proposal for a new PMC layout and more
At 5:17 PM +0200 9/1/04, Leopold Toetsch wrote: Below is a pod document describing some IMHO worthwhile changes. I hope I didn't miss some issues that could inhibit the implementation. Interesting. But... no. Things are the way they are on purpose -- a lot of thought, a not-incosiderable amount of pain, and a lot of harsh experience went into precursor designs, the current design, and the current implementation. We're going to leave it as-is. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Proposal for a new PMC layout and more
On Wed, 2004-09-01 at 11:17, Leopold Toetsch wrote: > Comments welcome, Honestly, much of this goes beyond my meager understanding of Parrot internals, but I've read it, and most of it seems reasonable. Just on point where you may not have considered a logical alternative: > =head2 2.6. Morphing Undefs > > Currently all binary (and other) opcodes need an existing destination > PMC. The normal sequence a compiler emits is something like this: > > $P0 = new Undef > $P0 = a + b Since you've lopped a lot of space off of PMCs, Undefs could be made large enough to fit a basic buffer PMC (3 words). In that case, they could always be upgraded in-place to integer PMCs, float PMCs, very simple objects, references and buffers. Everything else would need to go through a copy-upgrade. The trade-off is that all PMCs would be 3 words unless special code was emitted that avoided this for smaller (integer, float, reference) PMCs. I'm not saying that this is a BETTER plan, just an idea to think about and a different set of trade-offs. -- â 781-324-3772 â [EMAIL PROTECTED] â http://www.ajs.com/~ajs
Re: Proposal for a new PMC layout and more
On Wed, Sep 01, 2004 at 05:17:55PM +0200, Leopold Toetsch wrote: > PMCs are using too much memory (5 words for an Integer PMC, 11 + n > words plus two indirection for an object with n attributes). The > reduction of IIRC 9 words to the current 5 words almost doubled > execution speed for not too small amounts of allocated PMCs. I would be much happier if we got to a functionally complete implementation of parrot with stable, useful APIs first. And then put effort into optimising the implementation behind the scenes. Based on more complete knowledge of how things performed on code generated from real language compilers. It may well turn out that your proposals make sense then, as well as now. But I feel what's holding things up is not lack of speed, but lack of completeness. Nicholas Clark
Re: A question about attribute functions
On Wed, Sep 01, 2004 at 08:02:33AM -0700, Larry Wall wrote: : That might not work either. This will, though: : : ($x.transform)(); So will $x.transform()(); for that matter... Larry
Re: A question about attribute functions
Larry Wall skribis 2004-09-01 8:02 (-0700): > : $x.transform.(); > That might not work either. This will, though: > ($x.transform)(); This is surprising. Can you please explain why .() won't work? I have methods return subs quite often, and like that I can just attach ->() to it to make them work for me. I dislike parens. If $object.method.() will really not work, is there a way to call it without adding parens? Adding parens for someone who doesn't plan an entire line of code before typing it, means going back (for me, this is the most important reason for using statement modifiers; it's not just linguistically pleasing). Juerd
Re: Proposal for a new PMC layout and more
On Sep-01, Leopold Toetsch wrote: > Below is a pod document describing some IMHO worthwhile changes. I hope > I didn't miss some issues that could inhibit the implementation. Overall, I like it, although I'm sure I haven't thought of all of the repercussions. The one part that concerns me is the loss of the flags -- flags just seem generally useful for a number of things. In the limit, each flag could be replaced by an equivalent vtable entry that just returned true or false, but that will only work for rarely-used flags due to the extra levels of indirection. I suppose we could also have a large class of PMCs that contained a flag word, and only the primitive PMCs would lack it, but then the flags cannot be used without knowing the type of PMC. It all comes down to the specific current and future uses of flags. You've dealt with the GC flags; what about the rest? The proposal would also expand the size of the vtable by a bit due to the string vtable stuff. I don't know how much that is, percentage-wise. And I suppose that increase is dwarfed by the decrease due to eliminating the S variants. (Although that's another part of the proposal that makes me nervous -- will MMD really take care of all of the places where we care that we're going to a string, specifically, rather than any other random PMC type? Strings are a pretty widespread concept throughout the code base, and this is the only highly user-visible part of the change.) I also view the proposal as being comprised of several fairly independent pieces. Something like: * Merging PMCs and Buffers * Merging STRINGs and PMCs * Removing GC-related flags and moving them to GC implementations * Removing the rest of the flags * Using Null instead of Undef * Moving "extra" stuff to before the PMC pointer * Using Refs to expand PMCs * Using DOD to remove the Ref indirection * Shrinking the base PMC size ..and whatever else I forgot. Not all of these are dependent on each other, and could be implemented separately. And some are only dependent in the sense that you'll make space or time performance worse until you make the rest of the related changes. You could call those design-dependent, rather than implementation-dependent.
Re: Proposal for a new PMC layout and more
At 5:17 PM +0200 9/1/04, Leopold Toetsch wrote: Below is a pod document describing some IMHO worthwhile changes. I hope I didn't miss some issues that could inhibit the implementation. Okay, the "No" warrants more explanation. First off, the current structure of PMCs, Buffers, and Strings is definitely a mess, what with the multiple nested structs, semi-shared data, and weird smallobject overlap. A lot of stuff that is, in retrospect, crap has been layered on, so if this gets beaten up and cleaned out I won't mind in the least. The PMC scheme -- where PMCs are an immovable header with a vtable slot, cache slot, and flag slot -- stays. It's this way on purpose, and matches normal usage patterns (nicely efficiently) for perl 5 as well as (oddly) most python and ruby usage. (Where there's a preponderance of low-level types) Buffers and strings are special-purpose constructs, or at least they *should* be. They're segregated off for GC purposes. While they could be unified with PMCs, I don't want them to be. They've specific, special purposes, and as such they're staying the way they are. Strings, FWIW, are *not* a perl 6 specific thing. The current string design is sufficient, and *will* be used, for perl 5, python, and ruby, as well as any other language that wants to live on parrot and handle string data. While there's stuff to be added still, there's no reason that I can see to mess with them. Finally, Nicholas is right -- this is messing around with stuff that already works. We're better off working on things that don't exist yet, and leave this to later. If you want, we can hash out the changes to sub calling (with the swapping interpreter structs we've been arguing over), moving the return continuation/calling object/called sub into the interp structure, and fixing up the JIT and exception handling stuff to deal with it. That, at least, will be visible to bytecode programs and worth getting done. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
[perl #31419] PATCH: Fix for solaris platform asctime_r argument number mismatch
# New Ticket Created by [EMAIL PROTECTED] # Please include the string: [perl #31419] # in the subject line of all future correspondence about this issue. # http://rt.perl.org:80/rt3/Ticket/Display.html?id=31419 > On Solaris 8, asctime_r takes 3 parameters instead of 2. The prototype looks like this: char *asctime_r(const struct tm *tm, char *buf, int buflen); This patch adds a time.c file to the solaris platform directory. The only change in this time.c file and the stock generic/time.c is that the call to asctime_r passes 26 as the buflen. The Solaris man page for asctime_r says that the result will always be 26 characters exactly. I'm guessing the POSIX asctime assumes that the buffer is at least 26 characters, so assuming the buffer that Parrot_asctime_r gets is at least 26 characters isn't a new risk. Without this patch, the latest version of parrot from CVS will not compile on Solaris 8. parrot.solaris-asctime_r.patch Description: Binary data
Semantics for regexes
I promised Patrick this a while back but never got it, so here it is. This is a list of the semantics that I see as needed for a regex engine. When we have 'em, we'll map them to string ops, and may well add in some special-case code for faster access. *) extract substring *) exact string compare *) find string in string *) find first character of class X in string *) find first character not of class X in string *) find boundary between X and not-X *) Find boundary defined by arbitrary code (mainly for word breaks) *) create new class X *) add or subtract character to class X *) create union|intersection|difference of two classes I think this about does it, and we do some of this already. Are there semantics people see as missing, or need more explanation? If so, pipe up, we'll nail them down, then get the op mapping (with implementation) and go from there. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: A question about attribute functions
On Wed, Sep 01, 2004 at 07:08:57PM +0200, Juerd wrote: : Larry Wall skribis 2004-09-01 8:02 (-0700): : > : $x.transform.(); : > That might not work either. This will, though: : > ($x.transform)(); : : This is surprising. Can you please explain why .() won't work? I have : methods return subs quite often, and like that I can just attach ->() to : it to make them work for me. Because in Perl 5, we haven't defined ->() to be the long form of a "subscripty" (). But in Perl 6, we have $a[$x] $a .[$x]# same thing $a{$x} $a .{$x}# same thing $a($x) $a .($x)# same thing That says to me that we also have $a.b($x)$a.b .($x) # same thing In the particular case of an attribute, there are no arguments, so the parens are optional. But the fact that they're optional means that if you do put parens, they belong to the method call, not the returned value. (And we have to make them optional rather than mandatorily missing, since we require the parens to interpolate an attribute in double-quote context.) : I dislike parens. If $object.method.() will really not work, is there a : way to call it without adding parens? Adding parens for someone who : doesn't plan an entire line of code before typing it, means going back : (for me, this is the most important reason for using statement : modifiers; it's not just linguistically pleasing). Well, it's still extra parens, but my other solution: $object.method()() has the benefit of not forcing you to (horrors!) plan in advance. I suppose you're one of those clever people who prefer to rewrite the OS by typing cat >/dev/kmem [EMAIL PROTECTED]@[EMAIL PROTECTED]@7¥&$^@ [EMAIL PROTECTED]@... ^D :-) Larry
[perl #31423] [PATCH] two tests for NCI
# New Ticket Created by Bernhard Schmalhofer # Please include the string: [perl #31423] # in the subject line of all future correspondence about this issue. # http://rt.perl.org:80/rt3/Ticket/Display.html?id=31423 > Hi, this patch adds two tests to t/pmc/nci.t. The first new test should be a platform independent check of get_string() of the ParrotLibrary PMC. The second new test is a callback test ported from PASM to PIR. CU, Bernhard -- /* [EMAIL PROTECTED] */ NEU: Bis zu 10 GB Speicher fïr e-mails & Dateien! 1 GB bereits bei GMX FreeMail http://www.gmx.net/de/go/mail nci_tests_20040901.patch Description: Binary data
Re: Semantics for regexes
On Wed, Sep 01, 2004 at 01:57:32PM -0400, Dan Sugalski wrote: : I promised Patrick this a while back but never got it, so here it is. : : This is a list of the semantics that I see as needed for a regex : engine. When we have 'em, we'll map them to string ops, and may well : add in some special-case code for faster access. : : *) extract substring Mostly you don't want to go to the trouble of extracting substrings until you're forced to, because you're continually creating and destroying backrefs into your string, and you don't want to be copying characters around for that. As long as there's some kind of COWish semantics to keep around an original copy of the string being searched, that's probably all the regex engine itself wants. It's generally the outer code that wants to make a copy of $1 et al. : *) exact string compare : *) find string in string And maybe case insensitive variants of these, unless it turns out to be better to combine it with other required composition/decomposition. But then matching against the contents of a variable repeatedly is likely to induce repeated canonicalizations. : *) find first character of class X in string : *) find first character not of class X in string We're gonna run into the "what's a character?" issue here, especially at higher Unicode levels where what the user things of as a single character is really a sequence of codepoints. From perspective of a positive match, the character lengths can be largely self-defining. >From the perspective of a negative match, the engine has to know what "." means so you can skip one when the class doesn't match. (And the length of "." doesn't necessarily map to the length of all the entries in the class...) In particular, \n in Perl 6 has to match a set of weird sequences. Which arguably could be matched by a routine if character classes aren't up to it. (Also, the Perl 6 parser may be based on the notion of a set of hash keys being treated a bit like a lex grammar, and if we can use character class lookup for that, it might need to have some "longest token first" semantics. Which you might need for ordinary classes anyway as soon as you admit sequences.) (Also, minor nit, I'm not sure "find" is the right verb here and elsewhere. Mostly the regex engine just wants to check the "current" location. The rest is control flow.) : *) find boundary between X and not-X : *) Find boundary defined by arbitrary code (mainly for word breaks) We might have to use arbitrary code to match arrays and hashes as well, if the opcodes support only scalar string matches. : *) create new class X : *) add or subtract character to class X : *) create union|intersection|difference of two classes Not sure you really need opcodes for those if character classes are just real objects with a particular interface. : I think this about does it, and we do some of this already. Are there : semantics people see as missing, or need more explanation? If so, : pipe up, we'll nail them down, then get the op mapping (with : implementation) and go from there. I think that most of the other issues revolve around control flow and remembering your current state, and being able to backtrack out of that state, all of which Parrot can presumably handle with existing ops, though perhaps not as efficiently as we might like. I see one other potential gotcha with respect to backtracking and closures. In P6, a closure can declare a hypothetical variable that is restored only if the closure exits "unsuccessfully". Within a rule, an embedded closure is unsuccessful if it is backtracked over. But that implies that you can't know whether you have a successful return until the entire regex is matched, all the way down, and all the way back out the top, or at least out far enough that you know you can't backtrack into this closure. Abstractly, the closure doesn't return until the entire rest of the match is decided. Internally, of course, the closure probably returns as soon as you run into the end of it. So we may have to jimmy the meaning of hypotheticality in that context to defer undoing such variables until we hit a failure continuation of some sort. That's *probably* doable with the current opcodes, but maybe not optimally. In any event, we have to do all that anyway for $1, $2, et al. whether they're inside or outside of closures. Larry
Re: Semantics for regexes
On Wed, Sep 01, 2004 at 01:07:49PM -0700, Larry Wall wrote: : We might have to use arbitrary code to match arrays and hashes as well, : if the opcodes support only scalar string matches. I really wasn't being very clear about this. For efficiency we may need "trie" support (or something like it) to match various strings in parallel. My point is that it could very well be the case that character classes are just a specific application of this. Larry
Re: Semantics for regexes
On Wed, 2004-09-01 at 16:07, Larry Wall wrote: > I see one other potential gotcha with respect to backtracking and > closures. In P6, a closure can declare a hypothetical variable > that is restored only if the closure exits "unsuccessfully". Within > a rule, an embedded closure is unsuccessful if it is backtracked over. > But that implies that you can't know whether you have a successful > return until the entire regex is matched, all the way down, and all the > way back out the top, or at least out far enough that you know you > can't backtrack into this closure. Abstractly, the closure doesn't > return until the entire rest of the match is decided. Internally, > of course, the closure probably returns as soon as you run into the > end of it. Let's get concrete: rule foo { a $x:=(b*) c } "abbabc" So, if I understand Parrot and Perl 6 correctly (heh, fat chance), a slight modification to the calling convention of the closure that represents a rule (possibly even a raw .Closure) could add a pad that the callee is expected to fill in with any hypotheticals defined during execution. The following would happen in the example above: store_lex "bb" into hypopad("$x") after "abb" find "a" and fail the rule, backtracking (clear hypopad("$x")) store_lex "b" into hypopad("$x") after backtracking over one "b" find "b" next and fail the rule, backtracking again (clear) store_lex "b" into hypopad("$x") after second "ab" find "c" and succeed rule foo, return hypopad Essentially every close-paren triggers binding, and every back-track over a close-paren triggers clearing. Because this is all part of the calling convention for a rule, there's no difference between a rule "passing" back hypotheticals to its caller and a sub-rule doing so to the rule which called IT. Is that workable? Does it address your concern, Larry, or did I miss your point? -- â 781-324-3772 â [EMAIL PROTECTED] â http://www.ajs.com/~ajs
Re: Semantics for regexes
On Wed, 2004-09-01 at 16:33, Aaron Sherman wrote: > rule foo { a $x:=(b*) c } In the rest of my message I acted as if that read: rule foo { a $x:=(b+) c } so, we may as well pretend that that's what I meant to say ;-) -- â 781-324-3772 â [EMAIL PROTECTED] â http://www.ajs.com/~ajs
Re: Semantics for regexes
On Wed, Sep 01, 2004 at 01:07:49PM -0700, Larry Wall wrote: > On Wed, Sep 01, 2004 at 01:57:32PM -0400, Dan Sugalski wrote: > : I promised Patrick this a while back but never got it, so here it is. > : > : This is a list of the semantics that I see as needed for a regex > : engine. When we have 'em, we'll map them to string ops, and may well > : add in some special-case code for faster access. > : > : *) extract substring > > Mostly you don't want to go to the trouble of extracting substrings > until you're forced to, because you're continually creating and > destroying backrefs into your string, and you don't want to be > copying characters around for that. As long as there's some kind > of COWish semantics to keep around an original copy of the string > being searched, that's probably all the regex engine itself wants. Indeed, what I've been working towards in my head is that a rule would consist of a set of "components" which can nest and can refer to other rules, and each component would simply keep track of its current start and end positions of what it matches in the string, as well as what to do if that component needs to backtrack/continue because a successive component is unable to achieve its part of the match (or if we're looking for an C<:exhaustive> set of matches). As Larry mentioned in a previous conversation, the rules engine will need to need to remember its state without keeping a large stack (fortunately there appear to be a number of possible optimizations here). So, what a rule component really wants to do is to be able to perform comparisons and tests of substrings in-place, as opposed to extracting them to perform the match. Opcodes to support that would be most helpful (they may already be there--I just haven't looked at that part yet). And yes, my current state of thinking is that each rule component is a Parrot object of some sort that knows where it fits in the rule and can pass information and control flow to/from its neighbors in trying to get the rule(s) to match the target string. > : *) find first character of class X in string > : *) find first character not of class X in string > We're gonna run into the "what's a character?" issue here, especially > at higher Unicode levels where what the user things of as a single > character is really a sequence of codepoints. > : [...] > : *) create new class X > : *) add or subtract character to class X > : *) create union|intersection|difference of two classes > Not sure you really need opcodes for those if character classes are > just real objects with a particular interface. Indeed, I was thinking that character classes would just be more instances of rule component objects, and these would have methods for building unions/intersections/differences. Or, the rules compiler itself can do much of this when it constructs the character class components, rather than try to build them dynamically in Parrot. > (Also, the Perl 6 parser may be based on the notion of a set of hash keys > being treated a bit like a lex grammar, Yes, I think this is likely. > (Also, minor nit, I'm not sure "find" is the right verb here and > elsewhere. Mostly the regex engine just wants to check the "current" > location. The rest is control flow.) Yes, I'm (pardon the pun) finding that advanced "string find" operations may not be of great importance until we start looking at some special-case optimizations. I'd say to wait for those opcodes until we find (sorry!) we need them. > I see one other potential gotcha with respect to backtracking and > closures. In P6, a closure can declare a hypothetical variable > that is restored only if the closure exits "unsuccessfully". [...] Indeed, but I'm gonna cross that bridge when I get to it. Pm
Re: Semantics for regexes
On Wed, Sep 01, 2004 at 04:33:24PM -0400, Aaron Sherman wrote: : On Wed, 2004-09-01 at 16:07, Larry Wall wrote: : : > I see one other potential gotcha with respect to backtracking and : > closures. In P6, a closure can declare a hypothetical variable : > that is restored only if the closure exits "unsuccessfully". Within : > a rule, an embedded closure is unsuccessful if it is backtracked over. : > But that implies that you can't know whether you have a successful : > return until the entire regex is matched, all the way down, and all the : > way back out the top, or at least out far enough that you know you : > can't backtrack into this closure. Abstractly, the closure doesn't : > return until the entire rest of the match is decided. Internally, : > of course, the closure probably returns as soon as you run into the : > end of it. : : Let's get concrete: : : rule foo { a $x:=(b*) c } : "abbabc" : : So, if I understand Parrot and Perl 6 correctly (heh, fat chance), a : slight modification to the calling convention of the closure that : represents a rule (possibly even a raw .Closure) could add a pad that : the callee is expected to fill in with any hypotheticals defined during : execution. Okay, except that hypotheticality is an attribute of a variable's value, not of the pad it's in. As you wrote it above, $x would refer to an external variable, which might well be in the outer lexical pad. You can write $?x instead, which makes it automatically scoped to the current rule (that is, it lives in the $0 object). But again, that's largely independent of whether it's hypothetical. The binding you did implies hypotheticality, but within an embedded closure it wouldn't be hypothetical unless you said "let". That is, my $x; rule foo { a $x:=(b+) c } is shorthand for something like my $x; rule foo { a (b+) { let $x := $1 } c } : The following would happen in the example above: : : store_lex "bb" into hypopad("$x") after "abb" : find "a" and fail the rule, backtracking (clear hypopad("$x")) : store_lex "b" into hypopad("$x") after backtracking over one "b" : find "b" next and fail the rule, backtracking again (clear) : store_lex "b" into hypopad("$x") after second "ab" : find "c" and succeed rule foo, return hypopad : : Essentially every close-paren triggers binding, and every back-track : over a close-paren triggers clearing. Yes, that's essentially correct. My quibble was simply that it may be hard to keep track of what to clear out in the case of calling a failure continuation. : Because this is all part of the calling convention for a rule, there's : no difference between a rule "passing" back hypotheticals to its caller : and a sub-rule doing so to the rule which called IT. Again, hypotheticality is (these days) independent of scope, though a variable scoped to a rule certainly cannot live longer than its $0. : Is that workable? Does it address your concern, Larry, or did I miss : your point? Well, kind of, but it's the "how" that gets interesting... Larry
[perl #31424] PATCH: Fix for parrot linking issue on Solaris 8
# New Ticket Created by [EMAIL PROTECTED] # Please include the string: [perl #31424] # in the subject line of all future correspondence about this issue. # http://rt.perl.org:80/rt3/Ticket/Display.html?id=31424 > The attached patch fixes the solaris hints file to force the use of 'c++' for linking if Configure.pl finds gcc. Without this patch, it links with gcc which fails since it apparently can't find some of the c++ symbols from icu. This patch also changes the order of the Configure.pl tests so that the gcc detection step is run before the hints step. This has the side effect of fixing the test that was already in the solaris hints file. As near as I can tell, this doesn't break anything, at least on solaris. I'm not sure this is the correct fix for this issue, but it fixes it for me. If anyone has a suggestions for a better way to handle this, let me know and I'll redo it. parrot.solaris-c++-link.patch Description: Binary data
Re: Minor makefile fix
At 5:03 PM +0100 9/1/04, Jonathan Worthington wrote: Hi, Here's a small fix to the root.in makefile; this fix is needed to get Parrot building again on Win32 and probably in some other places too. Applied, thanks. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [perl #31419] PATCH: Fix for solaris platform asctime_r argument number mismatch
At 8:37 AM -0700 9/1/04, [EMAIL PROTECTED] (via RT) wrote: On Solaris 8, asctime_r takes 3 parameters instead of 2. The prototype looks like this: char *asctime_r(const struct tm *tm, char *buf, int buflen); This patch adds a time.c file to the solaris platform directory. The only change in this time.c file and the stock generic/time.c is that the call to asctime_r passes 26 as the buflen. The Solaris man page for asctime_r says that the result will always be 26 characters exactly. I'm guessing the POSIX asctime assumes that the buffer is at least 26 characters, so assuming the buffer that Parrot_asctime_r gets is at least 26 characters isn't a new risk. Without this patch, the latest version of parrot from CVS will not compile on Solaris 8. Applied, thanks -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [perl #31424] PATCH: Fix for parrot linking issue on Solaris 8
At 4:16 PM -0700 9/1/04, [EMAIL PROTECTED] (via RT) wrote: The attached patch fixes the solaris hints file to force the use of 'c++' for linking if Configure.pl finds gcc. Without this patch, it links with gcc which fails since it apparently can't find some of the c++ symbols from icu. This patch also changes the order of the Configure.pl tests so that the gcc detection step is run before the hints step. This has the side effect of fixing the test that was already in the solaris hints file. As near as I can tell, this doesn't break anything, at least on solaris. I'm not sure this is the correct fix for this issue, but it fixes it for me. If anyone has a suggestions for a better way to handle this, let me know and I'll redo it. Good enough -- applied, thanks! -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [perl #31423] [PATCH] two tests for NCI
At 12:20 PM -0700 9/1/04, Bernhard Schmalhofer (via RT) wrote: this patch adds two tests to t/pmc/nci.t. The first new test should be a platform independent check of get_string() of the ParrotLibrary PMC. The second new test is a callback test ported from PASM to PIR. Applied, thanks. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Test::Harness/prove: printing the test name when a test fails
On Tue, Aug 31, 2004 at 06:24:55PM +1000, Andrew Savige ([EMAIL PROTECTED]) wrote: > I told him to use verbose mode (prove -v) but he still complained. > Actually, I agree with him that when a test fails (even when not > in verbose mode) it makes sense to print out as much useful > infomation as possible (including the test name). I can't see changing it. What if there are 1000 failed tests? xoa -- Andy Lester => [EMAIL PROTECTED] => www.petdance.com => AIM:petdance
Re: Semantics for regexes
On Sep-01, Dan Sugalski wrote: > > This is a list of the semantics that I see as needed for a regex > engine. When we have 'em, we'll map them to string ops, and may well > add in some special-case code for faster access. > > *) extract substring > *) exact string compare > *) find string in string > *) find first character of class X in string > *) find first character not of class X in string > *) find boundary between X and not-X > *) Find boundary defined by arbitrary code (mainly for word breaks) Huh? What do you mean by "semantics"? The only semantics needed are the minimum necessary to answer the question "is the fred at offset i equal to the fred X?" (Sorry, not sure if fred is actually character or codepoint or whatever, and is probably all of them at different levels.) We also almost certainly need to be able to do character class comparisons, although if you assume that you can always transcode to what the regex was compiled with, then you don't even need that -- instead, you need to be able to convert to something like a difference list of numbered freds. But if we're talking about semantics, then yes you need the character class manipulation. Everything else in this list sounds like optimizations to me, and probably not the right optimizations (I don't think it's possible to predict what will be useful yet.) For other things that parrot will be used for, I suspect that the first 3 will be needed. I'm curious as to how you came up with that list; it seems to imply a particular way of implementing the grammar engine. I would expect all of that, barring certain optimizations, to be done directly with existing pasm instructions. There will be a need for saving a stack of former values of hypothetical variables, which can also be done with pasm ops but might interact with overloaded assignment or something wacky like that.
Re: Proposal for a new PMC layout and more
Dan Sugalski <[EMAIL PROTECTED]> wrote: > At 5:17 PM +0200 9/1/04, Leopold Toetsch wrote: >>Below is a pod document describing some IMHO worthwhile changes. I >>hope I didn't miss some issues that could inhibit the implementation. > Okay, the "No" warrants more explanation. Thanks. > First off, the current structure of PMCs, Buffers, and Strings is > definitely a mess, what with the multiple nested structs, semi-shared > data, and weird smallobject overlap. A lot of stuff that is, in > retrospect, crap has been layered on, so if this gets beaten up and > cleaned out I won't mind in the least. Well, that's what the proposal is for. Cleaning up and unifying existing mess. > The PMC scheme -- where PMCs are an immovable header with a vtable > slot, cache slot, and flag slot -- stays. It's this way on purpose, > and matches normal usage patterns (nicely efficiently) for perl 5 as > well as (oddly) most python and ruby usage. (Where there's a > preponderance of low-level types) I don't see that "matches normal usage patterns". Just the opposite of it. The current PMC structure doesn't easily allow to create e.g. Pythons "all is an object" POV. An Integer just needs 2 words of information and not more. The rest (3 words) is just wasted. No interpreter I've looked at has fixed sized objects. OTOH aggregates have artifical helper structures to store needed information. A variable sized PMC covers that all and eliminates all indirections totally to access these data. I don't see efficiency either, neither in execution time nor in memory usage in the current scheme. > Buffers and strings are special-purpose constructs, or at least they > *should* be. They're segregated off for GC purposes. Buffers and strings are different because the current PMC structure doesn't allow or support an arbitrary object layout. This lack of functionality creates the need for Buffers. Which leads to more indirection in accessing an PMC's data and more overhead during GC. The unificiation into one coherent object model just simplifies all that stuff. > ... While they could > be unified with PMCs, I don't want them to be. They've specific, > special purposes, and as such they're staying the way they are. A PMC is specific enough. The vtable makes it special. The vtable defines the functionality of that very object. There isn't any real difference between an Buffer structure or an array-ish PMC. Both hold some amount of data. But we currently treat these two totally differently for no good reason. > Strings, FWIW, are *not* a perl 6 specific thing. The current string > design is sufficient, and *will* be used, for perl 5, python, and > ruby, as well as any other language that wants to live on parrot and > handle string data. While there's stuff to be added still, there's no > reason that I can see to mess with them. Well, the current need for a distinct STRING type arises just because of a lack in PMCs to deal with strings. E.g. $P0 = a_func_returning_a_string() $S0 = $P0 $I0 = length $S0 That's the way to get at the length of a string. Python doesn't have a notion of a STRING, I'm not aware of anything like that in Perl5 either. So functions are returning objects aka PMCS. Mostly all operations are dealing with PMCs only. This is my experience coming from the Pie-thon quest, but not alone. The need for STRING opcodes and vtable/MMD functions just comes from a lack of functionality in PMCs. Unifying or just having String PMCs eliminates this lack and almost one third of opcodes. STRING operations aren't the fastest anyway. I don't see any reason to just provide all these in PMCs (which we need anyway) and eliminate the duplication with a distinct type. Native integers and numbers do warrant the specialization. Processors and JIT can supprt these types natively. Nothing can be done with STRINGs*. These are just overhead and code duplication currently. > Finally, Nicholas is right -- this is messing around with stuff that > already works. We're better off working on things that don't exist > yet, and leave this to later. That's of course true. There is a lot of stuff that needs to be done and should be done before reworking internals deeply. OTOH a lot of currently todo stuff could immediately be done much more easily. We need some more PMCs e.g. to manage packfiles and code segments. The rigid structure of the fixed-sized PMCs is always a PITA when implementing new objects. Unicode string vtables is another issue, albeit I don't know, if some/all vtable slots are usable for string operations. But we got already e.g. concatenate or bitwise string vtables. > If you want, we can hash out the changes to sub calling (with the > swapping interpreter structs we've been arguing over), moving the > return continuation/calling object/called sub into the interp > structure, Of course, yes. That thread is BTW lacking another answer: what's the difference between your derived proposal and mine. > ... and fixing up the JIT
Re: Proposal for a new PMC layout and more
Steve Fink <[EMAIL PROTECTED]> wrote: > On Sep-01, Leopold Toetsch wrote: >> Below is a pod document describing some IMHO worthwhile changes. I hope >> I didn't miss some issues that could inhibit the implementation. > Overall, I like it, although I'm sure I haven't thought of all of the > repercussions. > The one part that concerns me is the loss of the flags -- flags just > seem generally useful for a number of things. In the limit, each flag > could be replaced by an equivalent vtable entry that just returned true > or false, I'm not thinking about vtable entries returning a flag bit. E.g. the presence of PObj_custom_mark_FLAG could as well be tested as: if (pmc->vtable->mark) // != NULL Generally speaking the vtable mostly holds the information, that is needed for one kind of PMC. More specialized PMCs can have their private flags (for example a Key PMC). But they are normally not needed. An Integer or Float PMC doesn't need any flags to perform its operation. The proposed scheme doesn't of course forbid private flags in the PMCs data section. But a lot of PMCs just don't need any flags. > The proposal would also expand the size of the vtable by a bit due to > the string vtable stuff. No. The vtable would very likely shrink. 46 vtable (or MMD) entries are currently used by STRING* operations. These would be just PMC operations, which we have anyway. > ... will MMD really take care of all of > the places where we care that we're going to a string, specifically, > rather than any other random PMC type? MMDs have to deal with that anyway. We have String PMCs. The vtables or MMD functions that currently take STRING* ought to be optimized shortcuts for STRING* arguments. But if a String PMC is passes, still The Rigth Thing has to happen. > I also view the proposal as being comprised of several fairly > independent pieces. Something like: * allow/allocate variable sized PMCs Then --yes. > * Merging PMCs and Buffers > * Merging STRINGs and PMCs That's the same thing, mostly. > * Removing GC-related flags and moving them to GC implementations We already have that. But it's not hidden or encapsulated. > * Removing the rest of the flags Yep. > * Using Null instead of Undef No. Undef is a totally different thing. There is no change here. The Null PMC catches program errors (like using a C NULL pointer). The Undef is just a placeholder that morphs to any other type. > * Moving "extra" stuff to before the PMC pointer > * Using Refs to expand PMCs > * Using DOD to remove the Ref indirection > * Shrinking the base PMC size Yep. That is related. though. > ... And some are only dependent > in the sense that you'll make space or time performance worse until you > make the rest of the related changes. You could call those > design-dependent, rather than implementation-dependent. Yes. leo