A serious stab at regexes

2001-11-04 Thread Brent Dax

Okay, this bunch of ops is a serious attempt at regular expressions.  I
had a discussion with japhy on this in the Monastery
(http://www.perlmonks.org/index.pl?node_id=122784), and I've come up
with something flexible enough to actually (maybe) work.  Attached is a
patch to modify core.ops and add re.h (defines data structures and such)
and t/op/re.t (six tests).  All tests, including the new ones, pass.

--Brent Dax
[EMAIL PROTECTED]
Configure pumpking for Perl 6

When I take action, I’m not going to fire a $2 million missile at a $10
empty tent and hit a camel in the butt.
--Dubya


--- core.ops.oldSun Nov  4 03:07:31 2001
+++ core.opsSun Nov  4 03:05:42 2001
@@ -2,6 +2,8 @@
 ** core.ops
 */
 
+#include 
+
 =head1 NAME
 
 core.ops
@@ -1959,6 +1961,357 @@
 
 ###
 
+=head2 Regular expression operations
+
+These operations are used by the regular expression engine.  Unless 
+otherwise noted, any regexp opcode which takes an integer constant as its 
+last argument branches to that address if the op fails to match.
+
+=over 4
+
+=cut
+
+
+
+=item B(ic, s)
+
+=item B(ic, sc)
+
+Sets the string in $2 as the string to match against and branches to the 
+regular expression at the address specified as $1.
+
+=item B(i, s)
+
+=item B(i, sc)
+
+Same as the ic variant, but jumps to $1 instead of branching.
+
+=cut
+
+AUTO_OP reMatch(ic, s|sc) {
+   cur_re=mem_sys_allocate(sizeof(re_info));
+   
+   cur_re->string=$2;
+   cur_re->flags=0;
+   cur_re->index=0;
+   cur_re->minlength=0;
+   
+/*
+** Allocate a stack
+** XXX There ought to be a function to do this, like
+   **  stack_make(interpreter, &cur_re->stack_base, &cur_re->stack_top) or 
+something 
+   */
+cur_re->stack_base = mem_allocate_aligned(sizeof(struct StackChunk));
+cur_re->stack_top = &cur_re->stack_base->entry[0];
+cur_re->stack_base->used = 0;
+cur_re->stack_base->free = STACK_CHUNK_DEPTH;
+cur_re->stack_base->next = NULL;
+cur_re->stack_base->prev = NULL;
+
+/* push the current location onto the call stack--we're doing the equivalent of a 
+sub call */
+push_generic_entry(interpreter, &interpreter->control_stack_top, cur_opcode + 3,  
+STACK_ENTRY_DESTINATION, NULL);
+
+RETREL($1);
+}
+
+AUTO_OP reMatch(i, s|sc) {
+   cur_re=mem_sys_allocate(sizeof(re_info));
+
+   cur_re->string=$2;
+   cur_re->flags=0;
+   cur_re->index=0;
+   cur_re->minlength=0;
+   
+/*
+** Allocate a stack
+** XXX There ought to be a function to do this, like
+   **  stack_make(interpreter, &cur_re->stack_base, &cur_re->stack_top) or 
+something 
+   */
+   cur_re->stack_base = mem_allocate_aligned(sizeof(struct StackChunk));
+cur_re->stack_top = &cur_re->stack_base->entry[0];
+cur_re->stack_base->used = 0;
+cur_re->stack_base->free = STACK_CHUNK_DEPTH;
+cur_re->stack_base->next = NULL;
+cur_re->stack_base->prev = NULL;
+
+push_generic_entry(interpreter, &interpreter->control_stack_top, cur_opcode + 3,  
+STACK_ENTRY_DESTINATION, NULL);
+
+/* jump to the first argument */
+RETABS((opcode_t *)$1);
+}
+
+
+
+=item B(s)
+
+=item B(sc)
+
+Sets the regular expression's flags.  'i' sets the flag 
+RE_case_insensitive_FLAG, 's' sets the flag
+RE_single_line_FLAG, and 'm' sets the flag 
+RE_multiline_FLAG.  Currently only 's' is implemented.
+
+=cut
+
+AUTO_OP reFlags(s|sc) {
+   int i;
+   char ch;
+   
+   for(i=0; i < string_length($1); i++) {
+   /*
+   ** XXX this is a REALLY naughty thing to do--I 
+   **  shouldn't poke around inside the string like this 
+   */
+   ch=((char *)$1->bufstart)[i];
+   
+   switch(ch) {
+   case 'i':
+   fprintf(stderr, "Warning: RE option /m not yet 
+implemented");
+   RE_case_insensitive_SET(cur_re);break;
+   case 's':
+   RE_single_line_SET(cur_re); break;
+   case 'm':
+   fprintf(stderr, "Warning: RE option /m not yet 
+implemented");
+   RE_multiline_SET(cur_re);   break;
+   default:
+   fprintf(stderr, "Warning: unrecognized RE option /%c", 
+ch);
+   }
+   }
+}
+
+
+
+=item B(s)
+
+=item B(sc)
+
+Sets the minimum number of characters that must be left in the 
+string for a match to be possible.  For example, the expression 
+/fo*bar/ must have at least 4 characters in the string left to
+match; the expression /fo+bar/ requires five characters.  This
+information 

Re: Yet another switch/goto implementation

2001-11-04 Thread Michael Fischer

On Nov 04, Daniel Grunblatt <[EMAIL PROTECTED]> took up a keyboard and banged out
> First of all you miss typed:
> -if ($c{do_opt_t} eq 'goto' and $c{cc} !~ /gcc/i ) {
> +if ($c{do_op_t} eq 'goto' and $c{cc} !~ /cc/i ) {

hmm. Thats not what my diff has. Point is, if you chose
'goto', $c{cc} /isn't/ gcc, there's a problem. $c{cc} is
just whatever you compiled perl5 with, I believe.

> 
> 
> On Sat, 3 Nov 2001, Michael Fischer wrote:
> >
> > 2) replaces interp_guts.h with do_op.h
> 
> No, it doesn't, it's still using DO_OP from interp_guts.h
> 

D'oh! good catch. I only hit interpreter.c, which of course
didn't do much...really belongs in runops_cores.c.

> I really suggest that you do a do_op.c and a do_op.h and that you call
> goto_op_dispatch directly from runops_core.c (from runops_t0p0b0_core),
> because if I'm not wrong you are breaking -t ,-p and -b options.

Erm, I'm not sure how, as each of them does run in some form of
while(pc). Please enlighten me here.

> >
> > 5) Not the cleanest implementation perhaps, but largely
> >limited to ops2c.pl, and things should be fairly easy
> >to track down.
> >
> 
> I think your approuch is much better and cleaner than mine, my brain was
> limited to unix :) so I never worried about anything besides gcc.
> It would also be nice if you can decide which dispatch method use instead
> of asking.

Well, the point is to let developers have an /easy/ way to play around
with it, and see what happens on different arches, compilers, optimization
settings, etc.

I'm going to post a revised patch in a few minutes, with a number of 
caveats... namely that now the goto branch is compiling fine, but 
blowing up badly when run. I assume yours does not :-)

Why not have a look and see if you can't merge your goto system into
mine for getting it to be workable from Configure?

Win win.

Cheers.

Michael
-- 
Michael Fischer 7.5 million years to run
[EMAIL PROTECTED]printf "%d", 0x2a;
-- deep thought 



RE: [Patch] Win32 Parrot_floatval_time Improved (1/1)

2001-11-04 Thread Richard J Cox

In article <[EMAIL PROTECTED]>, 
[EMAIL PROTECTED] (Brent Dax) wrote:
> Richard J Cox:
> # Firstly, 8am code this morning builds on Win32 without
> # problem, other than
> # configure.pl not knowing that link is the linker (which
> # appears to be down
> # to ActiveState not knowing).
> 
> Does it have to know?  If so, set it in the hints file
> (hints/mswin32.pl).

It seems it doesn't (however once we get into creating dll's etc. it will, 
cl and link take different options).

Richard

BTW, just tested with VC++ 7 (part of VisualStudio.Net Beta 2) and all 
also compiles OK and tests pass (well, same set of warnings about no 
return values in classes\intclass.c).

-- 
[EMAIL PROTECTED]



Win32 build and WINVER

2001-11-04 Thread Richard J Cox

Currently for a Win32 build WINVER is not being set, this leads to it 
being set in Windef.h (included by Windows.h) to 0x0500, or "build for 
Windows 2000".

This is OK, until (for whatever) reason a Win2k only API is called, at 
which point the built exe will not run on earlier versions of Windows. 
Thus I think this should be set somewhat lower (and a couple of other 
defines as well to go along with it).

If code that is version dependent is used, the newer windows only API's 
need to be dynamically loaded (and thus checked at run time).

This of course leads to the question of what is the earliest Win32 version 
that Perl6 will support?

-- 
[EMAIL PROTECTED]



Revised yet another goto/switch...etc.

2001-11-04 Thread Michael Fischer

Ok, thanks to Daniel Grunblatt for pointing out
the obvious mis-adjustment of #including...
That bit is fixed, and low, a whopping less-than-%10
improvement in speed with the switch() version of 
DO_OP. Hmm.

OTOH, my implementation of goto, based on Paolo's post
back when, is clearly broken in some fashion.
It compiles cleanly, and runs _very_ badly.
One of the tests nearly clobbered my machine.
Caution is advised. All the code for the goto sub
ends up in include/parrot/do_op.h, so you can 
certainly examine before taking risk. Better
wizards than I should have a gander at that bit
of black magic.

Suggestion: those who are handier at getting that
goto thing to work right might want to merge their
ideas against my patch (which has the advantage
of being almost totally localized to ops2c.pl and
being selectable at Configure time)

Ooof.

Michael
-- 
Michael Fischer 7.5 million years to run
[EMAIL PROTECTED]printf "%d", 0x2a;
-- deep thought 



FW: Compiler Warnings? (1/1)

2001-11-04 Thread Brent Dax

The patch attached is courtesy of Richard J. Cox.  It fixes the VC++
warnings about functions declared to return a value but not actually
returning one.  I'll apply it in a couple hours if there aren't any
objections.

--Brent Dax
[EMAIL PROTECTED]
Configure pumpking for Perl 6

When I take action, I'm not going to fire a $2 million missile at a $10
empty tent and hit a camel in the butt.
--Dubya

-Original Message-
From: Richard J Cox [mailto:[EMAIL PROTECTED]]
Sent: Sunday, November 04, 2001 09:26
To: [EMAIL PROTECTED]
Subject: Compiler Warnings? (1/1)


I assume we are trying to create a warning free build... thus this patch
stops VC++ complaining about functions that return values, but lack a
return statement in classes\intclass.c.

(Alternately we can switch off warnings, but I don't think that that is
a
good idea at all.)

--
[EMAIL PROTECTED]



intclass_c.diff
Description: Binary data


Re: [PATCH] Computed goto, super-fast dispatching.

2001-11-04 Thread Daniel Grunblatt

Yes, you are right on that, but that is only on linux, not on *BSD (where
I tried it). I still don't know why is these, Can you try using gcc 3.0.2?

For the compiled version, please read both mops.c you will see there is no
difference except for the definition of the array which if no missing
something doesn't have anything to with the _benchmark_.

Daniel Grunblatt.

On Sun, 4 Nov 2001, Tom Hughes wrote:

> In message <[EMAIL PROTECTED]>
>   Daniel Grunblatt <[EMAIL PROTECTED]> wrote:
>
> > All:
> > Here's a list of the things I've been doing:
> >
> > * Added ops2cgc.pl which generates core_cg_ops.c and core_cg_ops.h from
> > core.ops, and modified Makefile.in to use it. In core_cg_ops.c resides
> > cg_core which has an array with the addresses of the label of each opcode
> > and starts the execution "jumping" to the address in array[*cur_opcode].
> >
> > * Modified interpreter.c to include core_cg_ops.h
> >
> > * Modified runcore_ops.c to discard the actual dispatching method and call
> > cg_core, but left everything else untouched so that -b,-p and -t keep
> > working.
> >
> > * Modified pbc2c.pl to use computed goto when handling jump or ret, may be
> > I can modified this once again not to define the array with the addresses
> > if it's not going to be used but I don't think that in real life a program
> > won't use jump or ret, am I right?
> >
> > Hope some one find this usefull.
>
> I just tried it but I don't seem to be seeing anything like the speedups
> you are. All the times which follow are for a K6-200 running RedHat 7.2 and
> compiled -O6 with gcc 2.96.
>
> Without patch:
>
>   gosford [~/src/parrot] % ./test_prog examples/assembly/mops.pbc
>   Iterations:1
>   Estimated ops: 3
>   Elapsed time:  37.387179
>   M op/s:8.024141
>
>   gosford [~/src/parrot] % ./examples/assembly/mops
>   Iterations:1
>   Estimated ops: 3
>   Elapsed time:  3.503482
>   M op/s:85.629098
>
> With patch:
>
>   gosford [~/src/parrot-cg] % ./test_prog examples/assembly/mops.pbc
>   Iterations:1
>   Estimated ops: 3
>   Elapsed time:  29.850361
>   M op/s:10.050130
>
>   gosford [~/src/parrot-cg] % ./examples/assembly/mops
>   Iterations:1
>   Estimated ops: 3
>   Elapsed time:  4.515596
>   M op/s:66.436413
>
> So there is a small speed up for the interpreted version, but nothing
> like the three times speedup you had. The compiled version has actually
> managed to get slower...
>
> Tom
>
> --
> Tom Hughes ([EMAIL PROTECTED])
> http://www.compton.nu/
>
>




Re: vmem memory manager

2001-11-04 Thread Benoit Cerrina

>
> dan at his recent talk at boston.pm's tech meeting said he was leaning
> towards a copying GC scheme. this would be the split ram in half design
> and copy all objects to the other half at CG time. the old half is
> reclaimed (not even reclaimed, just ignored!) in one big chunk.
>
This schemes require double the necessary memory, if you have what is needed
for this (if you are able to move objects around) maybe a mark-compact algo
would be better.
In any case I didn't here the talk but previously I think I read him talking
about more elaborate generational schemes.  In this case a copying scheme
could
be used for the lower generation and a mark sweep for the upper.
Benoit




Re: Opcode numbers

2001-11-04 Thread Dan Sugalski

At 09:57 PM 11/3/2001 -0500, Brian Wheeler wrote:
>On Sat, 2001-11-03 at 21:40, Gregor N. Purdy wrote:
> >
> > None of these are issues with the approach I've been working on /
> > advocating. I'm hoping we can avoid these altogether.
> >
>
>
>I think this is a cool concept, but it seems like a lot of overhead with
>the string lookups.

For the core opcode set, the one shipped with perl, we absolutely must keep 
those numbers constant. There's no real good reason for them to change, and 
if they're constant we don't have to worry about fixing things up at 
runtime. That's a cost we then don't have to pay every time we load a 
compiled program up. (We have to pay it at compilation time regardless, of 
course, but no reason to pay it more than once)

Libraries loaded from disk are a separate matter entirely. I'm tempted to 
require that the exported opcodes stay in the same order for the lifetime 
of the library, but that pushes good version and export control off to the 
developer community as a whole. Building an export list that gets packaged 
up with modules isn't that tough, but I worry that people won't do it.

So I suppose we have three sorts of opcode libraries:

1) The core set. These must have constant numbers that don't change over 
the life of perl.
2) The core library set. These will have reserved sets of opcode numbers 
and shouldn't change either. (Stuff like the socket library, for example) 
We load 'em at runtime, though
3) User libraries. These we should have some method of cherry-picking the 
functions we want, and we won't count on the numbers for the opcodes in the 
library. Also loaded at runtime.

I really, *really* want to require that the export list for a module be 
constant, but I'm not sure we can get away with that safely.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Yet another switch/goto implementation

2001-11-04 Thread Dan Sugalski

At 12:19 PM 11/4/2001 -0500, Michael Fischer wrote:
>On Nov 04, Daniel Grunblatt <[EMAIL PROTECTED]> took up a keyboard 
>and banged out
> > First of all you miss typed:
> > -if ($c{do_opt_t} eq 'goto' and $c{cc} !~ /gcc/i ) {
> > +if ($c{do_op_t} eq 'goto' and $c{cc} !~ /cc/i ) {
>
>hmm. Thats not what my diff has. Point is, if you chose
>'goto', $c{cc} /isn't/ gcc, there's a problem. $c{cc} is
>just whatever you compiled perl5 with, I believe.

Perhaps we need a configure probe to see if a compiler does computed gotos. 
Just because it's always gcc now doesn't mean it won't be in other 
compilers later.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: [PATCH] Computed goto, super-fast dispatching.

2001-11-04 Thread Tom Hughes

In message <[EMAIL PROTECTED]>
  Daniel Grunblatt <[EMAIL PROTECTED]> wrote:

> All:
>   Here's a list of the things I've been doing:
> 
> * Added ops2cgc.pl which generates core_cg_ops.c and core_cg_ops.h from
> core.ops, and modified Makefile.in to use it. In core_cg_ops.c resides
> cg_core which has an array with the addresses of the label of each opcode
> and starts the execution "jumping" to the address in array[*cur_opcode].
> 
> * Modified interpreter.c to include core_cg_ops.h
> 
> * Modified runcore_ops.c to discard the actual dispatching method and call
> cg_core, but left everything else untouched so that -b,-p and -t keep
> working.
> 
> * Modified pbc2c.pl to use computed goto when handling jump or ret, may be
> I can modified this once again not to define the array with the addresses
> if it's not going to be used but I don't think that in real life a program
> won't use jump or ret, am I right?
> 
> Hope some one find this usefull.

I just tried it but I don't seem to be seeing anything like the speedups
you are. All the times which follow are for a K6-200 running RedHat 7.2 and
compiled -O6 with gcc 2.96.

Without patch:

  gosford [~/src/parrot] % ./test_prog examples/assembly/mops.pbc
  Iterations:1
  Estimated ops: 3
  Elapsed time:  37.387179
  M op/s:8.024141

  gosford [~/src/parrot] % ./examples/assembly/mops
  Iterations:1
  Estimated ops: 3
  Elapsed time:  3.503482
  M op/s:85.629098

With patch:

  gosford [~/src/parrot-cg] % ./test_prog examples/assembly/mops.pbc
  Iterations:1
  Estimated ops: 3
  Elapsed time:  29.850361
  M op/s:10.050130

  gosford [~/src/parrot-cg] % ./examples/assembly/mops
  Iterations:1
  Estimated ops: 3
  Elapsed time:  4.515596
  M op/s:66.436413

So there is a small speed up for the interpreted version, but nothing
like the three times speedup you had. The compiled version has actually
managed to get slower...

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu/




RE: [perl6]RE: Helping with configure

2001-11-04 Thread Dan Sugalski

At 10:36 PM 11/3/2001 -0800, Brent Dax wrote:
>Well, for now we're using Perl for Configure, but that won't be possible
>in the final version.  Nasty bootsrapping issues with that.  :^)

You'd be surprised... :)

Seriously, miniparrot, enough to do simple file ops, spawn external 
programs and check their statuses, and implement all the 'internal' stuff 
(regexes and suchlike things) is pretty easily doable. Full and fancy 
Parrot's a bigger task, but I think you'll find that we can provide a 
static config.h/platform.c/platform.h set that'll get things built enough 
to run a full probing configure, even on the odder platforms (Win32, VMS, 
MVS, and WinCE).

Don't forget that we *also* will have to provide a shell-specific base 
build script (DCL for VMS, batch for Windows, JCL (shudder) for MVS), so it 
can do platform-specific copying of files for the initial build.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Regex helper opcodes

2001-11-04 Thread Dan Sugalski

While I'm not going to dive too deep into regexes (I like what little 
sanity I have left, thanks :), here are a few opcodes I've been thinking of 
for making REs faster:

=begin proposed_opcodes

=item makebitlist sx, sy

Makes the string in X a bitmap, with one bit set in it for each character 
in Y. (So if Y was "AB" bits 64 and 65 would be set, assuming I remember my 
ASCII)

=item ifin sx, iy, DEST

If bit Y of bitlist X is set branch to DEST

=item ifnotin sx, iy, DEST

If bit Y of bitlist X is not set branch to DEST

=end proposed_opcodes

I think we already have ops to put the integer value of a single character 
(taken from an offset from the beginning of a string) into an integer 
register, but if we don't we should.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Win32 build and WINVER

2001-11-04 Thread Dan Sugalski

At 05:26 PM 11/4/2001 +, Richard J Cox wrote:
>This of course leads to the question of what is the earliest Win32 version
>that Perl6 will support?

Currently, I don't want to promise back before Win98, though if Win95 is no 
different from a programming standpoint (I have no idea if it is) then 
that's fine too. Win 3.1 and DOS are *not* target platforms, though if 
someone gets it going I'm fine with it.

I'd love to say the furthest back is NT4/WinMe but I don't think that's 
feasable.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Yet another switch/goto implementation

2001-11-04 Thread Daniel Grunblatt



On Sun, 4 Nov 2001, Michael Fischer wrote:

> On Nov 04, Daniel Grunblatt <[EMAIL PROTECTED]> took up a keyboard and banged 
>out
> > First of all you miss typed:
> > -if ($c{do_opt_t} eq 'goto' and $c{cc} !~ /gcc/i ) {
> > +if ($c{do_op_t} eq 'goto' and $c{cc} !~ /cc/i ) {
>
> hmm. Thats not what my diff has. Point is, if you chose
> 'goto', $c{cc} /isn't/ gcc, there's a problem. $c{cc} is
> just whatever you compiled perl5 with, I believe.
>
> >
> >
> > On Sat, 3 Nov 2001, Michael Fischer wrote:
> > >
> > > 2) replaces interp_guts.h with do_op.h
> >
> > No, it doesn't, it's still using DO_OP from interp_guts.h
> >
>
> D'oh! good catch. I only hit interpreter.c, which of course
> didn't do much...really belongs in runops_cores.c.
>
> > I really suggest that you do a do_op.c and a do_op.h and that you call
> > goto_op_dispatch directly from runops_core.c (from runops_t0p0b0_core),
> > because if I'm not wrong you are breaking -t ,-p and -b options.
>
> Erm, I'm not sure how, as each of them does run in some form of
> while(pc). Please enlighten me here.

Yes, but everyone call DO_OP and expect it to run just one opcode, not
entering a loop, right?

>
> > >
> > > 5) Not the cleanest implementation perhaps, but largely
> > >limited to ops2c.pl, and things should be fairly easy
> > >to track down.
> > >
> >
> > I think your approuch is much better and cleaner than mine, my brain was
> > limited to unix :) so I never worried about anything besides gcc.
> > It would also be nice if you can decide which dispatch method use instead
> > of asking.
>
> Well, the point is to let developers have an /easy/ way to play around
> with it, and see what happens on different arches, compilers, optimization
> settings, etc.
>
> I'm going to post a revised patch in a few minutes, with a number of
> caveats... namely that now the goto branch is compiling fine, but
> blowing up badly when run. I assume yours does not :-)
>
> Why not have a look and see if you can't merge your goto system into
> mine for getting it to be workable from Configure?
>

Sure, yes, but let's firts decide if we really want my goto system because
I don't know if the speed ups are like I thought they were, can anyone try
it on Windows/other plataforms/operating systems?


> Win win.
>
> Cheers.
>
> Michael
> --
> Michael Fischer 7.5 million years to run
> [EMAIL PROTECTED]printf "%d", 0x2a;
> -- deep thought
>


Daniel Grunblatt.




Re: vmem memory manager

2001-11-04 Thread Dan Sugalski

At 07:34 PM 11/4/2001 +0100, Benoit Cerrina wrote:
> >
> > dan at his recent talk at boston.pm's tech meeting said he was leaning
> > towards a copying GC scheme. this would be the split ram in half design
> > and copy all objects to the other half at CG time. the old half is
> > reclaimed (not even reclaimed, just ignored!) in one big chunk.
> >
>This schemes require double the necessary memory, if you have what is needed
>for this (if you are able to move objects around) maybe a mark-compact algo
>would be better.
>In any case I didn't here the talk but previously I think I read him talking
>about more elaborate generational schemes.  In this case a copying scheme
>could be used for the lower generation and a mark sweep for the upper.

I've not made any promises as to what type of GC system we'll use. I'm 
gearing things towards a copying collector, but I'm also trying to make 
sure we don't lock ourselves out of a generational scheme. (I really don't 
want to have to snag a huge ToSpace if I can avoid it) We'll probably have 
a reasonably naive single FromSpace and single ToSpace implementation to 
start, just for simplicity, but I'm not counting on it being permanent.

I know things are a little fuzzy in the GC arena, but that's on purpose for 
the moment.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Revised yet another goto/switch...etc.

2001-11-04 Thread Michael Fischer

And now, with the patch


Michael
-- 
Michael Fischer 7.5 million years to run
[EMAIL PROTECTED]printf "%d", 0x2a;
-- deep thought 


diff -ur parrot/Configure.pl dispatcher-11-04/Configure.pl
--- parrot/Configure.pl Fri Nov  2 07:11:15 2001
+++ dispatcher-11-04/Configure.pl   Sun Nov  4 12:26:21 2001
@@ -91,6 +91,8 @@
numlow =>   '(~0xfff)',
strlow =>   '(~0xfff)',
pmclow =>   '(~0xfff)',
+
+do_op_t =>  'func',

platform => 'linux',
cp =>   'cp',
@@ -118,6 +120,16 @@
 prompt("How big would you like integers to be?", 'iv');
 prompt("And your floats?", 'nv');
 prompt("What is your native opcode type?", 'opcode_t');
+prompt("Opcode dispatch by switch or function ('switch' or 'goto' or 'func')",
+'do_op_t');
+
+if ($c{do_opt_t} eq 'goto' and $c{cc} !~ /gcc/i ) {
+my $not_portable =  "
+'goto' opcode dispatch available only with gcc (for now).
+Please rerun and select either 'func' or 'switch'. Sorry\n";
+die $not_portable;
+}
+
 
 unless( $c{debugging} ) {
$c{ld_debug} = ' ';
diff -ur parrot/Makefile.in dispatcher-11-04/Makefile.in
--- parrot/Makefile.in  Fri Nov  2 07:11:15 2001
+++ dispatcher-11-04/Makefile.inSun Nov  4 12:26:15 2001
@@ -29,6 +29,7 @@
 PERL = ${perl}
 TEST_PROG = test_prog${exe}
 PDUMP = pdump${exe}
+DO_OP_T = ${do_op_t}
 
 .c$(O):
$(CC) $(CFLAGS) ${ld_out}$@ -c $<
@@ -104,13 +105,13 @@
 core_ops$(O): $(H_FILES) core_ops.c
 
 core_ops.c $(INC)/oplib/core_ops.h: core.ops ops2c.pl
-   $(PERL) ops2c.pl core.ops
+   $(PERL) ops2c.pl -t $(DO_OP_T) core.ops
 
 vtable.ops: make_vtable_ops.pl
$(PERL) make_vtable_ops.pl > vtable.ops
 
 vtable_ops.c $(INC)/oplib/vtable_ops.h: vtable.ops ops2c.pl
-   $(PERL) ops2c.pl vtable.ops
+   $(PERL) ops2c.pl -t $(DO_OP_T) vtable.ops
 
 $(INC)/config.h: Configure.pl config_h.in
$(PERL) Configure.pl
diff -ur parrot/interpreter.c dispatcher-11-04/interpreter.c
--- parrot/interpreter.cFri Oct 26 14:58:02 2001
+++ dispatcher-11-04/interpreter.c  Sun Nov  4 12:26:32 2001
@@ -11,7 +11,6 @@
  */
 
 #include "parrot/parrot.h"
-#include "parrot/interp_guts.h"
 #include "parrot/oplib/core_ops.h"
 #include "parrot/runops_cores.h"
 
diff -ur parrot/ops2c.pl dispatcher-11-04/ops2c.pl
--- parrot/ops2c.pl Wed Oct 17 20:21:03 2001
+++ dispatcher-11-04/ops2c.pl   Sun Nov  4 12:26:39 2001
@@ -1,15 +1,26 @@
 #! /usr/bin/perl -w
+
+#  vim: expandtab shiftwidth=4 ts=4:
 #
 # ops2c.pl
 #
-# Generate a C header and source file from the operation definitions in
+# Generate a C header and source file from the operation definitions in,
 # an .ops file.
 #
 
 use strict;
 use Parrot::OpsFile;
+use Getopt::Std;
+
+use vars qw($opt_t);
+getopts('t:');
 
+die "You didn't specifiy how you want DO_OP written!\n
+Use the -t ['func' | 'switch' | 'goto' ] flag, please\n" 
+unless $opt_t eq 'func' or $opt_t eq 'switch' or $opt_t eq 'goto';
 
+my $dispatch = $opt_t;
+  
 #
 # Process command-line argument:
 #
@@ -106,19 +117,49 @@
 #
 
 my @op_funcs;
+
+my %switch;
+my %goto;
+
 my $index = 0;
 
+my @switch_source_subs = (
+\&map_ret_abs_switch,
+\&map_ret_rel_switch,
+\&map_arg_switch,
+\&map_res_abs_switch,
+\&map_res_rel_switch
+);
+my @goto_source_subs = (
+\&map_ret_abs_goto,   
+\&map_ret_rel_goto,   
+\&map_arg_switch,
+\&map_res_abs_goto,
+\&map_res_rel_goto
+);
+my @func_source_subs = (
+\&map_ret_abs,
+\&map_ret_rel,
+\&map_arg,
+\&map_res_abs,
+\&map_res_rel);
+
 foreach my $op ($ops->ops) {
 my $func_name  = $op->func_name;
 my $arg_types  = "opcode_t *, struct Parrot_Interp *";
 my $prototype  = "opcode_t * $func_name ($arg_types)";
 my $args   = "opcode_t cur_opcode[], struct Parrot_Interp * interpreter";
 my $definition = "opcode_t *\n$func_name ($args)";
-my $source = $op->source(\&map_ret_abs, \&map_ret_rel, \&map_arg, 
\&map_res_abs, \&map_res_rel);
-
+my $source = $op->source(@func_source_subs);
+my $sw_source  = $op->source(@switch_source_subs);
+my $gt_source  = $op->source(@goto_source_subs);
 print HEADER "$prototype;\n";
 print SOURCE sprintf("  %-22s /* %6ld */\n", "$func_name,", $index++);
 
+my $idx = $op->{CODE};
+$switch{$idx} = "{\n$sw_source}\n";
+$goto{$func_name}   = "{\n$gt_source\ngoto *goto_map[*pc];\n}\n\n";
+
 push @op_funcs, "$definition {\n$source}\n\n";
 }
 
@@ -139,6 +180,82 @@
 
 print SOURCE @op_funcs;
 
+#
+# if we are working on core.ops, worry about switch/goto
+# if we are working on vtable, skip it...
+#
+
+if ( $file !~ /vtable/ ) {
+
+#my $do_op_header = "include/parrot/do_op.h";
+my $do_op_header = "include/parrot/do_op.h";
+
+open DO_OP_H, ">$do_op_header"
+or die "Could

RE: Regex helper opcodes

2001-11-04 Thread Brent Dax

Dan Sugalski:
# While I'm not going to dive too deep into regexes (I like what little
# sanity I have left, thanks :), here are a few opcodes I've

Oh, c'mon, they're not that bad.  It's basically just "if this works, do
the next thing, otherwise go back and do some stuff over".  "Do some
stuff over" is just popping a position off a stack and branching back to
the op that should be done over.  Even lookaheads aren't really that
bad--you just push the current position onto the RE stack and make sure
you return to it when the lookahead is finished.  (Unless I'm missing
something, which is certainly possible...)

# been thinking of
# for making REs faster:
#
# =begin proposed_opcodes
#
# =item makebitlist sx, sy
#
# Makes the string in X a bitmap, with one bit set in it for
# each character
# in Y. (So if Y was "AB" bits 64 and 65 would be set, assuming
# I remember my
# ASCII)
#
# =item ifin sx, iy, DEST
#
# If bit Y of bitlist X is set branch to DEST
#
# =item ifnotin sx, iy, DEST
#
# If bit Y of bitlist X is not set branch to DEST
#
# =end proposed_opcodes
#
# I think we already have ops to put the integer value of a
# single character
# (taken from an offset from the beginning of a string) into an integer
# register, but if we don't we should.

Have you looked at the regexp patch I posted last night?  It's pretty
much functional, including reOneof.  Still, these could be useful
internal functions... *ponder*

--Brent Dax
[EMAIL PROTECTED]
Configure pumpking for Perl 6

When I take action, I'm not going to fire a $2 million missile at a $10
empty tent and hit a camel in the butt.
--Dubya




Re: Yet another switch/goto implementation

2001-11-04 Thread Michael Fischer

On Nov 04, Daniel Grunblatt <[EMAIL PROTECTED]> took up a keyboard and banged out
> 
> 
> On Sun, 4 Nov 2001, Michael Fischer wrote:
> 
> > On Nov 04, Daniel Grunblatt <[EMAIL PROTECTED]> took up a keyboard and 
>banged out

> > > I really suggest that you do a do_op.c and a do_op.h and that you call
> > > goto_op_dispatch directly from runops_core.c (from runops_t0p0b0_core),
> > > because if I'm not wrong you are breaking -t ,-p and -b options.
> >
> > Erm, I'm not sure how, as each of them does run in some form of
> > while(pc). Please enlighten me here.
> 
> Yes, but everyone call DO_OP and expect it to run just one opcode, not
> entering a loop, right?

Um, that would be the case on the function and switch cases, where
the break's cause us to fall out of the switch. 

In the goto case, we spin. And perhaps I am broken there. End
really wants to return, not just set the pc, but I hadn't thought
of a clever way to do that corner case, and wanted to see what
the behavior would be without it. I suspect I need it.

Hmm, hand hacking that didn't help...

> > Why not have a look and see if you can't merge your goto system into
> > mine for getting it to be workable from Configure?
> >
> 
> Sure, yes, but let's firts decide if we really want my goto system because
> I don't know if the speed ups are like I thought they were, can anyone try
> it on Windows/other plataforms/operating systems?

Indeed. Much experimenting to be done all around.


Michael
-- 
Michael Fischer 7.5 million years to run
[EMAIL PROTECTED]printf "%d", 0x2a;
-- deep thought 



RE: [perl6]RE: Helping with configure

2001-11-04 Thread Brent Dax

Dan Sugalski:
# At 10:36 PM 11/3/2001 -0800, Brent Dax wrote:
# >Well, for now we're using Perl for Configure, but that won't
# be possible
# >in the final version.  Nasty bootsrapping issues with that.  :^)
#
# You'd be surprised... :)
#
# Seriously, miniparrot, enough to do simple file ops, spawn external
# programs and check their statuses, and implement all the
# 'internal' stuff
# (regexes and suchlike things) is pretty easily doable. Full and fancy
# Parrot's a bigger task, but I think you'll find that we can provide a
# static config.h/platform.c/platform.h set that'll get things
# built enough
# to run a full probing configure, even on the odder platforms
# (Win32, VMS,
# MVS, and WinCE).

I know.  My guess is that we'd "only" need file I/O, system(),
backticks, string and numeric operations, stat()ish things (to see if
this file is older than its dependencies), hashes, Data::Dumper, and a
kitchen sink.  Not to mention knowledge of how to correctly run various
programs on every platform we support.

# Don't forget that we *also* will have to provide a
# shell-specific base
# build script (DCL for VMS, batch for Windows, JCL (shudder)
# for MVS), so it
# can do platform-specific copying of files for the initial build.

I'm not looking forward to that either.  I do suggest that we have
Configure.pl5 and Configure.pl6 (both of which skip miniparrot) so
people can use them if they have the software for it, as well as
grabbing defaults and things from their Config.pm.

--Brent Dax
[EMAIL PROTECTED]
Configure pumpking for Perl 6

When I take action, I'm not going to fire a $2 million missile at a $10
empty tent and hit a camel in the butt.
--Dubya




RE: Yet another switch/goto implementation

2001-11-04 Thread Brent Dax

Michael Fischer:
# In the goto case, we spin. And perhaps I am broken there. End
# really wants to return, not just set the pc, but I hadn't thought
# of a clever way to do that corner case, and wanted to see what
# the behavior would be without it. I suspect I need it.

Can't you just break()?

--Brent Dax
[EMAIL PROTECTED]
Configure pumpking for Perl 6

When I take action, I'm not going to fire a $2 million missile at a $10
empty tent and hit a camel in the butt.
--Dubya




Re: Revised yet another goto/switch...etc.

2001-11-04 Thread Daniel Grunblatt

You miss typed yet again :) :

+prompt("Opcode dispatch by switch or function ('switch' or 'goto' or
'func')",
+'do_op_t');
+
+if ($c{do_opt_t} eq 'goto' and $c{cc} !~ /gcc/i ) {

do_op_t

+my $not_portable =  "
+'goto' opcode dispatch available only with gcc (for now).
+Please rerun and select either 'func' or 'switch'. Sorry\n";
+die $not_portable;
+}
+

But here's one idea, if I'm not wrong the number of opcodes is going to
grow, so I think that the big switch dispatch method is going to be
useless, my suggestion is to discard it right now.

Another idea is to write another test_c2.in which test if we can compile
using computed goto and if can (and again if it's REALLY faster on every
os/plataform) use it as I propoused on my patch, that is using the func
dispatch method when -b, -p or -t and computed goto on normal situations,
because after my work arround that is the best way.

On Sun, 4 Nov 2001, Michael Fischer wrote:

> And now, with the patch
>
>
> Michael
> --
> Michael Fischer 7.5 million years to run
> [EMAIL PROTECTED]printf "%d", 0x2a;
> -- deep thought
>

Daniel Grunblatt.





Re: A serious stab at regexes

2001-11-04 Thread Angel Faus

Brent Dax :

> Okay, this bunch of ops is a serious attempt at regular expressions.  I
> had a discussion with japhy on this in the Monastery
> (http://www.perlmonks.org/index.pl?node_id=122784), and I've come up
> with something flexible enough to actually (maybe) work.  Attached is a
> patch to modify core.ops and add re.h (defines data structures and such)
> and t/op/re.t (six tests).  All tests, including the new ones, pass.

Hi Brent,

Since your ops are much complete and better documented that the ones I sent, 
I was trying to adapt my previous regex compiler to your ops, but I found 
what i think might be a limitation of your model.

It looks to me that for compiling down regexp to usual opcodes there is the 
need of having a generic backtrack, insted of a $backtrack label for each 
case.

I have been uncapable of expressing nested groups or alternation with your 
model, and I would say that this is because the engine needs some way to save 
not only the index into the string, but also the point of the regex where it 
can branch on a backtack.

You solve this in your examples, by having a "$bactrack" address for each 
case, but this looks to me as a bad solution. In particular, i would say that 
cannot be aplied for complex regular expressions.

In my previous experimental patch, there was a way to save the string index 
_plus_ the "regex index". Writing this with your syntax, it would mean to be 
able to add a parametrer in rePushindex that saves the "regex index". 

Your example:

RE:
reFlags ""
reMinlength 4
$advance:
rePopindex
reAdvance $fail
$start:
rePushindex
reLiteral "f", $advance
$findo:
literal "o", $findbar
rePushindex
branch $findo
$findbar:
reLiteral "bar", $backtrack
set I0, 1   #true
reFinished
$backtrack:
rePopindex $advance
branch $findbar <<< backtrack needs to know where to branch
$fail:
set I0, 0   #false
reFinished

Your example tweaked by me:

RE:
reFlags ""
reOnFail $fail
reMinlength 4
$start:
rePushindex $advance
reLiteral "f"
$findo:
rePushindex $findbar
literal "o"
branch $findo
$findbar:
reLiteral "bar"
set I0, 1   #true
reFinished
$fail:
set I0, 0   #false
reFinished
$advance:
reAdvance
branch $start

So it is not the reLiteral, reAdvance, etc.. ops that need to know were they 
have to branch on failing, but when failing they always:

  -pop the last index on the stack and then branch to the last saved 
destination.
  -or branch to the address previously set in reOnFail op if there are no 
pending indexes.

There is no $bactrack label, but the backtracking action is called each time 
a submatch fails.

I am not sure that this is the only solution, but is the one that come to my 
mind mind seeing your proposal and I find it quite elegant. 

It is quite possible that nested groups and alternation can be implemented 
with your model. If that is the case, ¿could you please post an example so I 
can understand?.

What do you think about it?

-angel




Re: Rules for memory allocation and pointing

2001-11-04 Thread Benoit Cerrina


> There will be a mechanism to register PMCs with the interpreter to note
> they're pointed to by something that the interpreter can't reach. (For
> example, a structure in your extension code, or via a pointer stashed in
> the depths of a buffer object, or referenced by another interpreter) This
> "foreign access" registry is considered part of an interpreter's root set.
If this is the case, how do you want to move the PMC, I thought you wanted a
copying collector?
Benoit




RE: [perl6]RE: Helping with configure

2001-11-04 Thread Dan Sugalski

At 11:12 AM 11/4/2001 -0800, Brent Dax wrote:
>Dan Sugalski:
># At 10:36 PM 11/3/2001 -0800, Brent Dax wrote:
># >Well, for now we're using Perl for Configure, but that won't
># be possible
># >in the final version.  Nasty bootsrapping issues with that.  :^)
>#
># You'd be surprised... :)
>#
># Seriously, miniparrot, enough to do simple file ops, spawn external
># programs and check their statuses, and implement all the
># 'internal' stuff
># (regexes and suchlike things) is pretty easily doable. Full and fancy
># Parrot's a bigger task, but I think you'll find that we can provide a
># static config.h/platform.c/platform.h set that'll get things
># built enough
># to run a full probing configure, even on the odder platforms
># (Win32, VMS,
># MVS, and WinCE).
>
>I know.  My guess is that we'd "only" need file I/O, system(),
>backticks, string and numeric operations, stat()ish things (to see if
>this file is older than its dependencies), hashes, Data::Dumper, and a
>kitchen sink.  Not to mention knowledge of how to correctly run various
>programs on every platform we support.

Luckily we can feel free to disappoint people. :)

open, close, write, read, system, and the things that don't require any 
off-program support. (We can reasonably assume we won't need any sort of 
platform-specific code to iterate through arrays we create ourselves...) 
Simon's already done it with perl 5.

We only need this to run configure.pl, at which point we can just go 
rebuild ourselves with full support.

># Don't forget that we *also* will have to provide a
># shell-specific base
># build script (DCL for VMS, batch for Windows, JCL (shudder)
># for MVS), so it
># can do platform-specific copying of files for the initial build.
>
>I'm not looking forward to that either.

We have it now, and that's not going to change. If we're careful we can put 
perl 6 in a much better position than perl 5 is in this regard.


Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Yet another switch/goto implementation

2001-11-04 Thread Michael Fischer

On Nov 04, Brent Dax <[EMAIL PROTECTED]> took up a keyboard and banged out
> Michael Fischer:
> # In the goto case, we spin. And perhaps I am broken there. End
> # really wants to return, not just set the pc, but I hadn't thought
> # of a clever way to do that corner case, and wanted to see what
> # the behavior would be without it. I suspect I need it.
> 
> Can't you just break()?

Out of a function?

In the goto case, I write a function conatining the array of
&&label_foo, and do a lot of gotos inside a while(1) loop.
Neither a 'break' nor a 'return' in the end op seems to be helping.
The function is declared to return void, so a 'return' at the
bottom of the function doesn't matter, really (yes, I tried it).

all this followed by

#define DO_OP(pc,interpreter)  goto_op_dispatch((pc),(interpreter))

Sigh.

What we _really_ want anyway, IMHO, is a not-compiler-specific
way to write the gotos. I have not the expertise at this time,
as I discovered to my chagrin after several hours of experimentation
yesterday. Cant use something like '5' as the goto label. Damn.
Enums didn't help matters. As mjd says, ingenuity is always in
short supply. More eyes?

Michael
-- 
Michael Fischer 7.5 million years to run
[EMAIL PROTECTED]printf "%d", 0x2a;
-- deep thought 



RE: Regex helper opcodes

2001-11-04 Thread Dan Sugalski

At 11:06 AM 11/4/2001 -0800, Brent Dax wrote:
>Dan Sugalski:
># While I'm not going to dive too deep into regexes (I like what little
># sanity I have left, thanks :), here are a few opcodes I've
>
>Oh, c'mon, they're not that bad.  It's basically just "if this works, do
>the next thing, otherwise go back and do some stuff over".  "Do some
>stuff over" is just popping a position off a stack and branching back to
>the op that should be done over.  Even lookaheads aren't really that
>bad--you just push the current position onto the RE stack and make sure
>you return to it when the lookahead is finished.  (Unless I'm missing
>something, which is certainly possible...)

I'm just scarred (for life) from digging into perl 5's RE engine, I think. ;)

>Have you looked at the regexp patch I posted last night?

Not really. That's partially for lack of time, and partially because I've 
no opinion to speak of on 'em, so I figure it's best to let the people 
who're interested go for it and see what they come up with.

>It's pretty
>much functional, including reOneof.  Still, these could be useful
>internal functions... *ponder*

I was thinking that the places they could come in really handy for were 
character classes. \w, \s, and \d are potentially a lot faster this way, 
'specially if you throw in Unicode support. (The sets get rather a bit 
larger...) It also may make some character-set independence easier.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Yet another switch/goto implementation

2001-11-04 Thread Benoit Cerrina



> I think your approuch is much better and cleaner than mine, my brain was
> limited to unix :) so I never worried about anything besides gcc.
> It would also be nice if you can decide which dispatch method use instead
> of asking.
Hum, I think you mean linux, maybe BSD, but the other unixes come with cc
and
don't always have gcc installed.
Benoit




Re: Rules for memory allocation and pointing

2001-11-04 Thread Dan Sugalski

At 08:32 PM 11/4/2001 +0100, Benoit Cerrina wrote:

> > There will be a mechanism to register PMCs with the interpreter to note
> > they're pointed to by something that the interpreter can't reach. (For
> > example, a structure in your extension code, or via a pointer stashed in
> > the depths of a buffer object, or referenced by another interpreter) This
> > "foreign access" registry is considered part of an interpreter's root set.
>If this is the case, how do you want to move the PMC, I thought you wanted a
>copying collector?

While the PMC structures themselves don't move (no real need--there of 
fixed size so you can't fragment your allocation pool, though it makes 
generational collection easier to some extent) the data pointed to by the 
PMC can. That's the bit that moves.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Yet another switch/goto implementation

2001-11-04 Thread Daniel Grunblatt

Did you put an eye on my implementation? what's the point in using
computed goto when tracing, checking bounds or profiling?

Daniel Grunblatt.

On Sun, 4 Nov 2001, Michael Fischer wrote:

> On Nov 04, Brent Dax <[EMAIL PROTECTED]> took up a keyboard and banged out
> > Michael Fischer:
> > # In the goto case, we spin. And perhaps I am broken there. End
> > # really wants to return, not just set the pc, but I hadn't thought
> > # of a clever way to do that corner case, and wanted to see what
> > # the behavior would be without it. I suspect I need it.
> >
> > Can't you just break()?
>
> Out of a function?
>
> In the goto case, I write a function conatining the array of
> &&label_foo, and do a lot of gotos inside a while(1) loop.
> Neither a 'break' nor a 'return' in the end op seems to be helping.
> The function is declared to return void, so a 'return' at the
> bottom of the function doesn't matter, really (yes, I tried it).
>
> all this followed by
>
> #define DO_OP(pc,interpreter)  goto_op_dispatch((pc),(interpreter))
>
> Sigh.
>
> What we _really_ want anyway, IMHO, is a not-compiler-specific
> way to write the gotos. I have not the expertise at this time,
> as I discovered to my chagrin after several hours of experimentation
> yesterday. Cant use something like '5' as the goto label. Damn.
> Enums didn't help matters. As mjd says, ingenuity is always in
> short supply. More eyes?
>
> Michael
> --
> Michael Fischer 7.5 million years to run
> [EMAIL PROTECTED]printf "%d", 0x2a;
> -- deep thought
>




Re: Yet another switch/goto implementation

2001-11-04 Thread Dan Sugalski

At 02:33 PM 11/4/2001 -0300, Daniel Grunblatt wrote:
>Did you put an eye on my implementation? what's the point in using
>computed goto when tracing, checking bounds or profiling?

There's not a huge amount of win over a switch, but there is a benefit over 
the function dispatch method.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Yet another switch/goto implementation

2001-11-04 Thread Daniel Grunblatt

So, on those other unixes that come with cc we can't use computed goto?

Daniel Grunblatt.

On Sun, 4 Nov 2001, Benoit Cerrina wrote:

>
>
> > I think your approuch is much better and cleaner than mine, my brain was
> > limited to unix :) so I never worried about anything besides gcc.
> > It would also be nice if you can decide which dispatch method use instead
> > of asking.
> Hum, I think you mean linux, maybe BSD, but the other unixes come with cc
> and
> don't always have gcc installed.
> Benoit
>
>




Re: [perl6]RE: Helping with configure

2001-11-04 Thread Andy Dougherty

On Sat, 3 Nov 2001, Zach Lipton wrote:

> parrot. The idea is to have the Configure.pl script itself run .cm files
> located in Config/, these .cm files (configuremodule) would do the actual
> work of configuration. The Conf.pm module would contain a set of API's for
> the .cm files to call. (this is all portable of course...) An example .cm
> file would be like:

> So anyway, what do you think about this? Again, I have real code which I can
> supply, but am I way offbase here? To me, the main advantage is getting
> everything out of a "monster-script" and breaking it up into packages.

Yes, this is how metaconfig works to generate the monster Configure script
used in Configuring perl.  I happen to think that's a pretty good plan:-).  
(One superficial difference is that all the little .cm files (called .U
units in metaconfig) get smashed together into a big Configure script in
perl5, but that's just a cosmetic issue at the end of the assembly
process.  There's no compelling reason for it.)

If I were designing units for Parrot's Configure, I'd probably set up the
different sections in a metaconfig unit sections as POD sections:  

=for config.h   # Stuff to go in config.h
=for Configure  # Stuff to get run as part of Configure
=for make   # Dependency info so metaconfig knows how to
# assemble all the different parts
=for metalint   # For running metalint to check the units.

There are probably others too that I forget offhand, but you get the idea.

Also, in the part that goes into Configure, I'd include pre- and post-
hooks (a better version of the "call-back units" that are in Perl5's
Configure.)  Those hooks would normally be empty, but could be overridden
by hints files.

-- 
Andy Dougherty  [EMAIL PROTECTED]
Dept. of Physics
Lafayette College, Easton PA 18042




Re: Yet another switch/goto implementation

2001-11-04 Thread Dan Sugalski

At 02:37 PM 11/4/2001 -0300, Daniel Grunblatt wrote:
>So, on those other unixes that come with cc we can't use computed goto?

Computed goto is, at the moment, a GCC-specific feature. It's not OS 
specific, just compiler-specific.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Rules for memory allocation and pointing

2001-11-04 Thread Benoit Cerrina


> While the PMC structures themselves don't move (no real need--there of
> fixed size so you can't fragment your allocation pool, though it makes
Sorry can you expand on this.  I don't see the relation between the data
being fixed size and the memory not becomming fragmented.
> generational collection easier to some extent) the data pointed to by the
> PMC can. That's the bit that moves.
>
> Dan
>
Well one of the reasons of moving the data is so you can use a trivial stack
like
allocator.  If you don't move the PMC you won't be able to.
Benoit




Re: Yet another switch/goto implementation

2001-11-04 Thread Daniel Grunblatt

Sure, I alredy knew that, may be I'm just having a hard time to make my
self clear.

What I mean was:

On those unixes, with cc (NOT GCC), that Benoit Cerrina pointed, Can we
use computed goto?

or in other words:

Is there any other compiler besides gcc that implements computed goto?

Daniel Grunblatt.

On Sun, 4 Nov 2001, Dan Sugalski wrote:

> At 02:37 PM 11/4/2001 -0300, Daniel Grunblatt wrote:
> >So, on those other unixes that come with cc we can't use computed goto?
>
> Computed goto is, at the moment, a GCC-specific feature. It's not OS
> specific, just compiler-specific.
>
>   Dan
>
> --"it's like this"---
> Dan Sugalski  even samurai
> [EMAIL PROTECTED] have teddy bears and even
>   teddy bears get drunk
>
>





Re: Rules for memory allocation and pointing

2001-11-04 Thread Dan Sugalski

At 09:36 PM 11/4/2001 +0100, Benoit Cerrina wrote:

> > While the PMC structures themselves don't move (no real need--there of
> > fixed size so you can't fragment your allocation pool, though it makes
>Sorry can you expand on this.  I don't see the relation between the data
>being fixed size and the memory not becomming fragmented.
> > generational collection easier to some extent) the data pointed to by the
> > PMC can. That's the bit that moves.
> >
> > Dan
> >
>Well one of the reasons of moving the data is so you can use a trivial stack
>like
>allocator.  If you don't move the PMC you won't be able to.

Sure you will. Allocation and deallocation of fixed-sized structures is 
very simple--in some ways as simple as using a stack allocator, as you're 
just putting things on the head of a list or taking them off it. It's 
allocation and deallocation of variable-sized structures that's tricky, and 
tends to fragment your heap all to heck.


Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Yet another switch/goto implementation

2001-11-04 Thread Dan Sugalski

At 02:45 PM 11/4/2001 -0300, Daniel Grunblatt wrote:
>Sure, I alredy knew that, may be I'm just having a hard time to make my
>self clear.
>
>What I mean was:
>
>On those unixes, with cc (NOT GCC), that Benoit Cerrina pointed, Can we
>use computed goto?

No. And Unix generally doesn't enter into it at all. GCC on Win32 or VMS 
support computed gotos. (I'm not sure if GCC in C++ mode supports it either)

>or in other words:
>
>Is there any other compiler besides gcc that implements computed goto?

At the moment, no. Assume at some point there will be, though, which means 
we need a feature probe in configure.pl for general goto-label functionality.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: Yet another switch/goto implementation

2001-11-04 Thread Daniel Grunblatt

Yes, and thanks to Michael Fischer I'm already working on that as I
described on a previos mail. I hope to post it in a few hours.

Daniel Grunblatt.

On Sun, 4 Nov 2001, Dan Sugalski wrote:

> At 02:45 PM 11/4/2001 -0300, Daniel Grunblatt wrote:
> >Sure, I alredy knew that, may be I'm just having a hard time to make my
> >self clear.
> >
> >What I mean was:
> >
> >On those unixes, with cc (NOT GCC), that Benoit Cerrina pointed, Can we
> >use computed goto?
>
> No. And Unix generally doesn't enter into it at all. GCC on Win32 or VMS
> support computed gotos. (I'm not sure if GCC in C++ mode supports it either)
>
> >or in other words:
> >
> >Is there any other compiler besides gcc that implements computed goto?
>
> At the moment, no. Assume at some point there will be, though, which means
> we need a feature probe in configure.pl for general goto-label functionality.
>
>   Dan
>
> --"it's like this"---
> Dan Sugalski  even samurai
> [EMAIL PROTECTED] have teddy bears and even
>   teddy bears get drunk
>
>




RE: A serious stab at regexes

2001-11-04 Thread Brent Dax

Angel Faus:
# Since your ops are much complete and better documented that
# the ones I sent,
# I was trying to adapt my previous regex compiler to your ops,
# but I found
# what i think might be a limitation of your model.
#
# It looks to me that for compiling down regexp to usual
# opcodes there is the
# need of having a generic backtrack, insted of a $backtrack
# label for each
# case.
#
# I have been uncapable of expressing nested groups or
# alternation with your
# model, and I would say that this is because the engine needs
# some way to save
# not only the index into the string, but also the point of the
# regex where it
# can branch on a backtack.

I've been a bit worried that this might be the case.  The best solution
I've been able to think of is to push a "mark" onto the stack, like what
Perl 5 does with its call stack to indicate the end of the current
function's arguments.  If a call to rePopindex popped a mark, it would
be considered to have failed, so it would branch to $1 if it had a
parameter.

# You solve this in your examples, by having a "$bactrack"
# address for each
# case, but this looks to me as a bad solution. In particular,
# i would say that
# cannot be aplied for complex regular expressions.
#
# In my previous experimental patch, there was a way to save
# the string index
# _plus_ the "regex index". Writing this with your syntax, it
# would mean to be
# able to add a parametrer in rePushindex that saves the "regex index".
#
# Your example:
#
# RE:
# reFlags ""
# reMinlength 4
# $advance:
# rePopindex
# reAdvance $fail
# $start:
# rePushindex
# reLiteral "f", $advance
# $findo:
# literal "o", $findbar
# rePushindex
# branch $findo
# $findbar:
# reLiteral "bar", $backtrack
# set I0, 1 #true
# reFinished
# $backtrack:
# rePopindex $advance
# branch $findbar <<< backtrack needs to know
# where to branch
# $fail:
# set I0, 0 #false
# reFinished
#
# Your example tweaked by me:
#
# RE:
# reFlags ""
# reOnFail $fail
# reMinlength 4
# $start:
# rePushindex $advance
# reLiteral "f"
# $findo:
# rePushindex $findbar
# literal "o"
# branch $findo
# $findbar:
# reLiteral "bar"
# set I0, 1 #true
# reFinished
# $fail:
# set I0, 0 #false
# reFinished
# $advance:
# reAdvance
# branch $start
#
# So it is not the reLiteral, reAdvance, etc.. ops that need to
# know were they
# have to branch on failing, but when failing they always:
#
#   -pop the last index on the stack and then branch to the last saved
# destination.
#   -or branch to the address previously set in reOnFail op if
# there are no
# pending indexes.
#
# There is no $bactrack label, but the backtracking action is
# called each time
# a submatch fails.
#
# I am not sure that this is the only solution, but is the one
# that come to my
# mind mind seeing your proposal and I find it quite elegant.

Actually, on further examination that mode does appear quite elegant.
It also has its problems:

/a(?:foo)?b/

With your model:

RE:
reOnFail $fail
reFlags ""
reMinlength 2
$start:
rePushindex $advance
reLiteral "a"
rePushindex $continue
reLiteral "foo"
$continue:
reLiteral "b"   #and what if this fails?
set I0, 1
reFinished
$advance:
reAdvance
branch $start
$fail:
set I0, 0
reFinished


Mine:

RE:
reFlags ""
reMinlength 2
$start:
rePushindex
reLiteral "a", $advance
reLiteral "foo", $continue  #i may implement zero-argument versions of
this
$continue:
reLiteral "b", $advance
set I0, 1
reFinished
$advance:
rePopindex
reAdvance
branch $start
$fail:
set I0, 0
reFinished

Hmm, I expected to see it be much shorter.  Perhaps your idea has even
more merit than I thought...

# It is quite possible that nested groups and alternation can
# be implemented
# with your model. If that is the case, ¿could you please post
# an example so I
# can understand?.
#
# What do you think about it?

I think the mark solution may be more flexible:

RE:
reFlags ""
reMinlength 4
$advance:
rePopindex
reAdvance $fail
$start:
rePushindex
reLiteral "f", $advance
rePushmark
$findo:
reLiteral "o", $findbar
rePushindex
branch $findo
$findbar:
 reLiteral "bar", $backtrack
 set I0, 1  #true
 reFinished
 $backtrack:
 rePopindex $advance
 branch $findbar
 $fail:
 set I0, 0  #false
 reFinished

However, this may not be a good example, as I'm seriously looking at the
possibility of making reAdvance independent of the stack
(cur_re->startindex or something) to ease implementation of reSubst
(substitution) and related nonsense.  H

Re: Rules for memory allocation and pointing

2001-11-04 Thread Michael L Maraist

On Sunday 04 November 2001 02:39 pm, Dan Sugalski wrote:
> At 08:32 PM 11/4/2001 +0100, Benoit Cerrina wrote:
> > > There will be a mechanism to register PMCs with the interpreter to note
> > > they're pointed to by something that the interpreter can't reach. (For
> > > example, a structure in your extension code, or via a pointer stashed
> > > in the depths of a buffer object, or referenced by another interpreter)
> > > This "foreign access" registry is considered part of an interpreter's
> > > root set.
> >
> >If this is the case, how do you want to move the PMC, I thought you wanted
> > a copying collector?
>
> While the PMC structures themselves don't move (no real need--there of
> fixed size so you can't fragment your allocation pool, though it makes
> generational collection easier to some extent) the data pointed to by the
> PMC can. That's the bit that moves.
>

Ok, so far, here's what I see:

There are two main memory segments.  A traditional alloc/free region.  Within 
this area are arena's (for fixed sized memory objects).  This region must be 
efficient within MT, and hopefully not too wasteful or fragmenting.  This 
region is mostly for core operation and the non arena allocations are to be 
minimized; memory leaks are critical as usual. The utilization patterns 
should be analyzable, well known and compensated for (with respect to 
fragmentation, thread-contention, etc).  My vmem-derivative might still be 
valuable here, but I suppose we need to let the core's needs/characteristics 
flesh out further.

Then there's a GC region which is primarly the allocation space of the 
interpreted app, which obviously can't be trusted with respect to memory 
leaks or usage (fragmentation) patterns.  PMC's and strings are handles into 
this GC-region, though the handles are to be stored in core-memory (above).

Question: Are there other types of references?  I can't think of any.

The GC algorithm (being a separate thread or called incrementally when memory 
is low (but not yet starved)), needs to quickly access this GC heap's values, 
which I believe is Dans for requiring a maximum of two levels of 
indirection.  I suppose it just runs down the known PMC lists including the 
PMC and string register sets, the stacks and the stashes for each 
interpreter.  You do, however refer to a "foreign address" region, and my 
first impression is that it's the wrong way of going about it.

First of all, how are arrays of arrays of arrays handled?

// create a buffer object of PMCs
PMC_t p1 = new Array(50)

for i in 0 .. 49:
  p1->data[ i ] = new Array(50)

In order to comply with the max-depth, the array creation will have to 
register each sub-array-entry in the foreign access region, or am I missing 
something?

First of all, this will mean that the foreign access data-structure will grow 
VERY large when PMC arrays/ hashes are prevalant.  What's worse, this 
data-structure is stored within the core, which means that there is 
additional burden on the core memory fragmentation / contention.

Additionally, what happens when an array is shared by two threads (and thus 
two interpreters).  Who's foreign access region is it stored in?  My guess is 
that to avoid premature freeing, BOTH.  So now a work-q used by a 30-thread 
producer/consumer app is allocating and freeing LOTS of core-memory on each 
enqueue / dispatch..  Again, with the details this fuzzy, I'm probably going 
off half-cocked; but I did qualify that this was my initial impression.

My suggestion is to not use a foreign references section; or if we do, not 
utilize it for deep data-structure nesting.  And instead incorporate a doubly 
linked list w/in PMCs and strings...  Thus wheneever you allocate a PMC or 
string, you attach it to the chain of allocated handles.  Whenever the PMC is 
free'd, you detach it.  The GC then has the laughably simple task of 
navigating this linked list, which spans all threads.  This can encorporate 
mark-and-sweep or copying or what-ever.  By adding 8 or 16 bytes to the size 
of a PMC / string, you avoid many memory related problems.  Not to mention 
the fact that we are free of the concern of depth away from the 
interpreter-root.

Beyond this, I think I see some problems with not having PMCs relocatable.  
While compacting the object-stores that are readily resized can be very 
valuable, the only type of memory leak this avoids is fragmentation-related.  
The PMCs themselves still need to be tested against memory leaks.  Now I'm 
still in favor of some form of reference counting; I think that in the most 
common case, only one data-structure will reference a PMC and thus when it 
goes away, it should immediately cause the deallocation of the associated 
object-space (sacraficing a pitance of run-time CPU so that the GC and free 
memory are relaxed).  But I hear that we're not relying on an integer for 
reference counting (as with perl5), and instead are mostly dependant on the 
GC.   Well, if we use a copying GC, but nev

Re: vmem memory manager

2001-11-04 Thread James Mastros

On Sun, Nov 04, 2001 at 01:47:44PM -0500, Dan Sugalski wrote:
> I've not made any promises as to what type of GC system we'll use. I'm 
> gearing things towards a copying collector, but I'm also trying to make 
> sure we don't lock ourselves out of a generational scheme.
I'd really like to hear that you were planning on not locking us out of
/any/ scheme.  I'd like to see a lot of pluggablity here, so we can get
custom solutions for those needing multiprocessor, huge memory optimized
schemes, and with tiny machines with poor processors, or on a handheld with
tiny memory.  Hell, even segmented memory, if they're really brave.

> I know things are a little fuzzy in the GC arena, but that's on purpose for 
> the moment.
Hell.  I've got very, very little knowlage about gc.  But I'd love to see
the GC pluggable to the point where different modules can have different
GCs... but I don't think it's reasonably possible.

Without doubt, there should be a way for parrot code to modify the
properties of the GC, like the frequency of calling, and to specify "run the
GC now".

   -=- James Mastros



Re: Win32 build and WINVER

2001-11-04 Thread James Mastros

On Sun, Nov 04, 2001 at 01:38:58PM -0500, Dan Sugalski wrote:
> Currently, I don't want to promise back before Win98, though if Win95 is no 
> different from a programming standpoint (I have no idea if it is) then 
> that's fine too. Win 3.1 and DOS are *not* target platforms, though if 
> someone gets it going I'm fine with it.
I'd tend to say that we should support back to win95 (original, not sp2).
AFAIK, there's nothing that changed that should effect core perl/parrot.
The one big exception is Unicode support, NT-based systems have much better
Unicode.  Specificly, you can output unicode to the console.  However, only
targeting NT machines is absolutly not-an-option, for obvious reasons.

It might be that we end up with an NT binary with support for printing
Unicode to the console, and a generic binary without.  (Come to think of it,
the only thing that should care is the opcode library that implements
print(s|sc).)  There's a lot of other differences, of course, but for
everything the win95 versions should be sufficent.  (For example, if we want
to set security properties on open, we need to use APIs that won't work on
95,98, or Me.  But so long as we don't care, the security descriptor
parameter can be NULL, and it will work fine on both.)

I should note, BTW, that I don't write windows programs when I can manage
not to, and I don't run NT.

  -=- James Mastros



Re: [PATCH] Computed goto, super-fast dispatching.

2001-11-04 Thread Tom Hughes

In message <[EMAIL PROTECTED]>
  Daniel Grunblatt <[EMAIL PROTECTED]> wrote:

> Yeap, I was right, using gcc 3.0.2 you can see the difference:

I've just tried it with 3.0.1 and see much the same results as I did
with 2.96 I'm afraid. I don't have 3.0.2 to hand without building it
from source so I haven't tried that as yet.

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu/




Re: [PATCH] Computed goto, super-fast dispatching.

2001-11-04 Thread Daniel Grunblatt

Do you want me to give you an account in my linux machine where I have
install gcc 3.0.2 so that you see it?

Daniel Grunblatt.

On Mon, 5 Nov 2001, Tom Hughes wrote:

> In message <[EMAIL PROTECTED]>
>   Daniel Grunblatt <[EMAIL PROTECTED]> wrote:
>
> > Yeap, I was right, using gcc 3.0.2 you can see the difference:
>
> I've just tried it with 3.0.1 and see much the same results as I did
> with 2.96 I'm afraid. I don't have 3.0.2 to hand without building it
> from source so I haven't tried that as yet.
>
> Tom
>
> --
> Tom Hughes ([EMAIL PROTECTED])
> http://www.compton.nu/
>
>




Re: Rules for memory allocation and pointing

2001-11-04 Thread Michael L Maraist

On Sunday 04 November 2001 03:36 pm, Benoit Cerrina wrote:
> > While the PMC structures themselves don't move (no real need--there of
> > fixed size so you can't fragment your allocation pool, though it makes
>
> Sorry can you expand on this.  I don't see the relation between the data
> being fixed size and the memory not becomming fragmented.

Fixed sized memory objects can be "records" of an array.  Thus you can use an 
"arena" scheme (also known as a slab, or indexed chunks).  You store a linked 
list of free'd elements as with most memory allocators, but:
a) You're garunteed a "best fit algorithm"; all requests are of the exactly 
needed size.
b) When you free, you don't have to consolidate, since subsequent memory 
requests will also need this exact same size.

Fragmentation is when you mix big and small allocations, while the life-times 
of the objects are random.  When you alloc something big, then another thing 
big, then free the first, then alloc something small (which fragments the 
original big chunk into a small chunk and a remainder), then you can't 
allocate another big chunk without growing the heap, even though 99% of what 
you need is available - The fragmentation makes this region useless.  And 
this is just with two memory sizes.  Real memory systems have an infinite 
number of memory sizes.  Theoretically you reuse the medium sized chunk, but 
then you develop non-optimal fitting strategies which are O(n) (and you 
regularly achieve millions of objects).  In general, good algorithms don't 
allow arbitrary sized allocations, and do at least SOME rounding up to 
minimize fragmentation.  Most additionally use some sort of arena as with the 
above.

>
> > generational collection easier to some extent) the data pointed to by the
> > PMC can. That's the bit that moves.
> >
> > Dan
>
> Well one of the reasons of moving the data is so you can use a trivial
> stack like
> allocator.  If you don't move the PMC you won't be able to.
> Benoit

Yeah, I'm writing something up on that.

-Michael




Re: vmem memory manager

2001-11-04 Thread Benoit Cerrina


- Original Message -
From: "James Mastros" <[EMAIL PROTECTED]>
To: "Dan Sugalski" <[EMAIL PROTECTED]>
Cc: "Benoit Cerrina" <[EMAIL PROTECTED]>; "Uri Guttman"
<[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>;
<[EMAIL PROTECTED]>
Sent: Monday, November 05, 2001 12:03 AM
Subject: Re: vmem memory manager


> On Sun, Nov 04, 2001 at 01:47:44PM -0500, Dan Sugalski wrote:
> > I've not made any promises as to what type of GC system we'll use. I'm
> > gearing things towards a copying collector, but I'm also trying to make
> > sure we don't lock ourselves out of a generational scheme.
> I'd really like to hear that you were planning on not locking us out of
> /any/ scheme.  I'd like to see a lot of pluggablity here, so we can get
> custom solutions for those needing multiprocessor, huge memory optimized
> schemes, and with tiny machines with poor processors, or on a handheld
with
> tiny memory.  Hell, even segmented memory, if they're really brave.
>
> > I know things are a little fuzzy in the GC arena, but that's on purpose
for
> > the moment.
> Hell.  I've got very, very little knowlage about gc.  But I'd love to see
> the GC pluggable to the point where different modules can have different
> GCs... but I don't think it's reasonably possible.
This is very important since in the GC world one size does not fit all, you
can have either a very efficient algorythm which will 'stop the world' and
collect everything but it may appear as if your program stops from time to
time... This is bad for all interactive or real time apps but good for
batches.  You may also have 'incremental' or 'real time' collector which
only stop the mutator (the program) for a small and/or bounded time, those
are better for the above apps but they loose in efficiency (they take more
space and time). You also have concurrent collectors which are ideal for
multiprocessors but take a penalty hit which is unacceptable for single
processor systems.  I think that the only good solution is one where one may
choose its collector at the time the app is launched.  This is also what
java does, with sun's jdk (and probably with others) at launch time command
line args let you choose between an efficient stop the world and collect
algo, an incremental algorythm (the train algorythm
(http://www.daimi.au.dk/~beta/Papers/Train/train.html) or a concurrent
algorythm (don't know the algo used)
Benoit

>
> Without doubt, there should be a way for parrot code to modify the
> properties of the GC, like the frequency of calling, and to specify "run
the
> GC now".
>
>-=- James Mastros




[PATCH] Computed goto detected at Configure.pl

2001-11-04 Thread Daniel Grunblatt

- Message Text -
All.-

Now I'm sending:

* A modification to Configure.pl and Makefile.in to detect if the compiler
accepts computed gotos, also added testcomputedgoto_c.in.

* A modification to runcore_ops.c and interpreter.c adding an ifdef.

* The same ops2cgc.pl and the same modification to pbc2c.pl, but with this
one I will deal tomorrow (I have to go now), it's still throwing computed
goto C, we'll have to decide how to handle jump and ret when we don't have
computed goto without big speed consecuencies.

Hope someone find these usefull.

Daniel Grunblatt.




/*
 * testcomputedgoto.c - figure out if we can use computed goto
 *
 * This file is automatically generated by Configure
 * from testcomputedgoto_c.in.
 */

int main(int argc, char **argv) {
static void *ptr = &&LABEL;
int a;

goto *ptr;

LABEL: {
a = 1;
}

return 0;
}



#! /usr/bin/perl -w
#
# ops2cgc.pl
#
# Generate a C header and source file from the operation definitions in
# an .ops file.
#

use strict;
use Parrot::OpsFile;


#
# Process command-line argument:
#

if (@ARGV != 1) {
  die "ops2cgc.pl: usage: perl ops2cgc.pl input.ops\n";
}

my $file = $ARGV[0];

my $base = $file;
$base =~ s/\.ops$//;

my $incdir  = "include/parrot/oplib";
my $include = "parrot/oplib/${base}_cg_ops.h";
my $header  = "include/$include";
my $source  = "${base}_cg_ops.c";


#
# Read the input file:
#

my $ops = new Parrot::OpsFile $file;

die "ops2cgc.pl: Could not read ops file '$file'!\n" unless $ops;

my $num_ops = scalar $ops->ops;
my $num_entries = $num_ops + 1; # For trailing NULL


#
# Open the output files:
#

if (! -d $incdir) {
mkdir($incdir, 0755) or die "ops2cgc.pl: Could not mkdir $incdir $!!\n";
}

open HEADER, ">$header"
  or die "ops2cgc.pl: Could not open header file '$header' for writing: $!!\n";

open SOURCE, ">$source"
  or die "ops2cgc.pl: Could not open source file '$source' for writing: $!!\n";


#
# Print the preamble for the HEADER and SOURCE files:
#

my $preamble = source(\&map_cg_abs, \&map_cg_rel, \&map_arg, \&map_res_abs, 
\&map_res_rel);

print SOURCE "&&PC_" . $index++ . ",\n";

push @op_source, "$definition /* " . $op->func_name . " */\n{\n$source}\n\n";
}

#
# Finish the array and stat the execution:
#

print SOURCE < "cur_opcode[%ld]",

'i'  => "interpreter->int_reg->registers[cur_opcode[%ld]]",
'n'  => "interpreter->num_reg->registers[cur_opcode[%ld]]",
'p'  => "interpreter->pmc_reg->registers[cur_opcode[%ld]]",
's'  => "interpreter->string_reg->registers[cur_opcode[%ld]]",
  
'ic' => "cur_opcode[%ld]",
'nc' => "interpreter->code->const_table->constants[cur_opcode[%ld]]->number",
'pc' => "%ld /* ERROR: Don't know how to handle PMC constants yet! */",
'sc' => "interpreter->code->const_table->constants[cur_opcode[%ld]]->string",
  );

  die "Unrecognized type '$type' for num '$num'" unless exists $arg_maps{$type};

  return sprintf($arg_maps{$type}, $num);
}


#
# map_res_rel()
#

sub map_res_rel
{
  my ($offset) = @_;
  return "interpreter->resume_addr = cur_opcode + $offset";
}


#
# map_res_abs()
#

sub map_res_abs
{
  my ($addr) = @_;
  return "interpreter->resume_addr = $addr";
}


Index: Configure.pl
===
RCS file: /home/perlcvs/parrot/Configure.pl,v
retrieving revision 1.31
diff -u -r1.31 Configure.pl
--- Configure.pl2001/11/02 12:11:15 1.31
+++ Configure.pl2001/11/05 00:28:50
@@ -95,6 +95,14 @@
platform => 'linux',
cp =>   'cp',
slash =>'/',
+   cg_h => '$(INC)/oplib/core_cg_ops.h',
+   cg_c => 'core_cg_ops$(O): $(H_FILES) core_ops.c
+
+core_cg_ops.c $(INC)/oplib/core_cg_ops.h: core.ops ops2cgc.pl
+   $(PERL) ops2cgc.pl core.ops',
+   cg_o => 'core_cg_ops$(O)',
+   cg_r => '$(RM_F) $(INC)/oplib/core_cg_ops.h core_cg_ops.c',
+   cg_flag =>  '-DHAVE_COMPUTED_GOTO',
 );
 
 #copy the things from --define foo=bar
@@ -224,6 +232,33 @@
 
 # rewrite the config file with the updated info
 buildfile("config_h", "include/parrot");
+
+
+# and now test if we can use computed goto
+print <<"END";
+
+Still everything ok, let's check if we can use computed goto,
+don't worry if you see some errors, it will be all right,
+This could take a bit...
+END
+
+{
+   buildfile("testcomputedgoto_c");
+   my $test = system("$c{cc} $c{ccflags} -o testcomputedgoto$c{exe} 
+testcomputedgoto.c");
+   
+   if ($test != 0) {
+   $c{"cg_h"}='';
+   $c{"cg_c"}='';
+   $c{"cg_o"}='';
+   $c{"cg_r"}='';
+   $c{"cg_flag"}='';
+  

Re: [PATCH] Computed goto detected at Configure.pl

2001-11-04 Thread James Mastros

On Sun, Nov 04, 2001 at 07:27:01PM -0300, Daniel Grunblatt wrote:
> * A modification to Configure.pl and Makefile.in to detect if the compiler
> accepts computed gotos, also added testcomputedgoto_c.in.
Is there some reason that this is an _c.in file?  I've noticed that both
this and testparrotsizes_c.in have no substutions (AFAICS), so could just as
easily be .c files (in a different directory, I'd tend to say).

For that matter, why are we avoiding filenames with more then one dot?  It'd
be easy to teach a Makefile to get core.ops.c from core.ops; much harder to
tell it how to get core_ops.c.  (Note that in the current Makefile, we
special-case it.)

 -=- James Mastros



Rounding?

2001-11-04 Thread Zach Lipton

I'm working on learning some parrot asm, but if I write something like this:

set N0,2
set N1,2
add N3, N0, N1
print N3


I get:

4.00

Is there any way to round this, or at least chop the 0's off the end?

Zach




Re: Rounding?

2001-11-04 Thread Michael L Maraist

On Sunday 04 November 2001 10:59 pm, Zach Lipton wrote:
> I'm working on learning some parrot asm, but if I write something like
> this:
>
> set N0,2
> set N1,2
> add N3, N0, N1
> print N3
>
>
> I get:
>
> 4.00
>
> Is there any way to round this, or at least chop the 0's off the end?

since print is for debugging purposes, that's doubtful.  You could create a 
throw-away-patch that adds does:
printf s, i|n|s

-Michael



[PATCHES] Multiple operation tables

2001-11-04 Thread Jeff

This (rather large) set of patches adds the ability for parrot to use
multiple operation libraries. It's currently adding 'obscure.ops' and
'vtable.ops' to the list of operations the interpreter can perform,
though vtable.ops is not tested for obvious reasons.

While this likely should be done in the future with some sort of
standalone compilation tool that builds libobscure.so &c, this will at
least provide some sort of basic mechanism to let us test vtable
operations once they're fully implemented. I've added a test for
'covers' from obscure.ops in t/op/trans.t to prove that the new optable
is being included.

Modified files:
Makefile.in - Modified to link obscure_ops.o and vtable_ops.o in.
Parrot/Assembler.pm - Altered patch that was previously posted to unify
opcode entries
disassemble.pl - Add Parrot::OpLib::{obscure.vtable} and add the new
opcode lists.
interpreter.h - Add 'destroy_interpreter(struct Parrot_Interp*)' to do
final GC on the data structure that's been allocated inside the
interpreter.
interpreter.c - Change interpreter->opfunc and interpreter->opinfo to be
dynamically allocated, so that more than one op table can be added. Also
add code to do the copying.
pbc2c.pl - Add the other opfiles here.
t/op/trans.t - Add a test for 'covers', which is in the obscure.ops file

test_main.c - Make sure the interpreter is destroyed

*** WARNINGS
Makefile.in - It's somewhat silly to have to modify three different
points in the makefile to add the necessary dependencies, and this
should probably be set up dynamically by the Configure.pl script or the
like.
Parrot/Assembler.pm - The patch might cause a problem should
Parrot::OpLib::*::ops be used elsewhere.
interpreter.c - The memory allocation should likely be done elsewhere.
Currently it's done inside make_interpreter, and it should be factored
out into a different function.
pbc2c.pl - The patch breaks encapsulation by altering the {CODE} hash
keys inside the OpsFile object

--
--Jeff
<[EMAIL PROTECTED]>



diff -ru parrot_orig/Makefile.in parrot/Makefile.in
--- parrot_orig/Makefile.in Fri Nov  2 07:11:15 2001
+++ parrot/Makefile.in  Sat Nov  3 23:59:22 2001
@@ -3,16 +3,22 @@
 
 INC=include/parrot
 
+OP_H_FILES = $(INC)/oplib/core_ops.h $(INC)/oplib/obscure_ops.h 
+$(INC)/oplib/vtable_ops.h
+OP_O_FILES = core_ops$(O) obscure_ops$(O) vtable_ops$(O)
+OP_PM_FILES = Parrot/OpLib/core.pm Parrot/OpLib/obscure.pm Parrot/OpLib/vtable.pm
+
 H_FILES = $(INC)/config.h $(INC)/exceptions.h $(INC)/io.h $(INC)/op.h \
 $(INC)/register.h $(INC)/string.h $(INC)/events.h $(INC)/interpreter.h \
 $(INC)/memory.h $(INC)/parrot.h $(INC)/stacks.h $(INC)/packfile.h \
-$(INC)/global_setup.h $(INC)/vtable.h $(INC)/oplib/core_ops.h \
-$(INC)/runops_cores.h $(INC)/trace.h $(INC)/oplib/vtable_ops.h \
+$(INC)/global_setup.h $(INC)/vtable.h \
+$(OP_H_FILES) \
+$(INC)/runops_cores.h $(INC)/trace.h \
 $(INC)/pmc.h $(INC)/resources.h $(INC)/platform.h
 
 O_FILES = global_setup$(O) interpreter$(O) parrot$(O) register$(O) \
-core_ops$(O) memory$(O) packfile$(O) stacks$(O) string$(O) encoding$(O) \
-chartype$(O) runops_cores$(O) trace$(O) vtable_ops$(O) classes/intclass$(O) \
+memory$(O) packfile$(O) stacks$(O) string$(O) encoding$(O) \
+$(OP_O_FILES) \
+chartype$(O) runops_cores$(O) trace$(O) classes/intclass$(O) \
 encodings/singlebyte$(O) encodings/utf8$(O) encodings/utf16$(O) \
 encodings/utf32$(O) chartypes/unicode$(O) chartypes/usascii$(O) resources$(O) \
 platform$(O)
@@ -42,7 +48,7 @@
 libparrot.so: $(O_FILES)
$(CC) -shared $(C_LIBS) -o $@ $(O_FILES)
 
-$(TEST_PROG): test_main$(O) $(O_FILES) Parrot/OpLib/core.pm
+$(TEST_PROG): test_main$(O) $(O_FILES) $(OP_PM_FILES)
$(CC) $(CFLAGS) -o $(TEST_PROG) $(O_FILES) test_main$(O) $(C_LIBS)
 
 $(PDUMP): pdump$(O) $(O_FILES)
@@ -56,8 +62,11 @@
 Parrot/OpLib/core.pm: core.ops ops2pm.pl
$(PERL) ops2pm.pl core.ops
 
+Parrot/OpLib/obscure.pm: obscure.ops ops2pm.pl
+   $(PERL) ops2pm.pl obscure.ops
+
 Parrot/OpLib/vtable.pm: vtable.ops ops2pm.pl
-   $(PERL) ops2pm.pl vtabls.ops
+   $(PERL) ops2pm.pl vtable.ops
 
 examples/assembly/mops.c: examples/assembly/mops.pbc pbc2c.pl
$(PERL) pbc2c.pl examples/assembly/mops.pbc > examples/assembly/mops.c
@@ -103,9 +112,14 @@
 
 core_ops$(O): $(H_FILES) core_ops.c
 
+obscure_ops$(O): $(H_FILES) obscure_ops.c
+
 core_ops.c $(INC)/oplib/core_ops.h: core.ops ops2c.pl
$(PERL) ops2c.pl core.ops
 
+obscure_ops.c $(INC)/oplib/obscure_ops.h: obscure.ops ops2c.pl
+   $(PERL) ops2c.pl obscure.ops
+
 vtable.ops: make_vtable_ops.pl
$(PERL) make_vtable_ops.pl > vtable.ops
 
@@ -130,14 +144,14 @@
cd docs; make
 
 clean:
-   $(RM_F) *$(O) *.s core_ops.c $(TEST_PROG) $(PDISASM) $(PDUMP)
+   $(RM_F) *$(O) *.s $(OP_C_FILES) $(TEST_PROG) $(PDISASM) $(PDUMP)
$(RM_F) $(INC)/vtable.h
-   $(RM_F) $(INC)/oplib/core_ops.h
-   $(RM_F) $(INC)/oplib/vtable_ops.h vtable_ops.c vtable.ops
+   $(RM_F) $(OP_H_FILES)
+   $(RM_F) vtab

Re: Rules for memory allocation and pointing

2001-11-04 Thread Benoit Cerrina

>
>- Original Message -
>From: "Michael L Maraist" <[EMAIL PROTECTED]>
>To: <[EMAIL PROTECTED]>
>Sent: Sunday, November 04, 2001 10:10 PM
>Subject: Re: Rules for memory allocation and pointing
>
>
>On Sunday 04 November 2001 03:36 pm, Benoit Cerrina wrote:
>> > While the PMC structures themselves don't move (no real need--there of
>> > fixed size so you can't fragment your allocation pool, though it makes
>>
>> Sorry can you expand on this.  I don't see the relation between the data
>> being fixed size and the memory not becomming fragmented.
>
>Fixed sized memory objects can be "records" of an array.  Thus you can use
an
>"arena" scheme (also known as a slab, or indexed chunks).  You store a
linked
 interesting stuff 
>
OK
Benoit





Re: Rules for memory allocation and pointing

2001-11-04 Thread Benoit Cerrina

>At 09:36 PM 11/4/2001 +0100, Benoit Cerrina wrote:
>
>> > While the PMC structures themselves don't move (no real need--there of
>> > fixed size so you can't fragment your allocation pool, though it makes
>>Sorry can you expand on this.  I don't see the relation between the data
>>being fixed size and the memory not becomming fragmented.
>> > generational collection easier to some extent) the data pointed to by
the
>> > PMC can. That's the bit that moves.
>> >
>> > Dan
>> >
>>Well one of the reasons of moving the data is so you can use a trivial
stack
>>like
>>allocator.  If you don't move the PMC you won't be able to.
>
>Sure you will. Allocation and deallocation of fixed-sized structures is
>very simple--in some ways as simple as using a stack allocator, as you're
>just putting things on the head of a list or taking them off it. It's
>allocation and deallocation of variable-sized structures that's tricky, and
>tends to fragment your heap all to heck.
>
>
>Dan
OK
Benoit




Re: [PATCH] Computed goto, super-fast dispatching.

2001-11-04 Thread Daniel Grunblatt

Yeap, I was right, using gcc 3.0.2 you can see the difference:

Without my patch:

linux# ./test_prog examples/assembly/mops.pbc
Iterations:1
Estimated ops: 3
Elapsed time:  20.972973
M op/s:14.304124

With the patch:

linux# ./test_prog examples/assembly/mops.pbc
Iterations:1
Estimated ops: 3
Elapsed time:  8.983514
M op/s:33.394505

Now I'm really happy :)

So, you can say that we can't ask everyone to have gcc 3.0.2, right? but
we can let them download the binaries.

Daniel Grunblatt.

On Sun, 4 Nov 2001, Daniel Grunblatt wrote:

> Yes, you are right on that, but that is only on linux, not on *BSD (where
> I tried it). I still don't know why is these, Can you try using gcc 3.0.2?
>
> For the compiled version, please read both mops.c you will see there is no
> difference except for the definition of the array which if no missing
> something doesn't have anything to with the _benchmark_.
>
> Daniel Grunblatt.
>
> On Sun, 4 Nov 2001, Tom Hughes wrote:
>
> > In message <[EMAIL PROTECTED]>
> >   Daniel Grunblatt <[EMAIL PROTECTED]> wrote:
> >
> > > All:
> > >   Here's a list of the things I've been doing:
> > >
> > > * Added ops2cgc.pl which generates core_cg_ops.c and core_cg_ops.h from
> > > core.ops, and modified Makefile.in to use it. In core_cg_ops.c resides
> > > cg_core which has an array with the addresses of the label of each opcode
> > > and starts the execution "jumping" to the address in array[*cur_opcode].
> > >
> > > * Modified interpreter.c to include core_cg_ops.h
> > >
> > > * Modified runcore_ops.c to discard the actual dispatching method and call
> > > cg_core, but left everything else untouched so that -b,-p and -t keep
> > > working.
> > >
> > > * Modified pbc2c.pl to use computed goto when handling jump or ret, may be
> > > I can modified this once again not to define the array with the addresses
> > > if it's not going to be used but I don't think that in real life a program
> > > won't use jump or ret, am I right?
> > >
> > > Hope some one find this usefull.
> >
> > I just tried it but I don't seem to be seeing anything like the speedups
> > you are. All the times which follow are for a K6-200 running RedHat 7.2 and
> > compiled -O6 with gcc 2.96.
> >
> > Without patch:
> >
> >   gosford [~/src/parrot] % ./test_prog examples/assembly/mops.pbc
> >   Iterations:1
> >   Estimated ops: 3
> >   Elapsed time:  37.387179
> >   M op/s:8.024141
> >
> >   gosford [~/src/parrot] % ./examples/assembly/mops
> >   Iterations:1
> >   Estimated ops: 3
> >   Elapsed time:  3.503482
> >   M op/s:85.629098
> >
> > With patch:
> >
> >   gosford [~/src/parrot-cg] % ./test_prog examples/assembly/mops.pbc
> >   Iterations:1
> >   Estimated ops: 3
> >   Elapsed time:  29.850361
> >   M op/s:10.050130
> >
> >   gosford [~/src/parrot-cg] % ./examples/assembly/mops
> >   Iterations:1
> >   Estimated ops: 3
> >   Elapsed time:  4.515596
> >   M op/s:66.436413
> >
> > So there is a small speed up for the interpreted version, but nothing
> > like the three times speedup you had. The compiled version has actually
> > managed to get slower...
> >
> > Tom
> >
> > --
> > Tom Hughes ([EMAIL PROTECTED])
> > http://www.compton.nu/
> >
> >
>
>