Re: Regex helper opcodes

2001-11-06 Thread Dan Sugalski

At 04:34 PM 11/5/2001 -0800, Steve Fink wrote:
>Quoting Dan Sugalski ([EMAIL PROTECTED]):
> > At 11:54 AM 11/5/2001 -0800, Steve Fink wrote:
> > > > >It's pretty
> > > > >much functional, including reOneof.  Still, these could be useful
> > > > >internal functions... *ponder*
> > > >
> > > > I was thinking that the places they could come in really handy for were
> > > > character classes. \w, \s, and \d are potentially a lot faster this 
> way,
> > > > 'specially if you throw in Unicode support. (The sets get rather a bit
> > > > larger...) It also may make some character-set independence easier.
> > >
> > >But why would you be generating character classes at runtime?
> >
> > Because someone does:
> >
> >while (<>) {
> >  next unless /[aeiou]/;
> >}
> >
> > and we want that character class to be reasonably fast?
>
>? So don't generate it at runtime. When you generate the opcode
>sequence for the regex, emit a bit vector into the constant table and
>refer to it by address in the matchCharClass op's arguments. Be fancy
>and check that you haven't already emitted that bit vector. Am I
>missing something?

Just me being rather amazingly dense.

No, there's no requirement for there to be a way to create or change at 
runtime the contents of one of these bit vectors, at least not for the 
regexes. That can be done entirely by the compiler or loader, depending on 
where the code's ultimately coming from.

> > Ah, point. A bitmap won't work too well with the full UTF-32 set.
> >
> > Having a good set of set operations would be useful for the core, though.
>
>No argument there.

Which would be a good argument for allowing these things to be created or 
modified at runtime, but that's a separate argument entirely.

> > >You aren't thinking that the regular expression _compiler_ needs to be
> > >written in Parrot opcodes, are you? I assumed you'd reach it through
> > >some callout mechanism in the same way that eval"" will be handled.
> >
> > The core of the parser's still a bit up in the air. Larry's leaning 
> towards
> > it being in perl.
>
>When you say "parser", do you mean parser + bytecode generator +
>optimizer + syntax analyzer? (Of which only the bytecode generator is
>relevant to [:classes:], I suppose.)

No, really just the parser bit, which is where I was assuming most of the 
work would get done for this. Silly assumption--too much blood in my 
caffeine stream.

The bytecode generator and optimizer may well be (probably will be) in C, 
though depending on how it works out the bytecode generator itself may well 
be doable in perl. In many ways it's just a fancy set of rules to transform 
the syntax tree to bytecode, and you could look at it as either a really 
beefy template system or a fancy regex machine.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




useful GC and memory reference

2001-11-06 Thread Sean O'Rourke

As someone interested in both Perl and garbage collection, I've
been following the discussion here with interest.  Since I
haven't seen it mentioned, I thought I would point out Paul
Wilson's work in this area at UT Austin, which I found useful
back when I was doing GC stuff.  He has a survey of GC options
and their tradeoffs, some experiments measuring typical
programs' memory locality, and a discussion of how these things
relate.  This is some random course page, but it links to the
relevant papers:

http://www.cs.utexas.edu/users/wilson/cs395t/

Hope that's useful, or amusing, or something.
Sean
(relurks)

__
Do You Yahoo!?
Find a job, post your resume.
http://careers.yahoo.com



Re: Yet another switch/goto implementation

2001-11-06 Thread Simon Cozens

On Mon, Nov 05, 2001 at 11:35:53PM -0500, Ken Fox wrote:
> IMHO Perl is getting 

Interesting construction. :)

> some static typing ability, so it should be able
> to emit bytecode that doesn't go through the PMC vtable.

Yes, but that's fundamentally different from inlining vtable methods
in the runops loop, which is what you were originally suggesting. 
I'm now unsure what you're actually getting at.

-- 
 I detest people who get in their cars before turning off the 
alarm, fiddle around a bit, and then turn it off
 maybe they're afraid someone might steal the car in the short time 
before they turn off the alarm and actually get in
 it's a race condition, you know



Re: How far back do we go?

2001-11-06 Thread Dan Sugalski

At 11:29 AM 11/6/2001 -0500, Bryan C. Warnock wrote:
>On Tuesday 06 November 2001 11:32 am, Dan Sugalski wrote:
> > Getting parrot building on Solaris brought up another interesting
> > portability issue. Turns out the default GCC switches are inappropriate
> > for GCC 2.8.1, which is the default on my Sun box. Works OK with gcc
> > 2.95.somethingorother.
> >
> > How far back with GCC should we support?
>
>At least 2.7.x.

Cool. I think I even have a machine with that rev.

>What switches broke?

  -fno-strict-aliasing


Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: How far back do we go?

2001-11-06 Thread Bryan C . Warnock

On Tuesday 06 November 2001 11:38 am, Dan Sugalski wrote:
> >What switches broke?
>
>   -fno-strict-aliasing

Ah.  Well, steal a page from Perl 5's configure then, 'cause they check for 
it.

-- 
Bryan C. Warnock
[EMAIL PROTECTED]



Re: Portability update

2001-11-06 Thread Alex Gough

On Tue, 6 Nov 2001, Dan Sugalski wrote:

> Okay, I just tweaked things some, and now parrot builds and tests OK on 
> Solaris, Linux, and Cygwin. It's theoretically possible that this'll get 
> things building OK on all the recent-vintage Unices, but I can't promise 
> that. :)
> 
> Could folks on Tru64/Irix/HP-UX/AIX check this out and give it a whirl?

Irix happy. (w/ MIPSPro and gcc).

Alex Gough




Re: Portability update

2001-11-06 Thread Dan Sugalski

At 05:24 PM 11/6/2001 +, Alex Gough wrote:
>On Tue, 6 Nov 2001, Dan Sugalski wrote:
>
> > Okay, I just tweaked things some, and now parrot builds and tests OK on
> > Solaris, Linux, and Cygwin. It's theoretically possible that this'll get
> > things building OK on all the recent-vintage Unices, but I can't promise
> > that. :)
> >
> > Could folks on Tru64/Irix/HP-UX/AIX check this out and give it a whirl?
>
>Irix happy. (w/ MIPSPro and gcc).

Yay! Cool, thanks.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




RE: Portability update

2001-11-06 Thread Angel Faus


A fresh checkout (both after and before Dan's portability patch) gives me a
lot of failed tests on cygwin.

¿Is it me or is it really a problem? Dan said it's tested on cygwin...

$ make test

Failed TestStatus Wstat Total Fail  Failed  List of Failed


t/op/basic.t   1   256 51  20.00%  5
t/op/bitwise.t 4  1024 44 100.00%  1-4
t/op/integer.t27  691227   27 100.00%  1-27
t/op/number.t 27  691227   27 100.00%  1-27
t/op/stacks.t  4  1024104  40.00%  1, 3, 5, 7
t/op/string.t 21  537624   21  87.50%  1, 3-4, 6, 8-24
t/op/time.t2   512 22 100.00%  1-2
3 subtests skipped.
Failed 7/8 test scripts, 12.50% okay. 86/117 subtests failed, 26.50% okay.

$ uname -a
CYGWIN_98-4.10 BCN-AFAUS 1.3.3(0.46/3/2) 2001-09-12 23:54 i686 unknown

$ gcc -v
Reading specs from /usr/lib/gcc-lib/i686-pc-cygwin/2.95.3-5/specs
gcc version 2.95.3-5 (cygwin special)





Re: [PATCH] Computed goto, super-fast dispatching.

2001-11-06 Thread Dan Sugalski

At 11:12 PM 11/5/2001 +, Alex Gough wrote:
>On Mon, 5 Nov 2001, Dan Sugalski wrote:
> > At 10:24 AM 11/5/2001 -0300, Daniel Grunblatt wrote:
> > >Right, now, what about the audience with an operative system with gcc
> > >3.0.2?
> >
> > What about 'em? They build the same way everyone else does.
> >
> > Gearing code specifically towards the quirks of a specific compiler
> > version's usually a good way to get really disappointed when the next 
> point
> > release comes out and tosses away the win you found.
> >
>
>Hurrah that man!  For information, Irix isn't working any more, because
>Configure thinks it's linux.  I'll try to sort it out when I'm further
>away from the evil modem monster.

That's bizarre. A hints file will probably fix that up, but I think we need 
to wean parrot off of the GNU view of the world.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: How far back do we go?

2001-11-06 Thread Andy Dougherty

On Tue, 6 Nov 2001, Dan Sugalski wrote:

> Getting parrot building on Solaris brought up another interesting 
> portability issue. Turns out the default GCC switches are inappropriate for 
> GCC 2.8.1, which is the default on my Sun box. Works OK with gcc 
> 2.95.somethingorother.
> 
> How far back with GCC should we support?

[a little bit of guessing here . . . ]

I think your question is actually more generic and will be self-answering
in time.  

Configure.pl still grabs most of its defaults from the previous perl5
build. I'll bet you used a pre-built perl5 on your system.  That pre-built
perl5 was apparently not built with the same version of gcc that you have
on your system.  You'd probably run into the same problem trying to build
any perl5 extensions that require a compiler.  You'd also run into the
same problem anytime you had a mismatch between a prebuilt perl5 and the
rest of your system.  Such mismatches are very easy to come by.

Ultimately, Configure.pl ought to, in principle, check everything for
itself.  Once it starts doing more of that, problems such as you ran into
should be automatically handled.

-- 
Andy Dougherty  [EMAIL PROTECTED]
Dept. of Physics
Lafayette College, Easton PA 18042




RE: Portability update

2001-11-06 Thread Dan Sugalski

At 07:08 PM 11/6/2001 +0100, Angel Faus wrote:

>A fresh checkout (both after and before Dan's portability patch) gives me a
>lot of failed tests on cygwin.
>
>¿Is it me or is it really a problem? Dan said it's tested on cygwin...

The tests fail for spurious line-ending problems, which is kinda annoying 
but I've not had time to track it down yet. If you look at the output it's 
correct, but Test::Harness doesn't think so.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: How far back do we go?

2001-11-06 Thread Dan Sugalski

At 01:10 PM 11/6/2001 -0500, Andy Dougherty wrote:
>On Tue, 6 Nov 2001, Dan Sugalski wrote:
>
> > Getting parrot building on Solaris brought up another interesting
> > portability issue. Turns out the default GCC switches are inappropriate 
> for
> > GCC 2.8.1, which is the default on my Sun box. Works OK with gcc
> > 2.95.somethingorother.
> >
> > How far back with GCC should we support?
>
>[a little bit of guessing here . . . ]
>
>I think your question is actually more generic and will be self-answering
>in time.
>
>Configure.pl still grabs most of its defaults from the previous perl5
>build. I'll bet you used a pre-built perl5 on your system.  That pre-built
>perl5 was apparently not built with the same version of gcc that you have
>on your system.

D'oh! Got it in one--thanks Andy. The default GCC wasn't the one that built 
my perl, hence the problem.

One more thing for the portability list, I guess. (Re-figure C switches, 
right next to a relocatable perl...)


Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: How far back do we go?

2001-11-06 Thread Tom Hughes

In message <[EMAIL PROTECTED]>
  Dan Sugalski <[EMAIL PROTECTED]> wrote:

> >What switches broke?
> 
>   -fno-strict-aliasing

But parrot doesn't need that anyway as far as I know. It is needed
in perl5 because of the horrible things it does with casts and such
like but if we can avoid having to have it on we should as it will
impact performance.

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu/




Re: make clean

2001-11-06 Thread Tom Hughes

In message <[EMAIL PROTECTED]>
  Daniel Grunblatt <[EMAIL PROTECTED]> wrote:

> Index: Makefile.in
> ===
> RCS file: /home/perlcvs/parrot/Makefile.in,v
> retrieving revision 1.43
> diff -u -r1.43 Makefile.in

I enhanced this to remove object files from all the current
subdirectories rather then just one, and have committed it.

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu/




Portability update

2001-11-06 Thread Dan Sugalski

Okay, I just tweaked things some, and now parrot builds and tests OK on 
Solaris, Linux, and Cygwin. It's theoretically possible that this'll get 
things building OK on all the recent-vintage Unices, but I can't promise 
that. :)

Could folks on Tru64/Irix/HP-UX/AIX check this out and give it a whirl?

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




How far back do we go?

2001-11-06 Thread Dan Sugalski

Getting parrot building on Solaris brought up another interesting 
portability issue. Turns out the default GCC switches are inappropriate for 
GCC 2.8.1, which is the default on my Sun box. Works OK with gcc 
2.95.somethingorother.

How far back with GCC should we support?

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: How far back do we go?

2001-11-06 Thread Bryan C . Warnock

On Tuesday 06 November 2001 11:32 am, Dan Sugalski wrote:
> Getting parrot building on Solaris brought up another interesting
> portability issue. Turns out the default GCC switches are inappropriate
> for GCC 2.8.1, which is the default on my Sun box. Works OK with gcc
> 2.95.somethingorother.
>
> How far back with GCC should we support?

At least 2.7.x.  

What switches broke?

-- 
Bryan C. Warnock
[EMAIL PROTECTED]



Re: Vtables fixed, scalar started

2001-11-06 Thread Jason Gloudon


After looking at the internal data types PDD and the vtable PDD (which by the
way is truncated on dev.perl.org in the pdd and HTML form), I can't make sense
of the separate float_type and num_type declared in the vtable structure.

struct _vtable {
  struct PACKAGE *package;
  INTVAL base_type;
  INTVAL int_type;
  INTVAL float_type;  
  INTVAL num_type;
  INTVAL string_type;

Also, the NUM, INT and STR types are being implemented as the concrete
FLOATVAL, INTVAL, STRING types. This is not the design goal correct ?

For example, the current get_integer vtable functions generated for the scalar
class will only return INTVALs but my impression is that they could return
INTVALs or BigInt values, so that the current scalar class function prototypes
will eventually change.

-- 
Jason



Re: Vtables fixed, scalar started

2001-11-06 Thread Dan Sugalski

At 05:05 PM 11/6/2001 -0500, Jason Gloudon wrote:

>After looking at the internal data types PDD and the vtable PDD (which by the
>way is truncated on dev.perl.org in the pdd and HTML form), I can't make sense
>of the separate float_type and num_type declared in the vtable structure.

A variable with a numeric value can be taken in one of three ways:

*) As an integer. Which means either platform-native or bigint
*) As a float. Which means either platform-native or bigfloat
*) As a generic number. Which means platform native int or float, bigint or 
bigfloat, or (possibly) a complex number.

It might end up that the distinction's just not worth it. We'll see.

>Also, the NUM, INT and STR types are being implemented as the concrete
>FLOATVAL, INTVAL, STRING types. This is not the design goal correct ?
>
>For example, the current get_integer vtable functions generated for the 
>scalar class will only return INTVALs but my impression is that they could 
>return INTVALs or BigInt values, so that the current scalar class function 
>prototypes will eventually change.

Right, if you called the bigint version of get_integer you'd get a bigint.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Parrot memory/GC/DOD primer

2001-11-06 Thread Dan Sugalski

'Kay, here's a quick overview of how memory, garbage collection, and dead 
object detection are going to work in Parrot. (And I appreciate this 
getting raised now, BTW--both because it's about time and because it makes 
me think about it in time for next saturday)

There are three sets of memory pools for each interpreter. They are:

*) The PMC pool
*) The Buffer header pool
*) The honking great slab of memory pool

The PMC and buffer pools each consist of one or more arenas, while the 
HGSOM pool has one or more slabs of memory allocated to it.

Each arena is thread-private, and no other interpreter can allocate out of it.

When the interpreter needs to do a garbage run, it runs through all the 
PMCs in the PMC pool. Any that look like they might be live and that point 
to buffer objects are assumed to be alive, and the memory their buffer 
objects points to is collected. It's essentially a barely-conditional 
sweep, with the assumption that dead objects are marked as such and thus 
won't be collected

The dead object phase is essentially the mark phase of a GC run. We mark 
all the PMCs in the pool as unused, start with the root set and then mark 
all the used ones. When we run out, any PMCs that were used but aren't 
pointed to are then marked as unused so garbage runs will ignore them. If 
any of these PMCs require active destruction then that'll be done here.

If at any point an arena or memory slab is completely unused, it may be 
handed back. Whether that's to the system or a global allocator's up in the 
air--I can see it going either way.

Separating things this way means that a garbage run without an immediately 
preceding DOD run may collect on some PMCs that are really dead. That's OK.

Also, we may well use a vmem system to allocate buffer headers rather than 
using per-interpreter pools of arenas. I can see it going either way, and 
if we have several sizes of buffers it might be more space-efficient to do 
so, but on the other hand using arenas makes collecting the buffer headers, 
which can be done during DOD time, easier. (Basically we wouldn't need to 
explicitly free those either, and could trace the used and unused buffer 
objects at DOD time)

This help any?

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: useful GC and memory reference

2001-11-06 Thread Benoit Cerrina

Lot of reading, thanks.
Benoit
PS: 
I guess the actual page of interest is:
http://www.cs.utexas.edu/users/oops/papers.html
please correct me if wrong
- Original Message - 
From: "Sean O'Rourke" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Tuesday, November 06, 2001 7:32 AM
Subject: useful GC and memory reference


> As someone interested in both Perl and garbage collection, I've
> been following the discussion here with interest.  Since I
> haven't seen it mentioned, I thought I would point out Paul
> Wilson's work in this area at UT Austin, which I found useful
> back when I was doing GC stuff.  He has a survey of GC options
> and their tradeoffs, some experiments measuring typical
> programs' memory locality, and a discussion of how these things
> relate.  This is some random course page, but it links to the
> relevant papers:
> 
> http://www.cs.utexas.edu/users/wilson/cs395t/
> 
> Hope that's useful, or amusing, or something.
> Sean
> (relurks)
> 
> __
> Do You Yahoo!?
> Find a job, post your resume.
> http://careers.yahoo.com




Re: Yet another switch/goto implementation

2001-11-06 Thread Ken Fox

Simon Cozens wrote:
> On Mon, Nov 05, 2001 at 11:35:53PM -0500, Ken Fox wrote:
> > IMHO Perl is getting 
> 
> Interesting construction. :)

Yeah, that should have been a disclaimer. I've heard static typing
proposed, but nothing appears finalized about anything yet. Something
like static typing might even be a bolt-on module if it doesn't make
the core.

> > some static typing ability, so it should be able
> > to emit bytecode that doesn't go through the PMC vtable.
> 
> Yes, but that's fundamentally different from inlining vtable methods
> in the runops loop, which is what you were originally suggesting. 
> I'm now unsure what you're actually getting at.

If the guts of a vtable implementation are ripped out and given an
op, isn't that inlining a PMC method? There doesn't seem much point
in replacing a dynamic vtable offset with a constant vtable offset.
The method really needs to be inlined -- either by copying the code
or by calling the implementation directly.

For example, if you have a length op, the generic one would
dispatch through a vtable: (hand waving pseudo code)

  op_length:
 integer[pc[1]] = pmc[pc[2]]->vtable.length(pmc[pc[2]]);
 pc += 3;
 goto *pc;

the inlined one should be:

  op_length_array:
 if (pmc[pc[2]]->type == PERL_ARRAY) {
integer[pc[1]] = ((Perl_Array *)pmc[pc[2]])->length;
pc += 3;
goto *pc;
 }
 else {
goto op_length;
 }

It's probably naive to think that "static" typing is going to be
reliable enough to eliminate the type check. Hopefully branch
prediction will be nearly 100% accurate though.

This code would be horrible to implement and maintain by hand though.
It would be very cool if there was an option to unroll PMC methods
into inlined ops based on profiling example code. Perl might even send
enough type information to Parrot that Parrot could do some strength
reduction on its own (and replace generic PMC ops with inlined PMC
ops).

- Ken