error in rx.ops example

2003-10-21 Thread Stéphane Payrard
rx_popindex signature is incorrect in the pre-patch example.


--- rx.ops.old  2003-06-06 18:27:00.0 +0200
+++ rx.ops  2003-10-20 23:08:24.0 +0200
@@ -108,7 +108,7 @@
rx_literal S0, I1, "b", $next
branch $top
$backtrack:
-   rx_popindex S0, I1, $advance
+   rx_popindex I1, $advance
$next:
rx_oneof S0, I1, "cd", $backtrack
branch $success




--
 stef


Re: Object freezing

2003-10-21 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
> Since this has come up again, ...

[ FYI: I was starting implementing this, based on a general traverse
   vtable with callback functions. Two patches got backed out by
   Dan after some discussion in PM ]

> ... and it's apparent that the last time around
> I wasn't sufficiently clear, it's time to go through this again, and for
> the final time.

I'd be really happy, if you could go through my concerns mentioned in
the summary in the thread:
  Subject: Re: [RfC] vtable->dump
  Date: Thu, 4 Sep 2003 12:31:08 +0200

> ... (I will beat this thing into the ground by the time we're
> done)

Sorry for the inconvenience and being ignorant ...

>   PMC *thaw(interpreter, STRING *)

This should IMHO be able to create constant PMCs out of metadata, e.g.
for subroutine objects. So there should be some means to tell thaw() to
create PMC(s) in the constant_pmc_pool.

> The chill and warm runtime methods take a PMC or a frozen representation
> of a PMC (respectively) and provide a human readable version of that PMC.

I dunno, why chill() is superior to dump() or pretty_print(), but the
name doesn't really matter.

>   1) Freezing at the destruction level may *not* use any additional memory
>  for object traversal

What is "Freezing at the destruction level"? Is this anyhow related to
destruction ordering?

> Note that I do *not* want to have multiple object traversal systems in
> parrot! We have one for DOD, and proposals have ranged upwards from there.
> No. That is *not* happening--the chance for error is significant, the
> side-effects of the error annoying and tough to track down for complex
> cases (akin to the trouble with tracking down GC issues), and just not
> necessary. (Perhaps desirable for speed/space reasons, but desirable
> isn't necessary)

DOD's mark() routine has different requirements then a general
traverse() for freeze(), chill(), clone(), and destruction ordering.
Using just mark() will have these side effects that you want to avoid.

A general traverse() can be depth first of breadth first, mark() isn't
required do have any specific ordering as long as it sets live bits
everywhere.

mark() is called permanently in a running interpreter, that does non
trivial things. There are shortcuts for scalars, DOD is highly optimized
not to destroy cache coherency. Using mark() also implies to back out
my small PMC patches. All the advantages of smaller scalars are gone
then.

While freeze() and friends have to pull in each PMC into the cache, just
setting the live bit on a PMC hasn't. Further: Lukes proposal for
speeding up timely destruction puts objects either in front or at the
end of the next_for_GC chain. This IMHO implies that mark() is unusable
as your general and solely iterator.

> ... This is something that's hidden under a number of layers
> of API, so regardless of the outcome it doesn't affect the assembly, PMC,
> or runtime API.

So when its hidden, I really don't understand, why you are insisting on
an (IMHO) suboptimal design.

> The thread-safety is an issue,

While all schemes aren't thread-safe from user level (e.g.
manually sorting an array containing shared PMCs, while it gets
frozen), your scheme isn't thread-safe at low-level, as the next_for_GC
pointer inside the PMC is used as a duplicate marker. But if a user
changes shared resources its a user problem. We only guarantee atomic
updates per PMC (s. P6E p 86f by Dan).

>   Dan

Comments addressing all these issues are highly welcome,
leo


Re: error in rx.ops example

2003-10-21 Thread Leopold Toetsch
Stéphane Payrard <[EMAIL PROTECTED]> wrote:
> rx_popindex signature is incorrect in the pre-patch example.

> - rx_popindex S0, I1, $advance
> + rx_popindex I1, $advance

Thanks, changed.

leo


pcc: the parrot C compiler (just a wrapper, don't expect big things :-)

2003-10-21 Thread Aldo Calpini
hello,

I was trying to debug the t/src tests, and realized that doing by hand
what Parrot::Test::c_output_(is|like) does is not really easy.

I wanted to compile the source code embedded in t/src/sprintf.t (the
third test, in my case) to see exactly where and how it was failing, and
possibly have a chance to debug it.

so, I ended up looking in lib/Parrot/Test.pm and writing a simple
script which mimics what that module does (just a wrapper that calls
gcc with all the settings from Parrot::Config).

I've called it pcc, and it should reside in the root parrot directory.
the source is merely 17 lines long, so I attach it at the end of this
mail.

now, I'm able to pull out the source code from t/src/sprintf.t, save
it, let's say, in many_printfs.c and then do:

$ ./pcc many_printfs.c
$ ./many_printfs

hope this helps :-)

$ cat pcc

#!/usr/bin/perl -w

use lib 'lib';
use Parrot::Config;

my $libparrot = $PConfig{blib_lib_libparrot_a};
$libparrot =~ s/\$\(A\)/$PConfig{a}/;

my $source_f = $ARGV[0] || die "no source specified\n";
(my $obj_f = $source_f) =~ s/\.c/$PConfig{o}/ie;
(my $exe_f = $source_f) =~ s/\.c/$PConfig{exe}/ie;
my $cmd = "$PConfig{cc} $PConfig{ccflags} $PConfig{cc_debug} ".
  " -I./include -c $PConfig{cc_o_out}$obj_f $source_f";
system("$cmd") && die "compile failed with exit code ".($?>>8)."\n";
$cmd = "$PConfig{link} $PConfig{linkflags} $PConfig{ld_debug} $obj_f ".
   "$PConfig{ld_out}$exe_f $libparrot $PConfig{libs}";
system("$cmd") && die "link failed with exit code ".($?>>8)."\n";
__END__


cheers,
Aldo

__END__
$_=q,just perl,,s, , another ,,s,$, hacker,,print;



Re: Taint mode testing and project Phalanx

2003-10-21 Thread Nicholas Clark
On Mon, Oct 20, 2003 at 10:27:34PM -0700, Michael G Schwern wrote:
> On Tue, Oct 21, 2003 at 12:24:03AM -0500, Dave Rolsky wrote:

> > Not to mention that it's buggy as hell.  For example, in various versions
> > of Perl I've used there have been rather serious bugs in the regex engine
> > when taint mode is on, even when dealing with untainted variables!
> 
> I've never hit anything like this.  Do you have examples?

http://rt.perl.org/rt2/Ticket/Display.html?id=24248

variations on the theme of

#!perl -T
{
  local $ENV{PATH} = "/bin";

  my $r = "foo";

  $ARGV[0] =~ /($r)/;

  my $c = "echo $1";
  system $c;
}
__END__

http://rt.perl.org/rt2/Ticket/Display.html?id=22270

where I don't agree with any of the explainations (IIRC) and stand by the
bug. (But ran out of time to find a better explaination)

Nicholas Clark


Re: No more code coverage

2003-10-21 Thread Tim Bunce
On Mon, Oct 20, 2003 at 11:05:38PM +0200, Paul Johnson wrote:
> On Mon, Oct 20, 2003 at 09:34:38PM +0100, Tony Bowden wrote:
> > On Mon, Oct 20, 2003 at 10:16:40PM +0200, Paul Johnson wrote:
> > > I wrote "database" in quotes because currently we are talking about a
> > > flat file, written using Data::Dumper and eval'd in.  I have considered
> > > other options - specifically YAML and Storable.  I have found YAML to be
> > > even slower and too buggy, and Storable to less reliable.  (I never
> > > tracked down specific problems.)  This in an area that needs to be
> > > addressed.
> > 
> > Have you considered SQLite?
> 
> Initially I wanted something with few, or better yet no dependencies.  I
> also wanted something that required little or no work when I changed the
> internal data structures.
> 
> I'll compromise on both of these, and especially the latter, for
> something that is efficient and reliable.
> 
> I'll look into SQLite.

I'd caution against rushing in any particular direction without some
profiling information to back it up.

Having said that, I'd strongly recommend switching to Storable first.
It did have problems but it's now very robust and far, far, faster
than Data::Dumper+eval. This small change would yield a big gain.

The next step would be to get some profile information. There's
little point in doing that first as Data::Dumper+eval will dwarf
time spent elsewhere.

Tim.

p.s. Could someone suggest a pure-perl module with lots of tests as
a suitable testbed for Devel::Cover?


Re: No more code coverage

2003-10-21 Thread Andrew Savige
Tim Bunce wrote:
> p.s. Could someone suggest a pure-perl module with lots of tests as
> a suitable testbed for Devel::Cover? 

http://search.cpan.org/dist/Acme-EyeDrops has 22 test programs,
769 tests and no dependencies.

/-\


http://personals.yahoo.com.au - Yahoo! Personals
New people, new possibilities. FREE for a limited time.


Re: No more code coverage

2003-10-21 Thread Michael G Schwern
On Tue, Oct 21, 2003 at 10:38:48PM +1000, Andrew Savige wrote:
> Tim Bunce wrote:
> > p.s. Could someone suggest a pure-perl module with lots of tests as
> > a suitable testbed for Devel::Cover? 
> 
> http://search.cpan.org/dist/Acme-EyeDrops has 22 test programs,
> 769 tests and no dependencies.

Test-Simple's another good one.  44 programs, 267 tests, no dependencies,
backwards compatible to 5.4.0, does some complex stuff (evals, %SIG,
tied handles, system()...).


-- 
Michael G Schwern[EMAIL PROTECTED]  http://www.pobox.com/~schwern/
Here's some scholarly-ass opinions...


Re: Object freezing

2003-10-21 Thread Juergen Boemmels
Leopold Toetsch <[EMAIL PROTECTED]> writes:

> >   1) Freezing at the destruction level may *not* use any additional memory
> >  for object traversal

This is a really hard problem. In some early experiments with
destruction ordering (one of the problems wich need iteration) I
didn't get around with allocating new memory, or recursing on the
stack. It may be that we can get arround with a second pointer, but
I'm not sure.

> What is "Freezing at the destruction level"? Is this anyhow related to
> destruction ordering?
> 
> > Note that I do *not* want to have multiple object traversal systems in
> > parrot! We have one for DOD, and proposals have ranged upwards from there.
> > No. That is *not* happening--the chance for error is significant, the
> > side-effects of the error annoying and tough to track down for complex
> > cases (akin to the trouble with tracking down GC issues), and just not
> > necessary. (Perhaps desirable for speed/space reasons, but desirable
> > isn't necessary)

I did some benchmarking (to test our hash implementation, but thats a
different story). One thing I found out: We are completely dominated
by gc. I'm not sure if it was trace_systemareas or the mark method,
but don't put any load on mark.

mark should be as fast as possible. The other uses of traverse for
freeze, dump, destruction-ordering etc. are all more or less called on
user request, so the user needs to know its cost.

One other thing that makes mark different. If we ever want to use a
copying collector (Which is not reachable currently because of the
conservative stack-walking) The mark routine needs to know about the
moving of objects. All other traverse routine never get this problem.

> DOD's mark() routine has different requirements then a general
> traverse() for freeze(), chill(), clone(), and destruction ordering.
> Using just mark() will have these side effects that you want to avoid.

My words. mark() is not traverse() also they do similar things.

> A general traverse() can be depth first of breadth first, mark() isn't
> required do have any specific ordering as long as it sets live bits
> everywhere.
> 
> mark() is called permanently in a running interpreter, that does non
> trivial things. There are shortcuts for scalars, DOD is highly optimized
> not to destroy cache coherency. Using mark() also implies to back out
> my small PMC patches. All the advantages of smaller scalars are gone
> then.

This ist just on more thing of mark() speed.

> While freeze() and friends have to pull in each PMC into the cache, just
> setting the live bit on a PMC hasn't. Further: Lukes proposal for
> speeding up timely destruction puts objects either in front or at the
> end of the next_for_GC chain. This IMHO implies that mark() is unusable
> as your general and solely iterator.
> 
> > ... This is something that's hidden under a number of layers
> > of API, so regardless of the outcome it doesn't affect the assembly, PMC,
> > or runtime API.
> 
> So when its hidden, I really don't understand, why you are insisting on
> an (IMHO) suboptimal design.

We have at the moment 15 (in words fifteen) vtable slots for
dividing/remainder, 5 for multiplikation, 24 for bitwise ops. So
bloating the vtable is by design, but it is the end of world if we
special case the most often called function and have 2 (in words two)
walking functions. Sorry, I think there are other places in the vtable
which need some cleanup.

> > The thread-safety is an issue,
> 
> While all schemes aren't thread-safe from user level (e.g.
> manually sorting an array containing shared PMCs, while it gets
> frozen), your scheme isn't thread-safe at low-level, as the next_for_GC
> pointer inside the PMC is used as a duplicate marker. But if a user
> changes shared resources its a user problem. We only guarantee atomic
> updates per PMC (s. P6E p 86f by Dan).

The thread safty is less a problem for marking. It only needs to make
sure that other threads don't munge the data they are walking. Write
barriers or mutexes might help here. But how to freeze an object of an
other thread? This needs to freeze the whole thread.

> > Dan
> 
> Comments addressing all these issues are highly welcome,
> leo

I think we should address this issue like experimentalists: Create the
general traverse function. (No don't call it mark). Implement freeze,
dump, destruction ordering using this function. When this all is
running, port the mark function to use this new
functionality. Benchmark, and watch the speedup of the brandnew
design (or just find out that the slowdown is not bad enough to
satisfy two walking functions). When the benchmarking is done lets
descide if we need only one walk-function, and only than remove the
mark function.

bye
boe
-- 
Juergen Boemmels[EMAIL PROTECTED]
Fachbereich Physik  Tel: ++49-(0)631-205-2817
Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906
PGP Key finger

Re: [RfC] and [PATCH]: Libraries

2003-10-21 Thread Juergen Boemmels
Last week i send this:

> I spent the last day getting parrot running under Borland. The
> attached patch is whats need to get linking and running make test on
> both Windows/Borland and Linux/gcc. I'm not sure if its ready for
> inclusion in the tree, but I want some feedback on the approach.

No feedback is not very much.
Sure I just can commit this thing right away, but this patch changes
one fundamental thing: There is no more one single (static)
libparrot.

So Comments please.
boe
-- 
Juergen Boemmels[EMAIL PROTECTED]
Fachbereich Physik  Tel: ++49-(0)631-205-2817
Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906
PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F  23 F6 C7 2F 85 93 DD 47


Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Mon, 20 Oct 2003, Melvin Smith wrote:

> At 04:38 PM 10/20/2003 -0400, Dan Sugalski wrote:
> >The encoding methods for freezing (and corresponding decoding methods for
> >thawing) may be overridden to provide an alternate serialization format.
> >The only requirement of the serialziation format is that it starts with a
> >minimally valid piece of XML that encodes the format and version of the
> >serialized format. The rest of the serialization format need not be XML.
> >This is done because the format and version of the serialized data are
> >required in the stream, and making it XML incoveniences nobody and makes
> >the XML folks happy. It's good enough, and not up for discussion.
>
> Does that mean all encodings will start with the "standard" markup header:
>
> 

Each serialized data stream, yeah. Just once, and probably a few extra
characters in there, to make it legit XML.

Definitely not once per object, just once per stream--if you do:

  freeze S3, P5

and P5 happens to have a PMC that points to your top level symbol table,
the string in S3 will be darned huge, and have the  thing in
there exactly once. (Well, unless you've chosen an XML encoding, in which
case I expect you've just blown memory... :)

Dan


Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Mon, 20 Oct 2003, Gregor N. Purdy wrote:

>
> > the xml header is only for the top level thing in the serialized
> > tree. if it is nonstandard you have to mark the serialized string so you
> > can call the matching thaw methods. each object in the serialized tree
> > will have to support that method or some code has to be supplied to
> > handle all the freeze/thaw calls made by the tree traverser code. so the
> > xml header is just a way to mark which external class will be used for
> > the freeze/thaw and it will always be called for each object in the
> > tree. you can't mix/match different freeze/thaw techniques in one
> > operation (yes, you could but then you do have to mark each node with
> > its technique which is a lot of overhead and painful in other ways).
>
> I find the notion of an "XML header" a bit confusing, given Dan's
> statement to the effect that it was a throw to XML folks.
>
> I think anything "XML folks" will be interested in will entail
> *wrapping* stuff, not *prefixing* it.

Nah, I expect what they'll want is for the entire data stream of
serialized objects to be in XML format. Which is fine--they can have that.
(It's why I mentioned the serialization routines can be overridden)

For an XML stream the header might be  with the rest of the stream in XML. A YAML stream would start
 with the rest in YAML, and teh
binary format as . Or something
like that, modulo actual correct XML.

This way we have a single, fixed-format type/version header, which makes
the initial identification easier and less error-prone. (Possibly even
fit for file and programs of its ilk to note) The binary format won't
care, and teh YAML format shouldn't care (as long as the indenting's
right) but the XML format would, so it seems to make sense to use the XML
stuff for the initial header.

Dan


Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Leopold Toetsch wrote:

> Dan Sugalski <[EMAIL PROTECTED]> wrote:
> > Since this has come up again, ...
>
> [ FYI: I was starting implementing this, based on a general traverse
>vtable with callback functions. Two patches got backed out by
>Dan after some discussion in PM ]

Right, because you'd implemented some stuff I'd specifically said we
weren't doing, and didn't back them out any of the times I asked...

> > ... and it's apparent that the last time around
> > I wasn't sufficiently clear, it's time to go through this again, and for
> > the final time.
>
> I'd be really happy, if you could go through my concerns mentioned in
> the summary in the thread:

That's why I did this, in part. It's the plan, until declared otherwise.

> >   PMC *thaw(interpreter, STRING *)
>
> This should IMHO be able to create constant PMCs out of metadata, e.g.
> for subroutine objects. So there should be some means to tell thaw() to
> create PMC(s) in the constant_pmc_pool.

There should be a way to put PMCs in the constant pool in general. I was
thinking a constant op would work--something like

   constant Ix, [SP]y

to make the string or PMC Y a constant at slot X in the constant pool.
Passing in the PMC header to be filled in also works, though both fail if
you want full PMC trees marked as constants since thawing out a PMC stream
may involve creating multiple PMCs. (In which case we might be better
temporarily switching allocation pools at constant creation time, rather
than passing in PMCs)

> > The chill and warm runtime methods take a PMC or a frozen representation
> > of a PMC (respectively) and provide a human readable version of that PMC.
>
> I dunno, why chill() is superior to dump() or pretty_print(), but the
> name doesn't really matter.

The important thing is that it's not a vtable method. It's a function that
belongs in the freeze/thaw API as it's just an alternate encoding or
decoding. (Arguably it ought not be a separate API entry at all and just
another encoding scheme, but that requires transcoding serialization
forms, and I'd rather not get into that)

> >   1) Freezing at the destruction level may *not* use any additional memory
> >  for object traversal
>
> What is "Freezing at the destruction level"? Is this anyhow related to
> destruction ordering?

No. There are some valid cases where an object, after having been declared
dead by the DOD, wants to serialize itself. Persistent object stores
apparently do this, and it makes a certain amount of sense--when the
object goes out of scope the current state is flushed to disk.

It puts a number of unpleasant constraints on the core freeze routines.
User code can violate them and take the consequences, but we can't.

> > Note that I do *not* want to have multiple object traversal systems in
> > parrot! We have one for DOD, and proposals have ranged upwards from there.
> > No. That is *not* happening--the chance for error is significant, the
> > side-effects of the error annoying and tough to track down for complex
> > cases (akin to the trouble with tracking down GC issues), and just not
> > necessary. (Perhaps desirable for speed/space reasons, but desirable
> > isn't necessary)
>
> DOD's mark() routine has different requirements then a general
> traverse() for freeze(), chill(), clone(), and destruction ordering.
> Using just mark() will have these side effects that you want to avoid.

The only thing that mark does that the general traversal doesn't, in the
abstract, is flip the object's live flag. Everything else is an
optimization of code which we can, if we need, discard.

> A general traverse() can be depth first of breadth first, mark() isn't
> required do have any specific ordering as long as it sets live bits
> everywhere.

I'm pretty sure that with a singly linked list we can get a generally
properly-ordered flattened tree without having to do an insane number of
passes across the dead object store. I may be incorrect in this, but I
don't think so, and for our purposes the live bit can be safely ignored if
we end up setting it, though potentially with another pass over the dead
store, which may end up prohibitively expensive. We'll see.

> mark() is called permanently in a running interpreter, that does non
> trivial things. There are shortcuts for scalars, DOD is highly optimized
> not to destroy cache coherency. Using mark() also implies to back out
> my small PMC patches. All the advantages of smaller scalars are gone
> then.

All of this stuff for freezing is going to end up killing the small PMC
patch anyway, unfortunately, since we're going to have to be able to
traverse PMCs in the destruction phase, which means we have to have the
means of traversal at hand as we can't guarantee that we can allocate more
PMCs or resize the PMCs ext data.

> While freeze() and friends have to pull in each PMC into the cache, just
> setting the live bit on a PMC hasn't. Further: Lukes proposal for
> speeding up timel

Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Elizabeth Mattijsen wrote:

> At 08:21 -0400 10/21/03, Dan Sugalski wrote:
> >  > I find the notion of an "XML header" a bit confusing, given Dan's
> >>  statement to the effect that it was a throw to XML folks.
> >>
> >>  I think anything "XML folks" will be interested in will entail
> >>  *wrapping* stuff, not *prefixing* it.
> >
> >Nah, I expect what they'll want is for the entire data stream of
> >serialized objects to be in XML format. Which is fine--they can have that.
> >(It's why I mentioned the serialization routines can be overridden)
> >
> >For an XML stream the header might be  >version=1.0> with the rest of the stream in XML. A YAML stream would start
> > with the rest in YAML, and teh
> >binary format as . Or something
> >like that, modulo actual correct XML.
>
> If you want that to be looking like valid XML, it would have to be different:
>
> error: Specification mandate value for attribute parrot
> 
>^
> Better in my opinion would be something like:
>
> data yadda yadda yadda

I'm not an XML guy, and I'm making all this up as I go along. If that's
better, fine with me. :)

> >This way we have a single, fixed-format type/version header, which makes
> >the initial identification easier and less error-prone. (Possibly even
> >fit for file and programs of its ilk to note) The binary format won't
> >care, and teh YAML format shouldn't care (as long as the indenting's
> >right) but the XML format would, so it seems to make sense to use the XML
> >stuff for the initial header.
>
> So are we talking about a header or a wrapper?  If it is really a
> header, it's not XML and then it's prettyy useless from an XML point
> of view.

We're talking about the first thing in a file (or stream, or whatever). I
was under the impression that XML files should be entirely composed of
valid XML, hence the need for the stream type marker being valid XML. YAML
doesn't care as much, so far as I understand, and for our own internal
binary format we cna do whatever we want. If that's not true, then we can
go for a more compact header.

Note that the serialized stream will be different depending on the encoder
chosen. If you have the structure:

  $bar = 1;
  @foo[0] = \$bar;
  @foo[1] = "Baz";

The XML stream serializing @foo might look like:

  
  
 
   PerlArray
 
 
bar
Baz
 
  
  

  PerlInt


  1

  

Only not inevitably horribly broken, invalid, and poorly done. :) The YAML
form might look like

  
  PMC: foo
type: PerlArray
values:
  pmc: bar
  string: Baz
  PMC: bar
type: PerlInt
values:
  integer:1

Once again, modulo my limited and inevitably incorrect YAML knowledge. So
if the header says it's XML the whole thing is valid XML, while if it
doesn't the rest of the stream doesn't have to be. (Just enough of the
header so that an XML processing program can examine the stream and decide
that the valid XML chunk at the beginning says that the rest of the
stream's not XML)

Basically we want some nice, fixed (mostly) thing at the head of the
stream that doesn't vary regardless of the way the stream is encoded, and
XML seemed to be the most restrictive of the forms I know people will
clamor for. (I know, it means the stream can't be valid Lisp-style sexprs,
but XML's more widespread :)

Dan


Object instantiation

2003-10-21 Thread Dan Sugalski
After thinking about this a bit, it became glaringly obvious that the
right way to instantiate an object for class "Foo" is to do:

  new P5, .Foo

Or whatever the constant value assigned to the Foo class upon its creation
is. When a class is created, it should be assigned a number, and for most
things PMC-only classes or full-on HLL classes should behave identically.
Duh.

One more thing down--now to actually make it work out...

Dan


Re: Object freezing

2003-10-21 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
> On Tue, 21 Oct 2003, Leopold Toetsch wrote:

[ thaw ]

>> This should IMHO be able to create constant PMCs out of metadata, e.g.
>> for subroutine objects. So there should be some means to tell thaw() to
>> create PMC(s) in the constant_pmc_pool.

> There should be a way to put PMCs in the constant pool in general. I was
> thinking a constant op would work--something like

>constant Ix, [SP]y

> to make the string or PMC Y a constant at slot X in the constant pool.

You can append items to the constant table. You can't declare existing
items as constant, because you can't change the underlying object pool,
where the object was allocated. This would change the objects address.

> Passing in the PMC header to be filled in also works, though both fail if
> you want full PMC trees marked as constants since thawing out a PMC stream
> may involve creating multiple PMCs. (In which case we might be better
> temporarily switching allocation pools at constant creation time, rather
> than passing in PMCs)

These are either serious shortcomings or unneeded workarounds. An extra
parameter to relevant vtables can take care of such special cases.

>> I dunno, why chill() is superior to dump() or pretty_print(), but the
>> name doesn't really matter.

> The important thing is that it's not a vtable method.

Ah, that's the difference. How shall the system pretty-print dynamically
loaded PMCs then, when only a bytecode-stream is available? IMHO only a
vtable in the class can perform that job.

>> >   1) Freezing at the destruction level may *not* use any additional memory
>> >  for object traversal

> It puts a number of unpleasant constraints on the core freeze routines.

Constructing the frozen stream definitely needs memory. I don't see the
difference, to memory consumed by a seen hash. Can you please elaborate
a bit more on this.

> The only thing that mark does that the general traversal doesn't, in the
> abstract, is flip the object's live flag. Everything else is an
> optimization of code which we can, if we need, discard.

Yes, mark() can be written in terms of a general traverse, which gets a
vtable function (and a data pointer). mark is basically traverse(mark,
0). But this isn't true the other way round. You can't do freeze based
on the mark iterator. How do you pass the desired output format?

>> mark() is called permanently in a running interpreter, that does non
>> trivial things. There are shortcuts for scalars, DOD is highly optimized
>> not to destroy cache coherency. Using mark() also implies to back out
>> my small PMC patches. All the advantages of smaller scalars are gone
>> then.

> All of this stuff for freezing is going to end up killing the small PMC
> patch anyway, unfortunately, since we're going to have to be able to
> traverse PMCs in the destruction phase, which means we have to have the
> means of traversal at hand as we can't guarantee that we can allocate more
> PMCs or resize the PMCs ext data.

A scalar can't contain or reference other PMCs, so it can't be a
potential source of freeze loops. If I now spit out (PMC: Int, ID=xy,
value=5) twice or (PMC: ID=other) doesn't really matter. thaw() can take
care of duplicates, if needed. Other PMCs have the next_for_GC pointer.

Albeit I'm not convinced, that we can't have a seen hash.

> YHO, in this case, turns out to not consider all the issues involved.

That might very well be true, yes. So it would be fine, if you could fill
the gaps.

>   Dan

leo


Re: Object instantiation

2003-10-21 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
> After thinking about this a bit, it became glaringly obvious that the
> right way to instantiate an object for class "Foo" is to do:

>   new P5, .Foo

> Or whatever the constant value assigned to the Foo class upon its creation
> is. When a class is created, it should be assigned a number, and for most
> things PMC-only classes or full-on HLL classes should behave identically.

Yep. The question does arise, if which range class enums are? Intermixed
with enum_class_ numbers?

And - what about:

  typeof S0, P0  <=> classname S0, P0

(IMHO the HLL compiler can't always know, which op to use)

And the classname of objects vs the classname of classes (the classname
PMC is in different array slots).

>   Dan

leo


Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Leopold Toetsch wrote:

> Dan Sugalski <[EMAIL PROTECTED]> wrote:
> > On Tue, 21 Oct 2003, Leopold Toetsch wrote:
>
> [ thaw ]
>
> >> This should IMHO be able to create constant PMCs out of metadata, e.g.
> >> for subroutine objects. So there should be some means to tell thaw() to
> >> create PMC(s) in the constant_pmc_pool.
>
> > There should be a way to put PMCs in the constant pool in general. I was
> > thinking a constant op would work--something like
>
> >constant Ix, [SP]y
>
> > to make the string or PMC Y a constant at slot X in the constant pool.
>
> You can append items to the constant table. You can't declare existing
> items as constant, because you can't change the underlying object pool,
> where the object was allocated. This would change the objects address.

The object's address should be irrelevant for the constant table. PMCs are
referenced in the opstream by table offset. This offset can be into a PMC
pool, or into a pointer table. While the pointer table has an extra level
of indirection to it it adds flexibility and takes some pressure off of
the ordering of PMCs for instantiated constants.

> > Passing in the PMC header to be filled in also works, though both fail if
> > you want full PMC trees marked as constants since thawing out a PMC stream
> > may involve creating multiple PMCs. (In which case we might be better
> > temporarily switching allocation pools at constant creation time, rather
> > than passing in PMCs)
>
> These are either serious shortcomings or unneeded workarounds. An extra
> parameter to relevant vtables can take care of such special cases.

Not necessarily, no. The number of PMCs that are reconstituted for a set
of constant frozen PMCs is indeterminate. If we're instantiating bytecode
with constant PMCs in it it's possible the class that backs those PMCs has
changed and things instantiate differently than they might otherwise do.

If we've frozen 20 PMCs, all we can guarantee is that when we unthaw them
that we've got at least 20 PMCs, though we may have more, and the extras
arguably should be allocated from the constant PMC arena (though not given
slots in the constant table) so we can skip scanning the constant arenas
for dead objects needing cleanup.

> >> I dunno, why chill() is superior to dump() or pretty_print(), but the
> >> name doesn't really matter.
>
> > The important thing is that it's not a vtable method.
>
> Ah, that's the difference. How shall the system pretty-print dynamically
> loaded PMCs then, when only a bytecode-stream is available? IMHO only a
> vtable in the class can perform that job.

If the dynamically loaded PMC class doesn't have a backing Parrot class,
you can't, and get the default, relatively primitive dump.

> >> >   1) Freezing at the destruction level may *not* use any additional memory
> >> >  for object traversal
>
> > It puts a number of unpleasant constraints on the core freeze routines.
>
> Constructing the frozen stream definitely needs memory. I don't see the
> difference, to memory consumed by a seen hash. Can you please elaborate
> a bit more on this.

Constructing the frozen stream will need some memory, yes. At the moment
all it needs is a chunk of random memory and that's it, so we may well
fail because we're out of memory. We may, however, have general pool
memory handy. We can't guarantee that we have *any* headers, however,
since we can legitimately be called from within the destruct phase of a
DOD run, which may have been triggered by an out-of-headers condition.

Depending on how we flesh things out freezing may also not require any
additional memory--if we relax the requirement for freezing to allow the
output to be a PMC, we may be backed directly to a file or other storage
that doesn't involve RAM allocation.

> > The only thing that mark does that the general traversal doesn't, in the
> > abstract, is flip the object's live flag. Everything else is an
> > optimization of code which we can, if we need, discard.
>
> Yes, mark() can be written in terms of a general traverse, which gets a
> vtable function (and a data pointer). mark is basically traverse(mark,
> 0). But this isn't true the other way round. You can't do freeze based
> on the mark iterator. How do you pass the desired output format?

What does the desired output format have to do with any of this? All
marking does is put things on the list of PMCs to be visited if it hasn't
already been visited, so we get to that PMC at some point as we walk the
visited list. In the context of the DOD sweep it also sets the live flag,
but we could, if we chose, skip that and use the presence of a non-NULL
value in the mark chain address for a PMC as an indicator of liveness.
(Though yes, I realize that this means potentially skipping some of the
optimizations, so I'm not proposing it as a requirement for the DOD sweep
implementation)

> >> mark() is called permanently in a running interpreter, that does non
> >> trivial things. There ar

Re: Object instantiation

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Leopold Toetsch wrote:

> Dan Sugalski <[EMAIL PROTECTED]> wrote:
> > After thinking about this a bit, it became glaringly obvious that the
> > right way to instantiate an object for class "Foo" is to do:
>
> >   new P5, .Foo
>
> > Or whatever the constant value assigned to the Foo class upon its creation
> > is. When a class is created, it should be assigned a number, and for most
> > things PMC-only classes or full-on HLL classes should behave identically.
>
> Yep. The question does arise, if which range class enums are? Intermixed
> with enum_class_ numbers?

Yes, intermixed. I added support a while back to pass in the class number
to a PMC class being initialized for this very reason. The compiled-in
PMCs get fixed numbers at the beginning because it's easiest, and things
get referenced symbolically from there.

> And - what about:
>
>   typeof S0, P0  <=> classname S0, P0
>
> (IMHO the HLL compiler can't always know, which op to use)

At this point they're the same thing, I think. I'll need to think on it a
bit.

> And the classname of objects vs the classname of classes (the classname
> PMC is in different array slots).

Last I knew there was something of a fight over what class a class is in.
At the moment I'm going to ignore the heck out of things and let the
language folks fight over it some more.

Dan


Re: Object instantiation

2003-10-21 Thread Melvin Smith
Try:

new P0, 'std::array'  # PMC
new P1, 'Perl::PerlArray'# PMC (or class)
new P2, 'Package::SomeClass'   # Class

At compile time the string can be converted to an integer enumerator.

-Melvin






Leopold Toetsch <[EMAIL PROTECTED]>
10/21/2003 10:24 AM
Please respond to lt

 
To: [EMAIL PROTECTED] (Dan Sugalski)
cc: [EMAIL PROTECTED]
Subject:Re: Object instantiation



Dan Sugalski <[EMAIL PROTECTED]> wrote:
> After thinking about this a bit, it became glaringly obvious that the
> right way to instantiate an object for class "Foo" is to do:

>   new P5, .Foo

> Or whatever the constant value assigned to the Foo class upon its 
creation
> is. When a class is created, it should be assigned a number, and for 
most
> things PMC-only classes or full-on HLL classes should behave 
identically.

Yep. The question does arise, if which range class enums are? Intermixed
with enum_class_ numbers?

And - what about:

  typeof S0, P0  <=> classname S0, P0

(IMHO the HLL compiler can't always know, which op to use)

And the classname of objects vs the classname of classes (the classname
PMC is in different array slots).

>  Dan

leo




Re: Object freezing

2003-10-21 Thread Melvin Smith
On Tue, 21 Oct 2003, Leopold Toetsch wrote:

> Albeit I'm not convinced, that we can't have a seen hash.

A seen hash most likely would:

1) Kill GC performance especially in pathological cases. The GC
   should be quiet and invisible.
2) Cause memory usage to double upon a mark run.


-Melvin




A less controvertial API addition

2003-10-21 Thread Dan Sugalski
While we're fighting^Wdiscussing the freezing system, there's a simpler
thing we need to have added in. We need an API entry point that allows C
code to invoke a sub/method PMC. This needs to be done both for the
embedding API (we'll wrap it) where the embedding app will call in, but
also for things like vtable functions where the actual function is parrot
bytecode.

Calling straight into runops looks a little too simplistic, but this'd be
a good place to poke around and see what you can come up with.

Dan


Re: Object instantiation

2003-10-21 Thread Jeff Clites
On Oct 21, 2003, at 7:14 AM, Dan Sugalski wrote:

After thinking about this a bit, it became glaringly obvious that the
right way to instantiate an object for class "Foo" is to do:
  new P5, .Foo

Or whatever the constant value assigned to the Foo class upon its 
creation
is. When a class is created, it should be assigned a number, and for 
most
things PMC-only classes or full-on HLL classes should behave 
identically.
Duh.
That makes sense. What I keep wondering is what about things with the 
semantics of Perl5, in which new objects aren't instantiated 
directly--already-allocated things later become associated with a 
class. This doesn't seem quite like a case of morphing, since for 
instance a Perl array can be blessed into a class, but it's still a 
Perl array.

JEff



Re: Object freezing

2003-10-21 Thread Juergen Boemmels
Dan Sugalski <[EMAIL PROTECTED]> writes:

[...]

> > > The chill and warm runtime methods take a PMC or a frozen representation
> > > of a PMC (respectively) and provide a human readable version of that PMC.
> >
> > I dunno, why chill() is superior to dump() or pretty_print(), but the
> > name doesn't really matter.
> 
> The important thing is that it's not a vtable method. It's a function that
> belongs in the freeze/thaw API as it's just an alternate encoding or
> decoding. (Arguably it ought not be a separate API entry at all and just
> another encoding scheme, but that requires transcoding serialization
> forms, and I'd rather not get into that)

This is really just a naming problem. Dan wants to call the
vtable-function freeze and have different encodings for all kinds of
dumping/pretty_printing/marking. Leo calls the function traverse and
controlls it by callbacks.

My personal opinion on this naming problem is: traverse describes more
generaly what the function does. Marking live objects by freezing them
in an encoding that does return nothing just sounds plain wrong.

Freeze should be just a user of the general traverse function. (And
this does mean it is also no vtable function)

STRING *freeze(PMC *pmc, whatever *encoding)
{
   return (STRING *)pmc->vtable->traverse(pmc, freeze_callbacks, encoding);
}

or even the freeze_encodings are callback_sets: freeze_xml,
freeze_yaml, freeze_binary, whatever.

>>>   1) Freezing at the destruction level may *not* use any additional memory
>>>  for object traversal
>>
>> What is "Freezing at the destruction level"? Is this anyhow related to
>> destruction ordering?
> 
> No. There are some valid cases where an object, after having been declared
> dead by the DOD, wants to serialize itself. Persistent object stores
> apparently do this, and it makes a certain amount of sense--when the
> object goes out of scope the current state is flushed to disk.

This is a question of what is allowed at destruction time. You don't
want to allow memory allocation, but allow freezing. That gets hard,
because you need at least allocate the STRING where you want to put
your frozen stream.

> It puts a number of unpleasant constraints on the core freeze routines.
> User code can violate them and take the consequences, but we can't.

We can call (hopefully) arbitary user code in destruction routines. So
this argument does not count

>>> Note that I do *not* want to have multiple object traversal systems in
>>> parrot! We have one for DOD, and proposals have ranged upwards from there.
>>> No. That is *not* happening--the chance for error is significant, the
>>> side-effects of the error annoying and tough to track down for complex
>>> cases (akin to the trouble with tracking down GC issues), and just not
>>> necessary. (Perhaps desirable for speed/space reasons, but desirable
>>> isn't necessary)

Freeze is just another traversal method. Just calling it freeze
instead of traverse does not change this fact. You can limit the power
of encodings, but this does not change the fact that you need to walk
all children

>> DOD's mark() routine has different requirements then a general
>> traverse() for freeze(), chill(), clone(), and destruction ordering.
>> Using just mark() will have these side effects that you want to avoid.
> 
> The only thing that mark does that the general traversal doesn't, in the
> abstract, is flip the object's live flag. Everything else is an
> optimization of code which we can, if we need, discard.

mark() may be implemented in form of a general traverse. Let the
profiler decide if a special purpose mark() or a general traverse is
better.

>> A general traverse() can be depth first of breadth first, mark() isn't
>> required do have any specific ordering as long as it sets live bits
>> everywhere.
> 
> I'm pretty sure that with a singly linked list we can get a generally
> properly-ordered flattened tree without having to do an insane number of
> passes across the dead object store. I may be incorrect in this, but I
> don't think so, and for our purposes the live bit can be safely ignored if
> we end up setting it, though potentially with another pass over the dead
> store, which may end up prohibitively expensive. We'll see.

I'm pretty sure that a singly linked list is not enough. I had done
some experiments with this. One pass my be enough, but you need to
keep track of the tree-traversal and of the partial ordered
list. These to things don't play well together. Maybe this can be cut
down to two lists, or one list and one bit per pmc.
 
>> mark() is called permanently in a running interpreter, that does non
>> trivial things. There are shortcuts for scalars, DOD is highly optimized
>> not to destroy cache coherency. Using mark() also implies to back out
>> my small PMC patches. All the advantages of smaller scalars are gone
>> then.
> 
> All of this stuff for freezing is going to end up killing the small PMC
> patch anyway, unfortunately, since we're going

Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Juergen Boemmels wrote:

> Dan Sugalski <[EMAIL PROTECTED]> writes:
>
> [...]
>
> > > > The chill and warm runtime methods take a PMC or a frozen representation
> > > > of a PMC (respectively) and provide a human readable version of that PMC.
> > >
> > > I dunno, why chill() is superior to dump() or pretty_print(), but the
> > > name doesn't really matter.
> >
> > The important thing is that it's not a vtable method. It's a function that
> > belongs in the freeze/thaw API as it's just an alternate encoding or
> > decoding. (Arguably it ought not be a separate API entry at all and just
> > another encoding scheme, but that requires transcoding serialization
> > forms, and I'd rather not get into that)
>
> This is really just a naming problem. Dan wants to call the
> vtable-function freeze and have different encodings for all kinds of
> dumping/pretty_printing/marking. Leo calls the function traverse and
> controlls it by callbacks.

It's more than just a naming issue (or if it is, then traverse is the
wrong name). The traversal must be done externally, since we can't be
recursive.

Mark puts a PMC on the list of PMCs to be frozen. Freeze dumps the
PMC being frozen (and *only* that PMC) to the stream. The freeze routine
for a PMC must mark (generally indirectly by calling the "add this pmc to
the stream" api function) any PMCs that it needs to be in the stream.

The external function that traverses this list of PMCs to be dumped is
responsible for making sure there are no duplicates--the easiest way is to
do what the DOD sweep does and note that a PMC has already been put on the
list and thus not mark it.

Mark and freeze are separate, though related by the subsystems that use
them.

> This is a question of what is allowed at destruction time. You don't
> want to allow memory allocation, but allow freezing. That gets hard,
> because you need at least allocate the STRING where you want to put
> your frozen stream.

It's more a question of what we we require the engine to do, vs what user
code is allowed to do. A user program is allowed to write code that can
fail at destroy time, however the infrastructure we provide (including, in
this case, freezing--while I don't like it there's no choice) can't fail
that way. It's the reason the DOD and GC systems don't allocate memory (or
didn't--they shouldn't) when they run. The engine's not allowed to have
failure modes in critical sections.

Basically the engine may fail because of user code, but user code can't
fail because of the engine. It makes some things annoyingly restrictive,
but some problems are inherently annoyingly restrictive.

> > It puts a number of unpleasant constraints on the core freeze routines.
> > User code can violate them and take the consequences, but we can't.
>
> We can call (hopefully) arbitary user code in destruction routines. So
> this argument does not count

See above. User code can fail, we can't.

> >> A general traverse() can be depth first of breadth first, mark() isn't
> >> required do have any specific ordering as long as it sets live bits
> >> everywhere.
> >
> > I'm pretty sure that with a singly linked list we can get a generally
> > properly-ordered flattened tree without having to do an insane number of
> > passes across the dead object store. I may be incorrect in this, but I
> > don't think so, and for our purposes the live bit can be safely ignored if
> > we end up setting it, though potentially with another pass over the dead
> > store, which may end up prohibitively expensive. We'll see.
>
> I'm pretty sure that a singly linked list is not enough. I had done
> some experiments with this. One pass my be enough, but you need to
> keep track of the tree-traversal and of the partial ordered
> list. These to things don't play well together. Maybe this can be cut
> down to two lists, or one list and one bit per pmc.

There may be a little more infrastructure--I've not dug out the algorithms
books and gone hunting. The common algorithms tend to cheat by just
dodging the whole problem. :)

> Destruction ordering just enforces that small PMCs can't have
> destructors. If they have destructors they must be big, big enough to
> construct the ordered list of objects without allocating any memory.

Can't have destructors *or* refer to PMCs that may either have a
destructor or (indirectly) refer to a PMC that has a destructor.

If we have 2 PMCs with destructors they may be connected by a chain of 100
PMCs that don't, but we still need to walk that chain.

> If you think about it: The call to the destructors is done after
> free_unused_pobjects completed. The memory of the objects without
> destructors is already freed.

Then we reorder. This can't happen, and it didn't used to happen--if
that's how it works now then there's a bug in the DOD system. *All*
destructors *must* be called before any headers are collected.

> >> While freeze() and friends have to pull in each PMC into the cache, just
> >> setting the 

Re: Object freezing

2003-10-21 Thread Jeff Clites
On Oct 21, 2003, at 5:53 AM, Dan Sugalski wrote:

Note that I do *not* want to have multiple object traversal systems 
in
parrot! We have one for DOD, and proposals have ranged upwards from 
there.
No. That is *not* happening--the chance for error is significant, the
side-effects of the error annoying and tough to track down for 
complex
cases (akin to the trouble with tracking down GC issues), and just 
not
necessary. (Perhaps desirable for speed/space reasons, but desirable
isn't necessary)
DOD's mark() routine has different requirements then a general
traverse() for freeze(), chill(), clone(), and destruction ordering.
Using just mark() will have these side effects that you want to avoid.
The only thing that mark does that the general traversal doesn't, in 
the
abstract, is flip the object's live flag. Everything else is an
optimization of code which we can, if we need, discard.
I don't believe that is quite true. There are a couple of important 
differences between traversal-for-GC and traversal-for-serialization, 
which will be a challenge to reconcile in the one-true-traversal:

1) Serialization traversals need to "take note" of logical int and 
float slots (e.g., as used in perlint.pmc and perlnum.pmc) so that they 
can be serialized, but for GC you only need to worry about GC-able 
objects. It's difficult to come up with a reasonable callback which can 
take either int, float, or PObj arguments.

2) It's reasonable for an object to have a pointer to some sort of 
cache object, which is not logically part of the object, and shouldn't 
be serialized along with it. This needs to be traversed for GC 
purposes, but needs to not be traversed for serialization. (Situations 
such as this--physical but not logical membership--are the origin of 
the "mutable" keyword in C++.)

3) Traversal for GC needs to do loop detection, but can just stop going 
down a particular branch of the object graph once it encounters an 
object it's seen before. Serialization traversals would need to have a 
way, upon encountering an object seen before, to include in the 
serialization stream an indication that the current object has already 
been serialized, and enough information to enable deserialization code 
to go find it and recreate the loop. The only options I see here are 
either for serialization to involve the allocation of unbounded 
additional memory, or to expand the PObj structure to include a slot 
for a UUID which can be used as a back-reference in a stream, or to 
have serialization break loops (so that deserialized structures never 
have loops).

I'm not 100% convinced that a single approach can't handle both 
applications, but it's looking as though their requirements are 
different enough that it may not work well.

Two other questions/concerns/comments/issues:

1) I assume that ultimately a user-space iterator would end up calling 
the traversal code, right? If so, you can't reasonably mandate that 
only one traversal be in progress at one time. That would be the 
canonical way to compare two ordered collections--get an iterator for 
each, and compare element-by-element.

2) I don't see it as a huge problem that serialization code could end 
up creating additional objects if called from a destroy() method. 
(Though yes, it would be a problem for GC infrastructure code to.) I 
say that for two reasons: (a) destroy() methods can really do anything 
they want, and if that task involves allocating additional memory, that 
just makes it a risk to perform that task in a destroy() method--it may 
fail due to out-of-memory conditions. I think that Java design experts 
tend to argue against doing things like serialization in finalization 
methods. It sounds elegant, but it's problematic. One reason for this 
is that you tend to want to serialize structures as a whole, not 
piece-by-piece as they are garbage-collected. The second reason it is 
not always a problem in practice is that (b) a DOD run may be triggered 
by an out-of-headers conditions, but that doesn't mean that an 
additional chunk of memory for headers can't be allocated. If it can't 
be, then this is no more problematic that it would be in other user 
code--think of the case where I have some big tree of objects I want to 
make some sort of copy of, with the intention of then letting go of the 
original when I'm done. I'll be freeing up headers at the end of that 
process, but if I run out of memory part-way-through, then I'm just 
stuck.

3) I assume that not every object is assumed to be serializable? For 
instance, an object representing a filehandle can't really be 
serialized in a useful way. So I'm not sure of what sort of "fidelity" 
is required of a generic serialization method--that is, how similar a 
deserialized structure is guaranteed to be to the original.

JEff



Re: A less controvertial API addition

2003-10-21 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
> While we're fighting^Wdiscussing the freezing system, there's a simpler
> thing we need to have added in. We need an API entry point that allows C
> code to invoke a sub/method PMC.

What about params? I already thought about that a bit, and when looking
at extent.c:Parrot_call(), it seems that this needs another thunk (the
reverse of NCI), that sets up needed registers depending on a function
signature.

>   Dan

leo


Re: Object freezing

2003-10-21 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
> On Tue, 21 Oct 2003, Leopold Toetsch wrote:

>> You can append items to the constant table. You can't declare existing
>> items as constant, because you can't change the underlying object pool,
>> where the object was allocated. This would change the objects address.

> The object's address should be irrelevant for the constant table. PMCs are
> referenced in the opstream by table offset.

Only in the opstream. But not when such PMCs are used then. I.e. when
constant Sub PMC is refered to in the global stash.

>> Ah, that's the difference. How shall the system pretty-print dynamically
>> loaded PMCs then, when only a bytecode-stream is available? IMHO only a
>> vtable in the class can perform that job.

> If the dynamically loaded PMC class doesn't have a backing Parrot class,
> you can't, and get the default, relatively primitive dump.

I was thinking of plain PMCs, that where loaded to provide some special
functionality. Parrot doesn't know anything about these, so will be
unable to pretty print the opstream. Loaded classes OTOH as based on
ParrotClass and should be printable.

>> Constructing the frozen stream definitely needs memory. I don't see the
>> difference, to memory consumed by a seen hash. Can you please elaborate
>> a bit more on this.

> Constructing the frozen stream will need some memory, yes. At the moment
> all it needs is a chunk of random memory and that's it, so we may well
> fail because we're out of memory.

So, with the same argument I can say, (destructor level) freezing will
need *system* memory for the stream plus the hash. So we may well fail.
I don't see any difference. The hash hasn't to be a "fat" PerlHash.

If we don't want a hash one bit inside the objects arena flags should be
able to serve the same functionality - this PMC already got serialized.

Anyway - how does/would freezing at destructor level look like from HLL
POV?  Shortly before, there ought to be a full DOD run (or all possible
garbage would be frozen). At this time, the amount of still active and
then to be serialized PMCs is known (an upper boundary is always known).
So it should be possible to work around such constraints.

> ... We may, however, have general pool
> memory handy. We can't guarantee that we have *any* headers, however,
> since we can legitimately be called from within the destruct phase of a
> DOD run, which may have been triggered by an out-of-headers condition.

I really doubt, that thawing a program (or some data of it), that died
in middle of some non trivial operation, because it ran out of headers,
will be of any use.

>> A scalar can't contain or reference other PMCs, so it can't be a
>> potential source of freeze loops. If I now spit out (PMC: Int, ID=xy,
>> value=5) twice or (PMC: ID=other) doesn't really matter. thaw() can take
>> care of duplicates, if needed. Other PMCs have the next_for_GC pointer.

> Thaw can only properly take care of duplicates if the duplicates are
> correctly indicated in the serialization stream. Identical end-values are
> *not* sufficient to note multiple references to the same PMC.

Sorry I thought of PMC IDs, which are the address of the frozen PMCs.

>> Albeit I'm not convinced, that we can't have a seen hash.

> It takes an insane amount of memory and requires header allocation.

A PerlHash takes more memory, and yes. But we just need a hash of PMC
addresses, or a bit inside the objects arena.

We have several different traverse-like functions:

* mark (DOD): called frequently, should get all possible speed
* freeze (destruction): no speed issues, can't take Parrot resources
* freeze (user): rarely used, can take resources
* destruction ordering: only active objects to be visited
* clone: can take resources thaw(freeze()), or separate vtable
* dump/pretty-print: no vtable?
* thaw: special class method, is different anyway

The first 2 critical items have diametral usage patterns. This does not
really imply, that they should be implemented based on the same scheme.

> ... We
> can't allocate headers, and the memory requirements are extreme. Been
> there, done that, it was a bad idea. Consider this arbitrarily and
> unconditionally ruled out if you're unwilling to believe the stats that
> were previously posted about this.

You are speaking of Storable.pm? I'm not aware of any stats regarding
that. But I'm not thinking of using a full fledged hash for such a
special case.

>   Dan

leo


Re: Object freezing

2003-10-21 Thread Leopold Toetsch
Melvin Smith <[EMAIL PROTECTED]> wrote:
>> Albeit I'm not convinced, that we can't have a seen hash.

> A seen hash most likely would:

> 1) Kill GC performance especially in pathological cases. The GC
>should be quiet and invisible.
> 2) Cause memory usage to double upon a mark run.

GC isn't involved. A mark() run sets the live bit in the PMCs arena. No
hash is needed for both cases. I have stated several times, that I
don't like to mix mark() and the other traverse functions.

> -Melvin

leo


Re: A less controvertial API addition

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Leopold Toetsch wrote:

> Dan Sugalski <[EMAIL PROTECTED]> wrote:
> > While we're fighting^Wdiscussing the freezing system, there's a simpler
> > thing we need to have added in. We need an API entry point that allows C
> > code to invoke a sub/method PMC.
>
> What about params? I already thought about that a bit, and when looking
> at extent.c:Parrot_call(), it seems that this needs another thunk (the
> reverse of NCI), that sets up needed registers depending on a function
> signature.

We probably need two API entries. One, a vararg version, that just takes a
bunch of PMC pointers (or some sort of (ick) parameter signature), and a
second that assumes you've set the registers up properly already.

Dan


Re: Object freezing

2003-10-21 Thread Jeff Clites
On Oct 21, 2003, at 6:12 AM, Dan Sugalski wrote:

On Tue, 21 Oct 2003, Elizabeth Mattijsen wrote:

At 08:21 -0400 10/21/03, Dan Sugalski wrote:
I find the notion of an "XML header" a bit confusing, given Dan's
 statement to the effect that it was a throw to XML folks.
 I think anything "XML folks" will be interested in will entail
 *wrapping* stuff, not *prefixing* it.
Nah, I expect what they'll want is for the entire data stream of
serialized objects to be in XML format. Which is fine--they can have 
that.
(It's why I mentioned the serialization routines can be overridden)

For an XML stream the header might be 
version=1.0> with the rest of the stream in XML. A YAML stream would 
start
 with the rest in YAML, and teh
binary format as . Or 
something
like that, modulo actual correct XML.
If you want that to be looking like valid XML, it would have to be 
different:

error: Specification mandate value for attribute parrot

   ^
Better in my opinion would be something like:
data yadda yadda yadda
I'm not an XML guy, and I'm making all this up as I go along. If that's
better, fine with me. :)
Yeah, you can't put extra things in the "

So are we talking about a header or a wrapper?  If it is really a
header, it's not XML and then it's prettyy useless from an XML point
of view.
We're talking about the first thing in a file (or stream, or 
whatever). I
was under the impression that XML files should be entirely composed of
valid XML, hence the need for the stream type marker being valid XML.
No, XML _documents_ must be XML, but that doesn't mean that document == 
file. (For another example where this comes up, consider an XML 
document transmitted over HTTP. There are headers and other textual 
things in the stream along with the xml, and it's the HTTP protocol 
which determines where the document begins and ends, not xml's.) You 
can certainly have more than one XML document in a single file, but 
something needs to decide where an xml document begins and ends, and 
hand only that data to the xml parser.

YAML doesn't care as much, so far as I understand, and for our own 
internal
binary format we cna do whatever we want. If that's not true, then we 
can
go for a more compact header.
Yes, if you want the whole serialized steam to count as a well-formed 
xml document, then you can't but arbitrary binary data in the middle. 
See my previous post for why.

Once again, modulo my limited and inevitably incorrect YAML knowledge. 
So
if the header says it's XML the whole thing is valid XML, while if it
doesn't the rest of the stream doesn't have to be. (Just enough of the
header so that an XML processing program can examine the stream and 
decide
that the valid XML chunk at the beginning says that the rest of the
stream's not XML)
Most XML parsers aren't expecting to handle this. That is, there's no 
such thing as a valid half-of-an-xml document, from the perspective of 
the xml spec, and in many cases you'd have trouble getting a parser to 
stop before hitting something problematic and blowing up. In other 
words, you can't rely on an xml parser to process something which 
starts out looking like xml, but isn't.

Basically we want some nice, fixed (mostly) thing at the head of the
stream that doesn't vary regardless of the way the stream is encoded, 
and
XML seemed to be the most restrictive of the forms I know people will
clamor for. (I know, it means the stream can't be valid Lisp-style 
sexprs,
but XML's more widespread :)
Yeah, if you're just needing to tag the stream with a label to indicate 
the type plus a version number, then xml's on the one hand overkill and 
on the other hand not necessarily a big help to xml proponents.

JEff



Re: Object freezing

2003-10-21 Thread Dan Sugalski
> Yeah, if you're just needing to tag the stream with a label to indicate
> the type plus a version number, then xml's on the one hand overkill and
> on the other hand not necessarily a big help to xml proponents.

So, in a nutshell, throwing an XML format type tag at the beginning buys
us nothing regardless of whether it's an XML stream or not?

In that case, nuts to that. It's already terribly obvious I'm going to
mess it up if I try, so we'll just skip it and move on to the next
headache. :)

(FWIW, with respect to binary data in the output stream--if an encoded
format doesn't allow binary data then the encoder is responsible for
changing it to a non-binary format. So for XML and YAML (and any other
text encoding format, I expect) that'll likely be a base64 encoding or
something)

Dan


Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Jeff Clites wrote:

> I don't believe that is quite true. There are a couple of important
> differences between traversal-for-GC and traversal-for-serialization,
> which will be a challenge to reconcile in the one-true-traversal:
>
> 1) Serialization traversals need to "take note" of logical int and
> float slots (e.g., as used in perlint.pmc and perlnum.pmc) so that they
> can be serialized, but for GC you only need to worry about GC-able
> objects. It's difficult to come up with a reasonable callback which can
> take either int, float, or PObj arguments.

That's not an issue for us. A PMC is responsible for serializing itself,
so if its got a string, float, or int component then it must take
respnsibility for dumping those components to the serialization stream.
Basically PMCs *must* dump themselves out completely, but the engine
provides support to defer dumping of PMCs so that we don't get into
recursive dumping and blow stack, as well as to make sure that we properly
maintain multiple references to the same PMC.

> 2) It's reasonable for an object to have a pointer to some sort of
> cache object, which is not logically part of the object, and shouldn't
> be serialized along with it. This needs to be traversed for GC
> purposes, but needs to not be traversed for serialization. (Situations
> such as this--physical but not logical membership--are the origin of
> the "mutable" keyword in C++.)

That's what custom mark routines are for, though it does argue that we
should have a separate mark for freezing.

> 3) Traversal for GC needs to do loop detection, but can just stop going
> down a particular branch of the object graph once it encounters an
> object it's seen before. Serialization traversals would need to have a
> way, upon encountering an object seen before, to include in the
> serialization stream an indication that the current object has already
> been serialized, and enough information to enable deserialization code
> to go find it and recreate the loop. The only options I see here are
> either for serialization to involve the allocation of unbounded
> additional memory, or to expand the PObj structure to include a slot
> for a UUID which can be used as a back-reference in a stream, or to
> have serialization break loops (so that deserialized structures never
> have loops).

The loop breaking needs for freezing are the same as for DOD sweeps,
though with freezing we're at an advantage as we know where the tree
starts.

In all cases (I made sure this was in the example, but it might not have
been clear) we only include a marker for child PMCs in the parent PMC's
serialized data, and serialize the child PMCs later on in the stream. So
if PMC1 has a pointer to PMC2, the stream has PMC1 dumped to it but in the
place of PMC2's data is just a marker saying "refer to PMC2 here" and then
after the end of PMC1's data in the stream we dump out PMC2's data.

> 1) I assume that ultimately a user-space iterator would end up calling
> the traversal code, right? If so, you can't reasonably mandate that
> only one traversal be in progress at one time. That would be the
> canonical way to compare two ordered collections--get an iterator for
> each, and compare element-by-element.

While it could, I think it's infeasable to use the serialization iterator
for normal user-space iteration, if only because the limits that have to
be on the serialization iterator for use in restricted circumstances are a
bit onerous for general use.

I'm not entirely sure that parrot's going to provide this form of
iteration as it stands anyway--it's not necessary for the core langauge
support and while it'd be really useful there's a limit to the number of
Big Problems I'm up to solving. (Having said that there may, probably
will, be enough introspective capabilites to do this without engine
support)

> 2) I don't see it as a huge problem that serialization code could end
> up creating additional objects if called from a destroy() method.

User code may, parrot may not. The reasons are twofold--while parrot will
let you shoot yourself in the foot, it provides the gun, not the foot.
It should also be possible for carefully written destroy methods to
serialize but not eat any headers or memory. (I can see this being the
case in some embedded applications or systems) If we make it so freezing
is not a guaranteed possibility at destroy time then this can't happen and
it lessens the utility of the system some.

We can, if we choose, loosen the restriction later if sufficient reason is
presented. Can't really tighten it, though, so for now...

> 3) I assume that not every object is assumed to be serializable? For
> instance, an object representing a filehandle can't really be
> serialized in a useful way. So I'm not sure of what sort of "fidelity"
> is required of a generic serialization method--that is, how similar a
> deserialized structure is guaranteed to be to the original.

No fidelity is required at the moment, 

Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Elizabeth Mattijsen wrote:

> At 12:53 -0400 10/21/03, Dan Sugalski wrote:
> >  > Yeah, if you're just needing to tag the stream with a label to indicate
> >>  the type plus a version number, then xml's on the one hand overkill and
> >  > on the other hand not necessarily a big help to xml proponents.
> >So, in a nutshell, throwing an XML format type tag at the beginning buys
> >us nothing regardless of whether it's an XML stream or not?
>
> Yep.  But mainly I think because you'll need to encode binary data to
> make it valid XML.  That's on overhead you don't to suffer for those
> serialization that don't need it.

I had it in mind that the XML parsers were all event driven so they'd read
the header and stop until prodded, and wouldn't be prodded on if it wasn't
a real parrot XML serialization stream, so binary data wouldn't matter.

> If you ask me, you could do easy with a simple header line like:
>
>parrot xml 1.0
>\0
>
> basically magic word ('parrot')
>   followed by a space
>   followed by the type
>   followed by a space
>   followed by version
>   followed by a CRLF (not sure about this one, but could be nice)
>   followed by a null byte

That works for me, including the crlf. Congrats, you just defined the
parrot serialization header tag! :-)

> I'm not clear if you would know beforehand how many bytes of data you
> would receive.  If that is possible to know at all time, then I would
> suggest having the length as an extra part of the header.

Since we're going to potentially be serializing to an on-the-fly
unseekable device (i.e. dumping to a socket) so no length.

> >In that case, nuts to that. It's already terribly obvious I'm going to
> >mess it up if I try, so we'll just skip it and move on to the next
> >headache. :)
>
> Which means I'll be going back to lurking mode again...  ;-)

Waiting to pounce, huh? :)

Dan


Re: Object freezing

2003-10-21 Thread Jeff Clites
On Oct 21, 2003, at 10:41 AM, Elizabeth Mattijsen wrote:

At 12:53 -0400 10/21/03, Dan Sugalski wrote:
 > Yeah, if you're just needing to tag the stream with a label to 
indicate
 the type plus a version number, then xml's on the one hand overkill 
and
 > on the other hand not necessarily a big help to xml proponents.
So, in a nutshell, throwing an XML format type tag at the beginning 
buys
us nothing regardless of whether it's an XML stream or not?
Yep.  But mainly I think because you'll need to encode binary data to 
make it valid XML.  That's on overhead you don't to suffer for those 
serialization that don't need it.

If you ask me, you could do easy with a simple header line like:

  parrot xml 1.0
  \0
basically magic word ('parrot')
 followed by a space
 followed by the type
 followed by a space
 followed by version
 followed by a CRLF (not sure about this one, but could be nice)
 followed by a null byte
Yep, that's the sort of thing that I was thinking, though I'd actually 
leave the CRLF (or just an LF or CR, whatever), and take out the null 
byte. My reason for that is that this way, if your serialization format 
always spits out vanilla ASCII w/o control characters, suitable for 
consumption by some foreign C program, then the header won't change 
this. (That's one of the nice features of the tar format--a tar archive 
of ASCII text file is itself an ASCII text file, if I recall 
corrrectly.)

It could also be handy to allow additional "comment" text after the 
version (ignored by the deserialization, restricted to be ASCII w/o any 
CR or LF), because that would let you put in some human-readably 
comment to help out people trying to figure out what this file is. Some 
other formats to this, which is nice. Just another thought.

JEff



Old Big problems before New Big problems

2003-10-21 Thread Melvin Smith
*sigh*

I'm long overdue for a rant.

I'm very happy with the progress Parrot has made, but that is because I
took a year off. Otherwise, it would have been like watching a pot
waiting for it to boil.

However, some things have not changed, like us.

We try to tackle too many NEW large problems and spend time
arguing them while there are many things that we needed a year ago
and still need them today.

This is my list, not Dan's so he may disagree:

1) We should have a complete bytecode spec, with metadata and
symbol tables and classes.

2) We should have support for classes and methods in the core

3) We should have class/method syntax in IMCC (working on this)
   so people can define these in a notation that makes sense.

4) We should have a complete IO system

5) We should have a regex core and functional regex compiler. We had
started on this but disagreement on approach and developer
bandwidth seems to have stopped this.

6) We should have at least ONE semi-stable high level language
that allows us to write REAL software on Parrot and use all of the
above features.

Number 6 is where we really look bad compared to other VM efforts
out there. We have a huge directory of partially working languages but
nothing works besides trivial hello world style samples.

Until we get there, I won't give a flip about whether we serialize in XML 
or
morse code.

There are also tons of academic material available on all the issues
of late. Garbage collection, cylic issues, serialization, finalization and
destruction. Not to even MENTION compiler development. We should
be referencing books and papers in these discussions more than we
have been, because these problems have been solved in
various ways, many times.

Working, bad implementations (straw men as some of us call it) are
better than nothing at all.


-Melvin


Re: Object freezing

2003-10-21 Thread Jeff Clites
On Oct 21, 2003, at 10:49 AM, Dan Sugalski wrote:

On Tue, 21 Oct 2003, Elizabeth Mattijsen wrote:

At 12:53 -0400 10/21/03, Dan Sugalski wrote:
Yeah, if you're just needing to tag the stream with a label to 
indicate
 the type plus a version number, then xml's on the one hand 
overkill and
on the other hand not necessarily a big help to xml proponents.
So, in a nutshell, throwing an XML format type tag at the beginning 
buys
us nothing regardless of whether it's an XML stream or not?
Yep.  But mainly I think because you'll need to encode binary data to
make it valid XML.  That's on overhead you don't to suffer for those
serialization that don't need it.
I had it in mind that the XML parsers were all event driven so they'd 
read
the header and stop until prodded, and wouldn't be prodded on if it 
wasn't
a real parrot XML serialization stream, so binary data wouldn't matter.
The event-based parsers (such as expat and other SAX parsers) tend to 
be push instead of pull, so you hand them your bytes and they invoke 
your callbacks (as opposed to pull-style in which you'd ask for the 
next event). Sometimes you can hand them your bytes in chunks (and 
they'll process what they can and save up the rest to include with your 
next chunk), so with expat (for example) you could probably do what you 
wanted, but you'd have to hand it data one byte at a time until your 
first callback was invoked, then stop. So it could probably be done 
with some parsers, but it would be an unusual usage, and more overhead 
than it's worth to just parse out a couple of strings. :)

JEff



Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Elizabeth Mattijsen wrote:

> At 13:49 -0400 10/21/03, Dan Sugalski wrote:
> >On Tue, 21 Oct 2003, Elizabeth Mattijsen wrote:
> Hmmm... maybe as an optimilization, something that would fit in 4 or
> 8 bytes would be better for the magic string (so a single or double
> integer check would be suffcient?).
>
>prrt (4 byte)
>
>ParrotDS   (8 byte)
>
> (DS for Data Stream, rather than what you think, Dan  ;-)

Yeah, that's a better option, I think. (And no, I didn't figure it stood
for anything else--that thing in the build file is embarrassing enough
:-P)

> >  > I'm not clear if you would know beforehand how many bytes of data you
> >  > would receive.  If that is possible to know at all time, then I would
> >  > suggest having the length as an extra part of the header.
> >Since we're going to potentially be serializing to an on-the-fly
> >unseekable device (i.e. dumping to a socket) so no length.
>
> Ok, so how is the encoder to know that no more data will come?

Good point. We'll have to have an end encoding entry in the encoding API,
the same way we'll need a begin encoding. The encoding format itself
defines the end of encoding marker and will know when it's hit the end of
the data stream.

Dan


Re: Object freezing

2003-10-21 Thread Elizabeth Mattijsen
At 08:21 -0400 10/21/03, Dan Sugalski wrote:
 > I find the notion of an "XML header" a bit confusing, given Dan's
 statement to the effect that it was a throw to XML folks.

 I think anything "XML folks" will be interested in will entail
 *wrapping* stuff, not *prefixing* it.
Nah, I expect what they'll want is for the entire data stream of
serialized objects to be in XML format. Which is fine--they can have that.
(It's why I mentioned the serialization routines can be overridden)
For an XML stream the header might be  with the rest of the stream in XML. A YAML stream would start
 with the rest in YAML, and teh
binary format as . Or something
like that, modulo actual correct XML.
If you want that to be looking like valid XML, it would have to be different:

error: Specification mandate value for attribute parrot

  ^
Better in my opinion would be something like:
data yadda yadda yadda

At least this would be a valid stand-alone XML container.  And 
possibly parsers out there can be coerced into leaving the rest of 
the stream for other processes to be read.


This way we have a single, fixed-format type/version header, which makes
the initial identification easier and less error-prone. (Possibly even
fit for file and programs of its ilk to note) The binary format won't
care, and teh YAML format shouldn't care (as long as the indenting's
right) but the XML format would, so it seems to make sense to use the XML
stuff for the initial header.
So are we talking about a header or a wrapper?  If it is really a 
header, it's not XML and then it's prettyy useless from an XML point 
of view.

Liz


[perl #24260] [PATCH] to build under win32

2003-10-21 Thread Nick Kostirya
# New Ticket Created by  "Nick Kostirya" 
# Please include the string:  [perl #24260]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt2/Ticket/Display.html?id=24260 >


[PATCH] to build under win32

1. MS compiler do not support struct with empty body.
2. remove unistd.h

patch -p0 < win.patch

tesed on 
WinNT (MSVC 6.0) and
Linux with gcc version 2.95.4 20011002 (Debian prerelease)







-- attachment  1 --
url: http://rt.perl.org/rt2/attach/66327/49559/636663/win.patch



win.patch
Description: win.patch


[perl #24261] [PATCH] for t\harness under win32

2003-10-21 Thread Nick Kostirya
# New Ticket Created by  "Nick Kostirya" 
# Please include the string:  [perl #24261]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt2/Ticket/Display.html?id=24261 >


cmd.exe of WinNT do not convert t\src\*.t into list files.
 
D:\CvsProjects\parrot>nmake test
D:\Programs\Perl\bin\perl.exe t\harness t\src\*.t
t\src\*t\src\*.t does not exist
FAILED--1 test script could be run, alas--no output ever seen
NMAKE : fatal error U1077: 'D:\Programs\Perl\bin\perl.exe' : return code
'0x2'
Stop.

Apply patch into "t" directory:
patch  < harness.patch



-- attachment  1 --
url: http://rt.perl.org/rt2/attach/66334/49567/4ca3a6/harness.patch



harness.patch
Description: harness.patch


[perl #24262] [PATCH] for t\harness under win32

2003-10-21 Thread Nick Kostirya
# New Ticket Created by  "Nick Kostirya" 
# Please include the string:  [perl #24262]
# in the subject line of all future correspondence about this issue. 
# http://rt.perl.org/rt2/Ticket/Display.html?id=24262 >


cmd.exe of WinNT do not convert t\src\*.t into list files.

D:\CvsProjects\parrot>nmake test
D:\Programs\Perl\bin\perl.exe t\harness t\src\*.t
t\src\*t\src\*.t does not exist
FAILED--1 test script could be run, alas--no output ever seen
NMAKE : fatal error U1077: 'D:\Programs\Perl\bin\perl.exe' : return code
'0x2'
Stop.


Apply patch into "t" directory:
patch  < harness.patch







Re: Object freezing

2003-10-21 Thread Elizabeth Mattijsen
At 12:53 -0400 10/21/03, Dan Sugalski wrote:
 > Yeah, if you're just needing to tag the stream with a label to indicate
 the type plus a version number, then xml's on the one hand overkill and
 > on the other hand not necessarily a big help to xml proponents.
So, in a nutshell, throwing an XML format type tag at the beginning buys
us nothing regardless of whether it's an XML stream or not?
Yep.  But mainly I think because you'll need to encode binary data to 
make it valid XML.  That's on overhead you don't to suffer for those 
serialization that don't need it.

If you ask me, you could do easy with a simple header line like:

  parrot xml 1.0
  \0
basically magic word ('parrot')
 followed by a space
 followed by the type
 followed by a space
 followed by version
 followed by a CRLF (not sure about this one, but could be nice)
 followed by a null byte
I'm not clear if you would know beforehand how many bytes of data you 
would receive.  If that is possible to know at all time, then I would 
suggest having the length as an extra part of the header.


In that case, nuts to that. It's already terribly obvious I'm going to
mess it up if I try, so we'll just skip it and move on to the next
headache. :)
Which means I'll be going back to lurking mode again...  ;-)

Liz


Re: Object freezing

2003-10-21 Thread Elizabeth Mattijsen
At 13:49 -0400 10/21/03, Dan Sugalski wrote:
On Tue, 21 Oct 2003, Elizabeth Mattijsen wrote:
 > Yep.  But mainly I think because you'll need to encode binary data to
 > make it valid XML.  That's on overhead you don't to suffer for those
 > serialization that don't need it.
I had it in mind that the XML parsers were all event driven so they'd read
the header and stop until prodded, and wouldn't be prodded on if it wasn't
a real parrot XML serialization stream, so binary data wouldn't matter.
 > If you ask me, you could do easy with a simple header line like:
parrot xml 1.0
\0
 basically magic word ('parrot')
   followed by a space
   followed by the type
   followed by a space
   followed by version
   followed by a CRLF (not sure about this one, but could be nice)
 >   followed by a null byte
That works for me, including the crlf. Congrats, you just defined the
parrot serialization header tag! :-)
Hmmm... maybe as an optimilization, something that would fit in 4 or 
8 bytes would be better for the magic string (so a single or double 
integer check would be suffcient?).

  prrt (4 byte)

  ParrotDS   (8 byte)

(DS for Data Stream, rather than what you think, Dan  ;-)


 > I'm not clear if you would know beforehand how many bytes of data you
 > would receive.  If that is possible to know at all time, then I would
 > suggest having the length as an extra part of the header.
Since we're going to potentially be serializing to an on-the-fly
unseekable device (i.e. dumping to a socket) so no length.
Ok, so how is the encoder to know that no more data will come?


 > >In that case, nuts to that. It's already terribly obvious I'm going to
 >mess it up if I try, so we'll just skip it and move on to the next
 > >headache. :)
 > Which means I'll be going back to lurking mode again...  ;-)
Waiting to pounce, huh? :)
I wish.  ;-)   For now, I'm more the hidden dragon rather than the 
crouching tiger...  ;-(

Liz


Re: Object freezing

2003-10-21 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
> On Tue, 21 Oct 2003, Juergen Boemmels wrote:

>> You know we already have two versions of pobject_lives lying around.

> Then we need to fix that, too.

One is with ARENA_DOD_FLAGS one w/o. If you are trying to implement your
universal mark() for everything, one is obsolete anyway.

>   Dan

leo


Re: Object freezing

2003-10-21 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:

> So, in a nutshell, throwing an XML format type tag at the beginning buys
> us nothing regardless of whether it's an XML stream or not?

Yes. That's what people say :)

What about a well known format called PBC. (Parrot bortable^Wbyte code :)
It knows about en/decoding basic types. A PMC doesn't need a lot more.

>   Dan

leo


Re: Object freezing

2003-10-21 Thread Leopold Toetsch
Dan Sugalski <[EMAIL PROTECTED]> wrote:
> On Tue, 21 Oct 2003, Jeff Clites wrote:

>> 1) Serialization traversals need to "take note" of logical int and
>> float slots

> That's not an issue for us. A PMC is responsible for serializing itself,
> so if its got a string, float, or int component then it must take
> respnsibility for dumping those components to the serialization stream.
> Basically PMCs *must* dump themselves out completely, but the engine
> provides support to defer dumping of PMCs so that we don't get into
> recursive dumping

That's what my general traversal routine was intended for. A PerlHash
may have native datatypes as well as PMCs as data members, plus the
STRING keys, which are references. The hash itself and plain data
members can get serialized/frozen/dumped whatever. The callback takes
care of the desired action.

PMCs inside (especially aggregates of any kind) would get postponed
(only the ID or address needs to be serialized). If that is now done via
the next_for_GC pointer, a seen hash a bitmap or whatever is debatable
and seems (when destructor level freezing comes in) to be not too simple.

But using mark() for it doesn't meat the goal. Its a different thing. It
sets the live bit on objects.

> That's what custom mark routines are for, though it does argue that we
> should have a separate mark for freezing.

Which can be traverse or visit or whatever, but different.

>> 3) Traversal for GC needs to do loop detection,

It stops, when the live bit is set or sets a live bit and places
aggregates on an todo list. Its by far simpler then freezing. Its an
optimization - yes.

> The loop breaking needs for freezing are the same as for DOD sweeps,

s/are/can be/

> In all cases (I made sure this was in the example, but it might not have
> been clear) we only include a marker for child PMCs in the parent PMC's
> serialized data, and serialize the child PMCs later on in the stream. So
> if PMC1 has a pointer to PMC2, the stream has PMC1 dumped to it but in the
> place of PMC2's data is just a marker saying "refer to PMC2 here" and then
> after the end of PMC1's data in the stream we dump out PMC2's data.

That's clear. But plain scalars don't have child PMCs. Freez em and be
done with them. There is no need to put these on a next_for_GC list,
mark() doesn't do it (anymore) and freeze doesn't have to do it. I don't
see the small PMCs approach dieing because of that.

>   Dan

leo


Re: Old Big problems before New Big problems

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Melvin Smith wrote:

> This is my list, not Dan's so he may disagree:

The list is a valid one, and the complaint is real. We've been taking
things out of order in part because it works for me, but I've got a bigger
picture than anyone else and that's not necessarily a great way to work.
(Better be careful, we'll offer you the pumpkin once Leo's done with it...
:)

FWIW, the cranking about the freeze mechanisms (though not really the
format) is directly applicable to some of the points--it's necessary for
PMC constants, which we need in part to finalize the bytecode format. We
also need object stuff, though, so it is time to stop grumbling about it
and get it done.

Dan


Re: [perl #24261] [PATCH] for t\harness under win32

2003-10-21 Thread Juergen Boemmels
"Nick Kostirya" (via RT) <[EMAIL PROTECTED]> writes:

> cmd.exe of WinNT do not convert t\src\*.t into list files.
>  
> D:\CvsProjects\parrot>nmake test
> D:\Programs\Perl\bin\perl.exe t\harness t\src\*.t
> t\src\*t\src\*.t does not exist
> FAILED--1 test script could be run, alas--no output ever seen
> NMAKE : fatal error U1077: 'D:\Programs\Perl\bin\perl.exe' : return code
> '0x2'
> Stop.

Applied (except white-space) identical patch which i had already in my
tree.

boe


Class metadata for PIR/assembly files

2003-10-21 Thread Dan Sugalski
Here's the scoop:

Metadata for classes is simple. In PIR/assembly, they're noted with
.things:

  .class Foo
.is bar
.is baz
.does some_thing
.member x
.member y
.member z
  .ssalc

Unless someone tells me that ssalc is horribly obscene in some relatively
common language, and we may still if the translation amuses me
sufficiently.

Keywords are simple for the metadata. .class starts the declaration, has a
single parameter the name. Class declarations end with .ssalc. Each .is
defines a parent class, each .does defines an interface the class
supports, and each .member defines a PMC member slot that each object.

If a class is defined in the bytecode, it gets instantiated when the
bytecode is created. (It's a constant class, though like any other class
is mutable at runtime so it's not that constant) There is no difference
between a class created with metadata and one created by executable code
piecemeal.

Classes, when instantiated, have a backing namespace that's identical to
the class name.

We will be adding version metadata to the classes, but that's going to be
deferred.

It's OK for the code that handles PIR and assembly to ignore this for the
moment, at least until the metadata segment is better defined. Which will
be soon, though I'd rather someone else do the bytecode modification as
it's been a long time since I've had my hand in there.

This would be a good time to comment on the metadata, as I'm about to go
finish defining the ops to create classes dynamically and actually finish
the fscking object.c. code to do it.

Dan


Re: Taint mode testing and project Phalanx

2003-10-21 Thread Dave Rolsky
On Mon, 20 Oct 2003, Michael G Schwern wrote:

> On Tue, Oct 21, 2003 at 12:24:03AM -0500, Dave Rolsky wrote:
> > On Mon, 20 Oct 2003, Andrew Savige wrote:
> > > I noticed in Test::Tutorial:
> > > "Taint mode is a funny thing. It's the globalest of all global features.
> > > Once you turn it on it effects all code in your program and all modules
> > > used (and all the modules they use). If a single piece of code isn't
> > > taint clean, the whole thing explodes. With that in mind, it's very
> > > important to ensure your module works under taint mode."
> >
> > Not to mention that it's buggy as hell.  For example, in various versions
> > of Perl I've used there have been rather serious bugs in the regex engine
> > when taint mode is on, even when dealing with untainted variables!
>
> I've never hit anything like this.  Do you have examples?

Well, one example comes from my Params::Validate module, where I have this
little bit of XS:

  while (he = hv_iternext(p)) {
  /* This may be related to bug #7387 on bugs.perl.org */
  #if (PERL_VERSION == 5)
  if (! PL_tainting)
  #endif
  SvGETMAGIC(HeVAL(he));

Whee, a random taint related bug.

Then there was the time I found that pos() didn't get updated inside
s/\G...//gc matches when taint mode was on, for certain versions of
Perl working with some strings (but not others).  I don't think this
bug exists in the current version any more.

I could never reproduce this in a concise example, unfortunately.

Anyway, my taint mode experience has been that random things break in very
weird ways when using it.


-dave

/*===
House Absolute Consulting
www.houseabsolute.com
===*/


Re: Taint mode testing and project Phalanx

2003-10-21 Thread Tim Bunce
On Tue, Oct 21, 2003 at 12:34:44PM -0500, Dave Rolsky wrote:
> 
> Anyway, my taint mode experience has been that random things break in very
> weird ways when using it.

I'd guess that many extensions don't handle magic properly.

Extension authors rarely add the extra logic, even if they know
what logic needs to be added. Same possible applies to more
obscure parts of perl.

Proof of concept, for anyone that has the time: modify perl with a
#idef so that all values are tainted but disable the tainted
expression exception so that they're harmless. See what tests fail.

Tim.


Re: No more code coverage

2003-10-21 Thread Ovid
--- Tim Bunce <[EMAIL PROTECTED]> wrote:
> > I'll look into SQLite.
> 
> I'd caution against rushing in any particular direction without some
> profiling information to back it up.
> 
> Having said that, I'd strongly recommend switching to Storable first.
> It did have problems but it's now very robust and far, far, faster
> than Data::Dumper+eval. This small change would yield a big gain.
> 
> The next step would be to get some profile information. There's
> little point in doing that first as Data::Dumper+eval will dwarf
> time spent elsewhere.

It's not performance that's killing Devel::Cover when we run tests.  It's that the 
data structure
for the coverage data appears to be built in-memory and it's so huge that I run out of 
memory (and
this is on a machine with a couple of gigs of RAM).

If it's not the data structure being built but instead is the conversion to 
Data::Dumper format,
then ignore what I say :)

Cheers,
Ovid

=
Silence is Evilhttp://users.easystreet.com/ovid/philosophy/indexdecency.htm
Ovid   http://www.perlmonks.org/index.pl?node_id=17000
Web Programming with Perl  http://users.easystreet.com/ovid/cgi_course/

__
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com


Re: [RfC] and [PATCH]: Libraries

2003-10-21 Thread Jeff Clites
On Oct 15, 2003, at 4:52 AM, Juergen Boemmels wrote:

I spent the last day getting parrot running under Borland. The
attached patch is whats need to get linking and running make test on
both Windows/Borland and Linux/gcc. I'm not sure if its ready for
inclusion in the tree, but I want some feedback on the approach.
The main problem is that Borland can't build a single static library
(at least I did not find out) with two files of the same name. But
there are some name clashes: intlist.o and classes/intlist.o or
stacks.o and languages/imcc/stacks.o. I solved this by seperating
libparrot in three partial libs: classes/classes.a containing all
object-files of classes/ ; languages/imcc/imcc.a containing all
object-files of imcc and blib/lib/libparrot.a for all the rest. (This
names need cleanup; shouldn't they all go to blib/lib?). classes/ is
still build by its own Makefile, this should be integrated in the
root-Makefile, but thats another story.
Next problem is library interdependence. classes.a depends on
libparrot.a and libparrot.a depends on classes.a. This complicates
linking a bit. The gnu linker does not revisit previous files so the
link line has to contain something like
libparrot.a classes.a libparrot.a
A new configure variable parrot_libs takes care of this
Since no one else commented, I'll give you my two cents. I think there 
are 4 other options in addition to your proposal:

1) Find out for sure if Borland has a way to build this into a single 
library, if you're not 100% certain. Don't know if there are any 
Borland experts on the list.

2) Build separate libs, but only under Borland.

3) Rename the files with duplicate names so that they don't 
conflict--this might be worth doing anyway.

4) When building under Borland, make copies of the offending files 
under different names, and build those. (e.g., make a copy of 
languages/imcc/stacks.c called imcc-stacks.c, and build that instead)

I'd say that (4) is the cleanest (it only affect the one problematic 
environment, and you get to keep a single library everywhere), but as I 
said I think (3) might be useful anyway (just to make it easier to 
refer to files in conversation). Option (3) is also dead-easy, but we'd 
lose cvs history for the renamed files.

Having multiple libs isn't the end of the world, but it would be a 
shame to have to do it because of a particular compiler/linker quirk.

JEff



Re: Object freezing

2003-10-21 Thread Clark C. Evans
On Tue, Oct 21, 2003 at 09:12:27AM -0400, Dan Sugalski wrote:
| We're talking about the first thing in a file (or stream, or whatever). I
| was under the impression that XML files should be entirely composed of
| valid XML, hence the need for the stream type marker being valid XML. YAML
| doesn't care as much, so far as I understand, and for our own internal
| binary format we cna do whatever we want.

As for 'autodetecting' XML vs YAML, an earlier version of the YAML
spec restricted plain-style mapping keys so that they could not 
start with the '<' character.  In this way, a processor could 
auto-detect if the incoming stream was XML or YAML, and use the
appropriate parser.   With the restricted schema described below, 
a small-footprint XML parser could even be shipped with the 
core libyaml allowing us (the YAML team) to handle this XML 
compatibility requirement; especially with regard to SOAP, the
defacto XML object 'serialization' schema.

I wrote up a brief 'sketch' as to one option for interoperability
between XML and YAML, although there are many such options, and soon 
the yaml-core list will be forced to discuss icky things like this.
My thoughts (which are _not_ concensus in the YAML community) are
found at   http://www.yaml.org/xml.html ; this page gives the 
usually "invoice" example in XML, and an imperfect XSLT stylesheet 
for converting XML in this schema to YAML.   Clearly more work is
needed here, I would very much like to hear your requirements. 

Kind Regards,

Clark

P.S. I try to follow this list, but I often miss items, so if you put
'YAML' in the title and cc me on it, it will surely get my attention.


Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Clark C. Evans wrote:

> On Tue, Oct 21, 2003 at 09:12:27AM -0400, Dan Sugalski wrote:
> | We're talking about the first thing in a file (or stream, or whatever). I
> | was under the impression that XML files should be entirely composed of
> | valid XML, hence the need for the stream type marker being valid XML. YAML
> | doesn't care as much, so far as I understand, and for our own internal
> | binary format we cna do whatever we want.
>
> As for 'autodetecting' XML vs YAML

We don't have to! Woohoo! :)

This is one problem I didn't want to go into, so the encoding would be
explicit in the header. Since we've now dodged even the pretense of
guaranteed minimally valid anything in the stream header, the point's
moot, which is nice.

For YAML encoding, like for XML (and the default native encoding, which'll
probably be parrot bytecode) everything after the end-of-header will be
proper, well-formed whatever. Unless the encoder's messed up, of course.
;)

Dan


Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Clark C. Evans wrote:

> If you are going to go this far (including content-length) may I
> just suggest using a MIME envelope?  This has several advantages:

This is a very good idea, but not this time, as it's too easy to get stuck
in the endless churn of very good ideas and alternatives.

The simple header format, without the null byte (taking it out is a good
idea, since we may have the possibility of an all-text file in that case),
is the way we're going to go. Maybe for version 2.0, but we've hit the
Good Enough point here.

Dan


Class creation in bytecode

2003-10-21 Thread Dan Sugalski
Okay, since nobody took advantage of the, oh, at lesat 2 or 3 minutes
since the metadata spec post, here's the equivalent for assembly. I'll
stub in and commit the stubbed object.ops ops in a bit.

We've already got ops to create a class standalone, and to subclass an
existing class. We're also going to add the following ops:

  addparent Px, Py
  removeparent Px, Py

To add and remove class Y as a parent of class X

  addattrib Ix, Py, Sz
  removeattrib Px, [IS]y

To add attribute Z to class Y. X gets the attribute offset. removeattrib
removes attribute #y or named Y (depending on whether it's a string or
int) from the class

To add or remove an implemented interface:

  adddoes Px, Sy
  removedoes Px, Sy

Instantiate, as implemented, is dead. I'm going to nuke it, then use it
for instantiating classes via metadata chunks. That's next message.

Dan


Re: Object freezing

2003-10-21 Thread Elizabeth Mattijsen
At 15:18 -0400 10/21/03, Dan Sugalski wrote:
On Tue, 21 Oct 2003, Clark C. Evans wrote:
 > If you are going to go this far (including content-length) may I
 > just suggest using a MIME envelope?  This has several advantages:
This is a very good idea, but not this time, as it's too easy to get stuck
in the endless churn of very good ideas and alternatives.
I would think the MIME-envelope would have to be part of the data, 
rather than in the header.  Or am I missing anything?  It's 
encode/decoder determined, is it not?


The simple header format, without the null byte (taking it out is a good
idea, since we may have the possibility of an all-text file in that case),
I have no particular feeling about the null byte.  It would just be a 
convenience when debugging as it would allow you to just print the 
string, as it would be null-delimited.  The CRLF (or just the CR or 
just the LF) could serve as an end of header marker just as well.

Liz


Re: Object freezing

2003-10-21 Thread Clark C. Evans
On Tue, Oct 21, 2003 at 07:41:08PM +0200, Elizabeth Mattijsen wrote:
| If you ask me, you could do easy with a simple header line like:
| 
|   parrot xml 1.0
|   \0
| 
| basically magic word ('parrot')
|  followed by a space
|  followed by the type
|  followed by a space
|  followed by version
|  followed by a CRLF (not sure about this one, but could be nice)
|  followed by a null byte
| 
| I'm not clear if you would know beforehand how many bytes of data you 
| would receive.  If that is possible to know at all time, then I would 
| suggest having the length as an extra part of the header.

If you are going to go this far (including content-length) may I 
just suggest using a MIME envelope?  This has several advantages:

 - there are already readers for the format
 - it allows you to specify the 'Content-Type' as, say binary/parrot
   or text/yaml or text/xml
 - it gives you a place to put 'Content-Length'
 - it is extensible, allowing for other headers
 - it allows you to include other 'binary' blobs in the same file

Best,

Clark


Re: Object freezing

2003-10-21 Thread Dan Sugalski
On Tue, 21 Oct 2003, Clark C. Evans wrote:

> Back to the YAML list... sorry for interloping!

Ah, you weren't interloping--it is a good idea. You just managed to come
in on the other side of Good Enough today. :)

Dan


Re: Object freezing

2003-10-21 Thread Clark C. Evans
Dan/Elizabeth,

Thank you for considering my response, let me rephrase and then
I'll go back to my own list (*grins*).

On Tue, Oct 21, 2003 at 09:25:48PM +0200, Elizabeth Mattijsen wrote:
| At 15:18 -0400 10/21/03, Dan Sugalski wrote:
| >On Tue, 21 Oct 2003, Clark C. Evans wrote:
| > > If you are going to go this far (including content-length) may I
| > > just suggest using a MIME envelope?  This has several advantages:
| >This is a very good idea, but not this time, as it's too easy to get stuck
| >in the endless churn of very good ideas and alternatives.

I should have just suggested RFC822 like headers (used by E-Mail 
and HTTP) and not imply that MIME, with its multi-part and encoding
glory need be supported.   In other words, the header could simply be:

   Parrot-version: 0.3
   Content-type: binary/parrot<- or text/yaml, text/xml
   Content-length: 49384
  <- blank line   
   (binary payload)

This has the advantages of:
  (a) satisifies mentioned requirements: version, type, and size
  (a) easy to parse, well known syntax
  (b) fits in well with Intranet infrastructure
  (c) easy to extend down the road, ie, more headers can be added

| I would think the MIME-envelope would have to be part of the data, 
| rather than in the header.  Or am I missing anything?  It's 
| encode/decoder determined, is it not?

Well, it seemed you were making an 'envelope', and this is
exactly what RFC822 is all about.  In particular, all of the
items you wanted to put in your header could be done easily
with RFC822. 

| >The simple header format, without the null byte (taking it out is a good
| >idea, since we may have the possibility of an all-text file in that case),
| 
| I have no particular feeling about the null byte.  It would just be a 
| convenience when debugging as it would allow you to just print the 
| string, as it would be null-delimited.  The CRLF (or just the CR or 
| just the LF) could serve as an end of header marker just as well.

RFC822 uses a "blank line", that is two adjacent "CRLF" items to 
mark the end of the header.

Back to the YAML list... sorry for interloping!

Clark


Re: Object freezing

2003-10-21 Thread Elizabeth Mattijsen
At 12:56 -0700 10/21/03, Clark C. Evans wrote:
On Tue, Oct 21, 2003 at 09:25:48PM +0200, Elizabeth Mattijsen wrote:
| At 15:18 -0400 10/21/03, Dan Sugalski wrote:
| >On Tue, 21 Oct 2003, Clark C. Evans wrote:
| > > If you are going to go this far (including content-length) may I
| > > just suggest using a MIME envelope?  This has several advantages:
| >This is a very good idea, but not this time, as it's too easy to get stuck
| >in the endless churn of very good ideas and alternatives.
I should have just suggested RFC822 like headers (used by E-Mail
and HTTP) and not imply that MIME, with its multi-part and encoding
glory need be supported.   In other words, the header could simply be:
   Parrot-version: 0.3
   Content-type: binary/parrot<- or text/yaml, text/xml
   Content-length: 49384
  <- blank line  
   (binary payload)
But do we always need that?  In my idea it would be something like:

prrt 1.0 yaml# prrt = magic word, 1.0 = parrot header version, 
yaml = encode ID
Parrot-version: 0.3
Content-type: binary/parrot
Content-length: 49384

(binary payload)

The Parrot header line would just be enough to get the right decoder, 
whatever the decoder does with the rest of the stream, is up to the 
decoder.  Another example with XML:

prrt 1.0 xml  # assume XML


 xmlns:parrot="http://www.parrotcode.org/0.3";  # implies MIME-encode 
binary data
(mime encoded binary data)


And another one with an oldy but goody?

prrt 1.0 storable   # assume storable
(whatever Storable.pm puts in its magic)
Hope this made sense.

Liz


Re: Class creation in bytecode

2003-10-21 Thread Matt Fowles
All~

Dan Sugalski wrote:
To add or remove an implemented interface:

  adddoes Px, Sy
  removedoes Px, Sy
Instantiate, as implemented, is dead. I'm going to nuke it, then use it
for instantiating classes via metadata chunks. That's next message.
Just a thought, but (add/remove)interface seems a little more 
undetstandable...

Matt

PS-Dan, what happened to you sig?  I rather liked it.



Re: Taint mode testing and project Phalanx

2003-10-21 Thread Michael G Schwern
On Tue, Oct 21, 2003 at 12:34:44PM -0500, Dave Rolsky wrote:
> Anyway, my taint mode experience has been that random things break in very
> weird ways when using it.

All the more reason to test with it on. :)


-- 
Michael G Schwern[EMAIL PROTECTED]  http://www.pobox.com/~schwern/
Do not try comedy at home!  Milk & Cheese are advanced experts!  Attempts at
comedy can be dangerously unfunny!


Re: Taint mode testing and project Phalanx

2003-10-21 Thread Andrew Savige
Michael G Schwern wrote:
> On Tue, Oct 21, 2003 at 12:34:44PM -0500, Dave Rolsky wrote:
>> Anyway, my taint mode experience has been that random things break in very
>> weird ways when using it.
> 
> All the more reason to test with it on. :)

Given the differences in behaviour with taint mode, it seems to me
that for a "taint mode test" (i.e. one with -wT in its first line)
Test::Harness should run the test twice -- once with taint mode and
once without. Though I suppose there might be a case where you want
to run the test in taint mode only, so maybe Test::Harness needs
some options to control this.

/-\


http://personals.yahoo.com.au - Yahoo! Personals
New people, new possibilities. FREE for a limited time.


Re: Class metadata for PIR/assembly files

2003-10-21 Thread Joseph Ryan
Dan Sugalski wrote:

Here's the scoop:

Metadata for classes is simple. In PIR/assembly, they're noted with
.things:
 .class Foo
   .is bar
   .is baz
   .does some_thing
   .member x
   .member y
   .member z
 .ssalc
Unless someone tells me that ssalc is horribly obscene in some relatively
common language, and we may still if the translation amuses me
sufficiently.
Keywords are simple for the metadata. .class starts the declaration, has a
single parameter the name. Class declarations end with .ssalc. Each .is
defines a parent class, each .does defines an interface the class
supports, and each .member defines a PMC member slot that each object.
If a class is defined in the bytecode, it gets instantiated when the
bytecode is created. (It's a constant class, though like any other class
is mutable at runtime so it's not that constant) There is no difference
between a class created with metadata and one created by executable code
piecemeal.
Classes, when instantiated, have a backing namespace that's identical to
the class name.
We will be adding version metadata to the classes, but that's going to be
deferred.
It's OK for the code that handles PIR and assembly to ignore this for the
moment, at least until the metadata segment is better defined. Which will
be soon, though I'd rather someone else do the bytecode modification as
it's been a long time since I've had my hand in there.
This would be a good time to comment on the metadata, as I'm about to go
finish defining the ops to create classes dynamically and actually finish
the fscking object.c. code to do it.
Will there be a way to specify which methods belong to the class in the
metadata?  Or will Method namespaces just have to match class names so
that a lookup can be done?
-Joe




Re: Class metadata for PIR/assembly files

2003-10-21 Thread Melvin Smith
At 07:44 PM 10/21/2003 -0400, Joseph Ryan wrote:
Dan Sugalski wrote:

Here's the scoop:

Metadata for classes is simple. In PIR/assembly, they're noted with
.things:
 .class Foo
   .is bar
   .is baz
   .does some_thing
   .member x
   .member y
   .member z
 .ssalc
Will there be a way to specify which methods belong to the class in the
metadata?  Or will Method namespaces just have to match class names so
that a lookup can be done?
I was planning a .method directive. I like the feel of separate .field and 
.method
directives. I like supporting 2 variations like C++, however this is only
an intermediate language so it really doesn't matter.

.class Foo
  .method InlineMeth
  (code)
  .endmeth
  .method NotInline ...
.endclass
.method Foo.NotInLine
(code)
.endmethod
Using out of line definitions with inline declarations means the compiler
can be single pass and simpler, however most decent compilers
will do a separate semantic pass so forward declarations are easy.
On the other hand, inline method definitions makes code emitting
a little simpler.
I see no real big technical problem with supporting both syntax, I think
it is more proof-of-concept than anything since eventually we will pass
an syntax tree form to the compiler instead.
-Melvin




Re: Class metadata for PIR/assembly files

2003-10-21 Thread Melvin Smith
At 02:55 PM 10/21/2003 -0400, Dan Sugalski wrote:
Here's the scoop:

Metadata for classes is simple. In PIR/assembly, they're noted with
.things:
  .class Foo
.is bar
.is baz
.does some_thing
.member x
.member y
.member z
  .ssalc
Unless someone tells me that ssalc is horribly obscene in some relatively
common language, and we may still if the translation amuses me
sufficiently.
I'm sure ssalc must mean something bad somewhere. Technically
nothing is stopping us from using .end for everything since we
are using a LALR parser and don't need fancy error reporting,
Classes, when instantiated, have a backing namespace that's identical to
the class name.
Good.

So do we support :: or . for scope resolution? Or both?

It's OK for the code that handles PIR and assembly to ignore this for the
moment, at least until the metadata segment is better defined. Which will
be soon, though I'd rather someone else do the bytecode modification as
it's been a long time since I've had my hand in there.
Well we can hide this under PIR. Once PIR is set, we can
start by implementing on the fly class creation, then change
IMCC to emit metadata when the rest is in. That way
HL languages don't have to change later. For now we just have
IMCC emit newclass, etc. and manually construct the classes.
-Melvin




Re: Taint mode testing and project Phalanx

2003-10-21 Thread Dave Rolsky
On Tue, 21 Oct 2003, Michael G Schwern wrote:

> On Tue, Oct 21, 2003 at 12:34:44PM -0500, Dave Rolsky wrote:
> > Anyway, my taint mode experience has been that random things break in very
> > weird ways when using it.
>
> All the more reason to test with it on. :)

At this point I've become rather disgusted with it.  When taint mode
breaks pos(), and as a result your regex-based parser blows up in weird
ways, and you spend many, many hours figuring out what exactly is
happening, and then can't reduce it to a simple test case, you tend to get
a little peeved.

Tim's #ifdef idea for testing taint mode seems like a really good idea.
Once I know it's well tested in the core, I'll be happy to test my own
modules with it.


-dave

/*===
House Absolute Consulting
www.houseabsolute.com
===*/