Re: Compiling to Parrot

2003-01-22 Thread Dan Sugalski
At 8:54 AM -0500 1/21/03, Christopher Armstrong wrote:

On Tue, Jan 21, 2003 at 08:41:47AM +, Simon Wistow wrote:

 Speaking of games, it would be interesting to see Parrot be used in that
 direction. A lot of games currently are pretty much developed along the
 lines of 'custom scripting language interfaced to custom game engine'


One of the reasons I'm interested in Parrot -- I'm hoping that it's
going to have some secure execution facilities built-in from the
ground up (to facilitate user-code on virtual world servers) :-)


Yep. I've not spec'd them out as I've been trying to deal with other 
things, but secure execution is something I've been thinking of since 
the beginning. Perl 5's model has some rather significant flaws, as 
does Java's sandboxing, albeit fewer of them.
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Flex & IMCC

2003-01-22 Thread Dan Sugalski
At 6:10 PM +0100 1/21/03, Leopold Toetsch wrote:

Dan Sugalski wrote:


Okay, I can be a bit slow, but I finally figured out what's going 
on with IMCC and OS X. imclexer.c is autogenerated (duh!) and flex, 
or whatever's being used to do it, spits out bad code. Could the 
IMCC folks upgrade to the latest version of flex to see if that 
fixes things and, if not, I'll hack up a post-processing program to 
make the output buildable.

Could you be more precise in your error descriptions?
Error message, file line # ... ;-)


:-P

It's a link-time error--yyin is multiply defined.


If you are talking about the yyin, yyout thingy, this sould be 
fixed. Ths was a double define.

Cool. I didn't see where in the source it was an issue, but I'm not 
intimately familiar with the quirks of flex.
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Compiling to Parrot

2003-01-22 Thread Dan Sugalski
At 5:48 PM +0100 1/21/03, K Stol wrote:

Is it possible for parrot-code to call functions in other parrot files?
(which implies there is some program which consists of multiple files)


Oh, absolutely.

What one generally does is load in other bytecode files. Those files, 
on loading, will install a variety of symbols into the interpreter's 
global pool, at which point you can use them as you need to.

This is, generally speaking, a dynamic linking scheme, rather than 
the more traditional static link that resolves symbols and such, as 
we're dealing with a much more dynamic environment.

- Original Message -
From: "Dan Sugalski" <[EMAIL PROTECTED]>
To: "K Stol" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Tuesday, January 21, 2003 5:20 PM
Subject: Re: Compiling to Parrot



 At 5:13 PM +0100 1/21/03, K Stol wrote:
 >Only thing I need to know before I can start is: what would the purpose

be

 >of a Lua to Parrot compiler? Lua is originally an embedded language for
 >easy-scripting, as far as I understand. How could it be used when

targeted

 >to parrot? Would it be possible to call functions written in Lua (and

which

 >are then compiled to parrot) from (for example) a python script? (So:

python

 >script calls function writtenin Lua and compiled to parrot).

 If you follow the calling conventions, then yes you'll be able to
 call python/ruby/perl/befunge routines from Lua code, and vice versa.

 >From: "Dan Sugalski" <[EMAIL PROTECTED]>
 >  > At 5:01 PM +0100 1/21/03, K Stol wrote:
 >>  >well, I think not, then I can't help it. What do you think about
 >compiling
 >>  >Lua to parrot (IMCC)?
 >>
 >>  I like the idea, and I don't think you'll see anyone else tackle it
 >>  for a while. (And if that falls through, there's always LISP... :)
 >>
 >>  >From: "Dan Sugalski" <[EMAIL PROTECTED]>
 >>  >  > At 4:46 PM +0100 1/21/03, K Stol wrote:
 >>  >>  >Well, I'd do it as a project for my Bachelor's, so I won't get
 >permission
 >>  >to
 >>  >>  >do such a project, if it already exists.
 >>  >>
 >>  >>  Ah, that could be a problem. Will it be a problem if you start a
 >>  >>  project that someone else later also starts?
 >>  >>
 >>  >>  >From: "Dan Sugalski" <[EMAIL PROTECTED]>
 >>  >>  >  > At 9:17 AM +0100 1/21/03, K Stol wrote:
 >>  >>  >>  >Hi there,
 >>  >>  >>  >
 >>  >>  >>  >A few weeks ago I posted something about a Tcl->parrot

compiler,

 >but
 >>  >>  >>  >Will Coleda already was working on such a project. It would

be a

 >as
 >>  >>  >>  >a final project for my bachelor's. But because such already
 >exists,
 >>  >>  >>  >I'm looking for something else.
 >>  >>  >>
 >>  >>  >>  If you're interested in doing a Tcl compiler, by all means, go
 >>  >>  >>  ahead--I wouldn't let the fact that someone else is doing it

stop

 >>  >>  >>  you. The point of doing it is for the experience, which you'll

get

 >>  >>  >>  regardless of any other implementation. It's also distinctly
 >possible
 >>  >>  >>  that neither you nor Will will finish a full implementation

(as

 >it's
 >>  >>  >>  likely a rather large undertaking for one person) but you'll

each
 > >>  >  > >  > have part of it that can be merged together.


--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk



Re: Objects, finally (try 1)

2003-01-22 Thread Dan Sugalski
At 9:24 PM + 1/21/03, Piers Cawley wrote:

Dan Sugalski <[EMAIL PROTECTED]> writes:
 > Hrm, interesting. Single symbol table for methods and attributes,

 though that's not too surprising all things considered. That may make
 interoperability interesting, but I was already expecting that to some
 extent.


Isn't that, essentially what Perl 6 will have too?


Nope. Attributes and methods will be very separate. Attributes are 
class-private, more or less, so they won't be in the symbol table. 
Methods, OTOH, will be, as they aren't really private at all.

I've been thinking about how to handle methods, as we need a 
mechanism everyone can share--you need a method cache for good 
performance, and the last thing I want to have to deal with is a 
dozen method caches for a dozen different language implementations, 
especially as everyone's guaranteed to get the first version wrong. 
(Plus, of course, there's MM dispatch to deal with, which needs to be 
global as well)
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


RE: Compiling to Parrot

2003-01-22 Thread Dan Sugalski
At 11:43 PM -0800 1/21/03, Paul Du Bois wrote:

The advantage of Lua (at least for my project, which is a game) is that it
is quite easy to embed, and quite easy to customize.  The C API is small and
easily understandable (at the expense of being a little bit of a pain to
use), and the internals are simple and quite malleable.  The language itself
is pretty ugly IMHO.

So... I can't think of a good purpose of Lua/Parrot myself.  I'm not trying
to discourage you by any means!  If I were to embed a Parrot interpreter for
our next game (!) I'd happily leave Lua behind.


Well, then, let's see what we can do to make parrot suitable for your needs. :)


 > Only thing I need to know before I can start is: what would the purpose be
 > of a Lua to Parrot compiler? Lua is originally an embedded language for

 easy-scripting, as far as I understand. How could it be used when targeted
 to parrot? Would it be possible to call functions written in Lua
 (and which
 are then compiled to parrot) from (for example) a python script?
 (So: python
 script calls function writtenin Lua and compiled to parrot).

 Klaas-Jan



--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk



Re: [perl #20315] [PATCH] eval - inter code segment branches

2003-01-22 Thread Leopold Toetsch
Jason Gloudon wrote:


On Tue, Jan 21, 2003 at 08:21:42PM +0100, Leopold Toetsch wrote:



# #!/usr/bin/perl -w
# my $i= 5;
# LAB:
#$i++;
#eval("goto LAB if ($i==6)");



Ok. Having inter_cs call DO_OP just seems more involved than it has to be.



Yep.



How about a single self-contained inter-segment jump instruction.

Since the compiler knows when a branch is non-local it can always break a
non-local conditional branch into a conditional local branch to a non-local
branch instruction.



This would mean to rewrite the branch target to point to a location 
after the end of the current sub (or end of program).

  if i, non_local1

would become

  if i, taken1
...

  end/ret # whatever
taken1: inter_cs_jump non_local1
...

Yep. Seems really much simpler. I'll try this approach.

Thanks for your input,
leo



Re: [perl #18056] [PATCH] Extending the Packfile (Part 1.)

2003-01-22 Thread James Mastros
On 01/21/2003 5:24 AM, Leopold Toetsch wrote:

Jürgen Bömmels (via RT) wrote:
PS orig description again:

This patch is the beginning of an effort to make PackFile format
extendible. At the moment its combatible with the old bytecode
format.

Ok, to the details:
It appends a 4th segment behind the 3 already defined segments (FIXUP,
CONSTANT and BYTECODE) name DIRECTORY. After this directory one ore
more uniquely named segments may follow.

Please, before 1.0 is the time for incompatable changes to the bytecode 
format, rather then ugly hacks that bend over backwords to achieve 
backwords compatablity.

If we care about reading old packfiles on newer parrots, that's easy 
enough to do without this.  The only thing being hacky buys us is the 
ablity to run new packfiles on old parrots.

I'd be much happier seeing a packfile format that began with DIRECTORY, 
and then had the other major sections located dynamicly.

(Then again, making me happy shouldn't be anybody's priority.)

	-=- James Mastros



RE: Compiling to Parrot

2003-01-22 Thread Chad Fowler
I lost the original mail asking for suggestions, so there is no quoted 
text here, but have you looked at Joy 
(http://www.latrobe.edu.au/philosophy/phimvt/joy.html).  Looks to be quite 
clean and simple.  I haven't had the time to delve into it, but when I was 
reminded of it on the Ruby list, I thought I would suggest it here.

Chad




[CVS ci] branch_cs - intersegment branch

2003-01-22 Thread Leopold Toetsch
This patch adds a new opcde for intersegment branches. I named it
"branch_cs". This takes one $I param, which is the entry in the
fixuptable.
Thanks to Jason Gloudon for hinting me, how to handle this beast.
(s thread "[perl #20315] [PATCH] eval - inter code segment branches")

The fixuptable may hold items of different types. I did define item
type "0" holding a segment nr and a branch offset.

Finally to have a non local return from the evaled code, I did split
the runops loop into two parts. The inner part handles resumes after
e.g. the trace opcode, the outer loop handles intersegment jumps,
marked with the resume_flag being 2.

To make this play with JIT too, I had to make the "invoke" opcode a
restartable opcode, which might be to expensive for the normal case.

So we might want to have an additional "invoke_cs" or "eval" opcode,
that might leave the current code segment.

Here is one test with trace, to better see what's goin on:

$ imcc -t t/syn/eval_3.imc
PC=0; OP=73 (set_i_ic); ARGS=(I1=0, 5)
PC=3; OP=87 (set_s_sc); ARGS=(S0=(null), ".sub _e\nif I1 == 6 g")
PC=6; OP=841 (compreg_p_sc); ARGS=(P2, "PIR")
PC=9; OP=838 (compile_p_p_s); ARGS=(P0, P2=Compiler=PMC(0x8183328), \
	S0=".sub _e\nif I1 == 6 g")
PC=13; OP=322 (inc_i); ARGS=(I1=5)
PC=15; OP=837 (invoke)
*** invoking EVAL_1
PC=0; OP=158 (eq_i_ic_ic); ARGS=(I1=6, 6, 5)
PC=5; OP=739 (branch_cs_ic); ARGS=(0)
*** back from EVAL_1
*** Resume at seg 0 ofs 13
PC=13; OP=322 (inc_i); ARGS=(I1=6)
PC=15; OP=837 (invoke)
*** invoking EVAL_1
PC=0; OP=158 (eq_i_ic_ic); ARGS=(I1=7, 6, 5)
PC=4; OP=0 (end)
*** back from EVAL_1
PC=16; OP=21 (print_i); ARGS=(I1=7)
7PC=18; OP=26 (print_sc); ARGS=("\n")

PC=20; OP=0 (end)

Have fun,
leo




Re: [perl #18056] [PATCH] Extending the Packfile (Part 1.)

2003-01-22 Thread Leopold Toetsch
James Mastros wrote:


I'd be much happier seeing a packfile format that began with DIRECTORY, 
and then had the other major sections located dynamicly.


Yep. The simple reason for keeping the old format still a while is 
assemble.pl. When switching to imcc is done, there is no need to keep 
the old format.


(Then again, making me happy shouldn't be anybody's priority.)



I'm unhappy too, with the ugliness of both formats in packfile.c ;-)



-=- James Mastros


leo




Re: [CVS ci] branch_cs - intersegment branch

2003-01-22 Thread Dan Sugalski
At 4:53 PM +0100 1/22/03, Leopold Toetsch wrote:

This patch adds a new opcde for intersegment branches. I named it
"branch_cs". This takes one $I param, which is the entry in the
fixuptable.
Thanks to Jason Gloudon for hinting me, how to handle this beast.
(s thread "[perl #20315] [PATCH] eval - inter code segment branches")


No, this isn't how we're doing intersegment branches. Plain jump will 
work just fine for this, the only thing we'd potentially need is the 
address of the routine to jump through, and that only if we're not 
going to require sub PMCs for everything. (And, honestly, I am and 
have been leaning this way)

--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [CVS ci] branch_cs - intersegment branch

2003-01-22 Thread Leopold Toetsch
Dan Sugalski wrote:


At 4:53 PM +0100 1/22/03, Leopold Toetsch wrote:


This patch adds a new opcde for intersegment branches. I named it
"branch_cs". This takes one $I param, which is the entry in the
fixuptable.
Thanks to Jason Gloudon for hinting me, how to handle this beast.
(s thread "[perl #20315] [PATCH] eval - inter code segment branches")



No, this isn't how we're doing intersegment branches. Plain jump will 
work just fine for this, the only thing we'd potentially need is the 
address of the routine to jump through, and that only if we're not going 
to require sub PMCs for everything. (And, honestly, I am and have been 
leaning this way)


IMHO plain jumps do not work:
- How to get out of JIT code?
- How to jump in not yet JItted code
- How to setup new bounds for current code segment?
- How to run a plain jump with CGoto or Prederef?

leo







Re: [CVS ci] branch_cs - intersegment branch

2003-01-22 Thread Dan Sugalski
At 6:13 PM +0100 1/22/03, Leopold Toetsch wrote:

Dan Sugalski wrote:


At 4:53 PM +0100 1/22/03, Leopold Toetsch wrote:


This patch adds a new opcde for intersegment branches. I named it
"branch_cs". This takes one $I param, which is the entry in the
fixuptable.
Thanks to Jason Gloudon for hinting me, how to handle this beast.
(s thread "[perl #20315] [PATCH] eval - inter code segment branches")



No, this isn't how we're doing intersegment branches. Plain jump 
will work just fine for this, the only thing we'd potentially need 
is the address of the routine to jump through, and that only if 
we're not going to require sub PMCs for everything. (And, honestly, 
I am and have been leaning this way)


IMHO plain jumps do not work:


Sure they do. They work as well as jumps within code, which also has 
a not-insignificant potential for problems.

But the issues you raised are some of the reasons I'd prefer 
inter-segment jumping to be done via sub dispatch.
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [CVS ci] branch_cs - intersegment branch

2003-01-22 Thread Leopold Toetsch
Dan Sugalski wrote:


At 6:13 PM +0100 1/22/03, Leopold Toetsch wrote:

IMHO plain jumps do not work:



Sure they do. They work as well as jumps within code, which also has a 
not-insignificant potential for problems.

But the issues you raised are some of the reasons I'd prefer 
inter-segment jumping to be done via sub dispatch.


Ok then:


output_is(<<'CODE', <<'OUT', "intersegment branch");
# #!/usr/bin/perl -w
# my $i= 5;
# LAB:
#$i++;
#eval("goto LAB if ($i==6)");
#print "$i\n";
#
# 7
#

.sub _test
   I1 = 5
   $S0 = ".sub _e\nif I1 == 6 goto LAB\nend\n.end\n"
   compreg P2, "PIR"
   compile P0, P2, $S0
LAB:
   inc I1
   invoke
   print I1
   print "\n"
   end
.end
CODE
7
OUT






RE: Objects, finally (try 1)

2003-01-22 Thread Matt Fowles
All~

Regarding MM dispatch, I implemented a version of this for a class of mine
once.  The version I have is in C++, and heavily uses templating to provide
compile time type checks where appropriate.  Despite these differences,
however, I think that the system of caching methods and the system of
finding the appropriate method could be easily adapted.

http://www.cs.swarthmore.edu/~bulnes/PL/lab1/index.html

If anyone finds that interesting or has corrections for me, please send
them.

Boots
-
"Computer Science is merely the post-Turing decline of Formal Systems
Theory."
-???


> -Original Message-
> From: Dan Sugalski [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, January 22, 2003 2:55 AM
> To: Piers Cawley
> Cc: Christopher Armstrong; [EMAIL PROTECTED]
> Subject: Re: Objects, finally (try 1)
>
>
> At 9:24 PM + 1/21/03, Piers Cawley wrote:
> >Dan Sugalski <[EMAIL PROTECTED]> writes:
> >  > Hrm, interesting. Single symbol table for methods and attributes,
> >>  though that's not too surprising all things considered. That may make
> >>  interoperability interesting, but I was already expecting that to some
> >>  extent.
> >
> >Isn't that, essentially what Perl 6 will have too?
>
> Nope. Attributes and methods will be very separate. Attributes are
> class-private, more or less, so they won't be in the symbol table.
> Methods, OTOH, will be, as they aren't really private at all.
>
> I've been thinking about how to handle methods, as we need a
> mechanism everyone can share--you need a method cache for good
> performance, and the last thing I want to have to deal with is a
> dozen method caches for a dozen different language implementations,
> especially as everyone's guaranteed to get the first version wrong.
> (Plus, of course, there's MM dispatch to deal with, which needs to be
> global as well)
> --
>  Dan
>
> --"it's like this"---
> Dan Sugalski  even samurai
> [EMAIL PROTECTED] have teddy bears and even
>teddy bears get drunk




Bytecode metadata

2003-01-22 Thread Dan Sugalski
Since it looks like it's time to extend the packfile format and the 
in-memory bytecode layout, this would be the time to start discussing 
metadata. What sorts of metadata do people think are useful to have 
in either the packfile (on disk) or in the bytecode (in memory).

Keep in mind that parrot may be in the position where it has to 
ignore or mistrust the metadata, so be really cautious with things 
you propose as required.
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Transferring control between code segments, eval, and suchlike things

2003-01-22 Thread Dan Sugalski
Okay, since this has all come up, here's the scoop from a design perspective.

First, the branch opcodes (branch, bsr, and the conditionals) are all 
meant for movement within a segment of bytecode. They are *not* 
supposed to leave a segment. To do so was arguably a bad idea, now 
it's officially an error. If you need to do so, branch to an op that 
can transfer across boundaries.

Design Edict #1: Branches, which is any transfer of control that 
takes an offset, may *not* escape the current bytecode segment.

Next, jumps. Jumps take absolute addresses, so either need fixup at 
load time (blech), are only valid in dynamically generated code 
(okay, but limiting), or can only jump to values in registers (that's 
fine). Jumps aren't a problem in general.

Design Edict #2: Jumps may go anywhere.

Destinations. These are a pain, since if we can go anywhere then the 
JIT has to do all sorts of nasty and unpleasant things to compensate, 
and to make every op a valid destination. Yuck.

Design Edict #3: All destinations *must* be marked as such in the 
bytecode metadata segment. (I am officially nervous about this, as I 
can see a number of ways to subvert this for evil)

I'm only keeping jumps (and their corresponding jsr) around for 
nostalgic reasons, and with the vague hope they may be useful. I'm 
not sure about this.

Design Edict #4: Dan is officially iffy on jumps, but can see them as 
useful for lower-level statically bound languages such as forth, 
Scheme, or C.

That leads us to

Design Edict #5: Dan will accommodate semantics for languages outside 
the core set (perl, python, ruby) only if they don't compromise 
performance for the core set.

Calling actual routines--subs, methods, functions, whatever--at the 
high level isn't done with branches or jumps. It is, instead, done 
with the call series of ops. (call, callmeth, callcc, tailcall, 
tailcallmeth, tailcallcc (though that one makes my head hurt), 
invoke) These are specifically for calling code that's potentially in 
other segments, and to call into them at fixed points. I think these 
need to be hashed out a bit to make them more JIT-friendly, but 
they're the primary transfer destination point

Design Edict #6: The first op in a sub is always a valid 
jump/branch/control transfer destination

Now. Eval. The compile opcode going in is phenomenally cool (thanks, 
Leo!) but has pointed out some holes in the semantics. I got 
handwavey and, well, it shows. No cookie for me.

The compreg op should compile the passed code in the language that is 
indicated and should load that bytecode into the current interpreter. 
That means that if there are any symbols that get installed because 
someone's defined a sub then, well, they should get installed into 
the interpreter's symbol tables.

Compiled code is an interesting thing. In some cases it should return 
a sub PMC, in some cases it should execute and return a value, and in 
some cases  it should install a bunch of stuff in a symbol table and 
then return a value. These correspond to:


   eval "print 12";

   $foo = eval "sub bar{return 1;}";

   require foo.pm;

respectively. It's sort of a mixed bag, and unfortunately we can't 
count on the code doing the compilation to properly handle the 
semantics of the language being compiled. So...

Design Edict #7: the compreg opcode will execute the compiled code, 
calling in with parrot's calling conventions. If it should return 
something, then it had darned well better build it and return it.

Oh, and:

Design Edict #8: compreg is prototyped. It takes a single string and 
must return a single PMC. The compiler may cheat as need be. (No need 
to check and see if it returned a string, or an int)

Yes, this does mean that for plain assembly that we want to compile 
and return a sub ref for we need to do extra in the assembly we pass 
in. Tough, we can deal. If it was dead-simple it wouldn't be 
assembly. :)

I think that's it. Let's have at it and see where the edicts need fixing.
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Parrot Developer Day(s)?

2003-01-22 Thread Dan Sugalski
Since face to face meetings are usually a lot more productive than 
e-mail exchanges when working design things out, I figure that maybe 
it'd be in our best interests to see if it's worth having a parrot 
developer day somewhere, where some set of us can get together and 
hash out things.

It's distinctly possible (though not yet definite) that I'll be in 
Frankfurt in March for the German Perl Workshop--if enough people 
could get together we could see about snagging a room and going at it.

I'm also, if we have enough people and can get space, close enough to 
New York City and Boston (as I live in between) to do this, and I'm 
certainly up for going anywhere else if someone else is in a position 
to spring for travel, if we can get enough folks together to make it 
worthwhile.

Also, I know that we do have people scattered all over the world, but 
if someone wants to try and get a list of who's where, we may find 
it's worth it to get groups of people together. (I don't, after all, 
have to be involved... :)
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #18056] [PATCH] Extending the Packfile (Part 1.)

2003-01-22 Thread Dan Sugalski
At 12:28 AM -0500 1/22/03, James Mastros wrote:

If we care about reading old packfiles on newer parrots,


Until we reach 1.0, we don't. As long as we make sure the magic 
number in the header of the file is sufficient to make the execution 
fail, that's fine for now.
--
Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Transferring control between code segments, eval, and suchlike things

2003-01-22 Thread Benjamin Stuhl
At 03:00 PM 1/22/2003 -0500, you wrote:

Okay, since this has all come up, here's the scoop from a design perspective.

First, the branch opcodes (branch, bsr, and the conditionals) are all 
meant for movement within a segment of bytecode. They are *not* supposed 
to leave a segment. To do so was arguably a bad idea, now it's officially 
an error. If you need to do so, branch to an op that can transfer across 
boundaries.

Design Edict #1: Branches, which is any transfer of control that takes an 
offset, may *not* escape the current bytecode segment.

Seems reasonable. Especially when they bytecode loader may not guarantee 
the relative placement of segments (think mmap()). Although,
all this would seem to suggest that we'd need/want a special-purpose 
allocator for bytecode segments, since every sub has to fit within precisely
one segment (and I know _I'd_ like to keep bytecode segments on their own 
memory pages, to e.g. maximize sharing on fork()).

Next, jumps. Jumps take absolute addresses, so either need fixup at load 
time (blech), are only valid in dynamically generated code (okay, but 
limiting), or can only jump to values in registers (that's fine). Jumps 
aren't a problem in general.

Fixups aren't so bad if we make the jump opcode itself take an index into a 
table of fixups (thus letting the bytecode stream stay read-only). Register 
jumps
are dangerous, since parrot can't control what the user code loads into the 
register (while we can theoretically protect the fixup table from anything 
short of
native code).

Design Edict #2: Jumps may go anywhere.

Destinations. These are a pain, since if we can go anywhere then the JIT 
has to do all sorts of nasty and unpleasant things to compensate, and to 
make every op a valid destination. Yuck.

Design Edict #3: All destinations *must* be marked as such in the bytecode 
metadata segment. (I am officially nervous about this, as I can see a 
number of ways to subvert this for evil)

Marked destinations are very important; as for evil subversion, how about 
just saying "untrusted code only gets pure interpretation, and the 
untrusting interpreter bounds-checks everything"?

[snip]
Calling actual routines--subs, methods, functions, whatever--at the high 
level isn't done with branches or jumps. It is, instead, done with the 
call series of ops. (call, callmeth, callcc, tailcall, tailcallmeth, 
tailcallcc (though that one makes my head hurt), invoke) These are 
specifically for calling code that's potentially in other segments, and to 
call into them at fixed points. I think these need to be hashed out a bit 
to make them more JIT-friendly, but they're the primary transfer 
destination point

Design Edict #6: The first op in a sub is always a valid 
jump/branch/control transfer destination

Wouldn't make much sense if you had a sub but couldn't call it, now would 
it? :-D

Now. Eval. The compile opcode going in is phenomenally cool (thanks, Leo!) 
but has pointed out some holes in the semantics. I got handwavey and, 
well, it shows. No cookie for me.

The compreg op should compile the passed code in the language that is 
indicated and should load that bytecode into the current interpreter. That 
means that if there are any symbols that get installed because someone's 
defined a sub then, well, they should get installed into the interpreter's 
symbol tables.

Compiled code is an interesting thing. In some cases it should return a 
sub PMC, in some cases it should execute and return a value, and in some 
cases  it should install a bunch of stuff in a symbol table and then 
return a value. These correspond to:


   eval "print 12";

   $foo = eval "sub bar{return 1;}";

   require foo.pm;

respectively. It's sort of a mixed bag, and unfortunately we can't count 
on the code doing the compilation to properly handle the semantics of the 
language being compiled. So...

Design Edict #7: the compreg opcode will execute the compiled code, 
calling in with parrot's calling conventions. If it should return 
something, then it had darned well better build it and return it.

How does this play with

eval 'sub bar { change_foo(); } BEGIN { bar(); }  (...stuff that depends on 
foo...)';

? The semantics of BEGIN{} would seem to require that bar be installed into 
the symbol table immediately... but then how do we reproduce that if we're 
e.g. loading
precompiled bytecode?

Oh, and:

Design Edict #8: compreg is prototyped. It takes a single string and must 
return a single PMC. The compiler may cheat as need be. (No need to check 
and see if it returned a string, or an int)

Yes, this does mean that for plain assembly that we want to compile and 
return a sub ref for we need to do extra in the assembly we pass in. 
Tough, we can deal. If it was dead-simple it wouldn't be assembly. :)

That makes sense.

-- BKS




Re: Objects, finally (try 1)

2003-01-22 Thread Christopher Armstrong
On Wed, Jan 15, 2003 at 01:57:28AM -0500, Dan Sugalski wrote:
> At 9:37 PM -0500 1/14/03, Christopher Armstrong wrote:
> >But who knows, maybe it could be made modular enough (i.e., more
> >interface-oriented?) to allow the best of both worlds -- I'm far too
> >novice wrt Parrot to figure out what it'd look like, unfortunately.
> 
> It'll actually look like what we have now. If you can come up with 
> something more abstract than:
> 
>   callmethod P1, "foo"
> 
> that delegates the calling of the foo method to the method dispatch 
> vtable entry for the object in P1, well... gimme, I want it. :)

Just curious. Exactly how overridable is that `callmethod'? I don't
really know anything about the vtable stuff in Parrot, but is it
possible to totally delegate the lookup/calling of "foo" to a function
that's bound somehow to P1? Or does the "foo" entry have to exist in
the vtable already? Sorry for the naive question :-) Oh, and if you
just want to point me at a source file, I guess I can try reading it
:-) Python basically requires that each step in the process be
overridable. (1. look up attribute 2. call attribute, at least in
`callmethod's case).

Thanks.

-- 
 Twisted | Christopher Armstrong: International Man of Twistery
  Radix  |  Release Manager,  Twisted Project
-+ http://twistedmatrix.com/users/radix.twistd/



Re: Parrot Developer Day(s)?

2003-01-22 Thread gregor
Dan --

Cincinnati, Ohio.

And, I'll make my office available for the meeting, if there aren't so 
many
people that it would be impractical (unlikely, I expect, but CMA anyway).


-- Gregor





Dan Sugalski <[EMAIL PROTECTED]>
01/22/2003 03:22 PM

 
To: [EMAIL PROTECTED]
cc: 
Subject:Parrot Developer Day(s)?


Since face to face meetings are usually a lot more productive than 
e-mail exchanges when working design things out, I figure that maybe 
it'd be in our best interests to see if it's worth having a parrot 
developer day somewhere, where some set of us can get together and 
hash out things.

It's distinctly possible (though not yet definite) that I'll be in 
Frankfurt in March for the German Perl Workshop--if enough people 
could get together we could see about snagging a room and going at it.

I'm also, if we have enough people and can get space, close enough to 
New York City and Boston (as I live in between) to do this, and I'm 
certainly up for going anywhere else if someone else is in a position 
to spring for travel, if we can get enough folks together to make it 
worthwhile.

Also, I know that we do have people scattered all over the world, but 
if someone wants to try and get a list of who's where, we may find 
it's worth it to get groups of people together. (I don't, after all, 
have to be involved... :)
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk






[PATCH] nci.pmc mark routine

2003-01-22 Thread Steve Fink
I'm confused by nci.pmc's mark() routine. It calls pobject_lives() on
the ->cache.struct_val pointer. But in set_string_keyed(), that seems
to be set to a pointer to a function, which is definitely not a PObj*.
The ->data field, on the other hand, appears to be a PObj*. And
changing mark() to mark it instead of struct_val makes the imcc tests
go from failing to passing on my machine. But I don't really know
what's going on here. Is the below patch correct?

Index: classes/nci.pmc
===
RCS file: /cvs/public/parrot/classes/nci.pmc,v
retrieving revision 1.9
diff -u -r1.9 nci.pmc
--- classes/nci.pmc 21 Jan 2003 23:15:04 -  1.9
+++ classes/nci.pmc 23 Jan 2003 04:07:04 -
@@ -28,9 +28,8 @@
 }
 
 void mark () {
-   PMC *f = SELF->cache.struct_val;
-   if (f)
-   pobject_lives(INTERP, (PObj *)f);
+   if (SELF->data)
+   pobject_lives(INTERP, (PObj *)SELF->data);
 }
 
 STRING* name () {




Re: Transferring control between code segments, eval, and suchlike things

2003-01-22 Thread Dan Sugalski
At 6:24 PM -0500 1/22/03, Benjamin Stuhl wrote:

At 03:00 PM 1/22/2003 -0500, you wrote:

Okay, since this has all come up, here's the scoop from a design perspective.

First, the branch opcodes (branch, bsr, and the conditionals) are 
all meant for movement within a segment of bytecode. They are *not* 
supposed to leave a segment. To do so was arguably a bad idea, now 
it's officially an error. If you need to do so, branch to an op 
that can transfer across boundaries.

Design Edict #1: Branches, which is any transfer of control that 
takes an offset, may *not* escape the current bytecode segment.

Seems reasonable. Especially when they bytecode loader may not 
guarantee the relative placement of segments (think mmap()). 
Although,
all this would seem to suggest that we'd need/want a special-purpose 
allocator for bytecode segments, since every sub has to fit within 
precisely
one segment (and I know _I'd_ like to keep bytecode segments on 
their own memory pages, to e.g. maximize sharing on fork()).

Every sub doesn't have to fit in a single segment, though. There may 
well be a half-zillion subs in any one segment. (Though one segment 
per sub does give us some interesting possibilities for GCing unused 
code)

Next, jumps. Jumps take absolute addresses, so either need fixup at 
load time (blech), are only valid in dynamically generated code 
(okay, but limiting), or can only jump to values in registers 
(that's fine). Jumps aren't a problem in general.

Fixups aren't so bad if we make the jump opcode itself take an index 
into a table of fixups (thus letting the bytecode stream stay 
read-only). Register jumps
are dangerous, since parrot can't control what the user code loads 
into the register (while we can theoretically protect the fixup 
table from anything short of
native code).

Indirection. Ick. :)

Though, on the other hand, a jump with an integer constant 
destination is pretty pointless, so we could consider using that to 
index into a jump table. OTOH, it'd be the only thing using the jump 
table, so I'm not sure it's worth it. Might speed things up some. 
I'll think on that for a bit.

Design Edict #2: Jumps may go anywhere.

Destinations. These are a pain, since if we can go anywhere then 
the JIT has to do all sorts of nasty and unpleasant things to 
compensate, and to make every op a valid destination. Yuck.

Design Edict #3: All destinations *must* be marked as such in the 
bytecode metadata segment. (I am officially nervous about this, as 
I can see a number of ways to subvert this for evil)

Marked destinations are very important; as for evil subversion, how 
about just saying "untrusted code only gets pure interpretation, and 
the untrusting interpreter bounds-checks everything"?

True, and we'll not be JITting safe-mode code, or likely not at least 
because of the resource constraint checking.

[snip]

Calling actual routines--subs, methods, functions, whatever--at the 
high level isn't done with branches or jumps. It is, instead, done 
with the call series of ops. (call, callmeth, callcc, tailcall, 
tailcallmeth, tailcallcc (though that one makes my head hurt), 
invoke) These are specifically for calling code that's potentially 
in other segments, and to call into them at fixed points. I think 
these need to be hashed out a bit to make them more JIT-friendly, 
but they're the primary transfer destination point

Design Edict #6: The first op in a sub is always a valid 
jump/branch/control transfer destination

Wouldn't make much sense if you had a sub but couldn't call it, now 
would it? :-D

Don't tempt the JAPHers!


Now. Eval. The compile opcode going in is phenomenally cool 
(thanks, Leo!) but has pointed out some holes in the semantics. I 
got handwavey and, well, it shows. No cookie for me.

The compreg op should compile the passed code in the language that 
is indicated and should load that bytecode into the current 
interpreter. That means that if there are any symbols that get 
installed because someone's defined a sub then, well, they should 
get installed into the interpreter's symbol tables.

Compiled code is an interesting thing. In some cases it should 
return a sub PMC, in some cases it should execute and return a 
value, and in some cases  it should install a bunch of stuff in a 
symbol table and then return a value. These correspond to:


   eval "print 12";

   $foo = eval "sub bar{return 1;}";

   require foo.pm;

respectively. It's sort of a mixed bag, and unfortunately we can't 
count on the code doing the compilation to properly handle the 
semantics of the language being compiled. So...

Design Edict #7: the compreg opcode will execute the compiled code, 
calling in with parrot's calling conventions. If it should return 
something, then it had darned well better build it and return it.

How does this play with

eval 'sub bar { change_foo(); } BEGIN { bar(); }  (...stuff that 
depends on foo...)';

? The semantics of BEGIN{} would seem to require that bar be 
in

Re: Bytecode metadata

2003-01-22 Thread James Michael DuPont

--- Dan Sugalski <[EMAIL PROTECTED]> wrote:
> Since it looks like it's time to extend the packfile format and the 
> in-memory bytecode layout, this would be the time to start discussing
> 
> metadata. What sorts of metadata do people think are useful to have 
> in either the packfile (on disk) or in the bytecode (in memory).
> 
> Keep in mind that parrot may be in the position where it has to 
> ignore or mistrust the metadata, so be really cautious with things 
> you propose as required.

Dear Dan,

I would like to see a powerful meta-data system made possible,
even if it is not implemented immediatly. The symantic web researchers
like David Beckett and Tim Bernard-Lee have been working on powerfull
systems to support meta-data in general, maybe as the parrot meta-data
is just getting started, we can cut a bit of that off? 

Take a look at the list here at Diffuse MetaData Interchange [4] at the
bottom of this mail, you will see an overview of metadata systems. 
Even if they are not specific to parrot, the goals are similar in many
casess.

Recently I have been making progress with the rdf[1], specifically with
the redland application framework[2]. With the simple concept of
triples of data, a triple being (subject, predicate, object) we are
able to capture the metadata of the gcc compiler, and I hope other
compilers and systems.

Redland is written in clean c, and supports meta-data storage in
memory, and on disk in multiple formats, in rdf/xml, rdf/ntriples (even
in berkleydb). It would be possible to create a new storage model to
store the a packfile as well. 

The subjects are the items in the program, the node, each getting a
number inside the system. Predicates are important, the represent the
meat of the system. The objects are either literal data or other
subjects.

Via the redland api, you can add in new statements about things, and
find all the statements about a subject, about an object, all that meet
a predicate.

I tell you this, because maybe you want to provide this sort of
flexible meta-data api into parrot : 
for example the predicates that we extract that you might find
interesting :

*  Filename of the node

*  Line number of the node (the Column Number is not supported yet)

*  Internal Type of the node (variable declaration, type, integer
const, etc), as opposed to the type of the 

*  Name of the node (the identifier)

*  Type of the node (if it is a variable, or constant) this is a
pointer to another node 

*  Unsigned Type of a type, if a type supports itself being unsigned,
here it is.

*  Comments are supported, but not used yet, but would be a good idea.


Now we get into more specific types of predicates

*  Parameters of an expression
*  Variables in a block
*  Size of a variable
*  Alignment of a variable
*  Constant flag
*  Volitile flag

then we have 
*  Fields of a struct
*  Parameters of a function
*  Return type of a function
*  Body block of a function

So, with this idea of meta-data, by adding more predicates, 
you can support the capturing and storage of all the source code in an
abstract form, or just the basic function data. 

You will probably think that this is overkill for parrot, but I think
that it would give you an extensible system to add in new forms of
meta-data as langauges are added. Via OWL[3] the users will be able to
define the meaning and the classes of metadata as well.

mike

[1] RDF   http://www.w3.org/RDF/
[2] Redland   http://www.redland.opensource.ac.uk/
[3] OWL   http://www.w3.org/TR/owl-absyn/
[4] Diffuse MetaData Interchange standards
  http://www.diffuse.org/meta.html

=
James Michael DuPont
http://introspector.sourceforge.net/

__
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com



This week's Perl 6 summary

2003-01-22 Thread p6summarizer
The Perl 6 Summary for the week ending 20030119
Summary time again, damn but those tuits are hard to round up. Guess,
what? perl6-internals comes first. 141 messages this week versus the
language list's 143.

  Objects (again)
Objects were still very much on everyone's mind as the discussions of
Dan's initial thoughts about objects in Parrot continued. Jonathan
Sillito put up a list of questions about Dan's original message which
Dan answered promptly. Down the thread a little Dan mentioned that he
hoped Parrot's objects would serve, reasonably unmodified for a bunch of
languages (ie, he hoped that there wouldn't be a requirement for
PythonRef/Attr/Class/Object etc), Chris Armstrong thought that, given
what Dan had outlined so far, that wouldn't be straightforward. Dan
thanked him for throwing a spanner in the works, asking for more details
which Chris provided.

Meanwhile Jonathan had some supplementary questions... Hmm... doing this
blow by blow will take forever. Suffice to say that details are being
thrashed out. At one point Dan's head started to spin as terminology
mismatches started to bite, leading Nicholas Clark to suggest an
entirely new set of terms involving houses and hotels (but with some
serious underpinnings).

http://makeashorterlink.com/?Y21952033 -- thread root, from last week.

http://makeashorterlink.com/?M52912033 -- Jonathan's questions

http://makeashorterlink.com/?A23912033 -- Chris throws a spanner

http://makeashorterlink.com/?Z44922033 -- Nicholas tries for a
  monopoly on silliness

  Optimizing and per file flags
Nicholas Clark wrote about requiring the ability to adjust compiler
optimization flags on a per file basis (brought up by Dan on IRC
apparently) and proposed a scheme. Quote of the thread (and quite
possibly the year so far): "When unpack is going into an infinite loop
on a Cray 6000 miles away that you don't have any access to, there isn't
much more you can do." Thanks for that one Nick.

http://makeashorterlink.com/?I15965033

  The draft todo/worklist
Dan posted his current todo/worklist, which he described as "reasonably
high level, and a bit terse". I particularly liked the last entry
"Working Perl 5 parser". Surprisingly, there was very little discussion,
maybe everyone liked it.

http://makeashorterlink.com/?L56921033

  Parrot Examples
Joe Yates asked if we could add a helloworld.pasm to
parrot/examples/assembly. Joseph Guhlin wondered what was so special
about

print "Hello, world\n"
end

that it would need a file of its own (though he did forget the "end" in
his post, and segfaults are not really what you want in sample code).

http://makeashorterlink.com/?E27912033

  Thoughts on infant mortality (continued)
Jason Gloudon posted a wonderfully clear exposition of the problems
facing anyone trying to implement a portable, incremental garbage
collector for Parrot which sparked a small amount of discussion and
muttering from Dan about the temptation to program down to the metal.

http://makeashorterlink.com/?X38931033

  Operators neg and abs in core.ops
Bernhard Schmalhofer posted an enormous patch adding "neg" and "abs"
operators to core.ops. There were a few issues with the patch so it
hasn't gone in yet and an issue with what underlying C functions are
available reared its head too.

http://makeashorterlink.com/?L29951033

  The "eval" patch
Leo Tötsch seems to have spent most of the week working on getting
"eval" working and he opened a ticket on rt.perl.org to track what's
happening with it. The response to this can be summarized as 'Wow!
Fabulous!'.

Once more, for Googlism, Leopold Toetsch is my hero.

http://makeashorterlink.com/?I5A922033

  Pretty Pictures
Mitchell N Charity posted some pretty pictures that he'd generated with
doxygen and graphviz. Most of the responses to this suggested he use
different tools. Ah well.

http://makeashorterlink.com/?B2B921033

  Solaris tinderbox failures
Andy Dougherty created an RT ticket for the Solaris tinderbox, which
have been failing with the delightfully useful 'PANIC: Unknown signature
type" and wondered if things could be fixed up to be a little more
informative. Apparently it was as issue with Leo's recently checked in
eval patch. So Leo fixed it.

http://makeashorterlink.com/?D1C925033

  Parrot compilers
Cory Spencer wondered about how the current compilers that target parrot
work, noting that they seem to be duplicating a good deal of labour, and
wondered if anyone had worked on a gcc like framework with a
standardized Abstract Syntax Tree (AST). Everyone pointed him at IMCC.
Gopal V also pointed out that, given the variety of implementation
languages (C, Pe

Re: Objects, finally (try 1)

2003-01-22 Thread Erik Bågfors
On Wed, 2003-01-22 at 19:46, Christopher Armstrong wrote:
> On Wed, Jan 15, 2003 at 01:57:28AM -0500, Dan Sugalski wrote:
> > At 9:37 PM -0500 1/14/03, Christopher Armstrong wrote:
> > >But who knows, maybe it could be made modular enough (i.e., more
> > >interface-oriented?) to allow the best of both worlds -- I'm far too
> > >novice wrt Parrot to figure out what it'd look like, unfortunately.
> > 
> > It'll actually look like what we have now. If you can come up with 
> > something more abstract than:
> > 
> >   callmethod P1, "foo"
> > 
> > that delegates the calling of the foo method to the method dispatch 
> > vtable entry for the object in P1, well... gimme, I want it. :)
> 
> Just curious. Exactly how overridable is that `callmethod'? I don't
> really know anything about the vtable stuff in Parrot, but is it
> possible to totally delegate the lookup/calling of "foo" to a function
> that's bound somehow to P1? Or does the "foo" entry have to exist in
> the vtable already? Sorry for the naive question :-) Oh, and if you
> just want to point me at a source file, I guess I can try reading it
> :-) Python basically requires that each step in the process be
> overridable. (1. look up attribute 2. call attribute, at least in
> `callmethod's case).


Ruby needs to call the missing_method method (if I remember correctly). 
So if "foo" doesn't exist, it would be good to be able to override
callmethods behavior and make it call missing_method.

/Erik

-- 
Erik Bågfors   | [EMAIL PROTECTED]
Supporter of free software | GSM +46 733 279 273
fingerprint:  A85B 95D3 D26B 296B 6C60 4F32 2C0B 693D 6E32



RE: Objects, finally (try 1)

2003-01-22 Thread Brent Dax
Christopher Armstrong:
# On Wed, Jan 15, 2003 at 01:57:28AM -0500, Dan Sugalski wrote:
# > At 9:37 PM -0500 1/14/03, Christopher Armstrong wrote:
# > >But who knows, maybe it could be made modular enough (i.e., more
# > >interface-oriented?) to allow the best of both worlds -- 
# I'm far too 
# > >novice wrt Parrot to figure out what it'd look like, unfortunately.
# > 
# > It'll actually look like what we have now. If you can come up with
# > something more abstract than:
# > 
# >   callmethod P1, "foo"
# > 
# > that delegates the calling of the foo method to the method dispatch
# > vtable entry for the object in P1, well... gimme, I want it. :)
# 
# Just curious. Exactly how overridable is that `callmethod'?

Extremely.  callmethod maps to a function with a signature something
like

opcode_t * MyPMCClass_callmethod(Parrot_Interp interpreter, PMC*
self, char* name)

Which returns a pointer to the method's entry point, so that we don't
have C-level recursion for every method call.  (It's also allowed to
just perform the call itself and return NULL, so you can call into C
efficiently.)  The trick is to override MyPMCClass's callmethod to
provide whatever semantics you want to have.  We're currently bickering
over whether you can cache pmc->vtable->callmethod's return value or
not, but it looks like either way it should be easily updatable.

# Python basically requires that each step in the process be
# overridable. (1. look up attribute 2. call attribute, at least in
# `callmethod's case).

I'm not sure exactly how this would be implemented but...um...I'm sure
you *could* do it.  ;^)

Dan: with the various AUTOLOAD-esque features, can we reasonably expect
to be able to have One True Object PMC?

--Brent Dax <[EMAIL PROTECTED]>
@roles=map {"Parrot $_"} qw(embedding regexen Configure)

>How do you "test" this 'God' to "prove" it is who it says it is?
"If you're God, you know exactly what it would take to convince me. Do
that."
--Marc Fleury on alt.atheism