Q: pdd21 - namespaces.pod

2006-03-07 Thread Leopold Toetsch

I've started implementing pdd21, but I got some more questions:

   $P0 = find_global $P1, $S0
   $P0 = find_global $S0
   Find the variable $S0 in $P1 or the current namespace.

What about subroutines? Or should it return whatever is stored under the 
given name?



   Compile-time Creation
   ...
 .HLL "Perl5", "perl5_group"
 .namespace [ "Foo" ]
   ...
store_global "$x", $P0
 .end

   In this case, the "main" sub would be tied to Perl5 by the 
".HLL" directive, so a Perl5 namespace would be created.


As this is storing into the current namespace, which I presume is 
'Perl5::Foo', this would be equivalent to: '$Perl5::Foo::x'? That is the 
'.HLL' automagically selects everything being in 'Perl5' namespace here.



   Run-time Creation
   ...
   store_global $P2, $S0, $P3

This is the same variable as above, but the 'store_global' with an 
explicit namespace. As the namespace is now absolute, wouldn't that need


unshift $P2, 'Perl5'

before the 'store_global'? Or is the languages namespace prepended 
automatically? How can some HLL access a different HLL namespace then?


Thanks for comments,
leo



Re: Q: pdd21 - namespaces.pod

2006-03-07 Thread Chip Salzenberg
On Tue, Mar 07, 2006 at 02:53:39PM +0100, Leopold Toetsch wrote:
> I've started implementing pdd21

"and there was much rejoicing" :-)  Great questions, I know exactly why
you're asking them, and I'm thinking I need to amend pdd21 to include some
of the answers.

> but I got some more questions:
> 
>$P0 = find_global $P1, $S0
>$P0 = find_global $S0
>Find the variable $S0 in $P1 or the current namespace.
> 
> What about subroutines? Or should it return whatever is stored under the 
> given name?

The *_global opcodes should use the untyped interface.  (As they do now,
since now that's the only kind.  :-))

So, if the sample is Perl code, $S0 will (presumably) contain a name with a
sigil, and Parrot won't translate anything; it will just use the namespace's
hash interface.

This choice means that grafting cross-language sub-namespaces into each
others' namespaces won't work right in the general case.  (e.g. Don't try to
make a Python namespace 'foo' available in Perl as '%Foo::'.)

But that's OK; Parrot cross-HLL code isn't supposed to do that, it's
supposed to use item-by-item aliasing, either by hand or using the general
exporter.

It's hard enough to make explicit importation work across languages; we
don't need the *_global opcodes to handle cross-language too, we want them
to be fast & simple.

[The '$Perl5::Foo::x' example is elided as I think this answer obviates it.]

Moving on:

>...
>  .HLL "Perl5", "perl5_group"
>  .namespace [ "Foo" ]
>...
> store_global "$x", $P0
>  .end
> 
>In this case, the "main" sub would be tied to Perl5 by the 
> ".HLL" directive, so a Perl5 namespace would be created.

Right.

> As this is storing into the current namespace, which I presume is 
> 'Perl5::Foo'

Well, the leading 'perl5' is always lower case.  :-)  But yes.

>Run-time Creation
>...
>store_global $P2, $S0, $P3
> 
> This is the same variable as above, but the 'store_global' with an 
> explicit namespace. As the namespace is now absolute, wouldn't that need
> 
> unshift $P2, 'Perl5'
> 
> before the 'store_global'? Or is the languages namespace prepended 
> automatically? How can some HLL access a different HLL namespace then?

That's such a good question, it has a three-and-thre-quarters-part answer:

 1. Effectively, from the user's point of view: Yes, the unshift is
implied.

 2. Runtime efficiency and memory usage would suffer if we really did the
unshift.  I suggest caching a pointer to the HLL base namespace in the
HLL info.

a. As with all optimizations, this cheats: It implies that users can't
   switch out an HLL namespace after it's created.  But in the specific
   cases of the genuine root namespace and the top-level namespace for
   entire HLLs, I'm OK with that.

 3. Cross-language work won't use the *_global opcodes don't have to support
but rather, explicit namespace object manipulation.

a. The typed interface exists to enable cross-language work, and that
   interface uses method names to specify type; none of this is a good
   fit for the *_global opcodes, which have no way to specify type.
   (We could make them know about types, but that'd be anti-huffman;
   intra-HLL work is overwhelmingly more common than inter-HLL work.)

b. Users will need a way to get a pointer to the absolute root
   namespace.  An interpinfo code might work well for this.  Then users
   will do manual namespace navigation to reach the HLL they want.  This
   isn't exactly -automatic-, but though cross-HLL work should be easy
   and supported, it does require thought and is relatively rare, so we
   don't have to optimize it for brevity.

   i. No namespace name should be privileged to mean something special
  in the Parrot namespace code.  For example, using [''] or ["\0"]
  as a standard to mean "absolute root" is not OK.  On the other
  hand, weird_hll_namespace objects can do whatever they want


Thanks for the good questions, I hope this answers them.
-- 
Chip Salzenberg <[EMAIL PROTECTED]>


Re: Q: pdd21 - namespaces.pod

2006-03-07 Thread Leopold Toetsch

Chip Salzenberg wrote:


Thanks for the good questions, I hope this answers them.


Thanks, yes. So I got a new one ;)

In which namespaces should Parrot store PMC methods and multi subs? 
Currently Parrot occupies these namespaces:


  __parrot_core  ... buitin multisubs like '__add'
  Integer... all PMCs with methods are in a toplevel namespaces,
  String with their class name
  ...


leo



Re: Patch for nested macro support

2006-03-07 Thread Chip Salzenberg
Neat: It's backward-compatible and makes macros more useful, so file it
under "improvement" and commit it.  Two and a half Qs:

It looks to me like this implementation is safe against "{" and "}" in
strings, right?

(Not a new issue, but since we're on the subject of macros:) If I define a
macro named eg. ".local", does it expand as a macro, i.e. is it recognized
before or after core Parrot keywords?  I think the example of Perl keywords
vs. user-defined functions teaches us it's a good idea for macros to win in
case of conflict, for backward compatibility when we introduce new keywords.
-- 
Chip Salzenberg <[EMAIL PROTECTED]>


Re: Q: pdd21 - namespaces.pod

2006-03-07 Thread Chip Salzenberg
On Tue, Mar 07, 2006 at 05:21:43PM +0100, Leopold Toetsch wrote:
> Chip Salzenberg wrote:
> 
> >Thanks for the good questions, I hope this answers them.
> 
> Thanks, yes. So I got a new one ;)

No rest for the wicked.  :-)

> In which namespaces should Parrot store PMC methods and multi subs? 
> Currently Parrot occupies these namespaces:
> 
>   __parrot_core  ... buitin multisubs like '__add'
>   Integer... all PMCs with methods are in a toplevel namespaces,
>   String with their class name
>   ...

Working from first principles, and making some progress, I write (and invite
analysis of) these points:


1. The default target for 'newclass' must be inside the current HLL, or else
   Parrot and Perl code can't both say 'newclass "Integer"'.

   Confidence: HIGH

a. This implies that there should be a variant of newclass for global
   classes.  A 'newglobalclass' opcode is the most obvious solution, and
   simultaneously the least elegant, because it implies that there will
   be other *global* opcodes, and I don't like where that's going.
   Alternatively, newclass could accept an optional starting namespace
   parameter, which defaults to the current HLL namespace.

   Confidence: HIGH in need for solution, LOW in actual solution


2. Therefore, class lookup must be two-level: the current HLL first, and the
   global list.

   Confidence: HIGH

a. Does this imply that we should allow users to see and modify a list
   of namespaces to search for classes, a la $ENV{PATH}?

   Confidence: LOW, but intriguing


3. Under the assumption that PMC names are supposed to map to HLL class
   names (else what will?), it makes a lot of sense for newclass, the new
   operator, etc. all to allow multi-level names.

   Confidence: MEDIUM; it's possible that explicit namespace manipulation
   could take the place of full keyed names here, and perhaps
   also help solve the problem described in 1(a) above.


4. I'd like for global Parrot classes to live in a namespace other than the
   root, and that name should be something users can count on so they can
   do introspection (e.g. to detect name conflicts).
   Let's call it "class".

   Confidence: HIGH (cluttered roots are bad)

a. Just like the HLL namespace, it makes sense to cache this namespace
   pointer, so we don't have to keep looking it up.


And a bonus policy point:

5. All root names (including root namespaces) that start with '_' shall be
   designated now and forevermore to be something that users are not allowed
   to count on (though of course exploration is always allowed).

   So it's OK for us to leave the builtins in '__parrot_core', because I
   don't have any better suggestion at the moment, and we can move them
   later.  Any users who depended on finding them there will have broken the
   basic rule I just made up.

   (Besides, the leading '_' should clue in anybody who's ever used C or
   C++ that it's private, which is presumably what was intended anyway.)

-- 
Chip Salzenberg <[EMAIL PROTECTED]>


Integer types (was Re: early draft of I/O PDD)

2006-03-07 Thread Jonathan Worthington

"Leopold Toetsch" <[EMAIL PROTECTED]> wrote:
Depending on the arch (32 vs 64 bits) one of these opcodes is 
suboptimal. With a new "L" (Long) register type the functionality could 
be handled transparently:


  $L0 = pio.'tell'()



Yes, but as you add more register types you get a combinatorial blow-up 
on various opcodes.


This depends on the implementation of 'opcodes'. With the current scheme 
any such extension isn't really implementable because of the combinatorial 
opcode explosion. I've written a (still internal) proposal that would 
prevent the combinatorial issue. It (or some similar thing) would be 
indeed necessary to even think about more register types like int8, int16, 
int32/int64 or float32.


Iff we want those register types.  .NET is interesting in that it recognises 
int8, int16 etc as fundemental types, but on the stack only recognizes 
int32, int64 and native int (out of the integer types anyway).  If you want 
to have an int8 then you just do the ops in an int32 and then use a conv.i1 
instruction.  I think the wisdom here is, where do we actually really need 
to support int8.  And it's in arrays that it matters most for example, for 
size reasons.


 My understanding was that "I" registers were native integers so you 
could get good performance, and you used a PMC if you wanted some 
guarantees about what size you were talking about.


In the long run, we certainly don't want to use PMCs for e.g. 
immplementing bytes (int8) or some such for performance reasons. 'long 
long' aka int64 is usually supported by compilers and has for sure by far 
better performanche than a BigInt PMC.
Actually using 'int8' or 'float32' is usually only important, if you have 
huge arrays of these. That means that there's a very limited need for 
opcodes using these types, just some basic math mainly and array 
fetch/store. Or in other words: what is supported by the hardware CPU.


If it's just arrays, then we can provide a (Fixed|Resizable)ByteArray PMC, 
etc.  I don't think we need any instructions to specially handle doing 8-bit 
arithmetic.  Maybe we want something to truncate a 32-bit to an 8-bit etc, 
maybe throwing an exception on overflow.


The register allocator would map 'L0' either to a pair (I0, I1) on 32 
bit arch or just to 'I0' on 64 bit arch.
Yes, but surely it becomes somewhat more than just a mapping problem? 
For example, what do we do about:


add L0, L1, L2
mul L0, L1, L2


I don't see any problem with above code.

The register mapping rules would be something like:
- Lx occupies registers I(2x, 2x+1) - this is compile time,
  that is 'L1' prevents 'I2' and 'I3' from being assigned by the register 
allocator

- the runtime mapping isn't portable due to endianess and sizeof types
  'L1' might be 'I1' on 64-bit arch or (I2,I3) or (I3,I2) on 32-bit arch
Yup, and I really, really don't like the idea of making our bytecode format 
non-portable.  Part of the point of having a VM is portability, right?


- if you write PASM, overlapping Ix/Ly may cause warnings or errors, but 
could be used in a non-portable way, if you know what you are doing on a 
specific platform.



You still didn't address my question with these points, though.

mul L0, L1, L2

Isn't just a case of churning out something like:-

mul I0, I2, I4
mul I1, I3, I5

So it's not just so simple as a "map 1 L to 2 Is" problem.

Jonathan 



Re: Integer types (was Re: early draft of I/O PDD)

2006-03-07 Thread Leopold Toetsch


On Mar 7, 2006, at 23:44, Jonathan Worthington wrote:



- if you write PASM, overlapping Ix/Ly may cause warnings or errors, 
but could be used in a non-portable way, if you know what you are 
doing on a specific platform.



You still didn't address my question with these points, though.

mul L0, L1, L2

Isn't just a case of churning out something like:-

mul I0, I2, I4
mul I1, I3, I5

So it's not just so simple as a "map 1 L to 2 Is" problem.


Well, there wasn't a question to me ;)

  mul L0, L1, L2

is of course a distinct opcode that does int64 arithmetic.

leo



Re: Integer types (was Re: early draft of I/O PDD)

2006-03-07 Thread Leopold Toetsch


On Mar 7, 2006, at 23:44, Jonathan Worthington wrote:



The register mapping rules would be something like:
- Lx occupies registers I(2x, 2x+1) - this is compile time,
  that is 'L1' prevents 'I2' and 'I3' from being assigned by the 
register allocator

- the runtime mapping isn't portable due to endianess and sizeof types
  'L1' might be 'I1' on 64-bit arch or (I2,I3) or (I3,I2) on 32-bit 
arch
Yup, and I really, really don't like the idea of making our bytecode 
format non-portable.  Part of the point of having a VM is portability, 
right?


The described mapping doesn't have any PBC portability issues AFAIK. If 
'L' is mapping to 'I' or not is chosen at runtime.


leo



Re: Patch for nested macro support

2006-03-07 Thread Joshua Isom
I've committed it as of r11820.  Since it parses by tokens, braces 
inside of strings are allowed.


With regard to clashing, pir specials take precedent over macros.  The 
complications that could arise from accidental recursion, etc, seems 
complex.  As for your .local example, you can always use .my or .our 
instead.


.macro my(var, type)
.sym pmc .var
.var = new .type
.endm
.sub main
.my(foo, .String)
foo = "hello\n"
print foo
.end

On Mar 7, 2006, at 11:22 AM, Chip Salzenberg wrote:


Neat: It's backward-compatible and makes macros more useful, so file it
under "improvement" and commit it.  Two and a half Qs:

It looks to me like this implementation is safe against "{" and "}" in
strings, right?

(Not a new issue, but since we're on the subject of macros:) If I 
define a
macro named eg. ".local", does it expand as a macro, i.e. is it 
recognized
before or after core Parrot keywords?  I think the example of Perl 
keywords
vs. user-defined functions teaches us it's a good idea for macros to 
win in
case of conflict, for backward compatibility when we introduce new 
keywords.

--
Chip Salzenberg <[EMAIL PROTECTED]>





[perl #38691] OSX bus error in punie-clone

2006-03-07 Thread via RT
# New Ticket Created by  Will Coleda 
# Please include the string:  [perl #38691]
# in the subject line of all future correspondence about this issue. 
# https://rt.perl.org/rt3/Ticket/Display.html?id=38691 >


Got the following backtrace working with a very slightly modified  
snapshot of punie.

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x
fetch_buf_be_4 (rb=0xb048 "?", b=0x0) at src/byteorder.c:192
192 memcpy(rb, b, 4);
(gdb) bt
#0  fetch_buf_be_4 (rb=0xb048 "?", b=0x0) at src/byteorder.c:192
#1  0x000238a8 in fetch_op_be_4 (b=0x0) at src/packfile/pf_items.c:192
#2  0x0002399c in PF_fetch_opcode (pf=0xd124b0, stream=0xb144) at  
src/packfile/pf_items.c:288
#3  0xc9ac in PackFile_Constant_unpack (interpreter=0xd00240,  
constt=0xd126d0, self=0xd154e0, cursor=0x0) at src/packfile.c:3120
#4  0xd818 in PackFile_ConstTable_unpack (interpreter=0xd00240,  
seg=0xd126d0, cursor=0x0) at src/packfile.c:2936
#5  0xa770 in PackFile_Segment_unpack (interpreter=0xd00240,  
self=0xd126d0, cursor=0xd154e0) at src/packfile.c:1301
#6  0xb560 in directory_unpack (interpreter=0xd00240,  
segp=0xd124b0, cursor=0xfaf4e0) at src/packfile.c:1492
#7  0xa770 in PackFile_Segment_unpack (interpreter=0xd00240,  
self=0xd124b0, cursor=0xd154e0) at src/packfile.c:1301
#8  0xa98c in PackFile_unpack (interpreter=0xd00240,  
self=0xd124b0, packed=0xfa9000, packed_size=0) at src/packfile.c:642
#9  0xf410 in Parrot_readbc (interpreter=0xd00240,  
fullname=0xd12460 "/Users/wcoleda/research/parrot/./languages/apl/lib/ 
PunieGrammar.pbc") at src/embed.c:390
#10 0xd348 in Parrot_load_bytecode (interpreter=0xd00240,  
file_str=0xf76e90) at src/packfile.c:3330
#11 0x00033634 in Parrot_load_bytecode_sc (cur_opcode=0xfa70f4,  
interpreter=0x0) at src/ops/core.ops:144
#12 0x00139d94 in runops_slow_core (interpreter=0xd00240,  
pc=0xfa70f4) at src/runops_cores.c:172
#13 0x0002f3dc in runops_int (interpreter=0xd00240, offset=3) at src/ 
interpreter.c:775
#14 0x0002b438 in runops (interpreter=0xd00240, offs=0) at src/ 
inter_run.c:81
#15 0x0002b638 in runops_args (interpreter=0xd00240, sub=0xd12340,  
obj=0x281b9d0, meth=0x0, sig=0x18b9f0 "vP", ap=0xb6c4 "") at src/ 
inter_run.c:180
#16 0x0002b758 in Parrot_runops_fromc_args (interpreter=0xd00240,  
sub=0x0, sig=0x18b9f0 "vP") at src/inter_run.c:274
#17 0x3d40 in main (argc=3, argv=0x0) at compilers/imcc/main.c:686