Re: new FAQs

Joshua Isom Thu, 24 May 2007 01:35:03 -0700


On May 23, 2007, at 8:06 PM, Will Coleda wrote:

On May 23, 2007, at 1:58 AM, Joshua Isom wrote:

I confess to not grasping the point you claim is simple.  As you
understand it, what is there about a register based machine, as
opposed to a stack based machine, that specifically improves the
performance of operating on dynamically typed data, without regard to
performance differences between the two architectures that are
independent of typing models?

I don't believe there's a benefit for a certain type, but all types.There will be less time moving data around, and keeping the same datain the same places. The less you move data around, the more youimprove speed. Proven concepts for compiling to a register basedmachine will be easier to implement. And since after all, speed isoften in large part to the compiler, a better compiler will createbetter programs. So speed will be a combination of parrot version, andcompiler version.

When it comes to which is better for a type of language, both Java andPerl 5 use a stack based machine, but Python can be ran as native codeor on a JVM, so it's hard to truly say what is best for a type oflanguage. To me personally, a virtual machine simulating real hardwareas much as practical is a great benefit.

It sounds like you are saying that languages are free to implement
their own semantics using their own code, and that they can choose not
to interoperate with predefined Parrot types or types from other
languages when that would negatively impact their goals, such as
performance. While that rings true, it seems that Parrot is not
providing that ability -- languages can already implement whatever
they want without Parrot.  And if languages are free to ignore
predefined and foreign types, when what benefit will they actually get
from Parrot?

A language can use whatever semantics it so chooses. Butinteroperability between languages will likely be broken, and theywon't be able to utilize future improvements in parrot. Part of thewhole point of parrot is to provide one portable virtual machine forlanguages. Languages will be able to benefit from improvements inparrot without needing to spend all their time on their own virtualmachine. If you have an idea for a new language, you can use parrot asa base and deal solely with designing and compiling, instead ofrunning.

Moreover, this does not address my initial question.  I am asking, to
rephrase it bluntly, "If Parrot makes dynamic typing faster, doesn't
that have to make static typing slower?"  That is, is Parrot making a
tradeoff here?  If it is, how large is the tradeoff and what is its
nature.  If it is not, then why doesn't everyone else simply do what
you are doing and gain the same benefit?

What about static typing is so fast? Compiler optimizations help, butthose can still exist. I don't see any reason why parrot can't stillrun statically typed code faster than dynamically typed code(as isoften the case anyway).

It would seem that Parrot either has to be different from the JVM and
CLR due to design or implementation optimizations that favor a
specific typing model over others -- which is what it seems to claim --
or else it does not -- either it is not thus differently designed, or
it is not thus differently implemented.  If it does not, then it seems
inappropriate for it to make the claim -- and thus would raise the
question of why Parrot should be considered a superior target for
dynamically or statically typed language compilers.

Parrot has a benefit of starting from scratch. We aren't trying tocompile java to perl 5's vm, or compiling perl 5 to java's vm, insteadwe're building a vm that will work well for them all without late termslow hacks. A language won't need to use all of parrot's features.Most won't, but they won't need to use weird hacks to get thefunctionality they need. They can use a subset of parrot.

Personally, I rarely use some of the higher level features of parrot.But parrot having the additional features doesn't seem to slow down mycode at all. Sometimes I want to mix regular expressions and jitableassembly.

What tradeoffs could Parrot be making that will have a significant
benefit for dynamically typed languages -- significant enough to
justify the creation of Parrot itself -- without significant detriment
to statically typed languages?  Again, if these tradeoffs are so
broadly beneficial, why would the JVM or CLR not simply implement them
themselves?

Most simply: What is being lost to gain whatever is being gained?

I'm not sure anything is lost. Perhaps when a Java to Parrot compilerand a Perl 5 to Parrot compiler are both finished, we can see how theycompare to their "original" vm's.

I don't understand your answer.  Allow me to rephrase and expand the
question.

If Parrot is designed to benefit of dynamically typed languages, how
will Parrot handle statically typed code in those languages.  Will
Parrot discourage the use of static typing features in languages like
Perl by making that code execute more slowly or inefficiently than
equivalent dynamically typed code?

There's no reason for parrot to hinder one languages performance. Canyou give an example of code to use that would help others to understandyour question? Vague questions get vague answers, or no answers.Clarify for us, please.

> Perhaps miniparrot can help take care of this.  If miniparrot's a

> miniature parrot, and perhaps supporting only those features thatthat

> language needs, we might be able to get a parrot suited for embedded
> systems.  PMC's not needed won't be compiled in, the runcores other
> than the default could be left out, and parrot's size could shrink
> dramatically.

While many things are perhaps true, this answer sounds like "There is
no definite plan for supporting this."

There is no immediate plan for embedded systems. But the groundwork isalready laid.

> > f. How will Parrot support direct access to "unmanaged" resources?
>
> Is this like UnmanagedStruct?

I mean supporting direct access to the underlying address space and
support for determining the sizes of data within that memory.  For
example, direct access to a framebuffer.

> > g. How will Parrot facilitate distributed processing?
>
> With native threading support.

I think you misunderstood my question.  By "distributed", I meant the
execution of code in multiple address spaces, or the non-concurrent
execution of code.  What support will Parrot provide for migrating
data or code between environment with different byte orders.  How will
Parrot support capturing execution state into a preservable or
transportable form?

All PIR is compiled to PBC, a portable(including endianness) format forexecuting. Parrot can run it directly, or compile and run it inmemory. You can take a PBC file written on a power pc and run it onIA64(theoretically, I don't believe it's been tested lately), and visaversa.

Again, this does not seem to be clear, so I will provide an
example. If a Perl compiler is compiling Perl code, and that code is
written to increment the result of a call into some Python code that
returns a PythonString, how can the compiler ask the PythonString PMC
if it implements the "increment", so that it can detect at compile
time what the behavior of the statement will be?

More broadly, how can statically typed code determine if the values
produced by an operation will conform to the type requirements?

I believe your understanding of parrot is somewhat at fault ofconfusion. Perl doesn't increment a string. Python doesn't incrementa string. A PerlString increment's itself, and a PythonStringincrement's itself.

It's possible that you're integrating code for a language that youdon't have installed, so it may not be possible. But cross languagehandling probably isn't the most explored part of parrot. But mostissues will be more a matter of assumptions regarding handling twolanguages at once than parrot's handling. But don't forget, mostpeople who use parrot will only use one language at a time.

What are "basic things"?  What if a language inherently differs in how
it handles those things?  For example, incrementing a scalar would
seem to be a basic operation in Perl, but Python will not implement
that basic thing in the same way.  It would seem that one or both
sides of this cross-language exchange of very basic types of data will
be problematic.

Even if one doesn't implement increment on a string, it doesn't meanthat the "slot" for increment is at the same place.

You say "the best way for parrot" -- how can Parrot have a judgmental
reference point independent from the languages that target it and the
users of those languages?

If a language is being compiled for parrot, the compiler author isobviously aware of parrot's abilities and potential. A language may beimplemented differently, but under the concepts that parrot is builtupon, it won't use parrot's capabilities and potential to the fullest.Using Intel's ABI for a compiler may not be the best method for thegiven language, but it will help provide interoperability between otherlanguages that use it, and benefit from advances Intel makes forimproving that method.

> > d. Will each language have to provide its own support forinteracting

> > with PMCs for other languages?
> >
>
> No, the PMC's will do that themselves.  Getting the PMC's is another
> story.  A language is reponsible for it's cross language semantics.
> But parrot is designed for the widest possible case.  Many languages

> limit valid characters that a subroutine can use, but parrot doesnot.

> But as long as "common" cases are adhered to, most problems will not
> exist, e.g. no unicode whitespace in a subroutine name.

You say "No" initially, but then go on to say "yes" in substance.  If
the PMCs are responsible for this, and if languages provide the PMCs,
then the languages are responsible for this.

I can write a program that would benefit greatly from cross languagecommunication. Say I wish to combine Perl with AppleScript to controlmy computer. But if AppleScript doesn't support it, but Perl does,I'll have to use Perl to send data to AppleScript and to retrieve it.If my AppleScript program has weird subroutine names in it that Perldoesn't like, I can't call those subroutines. But if I can call thosesubroutines, and send it a PerlScalar, I can get to do things with thatvariable in AppleScript as well as Perl. Parrot will handle thePerlScalar, even in AppleScript, but I have to get it to cross theboundary between languages. I will be able to join together two Perlstrings from within AppleScript(although it'll probably return a Perlstring instead of an AppleScript string).

This is only an example, and someone else can probably come up with amore valid example.

To explicitly state what is implied by this question.  If every
language must provide PMCs that understand how to interact with types
of other languages, then languages will only be able to interact with
each other to the degree that one or both of those languages provides
support.  For Perl to use data returned from Python code, either Perl
will have to recognize Python types or Python will have to know to
produce Perl types.  Then for Perl to call Tcl code, Perl and/or Tcl
will have to be taught about each other.  And then for Python to call
Tcl, yet additional code will need to be created.  Indeed, it could be
necessary for Python code to call Perl code that calls Tcl code,
because Perl might understand how to handle a Tcl type that Python
does not.  And the more languages that are added, the more types each
language will be asked to implement code to interact with.

This seems like a scalability problem.

Vtables are the type. It all comes down to being a Parrot type, usingparrot's interface.

To create the scenario you are visioning, parrot would need to beextremely minimalistic, and all opcodes and pmc's would need to beloading via a dynpmc library and a dynop library, with no setdefinition of how to implement things. Fortunately, "rules" exist!Rules are why Windows NT was a POSIX system as well to all the othertypes it is/was. It aid's in the "it just works" idea that helps somany programmers in their code.

One possible approach would be to tell every language that when they
wish to interact, they must produce Parrot-provided types, like String
or Number.  Another possible approach would be for Parrot to forcibly
convert language-specific data to Parrot-provided types.  Both of
these approaches have issues.

Incidentally, the JVM/CTS approach is to tell every language to use
the same primitive types all the time and to use the same object types
as close to all the time as possible.  (I am only aware of one case of
this, being the separate 'String' type in Rhino, needed to provide
both Java and JavaScript String semantics. In that case, Java code
returns a Java String object and the caller must explicitly convert it
to a JavaScript string with an operation like 'string+""'.)

> > e. How will a PerlScalar interact with a PythonString?
>
> The best method would probably convert both down to a String, do
> whatever operation, and convert up to whatever is request.  But, for
> optimization, multimethod vtables could be used to provide custom

> behavior. I know src/pmc/complex.pmc has some examples ofmultimethod

> vtables.

See above.  The intent of this question is not so much "What could
someone happen to do in this situation", but rather "What exactly will
Parrot enforce, require or provide?"

Crossing the language boundary is up to the language, handling thelanguage boundary is largely up to parrot, and the programmer.

Don't increment a PythonString from Perl. If you write your code tosupport cross language support, you won't have a problem. But if youtry to use cross language abilities with a function that doesn't expectit, you might have trouble.

> > f. What will happen when a PythonString is incremented in Perlcode?

>
> Parrot call's PythonString's increment vtable.  Perl doesn't have an

> increment, but PerlScalar does. Python doesn't have an increment,but

> PythonString does.  Now, if the PMC doesn't implement that vtable
> function, an exception is thrown, but Parrot still tries to call it.

This would mean that any cross-language code could generate runtime
exceptions in operations that otherwise are generally considered not to
be able to fail.  Indeed, it would seem that every possible operation
would possibly fail at runtime when handling foreign data.

Asking for the third element in a PerlString isn't possible, butperfectly normal in C. There's the potential for a problem, but inmany instances you'll be passing an array to something expecting anarray rather than passing a string to something expecting an array.

This would seem to strongly discourage multi-language programming --
to the point of it never happening.

The reasons programming languages exist are to aid development, namelyspeed of development. The language used for a program is often aresult of the features of the language. PGE was written in PIR becauseit was easier than writing it in C. This, at least at present, has aspeed hit, but it made it easier to implement.

Suppose you want to mix Fortran with Perl 6. Both have theiradvantages in coding for different aspects, so you choose to writedifferent parts of your code in either. You have quicker developmentoverall, and few lines of code to debug. Suppose you want Perl's regexcombined with Java's IO groups, with Parrot it becomes possible.

What will Parrot do to make this acceptable?  Will end-users be forced
to write their own test cases that attempt all valid combinations of
all data between all languages they wish to use?

I'll remember to require a string argument to every Perl subroutine Iwrite and increment it just to forbid Python from using it.

If you don't intend it for cross language support, then don't care. Ifyou do intend it for cross language support, be more prudent about yourchoices for a language, and work with users of other languages toimprove your library.

> > Comparing the vtable for a PMC to the JVM and CLR base Objectclasses,

> > the PMC is essentially an "abstract" class with dozens of

> > "unimplemented" methods, while Java's Object provides (andimplements)

> > the following public methods:
> >
> >   equals getClass hashCode notify notifyAll toString wait
> >
> > Discounting the methods related to Java's peculiar threading
> > implementation, that's:
> >
> >   equals                 getClass hashCode    toString
> >
> > Similarly, the CLR's CTS Object provides:
> >
> >   Equals ReferenceEquals GetType  GetHashCode ToString
> >
> > g. Why is it a good thing that PMCs essentially non-contractual
> > abstract base classes that define a lot of functionality without
> > implementing it?
>
> In some instances, this is a benefit.  Suppose you want an
> auto-iterating string array.  For the most part, it's an array with

> normal array properties. But if you get it's string value, ititerates> over the next one. If you set it's string value, maybe it splicesthat

> value into the array.  Having both array and string properties is
> beneficial in this case.

I do not see the benefit.  You could implement exactly that without
having an undefined, abstract base type.  For example, with the
following code (which is clearly simplified):

[...]

Now, this was not the best of examples in the first place, because I
would not argue that 'ToString' is not the kind of really-useful thing
you want in a core data type.  The essential meaning of the routine
being "make something a human can read" -- and humans are the people
using the machines.  But, as you can see, there was no need for the
core data type to provide me with an implemented 'addValue' -- it can
simply be layered on using a more primitive and extensible runtime
support for properties.


But you forget "zz"++ == "aaa" as in perl!

Anyway, wouldn't you much rather write i += "A"? I know I would.There may be no need for the vtables, but otherwise you MUST know how alanguage implements a given function. With vtables, you don't, parrotdoes.

> But the downside is most things, such as an Integer, don't need manyof

> the vtables provided.  In fact, if you look at the c output of a pmc

> file, you'll see that every vtable is created. I imagine it's morefor

> simplicity and speed than for memory(both executable and ram) than
> anything else.

I don't see the simplicity or the speed benefit.  I do see the memory
cost.  If anything, I suspect that these larger objects will fill a
CPU cache faster and be slower to load because of this increased size,
leading to slower runtime performance.

Perhaps some confusion was caused. Parrot has an Integer PMC, likejava's Integer object. Parrot also has an int type, which is quitesimply an unsigned native integer, which has no more hit than in C. Ifyou're worried about your speed and memory footprint in java, you'lluse int far more than Integer.

No, I mean why is the type-specific functionality not pushed down into
the next tier where it is actually needed, like the JVM and CTS do,
leaving the base PMC with only the same four or five methods those
systems have?

We have a Default, which others utilize. Granted, parrot still uses a"macro" approach, but it works. But as far as I know this is moreimplementation than design. Parrot could have just one tiny baseclass.

Without opening a can of bees, this sounds like Parrot's performance
will vary greatly, depending on the quantity of variables in scope in
subroutines.  While it is generally true for most languages that a
large number of variables can trigger load/store operations when the
register capacity is exceeded, Parrot will switch from JIT code to
purely interpreted code?  While most people don't worry about
incurring a few load/store operations, this kind of variation may
cause programmers to alter their programming style significantly in
order to avoid unacceptable performance.

As you say, i386 has fewer registers, but it is a very common
platform.  Given that, many programmers may consider it necessary to
write code that will be JIT-able on that platform, leading to a rather
awkward programming style, encouraging the use of a larger number of
subroutines, thus more calling, and ultimately a lot of register
shuffling anyway.

Your computer behaves the same way, most likely. Currently Iuse(although not as my "desktop") FreeBSD on AMD64, with 16 totalinteger registers and 16 floating point registers. If I look at thedisassembled output of a function with a large number of variables, I'mamazed at how few registers it uses, instead opting to constantly movedata in and out of the stack(where local variables are stored). Infact, this is how Parrot has be able to achieve "faster than c" statusbefore, because data is kept in registers.

If a programmer really wishes to ensure that the code is jitable, thenthey'll need to look at the compiler output and the jit implementation.

But if you're looking at how to write your code to aid in jitability,you shouldn't forget how well your compiler optimizes for certainfeatures, and how much of a hit certain assumptions about optimizationcan hurt your program, such as arrays of arrays.

When I asked this question, I thought I was asking if the compiler
could suggest which variables should map to registers and which ones
should be loaded/stored.  But it seems this is a question of which
subroutines will use registers at all.  In that case, I wonder what
mechanisms Parrot will provide to inform a compiler how JIT-able a
subroutine is -- both on the current platform and on other
architectures -- to enable the compiler to know when it would make
sense to either automatically modify the code into JIT-able form, or
to warn the developer.

The tricky thing is, if it's compiled to PBC, parrot's "ELF" as itwere, you can't optimize for a particular platform. Other than beingopen source, parrot doesn't provide any capacity for aiding a compilerwith this at the moment, and a design hasn't been implemented, otherthan some odd(and probably unportable) use of NCI(native callinterface) and ManagedStruct's.

Frankly, this is not much of an answer.  I am not asking if CISC
architectures exist, but rather I am asking why you are choosing to
create one.

Moreover, I am not questioning your choices in terms of design options
and tradeoffs.  I am simply looking for the description of why what
you have was done the way you did it.

Why not ask AMD why their processors are RISC processors, with a CISCinterpreter as it were.

But consider a common function in cryptography. Bitwise rotation isoften used. Both PPC and x86 have a rotate opcode, but C does not.Simple code such as "(a << b) | (a >> (sizeof(int)*8 - b))" will becompiled to the shifts and or that it specifies. With a CISC vm, thiscan be JIT'ed to one opcode perhaps, because Parrot does have a rotopcode. Instead of trying to match against a sequence of codes to turninto one, parrot provides that one opcode.

> > b. What is the basis for deciding what will be an operator?
> >

> > c. How can substantial quantities of additional functionality beadded

> > to this design cleanly?
>

> New vtable's can be added by editting vtable.tbl, new ops can beadded

> by adding to src/ops/experimental.ops, new pmc's can just be added to
> src/pmc afaik.  New charsets in src/charset, new jit architectures
> under src/jit(just add --jitcapable and it'll try to compile it in).

> I'd say it's a fairly clean layout for expanding things. There'seven

> the capacity for adding a new garbage collected.

It is not sufficient to say that one can write the code.  How will
Parrot inform an existing compiler that the new operation exists (or
does not exist if the version of Parrot is older).  Will compilers
have to themselves be recompiled even if they do not use the new
operators?

Your question was about extending parrot's features, not seeing if theyexist. The best method would probably be finding the current versionof parrot, and going from there. Once parrot reaches version 1.0, thebytecode should remain relatively constant. New opcodes may be added,but older files will work regardless. An older parrot may not run anew bytecode file, but as long as a new opcode or feature isn'texpected, it should work find. Given the fact that parrot isn't atversion 1.0, these issues are theoretical.

Also, this seems, as a design, to simply be a bag of operations.

Finally, I would like to add some additional questions.

2.h. Will Parrot support inline assembly language?

Inline native assembly, doubtful. But a language could support inlinepir which could be jitted, which is a little bit close. With NCI,which allows for calling a c function, the power of parrot can becomemassive.

2.i. Will Parrot support primitive types?

It does, namely integers, floats, strings, and PMC's which are muchlike a pointer.

4.c. How will registers benefit PMCs (e.g. PerlScalar), which are not
primitive types and cannot be stored in a hardware register?


The same way C does, as a pointer.

Re: new FAQs

Reply via email to