Re: Parrot Z-machine

Dan Sugalski Mon, 08 Sep 2003 20:43:35 -0700

At 10:33 AM -0700 9/8/03, Amir Karger wrote:

Before I start, a list question: is Google groups mailing list-aware,
such that posting to Google's perl.perl6.internals group will email
[EMAIL PROTECTED]  Might be more convenient for me than reading
stuff on Google & then logging in to my Yahoo account to post.

The NNTP gateway's pretty much one way. You might try using the news server at nntp.perl.org, though there may still be a delay. News, unfortunately, is rather badly prone to abuse and spamming.

--- Dan Sugalski <[EMAIL PROTECTED]> wrote:
On Sat, 6 Sep 2003, Amir Karger wrote:
 > I'll need to write Zmachine.ops, or some such. It will include all
 > the Z-machine operations, which the bytecode will call.
Yep.
OK. Although Luke Palmer seems to think differently.

That's OK--there are a number of different ways to go about this.

He said:

 I think a z-machine to parrot converter, making some of the
 more complex ops sub calls or something, would be best.  We need to

work

 with the z-machine bytecode directly, though, because many games are
 distributed without source.


That sounds more like my "disassemble Zcode to Z assembly, then
translate that to PASM+, i.e., PASM that calls Zcode ops in addition to
the regular ops, but uses the Parrot registers et al., then compile and
run that."

Which you can do, and ultimately is probably the best thing to do. (It is, actually, what I'm suggesting you do, though I think I'm probably skipping a few bits) Though the possibility of writing a z-machine interpreter in parrot assembly, IMCC, or one of the languages that targets parrot is definitely an option.

Of course, between these two solutions, there would be some overlap in
the Z-specific opcodes, but there's the major question of whether we go
through PASM or not. But as Luke (and my-ticket-to-fame Piers Cawley)
mention, not using PASM loses you some fancy optimizations. (At least I
think that's what they said.) Are these optimizations unnecessary if we
write the whole bytecode implementer in C?

Nah, I think it's a matter of some interpretation. Piers was assuming that if you used a *loadable* opcode library that you couldn't be JITted. That's not the case, though I'm not sure it's a huge issue anyway--there's a limit to how fast Lurking Horror can run, after all. :)

I'm happy to do (or, maybe more likely, try and fail miserably to do)
either method, but The List should probably figure out a direction for
me to go in first. (Given my knowledge of Parrot, I'm probably the
least
qualified person on this list right now to make this decision. Unless
you tell me to do whichever I think will be more fun, in which case
I'll
write a Z-code to BASIC translator....)

Here's what *I* think would be cool.

1) We have a set of z-machine ops to implement whatever bits you need that are otherwise inconvenient to implement in what ops we have now. (There may not be any, which is fine)

2) We have whatever support parrot assembly routines you might need. (Library code for the interpreter and whatnot)

3) We have code, written either in C or parrot assembly, that can turn a z-machine file into parrot bytecode, parrot assembly, or IMCC source. (This code should rely only on ANSI C89, the C89 libraries, and/or the parrot library routines, and shouldn't do any I/O)

4) We have code that can definitively identify a file as z-code. (This may be the fact that there's a -b:zcode switch immediately before it)

What we do then is weld this all together, and it then gives us the ability to dynamically load up and run z-code files. The flow will be that parrot opens the bytecode file and tries to identify it. Using the code in #4, it identifies it as z-code, and then hands it to the code for #3 (which is a parrot bytecode loader). #3 translates it to parrot bytecode (directly or indirectly) including the support code from #2 (if there is some) and hands it to the interpreter to execute. The interpreter loads up the opcode library from #1 if it needs to, and goes.

> > > we want parrot to be able to directly load Z-code files, rather

 > > than having to first run an external program to convert them.
 >
 > Right. The problem that I see with this is that Z-code "story"
 > files have a very definite header format, which is almost but not
 > quite entirely unlike Parrot bytecode. Just for example, the first
 > few bytes are totally different, but are necessary for both
 > languages.

Right, which is good.

 What I want to have happen is that when parrot is handed a bytecode
 file to execute, it examines the header of the file to find out what
 type of bytecode it is, and dispatch to the right loader. So when
 you load up a story file as if it were native bytecode, the bytecode
 loading routines identify it as a zcode story file rather than
 parrot bytecode and dispatch it to the zcode loader rather than the
 parrot bytecode loader.


Er, I'll assume you have a magic (pun slightly intended) way to decide
which files are Zcode? I mean, sure, if the rule is "anything that
doesn't match a Parrot header", you're fine, but once you've included
Python bytecode, Java bytecode, and compiled Befunge bytecode (a man
can
dream), how will you tell the Zcode from other bytecode noise?

Potentially we can't, in which case we have to punt and force it to be specified on the command line as a parameter indicating bytecode type. That'd be unfortunate, but possible. On the other hand, there may well be some good heuristic that can be applied to identify the code. I'd expect there probably is, though I may be wrong here--many bytecode formats don't embed type or version info into them. (Which, having dealt with *far* too many binary formats, I consider a design flaw. I like binary formats, but it's nice to be able to identify the darned things)

I don't
see anything particularly obvious in the header contents:

http://www.inform-fiction.org/zmachine/standards/z1point0/sect11.html

I assume having a separate Z-code loader solves the "Z-code has
two-byte
words" problem? OTOH, that means even more of Parrot functionality we
don't get to use.

Nope. Note that, while the loader gets handed the bytecode file from disk, it can hand anything it likes to Parrot. Sure, in the case of parrot bytecode of the right endianness and word size it gets mmapped in, but you're perfectly within the design to dynamically mash, mangle, munge, warp, bend, fold, spindle, or mutilate that bytecode to do whatever you want. This includes on-the-fly translation from whatever format it is in (say, z-machine bytecode) to parrot bytecode. (Note that it'd be fine for the bytecode loader to see the signature "#!\s*\S+\bperl\b" and decide that the bytecode's really perl source and hand it off to the perl compiler...)

 > > From what I can understand, "native" means running directly from
the

 > bytecode, but writing new ops. That is, not using PASM at all. In
 > that case, do I even have access to Parrot's stack? (Or do I just
 > need to access all of it through C?)

PASM is parrot's assembly, but when you're writing actual op

functions

 you're extending parrot's opcodes--C is essentially Parrot's
 microcode.  From within opcode functions you have full access to all
 of the underlying structures, so if you want to access Parrot's stack
 in z-machine ops, go for it.


Hm. Well, *do* I want access Parrot's stack? Something to think about
if I ever start actually writing code for this project, instead of talking
about it.

Good question. Probably, yes. Pushing things on it is a useful thing, and may match what you need to do well. (IIRC z-code is stack based, but I might remember wrong) You could also have a custom PMC class that holds all your state, the data, or whatever else you want, which is only easily accessible from within translated z-code, too. -- Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Re: Parrot Z-machine

Reply via email to