Jeff Clites <[EMAIL PROTECTED]> wrote:

> Yep, that was the core of the issue. There's no free lunch--if we use
> the nonvolatile registers, we need to preserve/restore them in
> begin/end, but if we use the volatile registers, we need to preserve
> them across function calls (incl. normal op calls).

Good point and JIT/i386 does it wrong in core.ops. But normal non-JITted
code is not the problem - the framework preserves mapped registers or
better - it has to copy all mapped registers to Parrot registers so that C
code is able to see the actual values.

And the framework also knows not to restore non-volatile registers, at
least if the platform code defines PRESERVED_<type>_REGISTERS and
arranges the registers correctly.

> ... So I added code to
> do the appropriate save/restore, and use the non-volatile registers for
> mapping--that should be less asm than what we'd have to do to use the
> volatile registers. (The surprising thing was that we only got 2
> failures when using the volatile registers--I'll look into creating
> some tests that would detect problems with register preservation.)

Well, allocation strategy on PPC (or I386) is to first use the
non-volatile registers. PPC has 14 usable registers. To provoke a
failure you'd need e.g. 15 different I-registers and then a JITted
function call like C<set_s_sc>. But in normal cases you have more string
functions in that place and - as these are normally not JITted -
registers are saved and restored around the external function.

Oddly JIT/i386 has that problem too and there are now only 2
non-volatile registers. But albeit there are function calls, like
C<string_bool>, the whole test suite passes.

Anyway, I think we need a more general solution. We have basically two
problems:

1) JIT startup and end code size and memory usage

Given: a bunch of small overloaded vtable functions. All of these are
called through runops_fromc_*(). So for every function PPC JIT now would
move ~ 2 * 300 byte to and from the stack. That's too much and not
needed in that case.

Solution: The JIT compiler/optimizer already calculates the register
mapping. We have to use this information for JIT pro- and epilogs.

The allocation strategy should be adjusted and depend on code size and
register usage:
- big chunks of compiled code like whole modules should use the
  non-volatile registers first and then (if needed) the volatile
  registers.
- small chunks of code should use the volatile registers to reduce
  function startup and end sizes.

2) JITed functions with calls into Parrot

The jit2h JIT compiler needs a hint, that we call external code, e.g.

   CALL_FUNCTION("string_copy")

This notion is needed anyway for the EXEC core, which has to generate
fixup code for the executable. So with that information available, the
JIT compiler can surround such code with:

   PRESERVE_REGS();
   ...
   RESTORE_REGS();

The framework can now depending on the register mapping save and restore
volatile registers if needed.

> The other tricky part was that saving/restoring the FP registers is one
> instruction per saved register, so saving all 18 was exceeding the asm
> size we allocate in src/jit.c (in some cases), since we emit Parrot_end
> for all restart ops.

Yep, that's suboptimal. I've done that on i386 because it was just easy.
But you are right, the Parrot_end() code should really be there only
once.

> The attached patch also contains some other small improvements I'd been
> working on, and a few more jitted ops to demonstrate calling a C
> function from a jitted op.

I'll apply it, because it's obviously correct albeit suboptimal ;) But
we can improve things always later.

> ... , frustratingly, and I've moved to the
> habit of using something like "x/300i jit_code" as a workaround.

Ah, yes - forgot that.

> Clearly it can access the memory region, so it seems like a gdb bug.

Yep.

> JEff

Thanks,
leo

Reply via email to