Re: [Qemu-devel] qemu vs gcc4

Paul Brook Tue, 31 Oct 2006 11:02:50 -0800

On Tuesday 31 October 2006 16:53, Rob Landley wrote:
> On Monday 23 October 2006 2:37 pm, Paul Brook wrote:
> > > > Better to just teach qemu how to generate code.
> > > > In fact I've already done most of the infrastructure (and a fair
> > > > amount of the legwork) for this. The only major missing function is
> > > > code to do softmmu load/store ops.
> > > > https://nowt.dyndns.org/
>
> I looked at the big diff between that and mainline, and couldn't make heads
> nor tails of it in the half-hour I spent on it.  I also looked at the svn
> history, but there's apparently a year and change of it.
>
> I don't suppose there's a design document somewhere?  Or could you quickly
> explain "old one did this, new one does this, the code path diverges here,
> start reading at this point and expect this and this to happen, and if you
> go read this unrelated documentation to get up to speed it might help..."


Not really.

The basic principle is very similar. Host code is decomposed into an 
intermediate form consisting of simple operations, then native code is 
generated from those operations.

In the existing dyngen implementation most operands to ops are implicit, with 
only a few ops taking explicit arguments. The principle with the new system 
is that all operands are explicit.

The intermediate representation used by the code generator resembles an 
imaginary machine. This machine has various different instructions (qops), 
and a nominally infinite register file (qregs). Each qop takes zero or more 
arguments, each of which may be an input or output.

In addition to dynamically allocated qregs there are a fixed set of qregs that 
map onto the guest CPU state. This is to simplify code generation.

Each qreg has a particular type (32/64 bit, integer or float). It's up to you 
ro make sure the argument types match those expected by th qop. It's 
generally fairly obvious from the name. eg. add32 adds I32 values, addf64 
adds F64 values, etc. The exception is that I64 values can be used in place 
of I32. The upper 64-bit of outputs are undefined in this case, and teh value 
must be explicitly extended before the full 64 bits are used.

The old dyngen ops are actually implemented as a special case qops.

As an example take the arm instruction

  add, r0, r1, r2, lsl #2

This is equivalent to the C expression

 r0 = r1 + (r2 << 2)

The old dyngen translate.c would do:

  gen_op_movl_T1_r2()
  gen_op_shll_T1_im(2)
  gen_op_movl_T0_r1();
  gen_op_addl(); /* does T0 = T0 + T1 */
  gen_op_movl_r0_T0

When fully converted to the new system this would become:

  int tmp = gen_new_qreg(); /* Allocate a temporary reg.  */
  /* gen_im32 is a helper that allocates a new qreg and
     initializes it to an immediate value.  */
  gen_op_add32(tmp, QREG_R2, gen_im32(2));
  gen_op_add32(QREG_R0, QREG_R1, tmp);

One of the changes I've made to target-arm/translate.c is to replace all uses 
of T2 with new pseudo-regs. IN many cases I've left the code structure as it 
was (using the global T0/T1 temporaries), but replaced the dyngen ops with 
the equivalent qops. eg. movl and andl now generate mov32 and and32 qops.

The standard qops are defined in qops.def. A target can also define additional 
qops in qop-target.def. The target specific qops are to simplify 
implementation the i386 static flag propagation pass. the expand_op_* 
routines.

For operations that are too complicated to be expressed as qops there is a 
mechanism for calling helper functions. The m68k target uses this for 
division and a couple of other things.

The implementation make fairly heavy use of the C preprocessor to generate 
code from .def files. There's also a small shell script that pulls the 
definiteions of the helper routines out of qop-helper.c

The debug dumps can be quite useful. In particular -d in_asm,op will dump the 
input asm and the resulting OPs.

For converting targets you can probably ignore most of the translate-all and 
host-*/ changes. These implement generating code from the qops. This works by 
the host defining a set of "hard" qregs that correspond to host CPU 
registers, and constraints for the operands of each qop. Then we do register 
allocation and spilling to satisfy those constraints. The qops can then be 
assembled directly into binary code.

There is also mechanisms for implementing floating point and 64-bit arithmetic 
even if the target doesn't support this natively. The target code doesn't 
need to worry about this, it just generates 64-bit/fp qops and they will be 
decomposed as neccessary.

Paul


_______________________________________________
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel

Re: [Qemu-devel] qemu vs gcc4

Reply via email to