[Qemu-devel] [RFC] RISC-V Decoder generator

Bastian Koppelmann Fri, 20 Oct 2017 06:49:15 -0700

Hi QEMU and RISC-V folks,

I asked you for feedback some while ago regarding a modular RISC-V QEMU
target (see discussion [1]). I tried getting it to work with the good
old C preprocessor and quickly realized that it is too limiting. Instead
I created a data-driven decoder generator written in python (see the
code on github [2]) using YAML as a description language.


I'd like to get some feedback whether this is acceptable to be
upstreamed to QEMU or if you have any suggestions for improvements.
Right now only RV32I instruction are implemented in this scheme.

For this I will roughly sketch how the data format for RISC-V
instructions works and then how the result after processing looks like.
You can find the full tree on github [2] with all the relevant files in
'target-riscv/decode-gen/'.

The RISC-V data in YAML consists of:

1) prefixes
2) instruction formats
3) instruction specification using the data of 1) and 2)

1) prefixes have:
   - name
   - bitmask that identifies instructions belonging to it
   - length that specifies the length in bits for all instructions with
     this prefix

Example prefix description for 16 bit instructions:

16Bit: {
    bitmask: ~0b11,
    len: 16
    }

2) instruction formats have:
    - name
    - list of immediate fields
    - list of 'normal' fields (e.g. register identifier)

    The immediate fields are specified separately in the same file with:
        - name
        - data: in order list of the individual fields + possible sign i

                extension
        - fill: how is the immediate filled from the LSB (for now only
                '0' fills are allowed)

    Example for B-type immediate:
        B: {data: [31s, 7,  30..25, 11..8], fill: '0'}
        Note here that bit 31 is used for sign extension as indicated by
        the 's'.

    The 'normal' fields are similar to immediate fields but only have:
        - name
        - single field description

    Example for 'rs1' field:
        - rs1: 19..15

Example for a full instruction format 'SB':
    - SB: {
      fields: [rs1,rs2],
      immediates: [B]
      }

3) instruction specification have:
    - name
    - instruction format from 2)
    - prefix from 1)
    - keymap of opcode fields and the value they take for this
      instruction
    - func for binary translation

    The opcode fields are specified separately in the same file in the
    same manner as 'normal' opcode fields.
    Example for FUNC3:
        - FUNC3_32: 14..12

Example for the 'BEQ' instruction:
    BEQ: {format: SB, prefix: 32Bit, opcodes: {OP1_32: 99, FUNC3_32: 0},
          func: gen_beq}

    'OP1_32: 99' means that field OP1_32 defined by bits 6 until 0 have
    to have the value 99


The python script then converts these instructions into a tree of
switch-case statements depending on the specified opcodes and generates
at each case for one instruction two function calls:

1) decode_format_* depending on the instruction format used by this
   instruction.
   The decode_format_* functions are also generated by the script and
   take of decoding the 'normal' fields and the immediate fields. This
   data is stored in a data structure 'DisasFormats' which needs to be
   passed to any translation function used by instructions.

2) A call to the translation function defined by 'func' in 3)
   These need to be defined by the developer in translate.c. For example
   for LUI this looks like:

static void gen_lui(CPURISCVState *env, DisasContext *ctx)
{
    if (ctx->fmt.field_rd == 0) {
            return; /* NOP */
    }
    tcg_gen_movi_tl(cpu_gpr[ctx->fmt.field_rd], ctx->fmt.immediate_U);
}

Thanks for any comments,
Bastian

[1] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg07735.html
[2] https://github.com/bkoppelmann/riscv-qemu/tree/decoder

[Qemu-devel] [RFC] RISC-V Decoder generator

Reply via email to