Re: [PATCH v1 4/4] x86-disas: add x86-mini disassembler implementation

Michael Clark Sun, 26 Jan 2025 15:23:23 -0800

On 1/24/25 13:10, Michael Clark wrote:

+static x86_opc_prefix x86_table_make_prefix(const x86_opc_data *d,
+    const x86_opr_data *o, const x86_ord_data *p)
+{
+    x86_opc_prefix tp;
+    memset(&tp, 0, sizeof(tp));
+
+    /* extract prefix and synthesize width prefixes */
+    switch (x86_enc_type(d->enc)) {
+    case x86_enc_t_lex:
+    case x86_enc_t_vex:
+    case x86_enc_t_evex:
+        switch (d->enc & x86_enc_w_mask) {
+        case x86_enc_w_wig:
+        case x86_enc_w_wn:
+        case x86_enc_w_wb:
+        case x86_enc_w_w0: break;
+        case x86_enc_w_w1: tp.pfx = x86_enc_p_rexw; break;
+        case x86_enc_w_wx: tp.pfx_w = x86_enc_p_rexw; /* fallthrough */
+        case x86_enc_w_ww: tp.pfx_o = x86_enc_p_66; break;
+        }
+        break;
+     }
+
+    /* find register or memory operand mapping to modrm.rm field
+     * so that we can add mod=0b11 or mod!=0b11 to modrm mask */
+    tp.modfun = x86_enc_func(d->enc) == x86_enc_f_modrm_n;
+    for (size_t i = 0; i < array_size(o->opr) && o->opr[i]; i++) {
+        uint isreg = x86_opr_type_val(o->opr[i]) >= x86_opr_reg;
+        uint ismem = x86_opr_has_mem(o->opr[i]);
+        uint ismrm = x86_ord_type_val(p->ord[i]) == x86_ord_mrm;
+        if (ismrm) {
+            if (isreg && !ismem) {
+                tp.modreg = 1; /* mod == 0b11 */
+                break;
+            } else if (!isreg && ismem) {
+                tp.modmem = 1; /* mod != 0b11 */
+                break;
+            }
+        }
+    }


I am already relatively strictly following QEMU conventions for
single-line and multi-line comments. I don't remember seeing a
checkpatch.pl report for the above multi-line comment. I wonder
if the constraint is restricted to top-level comments? I left
this as-is due to symmetry with the other comments in this block
as they are single-line and it looks better. this is very pedantic.

so am curious about a determination on the commenting convention.

I have a disagreement about unbraced 'if' in switch because IMHO
the QEMU rules make combinatorial logic sparse and hard to read.
I also typically put the first brace of functions on their own
line so there is space between function signature and first line.
those are the two main rules I break. my coding-style has lots of
switch statements because I rely on switch constant analysis and
jump table conversion because it tends to translate to fast code.
also I linearize them explicitly because not all compilers detect
a constant stride for index canonicalization in switch constants.
that is where the bulk of the checkpatch.pl warnings show up.

+    /* explict second opcode byte has mod == 0b11 */
+    if (d->opm[1] == 0xff && (d->opc[1] & 0xc0) == 0xc0 &&
+        !tp.modreg && !tp.modmem)
+    {
+        tp.modreg = 1;
+    }
+
+    return tp;
+}


fyi I have made a x86-mini-v2 branch with the changes I
mentioned in an email yesterday but it is still in flux:

https://github.com/michaeljclark/qemu/commits/x86-mini-v2

I still don't expect this code to be merged any-time-soon.
maybe in the future if QEMU gains the ability to run TCG
full-system under hypervisor with accelerated page tables.
I'm also considering digging out the most recent QEMU or tcc
TCG that is fully MIT-licensed as a starting point for IR.

the patch set may eventually become compelling if there is
a complete TCG supporting all AVX-512 vector ops. maybe this
is already there but I am not reading the GPL code in QEMU.
I am unsure about that code because a lot of code is by-file
licensed. it is questionable when the provenance originates
inside QEMU as opposed to MIT-licensed or dual-licensed code
(like FreeType) that is clearly of external origin. as a
modern embedded TCG would be quite useful in Rui's chibicc
or the MIT-licensed version of tcc:

- https://github.com/rui314/chibicc
- https://github.com/absop/Tinycc/blob/master/RELICENSING

and I am confessing about a snippet of code. this may look
familiar. there is one enum and two functions that were
derived from TCG X86 but they are unused. so I should add
Fabrice's original TCG MIT-license header or remove them.

I am pretty sure I checked the provenance for the version
of TCG that I used but I need to add an explicit reference
or delete these definitions. they can't really be written
a different way because they are more like reference data
and invariant logic based on X86 ISA constraints:

+/*
+ * test conditions
+ */
+
+enum
+{
+    /* non-signed */
+    x86_never  = (0 | 0 | 0 | 0),
+    x86_always = (0 | 0 | 0 | 1),
+    x86_eq     = (8 | 0 | 0 | 0),
+    x86_ne     = (8 | 0 | 0 | 1),
+    /* signed */
+    x86_lt     = (0 | 0 | 2 | 0),
+    x86_ge     = (0 | 0 | 2 | 1),
+    x86_le     = (8 | 0 | 2 | 0),
+    x86_gt     = (8 | 0 | 2 | 1),
+    /* unsigned */
+    x86_ltu    = (0 | 4 | 0 | 0),
+    x86_geu    = (0 | 4 | 0 | 1),
+    x86_leu    = (8 | 4 | 0 | 0),
+    x86_gtu    = (8 | 4 | 0 | 1),
+};

+/*
+ * invert condition
+ */
+
+static inline uint x86_invert_cond(uint c) {
+    return c ^ 1;
+}
+
+/*
+ * swap condition operands
+ */
+
+static inline uint x86_swap_cond(uint c) {
+    return c & 6 ? c ^ 9 : c;
+}

Re: [PATCH v1 4/4] x86-disas: add x86-mini disassembler implementation

Reply via email to