a note to announce a port of the x86-mini disassembler to QEMU. - https://github.com/michaeljclark/qemu/tree/x86-mini - https://github.com/michaeljclark/x86/tree/x86-mini
# x86-mini the x86-mini library is a lightweight x86 encoder, decoder, and disassembler that uses extensions to the Intel instruction set metadata format to encode modern VEX/EVEX instructions and legacy instructions using a parameterized LEX (legacy extension) format. - metadata-driven disassembler with Intel format output. - written in C11 for compatibility with projects written in C. - low-level instruction encoder and decoder uses <= 32-bytes. - python tablegen program to generate C tables from CSV metadata. - metadata table tool to inspect operand encode and decode tables. - carefully checked machine-readable instruction set metadata. - support for REX/VEX/EVEX and preliminary support for REX2. the x86-mini x86 encoder and decoder library has been written from scratch to be modern and as simple as possible while also covering recent additions to the Intel and AMD 64-bit instruction sets such as the EVEX encodings for recent AVX-512 extensions and soon REX2/ EVEX encodings for Intel APX, as it is written with that in mind. ## interest to the QEMU community - x86-mini is fast. raw decode performance is ~100-200MiB/sec. - x86-mini is small. 5 files, ~5 KLOC or ~13 KLOC including tables. - x86-mini is complete and includes the latest AVX-512 extensions. - x86-mini is easy to extend and uses extended Intel format metadata. - x86-mini is documented with detailed info on the metadata format. - x86-mini has CLI tools for searching x86 instruction set metadata. ## techinical notes - the decoder is table-based and uses a metadata interpreter. - the decode table is ~66KiB with a ~150KiB acceleration trie. - there are currently 3658 opcode entries active on x86-64 which expands to 4775 table entries due to parameterization. - it could be made faster by vectorizing the prefix decoder and generating decode templates from the metadata to consteval metadata interpretation to eliminate some L1 D$ traffic. after cherry-picking the commit, one can test host and target disassembly support. e.g. for an x86-64 target on an x86-64 host: $ echo aaa | qemu-x86_64 -d in_asm,out_asm /usr/bin/openssl sha256 ## caveats and limitations - supports 32-bit and 64-bit disassembly, and theoretically 16-bit. - designed to support 16-bit but base index formats are not done yet. - x86-64 is exhaustively fuzz-tested against the LLVM disassembler. - but x86-mini is new and hasn't been battle-tested in production. if you already link with capstone then it doesn't provide very many immediate benefits, however, I think it is potentially useful as a small embeddable disassembler to evaluate for potential inclusion. ## rationale I worked on the QEMU disassembler while working on the QEMU RISC-V target back in 2017/2018 and I was curious about vector support. it seemed at the time that TCG vector support was piecemeal, plus the old x86 disassembler seemed messy and incomplete. I also needed an MIT-licensed disassembler to enable use in a commercial product. basically, I was looking for a lightweight symmetric x86 instruction encoder and decoder library in pure C with simple build requirements. that is what prompted this initiative. it would be nice to have an x86 disassembler building out-of-the-box as I find QEMU's built-in tracing extremely useful and given x86 is a popular target, a small embedded disassembler might be practical. ## summary and conclusion at minimum, the metedata may be useful for x86 EVEX support. note I see `tests/tcg/i386/x86.csv` in the source tree. the metadata is also based on x86-csv but has had numerous inaccuracies fixed as well as conversion of legacy instructions to the new LEX format. in effect the metadata has been fuzz-tested against LLVM for x86-64 and ISA coverage is in the order of ~99.7%. the main branch of the linked repo has a procedural fuzzer for metadata-based instruction synthesis that could be useful for generating test cases for QEMU. I am kind of throwing this over the fence, although the code is quite self-contained and my stress and mental health is now under control. also I have not yet run checkpatch.pl on this code. it is a preview. x86-mini submaintainer. Michael Clark. --