https://llvm.org/bugs/show_bug.cgi?id=26279
Bug ID: 26279 Summary: X86 Decoder doesn't decode properly REP STOSW Product: new-bugs Version: 3.8 Hardware: All OS: All Status: NEW Severity: normal Priority: P Component: new bugs Assignee: unassignedb...@nondot.org Reporter: ya...@probud.homeip.net CC: llvm-bugs@lists.llvm.org Classification: Unclassified In LLVM 3.7.0 and 3.7.1 (I couldn't get any newer version downloaded, but 3.7.1 is very new), if one passes REP STOSW ( 0x66 0xF3 0xAB ) to X86 32-bit mode llvm::MCDisassembler's getInstruction() call, one will get back "stosd dword ptr es:[edi], eax" (instruction opcode STOSL, op 0 - register EDI). The correct outcome should be either a REP_PREFIX instruction, followed by STOSW instruction with op 0 - register DI, or a REP STOSW instruction with op 0 - register DI. The 0x66 0xF3 0xAB encoding of REP STOSW has been generated by Microsoft Visual Studio compiler 2010, and so I expect this encoding to be widely used. As far as I could ascertain from looking at the code, the problem stems from the usage of insn->necessaryPrefixLocation in X86DisassemblerDecoder.cpp. Two prefixes are correctly detected for instruction , the operand size override one (0x66), and the rep one (0xF3), but in readPrefixes() body, the necessaryPrefixLocation for instruction is set at location of rep prefix, and so later, in getID() body, line "if( insn->mode != MODE_16BIT && isPrefixAtLocation(insn, 0x66, insn->necessaryPrefixLocation))", it's looking for the operand size override prefix at location of rep prefix, and so it doesn't detect it, which leads to instruction being classified as STOSL rather than STOSW. The assumption that there will only be one prefix for instruction, and thus noting its location and looking for it only at that location seems wrong. Intel 64 and IA-32 Architectures Software Developers Manual Volume 2 states in section 2.1.1 (InstructionPrefixes) states that there may be more than one prefix, of different prefix groups, and that Groups 1 through 4 may be placed in any order relative to each other. Possibly it would be better to treat necessaryPrefixLocation as only the highest possible address where the prefix byte may be located, and in isPrefixAtLocation() code to search for prefix byte in every byte between insn->startLocation and location given as argument, not just at location? -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs