Hello,

There hasn't been an solution to https://gcc.gnu.org/PR53929 since almost a dozen years ago, mostly due to compatibility with MASM. I was told that the ambiguity of Intel syntax should be classified as its own limitation and disrecommendation.

Notwithstanding, I am proposing a permanent solution to this issue, by banning constructions that cause ambiguity. This is likely to effect incompatibility with other assemblers, but it should make GAS parse the output of GCC flawlessly.


PR53929 contains a known ambiguous construction

   lea  rax, bx[rip]

where `bx` could denote the BX register and causes confusion. The Intel Software Developer Manual also contains an ambiguous construction

   MOV EBX, RAM_START

which would look like loading the offset of `RAM_START`. My proposal is that these two constructions are ambiguous and should be rejected. The compiler should generate assembly in the unambiguous subset, and we can start to implement the assembler to reject the ambiguous ones.

Their are formalized as

   lea rax, BYTE PTR bx[rip]
   mov EBX, DWORD PTR RAM_START

Roughly speaking, anything after `PTR`/`BCST` (and before `[` if any) is considered a symbol even if it matches a keyword; any identifier between `[` and `]` is a register and not a symbol.


My complete proposal can be found at <https://github.com/lhmouse/mcfgthread/wiki/Formalized-Intel-Syntax-for-x86>. Some ideas actually reflect the AT&T syntax. I hope it helps.


--
Best regards,
LIU Hao

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to