On 18.01.2024 17:40, LIU Hao wrote: > 在 2024-01-18 20:54, Jan Beulich 写道: >> I'm sorry, but most of your proposal may even be considered for being >> acceptable only if you would gain buy-off from the MASM guys. Anything >> MASM treats as valid ought to be permitted by gas as well (within the >> scope of certain divergence that cannot be changed in gas without >> risking to break people's code). It could probably be considered to >> introduce a "strict" mode of Intel syntax, following some / most of >> what you propose; making this the default cannot be an option. > > Thanks for your reply. > > I have attached the Markdown source for that page, modified a few hours ago. > I am planning to make > some updates according to your advice tomorrow.
Just to mention it: Attaching is in no way better than providing a link, commenting-wise. > And yes, I am proposing a 'strict' mode, however not for humans, only for > compilers. > > My first message references a GCC bug report, where the problematic symbol > `bx` comes from C source. > I have been aware of the `/APP` and `/NO_APP` markers in generated assembly, > so I suspect that GAS > should be able to tell which parts are generated from a compiler and which > parts are composed by > hand. The proposed strict mode may apply only to the output from GCC, which > are much more likely to > contain bad symbols, but are also more controllable on the GCC side. > > I believe that skillful people who write x86 assembly have known that > `offset`, `shr`, `si` etc. are > 'bad' names for symbols. Therefore, it's like an issue there. > > >> Commenting on individual aspects of your proposal is a little difficult, >> as you didn't provide the proposal inline (and hence it cannot be easily >> used as context in a reply). But to mention the imo worst aspect: >> Declaring >> >> mov eax, [rcx] >> >> as invalid is a no-go. > > I agree. I am considering to declare the lack of a symbol as a special case. Well, I took this as the simplest example. But clearly there should never be a need for an assembly programmer to needlessly write "dword ptr" or alike, when operand size is unambiguous. Limiting "strict mode" to compiler output would take away concerns in this regard (as machine generated assembly has no issue with uniformly adding such redundant specifiers, much like in AT&T mode suffixes would typically be emitted even when not needed). But I see a severe issue with your aim at confining strict mode to compiler generated code only: In inline assembly (see your mentioning of APP / NO_APP above) you still potentially reference C symbols. So the ambiguities don't disappear in APP / NO_APP regions. >> I also don't see how this would be related to the >> issue at hand. What's in the square brackets may as well be a symbol >> name, so requiring the "mode specifier" doesn't disambiguate things at >> all. > > If someone declares a variable called `rcx` in C, it has be translated to > > mov eax, DWORD PTR rcx # `movl rcx, %eax` > > instead of > > mov eax, DWORD PTR [rcx] # `movl (%rcx), %eax` And an array happening to be indexed by rcx would then result in mov eax, DWORD PTR rcx[rcx] # `movl rcx(%rcx), %eax` ? That's going to be confusing at best. I think this whole issue needs taking care of differently, and iirc I did already suggest an alternative in one of the bugzilla entries involved: Potentially ambiguous names (which to a compiler may mean: all symbol names) ought to simply be quoted, and it ought to be specified that quoted symbols are never registers. Iirc this will require gas changes, yes, but it'll address all ambiguities afaict. Jan