Hello all, There is a long-standing, but undocumented GCC inline assembly feature that's part of the extended asm GCC extension to C and C++: extended asm empty input constraints.
As far as I understand the semantics, this is a feature that effectively creates a "fake" load and a "fake" ordering dependency. The distinguishing feature of the empty constraint is that it's possible for the compiler to optimize it (and this seems to work fine with gcc -O3) so that no unnecessary additional code generation happens. Thus this feature is something that is useful for testing C/C++ code, or benchmarking it, or fuzzing it, etc. Note that this feature would then almost always be used without actual assembly instructions in the asm declaration/statement. The feature was luckily mentioned on the GCC mailing lists in the past, here's a quote from https://gcc.gnu.org/pipermail/gcc-help/2015-June/124410.html by David Brown. > But the extra "asm volatile" here with a fake input tells the compiler > that "val" is an input to the (empty) assembly, and must therefore be > calculated before the statement is executed. The empty input constraint > (no "r" or "m") gives the compiler complete freedom about where it wants > to put this fake input - all we are saying is that the value "val" must > be calculated before executing > asm volatile("" :: "" (val)) > Generating assembly from this (using gcc-4.5.1, which is the latest > avr-gcc I have installed at the moment) shows the division being done > before the cli() - the code is optimal and correct, with no unnecessary > memory operations (as you would need by making "val" volatile). There is also a Stack Overflow question where somebody is confused with the lack of documentation: https://stackoverflow.com/questions/63305223/gcc-asm-with-empty-input-operand-constraint This is why I am asking for this feature to be documented. It seems like this would be the most appropriate place: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#InputOperands I would have tried to contribute a doc fix myself, but I couldn't find the sources for the online docs in the GCC Git repository. Lastly, in case it's helpful, here's a short C++ program which demonstrates the behavior in more concrete terms: > #include <vector> > > int > main() { > // Greater than or equal to zero. > constexpr int asmV = ASM_V; > > std::vector<char> v{7, 6, 9, 3, 2, 0}; > for (int i{0}; i < (1 << 28); i++) { > for (int j{0}; j < 6; j++) { > v[j]++; > > if constexpr (1 <= asmV) { > asm volatile ("" :: ""(v.size())); > for (auto x: v) { > asm volatile ("" :: ""(x)); > } > } > if constexpr (2 <= asmV) { > asm volatile ("" :: ""(v.size())); > for (auto x: v) { > asm volatile ("" :: ""(x)); > } > } > if constexpr (3 <= asmV) { > asm volatile ("" :: ""(v.size())); > for (auto x: v) { > asm volatile ("" :: ""(x)); > } > } > } > } > > return 0; > } So compile it with, e.g. g++ -std=c++20 -O3 -flto -march=native -D ASM_V=XXX -o XXX asm.cc Where XXX will be 0, 1, 2 and 3. For ASM_V=0 the loop gets optimized out, while for greater values the loop stays. For ASM_V=1, ASM_V=2 and ASM_V=3 the generated code is exactly the same. Thanks, Neven Sajko