There is a long-standing, but undocumented GCC inline assembly feature
that's part of the extended asm GCC extension to C and C++: extended
asm empty input constraints.

Although I don't really use extended asm much, and I never contributed
to GCC before; I tried to document the feature as far as I understand
it. I ran make html to check that the changed Texinfo is well formed.

FTR, empty input constraints have been mentioned on the GCC mailing
lists, e.g.:
https://gcc.gnu.org/pipermail/gcc-help/2015-June/124410.html

I release this contribution into the public domain.

Neven Sajko

gcc/ChangeLog:

        * doc/md.texi: Document extended asm empty input constraints

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index e3686dbfe..deccfd38a 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -1131,7 +1131,102 @@ the addressing register.
 @subsection Simple Constraints
 @cindex simple constraints

-The simplest kind of constraint is a string full of letters, each of
+An input constraint is allowed to be an empty string, in which case it is
+called an empty input constraint. (When an empty input constraint is used,
+the assembler template will most probably also be empty. I.e., the @code{asm}
+declaration need not contain actual assembly code.) An empty input
+constraint can be used to create an artificial dependency on a C or C++
+variable (the variable that appears in the expression associated with the
+constraint) without incurring unnecessary costs to performance.
+
+An example of where such behavior may be useful is for preventing compiler
+optimizations like dead store elimination or hoisting code outside a loop for
+certain pieces of C or C++ code. Specific applications may include direct
+interaction with hardware features; or things like testing, fuzzing and
+benchmarking.
+
+Here's a simple C++20 program that is not useful in practice but demonstrates
+relevant behavior; store it as a file called asm.cc:
+
+@verbatim
+#include <vector>
+
+int
+main() {
+    // Greater than or equal to zero.
+    constexpr int asmV = ASM_V;
+
+    // The exact content of v is irrelevant for
+    // this example.
+    std::vector<char> v{7, 6, 9, 3, 2, 0};
+
+    for (int i{0}; i < (1 << 28); i++) {
+        for (int j{0}; j < 6; j++) {
+            // The exact operation on the contents
+            // of v is not relevant for this
+            // example.
+            v[j]++;
+
+            if constexpr (1 <= asmV) {
+                asm volatile ("" :: ""(v.size()));
+                for (auto x: v) {
+                    asm volatile ("" :: ""(x));
+                }
+            }
+            if constexpr (2 <= asmV) {
+                asm volatile ("" :: ""(v.size()));
+                for (auto x: v) {
+                    asm volatile ("" :: ""(x));
+                }
+            }
+            if constexpr (3 <= asmV) {
+                asm volatile ("" :: ""(v.size()));
+                for (auto x: v) {
+                    asm volatile ("" :: ""(x));
+                }
+            }
+        }
+    }
+
+    return 0;
+}
+@end verbatim
+
+Compile with, e.g., the following command (with @code{XXX} equal to @code{0},
+@code{1}, @code{2}, and @code{3}).
+
+@verbatim
+g++ -std=c++20 -O3 -flto -march=native -D ASM_V=XXX -o XXX asm.cc
+@end verbatim
+
+Firstly, for @code{XXX} equal to @code{0}; all of the @code{asm} declarations
+are dead code, thus formally the contents of @var{v} are not observable,
+thus the program consists almost entirely of code that may be eliminated by a
+(valid) compiler. While this usually aligns with what the programming user
+wants, sometimes we might want to, e.g., measure how long does it take for
+some piece of code to execute, even if we aren't interested in its results
+(or already know what its results must be). Such is the case in, e.g.,
+benchmarking.
+
+Secondly, for @code{XXX} equal to @code{1}; only the first part with
+@code{asm} declarations (the body of the first @code{if} statement) is
+effective, and because of it the preceding code can not be eliminated,
+because the @code{asm} declarations depend on @var{v} and its contents as
+input operands. The same effect would exist with a nonempty input constraint
+in place of the empty input constraints, but probably with additional
+unnecessary code generation and diminished performance. The innermost loop
+should not cause any code to be generated, because the input constraint is
+empty.
+
+Thirdly, for @code{XXX} equal to @code{2} or @code{3}; assuming the
required compiler
+optimizations are successful, the generated code should be the same as for
+@code{XXX} equal to @code{1}. This is again because of the empty
input constraint
+preventing unnecessary code generation (a nonempty input constraint would
+probably require that the compiler store values into either registers or
+memory, even though the assembler template is empty).
+
+The simplest kind of constraint, apart from the empty constraint,
+is a string full of letters, each of
 which describes one kind of operand that is permitted.  Here are
 the letters that are allowed:

Reply via email to