https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90447
Bug ID: 90447 Summary: Missed opportunities to use adc (worse when -1 is involved) Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: cassio.neri at gmail dot com Target Milestone: --- The following are three attempts to get gcc to generate adc instructions from C++: #include <x86intrin.h> unsigned constexpr X = 0; unsigned f1(unsigned a, unsigned b) { b += a; auto c = b < a; b += X + c; return b; } unsigned f2(unsigned a, unsigned b) { b += a; b += X + (b < a); return b; } unsigned f3(unsigned a, unsigned b) { b += a; unsigned char c = b < a; _addcarry_u32(c, b, X, &b); return b; } The 3 functions above (-O3 -std=c++17) generate: addl %edi, %esi movl %esi, %eax adcl $0, %eax ret This is great and I would expect that changing X would only affect the immediate value and nothing more. I was wrong. Changing X to 1, makes f1 and f3 change as I expected but f2 becomes: f2(unsigned int, unsigned int): xorl %eax, %eax addl %edi, %esi setc %al addl $1, %eax addl %esi, %eax ret I thought I could blame "b += X + (b < a);" for being undefined behaviour. However, I believe that, at least in c++17 this is not the case given the addition of this sentence: "The right operand is sequenced before the left operand." to [expr.ass]. As far as Standard C++ is concerned, I expect f1 to be equivalent to f2. Things got worse when X == -1: f1(unsigned int, unsigned int): xorl %eax, %eax addl %edi, %esi setc %al leal -1(%rax,%rsi), %eax ret f2(unsigned int, unsigned int): xorl %eax, %eax addl %edi, %esi setnc %al subl %eax, %esi movl %esi, %eax ret f3(unsigned int, unsigned int): addl %esi, %edi movl $-1, %eax setc %dl addb $-1, %dl adcl %edi, %eax ret No adc whatsoever. I'm not an assembly guy but if I understand f3 correctly, "setc %dl / addb $-1, dl" is simply storing the CF in dl and adding dl to 0xff to force CF to get the same value it already had before instruction setc was executed. Basically, this is a convoluted-register-wasteful nop. I thought the problem could be related to issue [1] but this one has already being resolved in trunk where this issue also happens and -fno-split-paths doesn't seem to change anything. The example in godbold is https://godbolt.org/z/3GUyLj but if you play with the site's settings (particularly, lib.f) be aware of their issue [2]. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88797 but this [2] https://github.com/mattgodbolt/compiler-explorer/issues/1377