https://llvm.org/bugs/show_bug.cgi?id=30821
Bug ID: 30821 Summary: Miscompile due to Stack Slot Coloring and partial load/stores Product: libraries Version: trunk Hardware: PC OS: All Status: NEW Severity: normal Priority: P Component: Register Allocator Assignee: unassignedb...@nondot.org Reporter: greg_bedw...@sn.scee.net CC: llvm-bugs@lists.llvm.org Classification: Unclassified One of our random test generators found the following miscompile bug. Here's a source-level repro-case (reduced and massaged a bit from the initial randomly generated case to make it a bit less transient, with the added __asm__ statement): // ======================================================================= extern "C" int printf(const char *, ...); typedef double __m128d __attribute__((__vector_size__(16))); __attribute__((noinline)) static void init(unsigned char pred, void *data, unsigned size) { unsigned char *bytes = (unsigned char *)data; for (unsigned i = 0; i != size; ++i) { bytes[i] = pred + i; } } int main() { volatile unsigned char alpha = 1; unsigned char bravo = alpha; __m128d foxtrot; for (unsigned char charlie = 0; charlie < bravo; ++charlie) { for (unsigned char golf = 0; golf < 2; ++golf){ __asm__ volatile ("nop":::"ebx", "r8", "r9", "r12", "r13", "r14", "r15"); init(0xff, &foxtrot, sizeof(foxtrot)); } } volatile __m128d india = {23.0, 24.0}; printf("%lf %lf -> ", india[0], india[1]); india[1] = 0.0; printf("%lf %lf\n", india[0], india[1]); } // ======================================================================= $ ./build/bin/clang 1.cpp --version clang version 4.0.0 (trunk 285395) (llvm/trunk 285397) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /home/greg/public_git/./build/bin $ ./build/bin/clang 1.cpp -O0 -target x86_64-unknown-unknown -mavx -o 1.elf && ./1.elf 23.000000 24.000000 -> 23.000000 0.000000 $ ./build/bin/clang 1.cpp -O2 -target x86_64-unknown-unknown -mavx -o 1.elf && ./1.elf 23.000000 24.000000 -> 23.000000 0.000000 $ ./build/bin/clang 1.cpp -O1 -target x86_64-unknown-unknown -mavx -o 1.elf && ./1.elf 23.000000 24.000000 -> 23.000000 24.000000 ^^^^^^^^^---- INCORRECT RESULT Note the difference in the -O1 build result. Here's our initial local analysis of the bug (from Andrea Di Biagio). Any help appreciated. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The problem is caused by an incorrect simplification performed by the Stack Slot Coloring pass. The stack slot coloring pass attempts to reduce the number of stack slots introduced by the register allocator when inserting spill/reload code. After register allocation, we may end up with code that looks similar to the following pseudo-code: //// 1. SPILL_REGISTER(regA, SlotA); 2. ... 3. regB = RELOAD_REGISTER(SlotA); 4. SPILL_REGISTER(regB, SlotB); 5. ... 6. RegC = RELOAD_REGISTER(SlotB); 7. ... /// Here we have two stack slots respectively named 'SlotA' and 'SlotB'. SlotA is used to spill the value of 'regA'. Later on, the value of 'regA' is reloaded into register 'regB'. That register is also spilled in memory to a different stack location. Eventually, it is reloaded from stack into yet another register. Pass Stack Slot Coloring is able to see that 'SlotA' is live between instructions #1 and #3. Stack slot 'SlotB' is live between instructions #4 and #6. Since the liveness ranges for the two slots is disjoint, pass Stack Slot Coloring attempts to reuse the same stack slot for both spills/reloads. So we end up with code similar to this: //// 1. SPILL_REGISTER(regA, SlotA); 2. ... 3. regB = RELOAD_REGISTER(SlotA); 4. SPILL_REGISTER(regB, SlotA); 5. ... 6. RegC = RELOAD_REGISTER(SlotA); 7. ... /// So, we saved one stack location, and we exposed some redundancy in the code. Now, the following two instructions are redundant: 3. regB = RELOAD_REGISTER(SlotA); 4. SPILL_REGISTER(regB, SlotA); Therefore, the Stack Slot Coloring pass removes the redundant spill/reload sequence (3. and 4.). The problem is that the Stack Slot Coloring pass uses two hook implemented by the TTI (Target Instruction Info) interface to figure out if an instruction is a "load from stack slot" or a "store to stack slot". On x86, opcodes MOVSS/MOVSD/VMOVSS/VMOVSD are all valid opcodes for loads/stores from/to stack slots. This is problematic because MOVSS/MOVSD/VMOVSS/VMOVSD are "partial loads/stores" since those operats on 32/64 bits only (instead of 128/256). In particular, MOVSSrm/MOVSDrm are zero-extending loads. Let's then have a look at the following example: //// 1. VMOVUPDmr regA, SlotA; 2. ... 3. regB = VMOVSDrm SlotA; 4. VMOVUPDmr regB, SlotA; 5. ... 6. RegC = VMOVUPDrm SlotA; 7. ... /// The problem here is that the following sequence of instructions cannot be removed because the load is a partial (zero-extending) load! 3. regB = VMOVSDrm SlotA; 4. VMOVUPDmr regB, SlotA; Unfortunately, the stack slot coloring doesn't have the notion of "partial load/store", so it believes that the above sequence can be safely removed. Therefore the bug. Strictly speaking, this is not a regression with respect to the previous release. The code in the stack slot coloring has always been problematic. So, the bug has always been latent; it has been accidentally exposed due to other unrelated changes to regalloc and other parts of the code generator. Unfortunately there is not an easy workaround for this issue. One way to fix this issue is to propagate the information that MOVSS/MOVSD instructions as "partial loads/partial store" instructions. We could extend the TII interface by adding a "isPartialLoad/isPartialStore" hook; that hook would be used by the Stack Slot Coloring to filter out load-store pairs that are unsafe to remove. Ideally, it would be nice to be less conservative and also be able to propagate the size information for the partial load/store, so that we don't fail to simplify the following sequence: 3. regB = VMOVSDrm SlotA; 4. VMOVSDmr regB, SlotA; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs