https://llvm.org/bugs/show_bug.cgi?id=25899
Bug ID: 25899 Summary: Loads and Stores are not always coalesced Product: libraries Version: 3.7 Hardware: PC OS: All Status: NEW Severity: normal Priority: P Component: Backend: X86 Assignee: unassignedb...@nondot.org Reporter: haneef...@gmail.com CC: llvm-bugs@lists.llvm.org Classification: Unclassified Clang (llvm?) sometimes generates inefficient code for loads and stores, but recognizes that *the same code* can be optimized into fewer loads/stores at different times. For example, take this simple code: ``` #include <stdint.h> int l32 (const uint8_t *b) { int r = 0; r ^= b[0]; r ^= b[1] << 8; r ^= b[2] << 16; r ^= b[3] << 24; return r; } int f (int a) { return l32 ((void *) &a); } ``` `clang -O2` generates (clang 3.7, intel syntax, extraneous contents removed): l32: movzx eax, byte ptr [rdi] movzx ecx, byte ptr [rdi + 1] shl ecx, 8 or ecx, eax movzx edx, byte ptr [rdi + 2] shl edx, 16 or edx, ecx movzx eax, byte ptr [rdi + 3] shl eax, 24 or eax, edx ret f: mov eax, edi ret If it was able to optimize f() to a simple register move, it must have recognized that the loads could be coalesced into a single load (or that a little endian load was being compiled for an architecture that happened to be little endian). Hence, it really is quite odd that it didn't perform the same optimization and reduce l32() to something more like: l32: mov eax, dword ptr [rdi] ret -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs