https://llvm.org/bugs/show_bug.cgi?id=26859
Bug ID: 26859 Summary: [x86, SSE] phaddw / phaddd wrongly generated Product: libraries Version: trunk Hardware: PC OS: All Status: NEW Severity: normal Priority: P Component: Backend: X86 Assignee: unassignedb...@nondot.org Reporter: spatel+l...@rotateright.com CC: llvm-bugs@lists.llvm.org Classification: Unclassified Packed horizontal add - phaddd / phaddw: These are SSSE3 (yes, 3 Ss) instructions that should probably never be generated for performance reasons, only to save on code size. They're just about guaranteed to be slow because they operate across vector lanes. Here we're not only generating these things, but for a target that doesn't have SSSE3: $ cat accum.c int please_no_phaddd(int *x) { int sum = 0; for (int i=0; i<1024; ++i) sum += x[i]; return sum; } short please_no_phaddw(short *x) { short sum = 0; for (int i=0; i<1024; ++i) sum += x[i]; return sum; } bin $ ./clang -O2 -S -o - accum.c -msse -fno-unroll-loops|grep phadd .globl _please_no_phaddd _please_no_phaddd: ## @please_no_phaddd phaddd %xmm1, %xmm1 .globl _please_no_phaddw _please_no_phaddw: ## @please_no_phaddw phaddw %xmm0, %xmm0 -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs