https://llvm.org/bugs/show_bug.cgi?id=30654
Bug ID: 30654 Summary: [LoopVectorizer/SCEV] induction with truncation prevents vectorization. Need runtime overflow test. Product: libraries Version: trunk Hardware: PC OS: All Status: NEW Severity: normal Priority: P Component: Loop Optimizer Assignee: unassignedb...@nondot.org Reporter: dorit.nuz...@intel.com CC: llvm-bugs@lists.llvm.org Classification: Unclassified Saw this missed optimization in a Geekbench workload: We have a signed int index ‘w_ix’ which is incremented by an unsigned long ‘step’. When compiling with -m32 all is well. However when compiling with -m64 the result of the ulong addition may not fit in back into the sint index and so we may have sint overflow (the index may wrap). for.body: %w_ix.014 = phi i64 [ %add3, %for.body ], [ 0, %for.body.preheader ] %sext = shl i64 %w_ix.014, 32 %idxprom = ashr exact i64 %sext, 32 %add3 = add i64 %idxprom, %step As a result the loop vectorizer fails with “LV: PHI is not a poly recurrence… Found an unidentified PHI”. "loop not vectorized: value that could not be identified as reduction is used outside the loop." In order to guarantee that the induction behaves nicely we need to identify this pattern (addition with 64-to-32-bit truncation), and generate a runtime sint overflow check (e.g. check that step*loopTripCount is small enough). This is a reduced testcase: #include <stdlib.h> float in[1000]; float out[1000]; void test(size_t out_start, size_t size, size_t step) { int w_ix = 0; for (size_t out_offset = 0; out_offset < size; ++out_offset) { size_t out_ix = out_start + out_offset; float w = in[w_ix]; out[out_ix] += w; w_ix += step; } } (I compiled it with -m64 -Ofast -static -march=core-avx2 ). -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs