https://bugs.llvm.org/show_bug.cgi?id=35282

            Bug ID: 35282
           Summary: Functional bug: LoopVectorizer must update the nuw/nsw
                    flags
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Loop Optimizer
          Assignee: unassignedb...@nondot.org
          Reporter: serguei.kat...@azul.com
                CC: llvm-bugs@lists.llvm.org

Let's consider the following simple loop:
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128-ni:1"
target triple = "x86_64-unknown-linux-gnu"

define i64 @test(i32 %a) {
entry:
  %factor = mul nsw i32 %a, -5
  br label %loop

loop:
  %b = phi i32 [ -35, %entry ], [ %e, %loop ]
  %c = phi i32 [ 10, %entry ], [ %f, %loop ]
  %d = sub nuw nsw i32 %b, %a
  %e = add nsw i32 %factor, %d
  %f = add nuw nsw i32 %c, 1
  %cmp = icmp ugt i32 %c, 226
  br i1 %cmp, label %done, label %loop

done:
  %.lcssa = phi i32 [ %e, %loop ]
  %g = sext i32 %.lcssa to i64
  ret i64 %g
}

Loop Vectorizer transforms (opt --loop-vectorize test.ll -S) it to
...
vector.body:                                      ; preds = %vector.body,
%vector.ph
  %index = phi i32 [ 0, %vector.ph ], [ %index.next, %vector.body ]
  %vec.phi = phi <4 x i32> [ <i32 -35, i32 0, i32 0, i32 0>, %vector.ph ], [
%4, %vector.body ]
  %vec.phi1 = phi <4 x i32> [ zeroinitializer, %vector.ph ], [ %5, %vector.body
]
  %offset.idx = add i32 10, %index
  %broadcast.splatinsert = insertelement <4 x i32> undef, i32 %offset.idx, i32
0
  %broadcast.splat = shufflevector <4 x i32> %broadcast.splatinsert, <4 x i32>
undef, <4 x i32> zeroinitializer
  %induction = add <4 x i32> %broadcast.splat, <i32 0, i32 1, i32 2, i32 3>
  %induction2 = add <4 x i32> %broadcast.splat, <i32 4, i32 5, i32 6, i32 7>
  %0 = add i32 %offset.idx, 0
  %1 = add i32 %offset.idx, 4
  %2 = sub nuw nsw <4 x i32> %vec.phi, %broadcast.splat4
  %3 = sub nuw nsw <4 x i32> %vec.phi1, %broadcast.splat6
  %4 = add nsw <4 x i32> %broadcast.splat8, %2
  %5 = add nsw <4 x i32> %broadcast.splat10, %3
  %index.next = add i32 %index, 8
  %6 = icmp eq i32 %index.next, 216
  br i1 %6, label %middle.block, label %vector.body, !llvm.loop !0
...

The problematic instruction is
  %3 = sub nuw nsw <4 x i32> %vec.phi1, %broadcast.splat6
%vec.phi1 on first iteration is zero and "0 - value" with nuw actually means
that it is zero as well.

So after, for example, full loop unrolling LLVM is allowed to eliminate this
instruction on first iteration but it is invalid.

So vectorizer should clean or be smarter with nuw/nsw flags.

Could someone working on LLVM vectorizer take a look at this?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to