Issue |
144145
|
Summary |
[LoopVectorizer] If Conversion example doesn't work with floating-point types
|
Labels |
new issue
|
Assignees |
|
Reporter |
graham-yiu
|
The example from [Vectorizers/if-conversion](https://llvm.org/docs/Vectorizers.html#if-conversion) when converted to floating point types:
```
float foo(float *A, float *B, int n) {
float sum = 0;
for (int i = 0; i < n; ++i)
if (A[i] > B[i])
sum += A[i] + 5.0;
return sum;
}
```
`clang -fvectorize -fno-slp-vectorize -O3 -ffast-math -S -c if_conv.c`
Doesn't get vectorized even when -ffast-math enabled. From what I can see, the loop body in the IR is slightly different from the integer version, leading to loop vectorizer not recognizing it as a candidate:
```
for.body: ; preds = %for.body.preheader, %for.body
%indvars.iv = phi i64 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
%sum.014 = phi float [ 0.000000e+00, %for.body.preheader ], [ %sum.1, %for.body ]
%arrayidx = getelementptr inbounds float, ptr %A, i64 %indvars.iv
%0 = load float, ptr %arrayidx, align 4, !tbaa !5
%arrayidx2 = getelementptr inbounds float, ptr %B, i64 %indvars.iv
%1 = load float, ptr %arrayidx2, align 4, !tbaa !5
%cmp3 = fcmp fast ogt float %0, %1
%add = fadd fast float %sum.014, 5.000000e+00
%add6 = fadd fast float %add, %0
%sum.1 = select nsz i1 %cmp3, float %add6, float %sum.014
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond.not = icmp eq i64 %indvars.iv.next, %wide.trip.count
br i1 %exitcond.not, label %for.cond.cleanup.loopexit, label %for.body, !llvm.loop !9
}
```
Trace:
```
LV: Checking a loop in 'foo' from if_conv.c
LV: Loop hints: force=? width=0 interleave=0
LV: Found a loop: for.body
LV: Found an induction variable.
LV: Not vectorizing: Found an unidentified PHI %sum.014 = phi float [ 0.000000e+00, %for.body.preheader ], [ %sum.1, %for.body ]
LV: Interleaving disabled by the pass manager
LV: Can't vectorize the instructions or CFG
LV: Not vectorizing: Cannot prove legality.
```
compared to this loop body for integer types:
```
for.body: ; preds = %for.body.preheader, %for.body
%indvars.iv = phi i64 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
%sum.014 = phi i32 [ 0, %for.body.preheader ], [ %sum.1, %for.body ]
%arrayidx = getelementptr inbounds i32, ptr %A, i64 %indvars.iv
%0 = load i32, ptr %arrayidx, align 4, !tbaa !11
%arrayidx2 = getelementptr inbounds i32, ptr %B, i64 %indvars.iv
%1 = load i32, ptr %arrayidx2, align 4, !tbaa !11
%cmp3 = icmp sgt i32 %0, %1
%add = add nsw i32 %0, 5
%add6 = select i1 %cmp3, i32 %add, i32 0
%sum.1 = add nsw i32 %add6, %sum.014
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond.not = icmp eq i64 %indvars.iv.next, %wide.trip.count
br i1 %exitcond.not, label %for.cond.cleanup.loopexit, label %for.body, !llvm.loop !13
}
```
Trace:
```
LV: Checking a loop in 'foo_int' from if_conv.c
LV: Loop hints: force=? width=0 interleave=0
LV: Found a loop: for.body
LV: Found an induction variable.
LV: We can vectorize this loop!
``` # #
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs