https://llvm.org/bugs/show_bug.cgi?id=30627
Bug ID: 30627 Summary: [AArch64] LLVM vectorizer produces erroneous output Product: new-bugs Version: trunk Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P Component: new bugs Assignee: unassignedb...@nondot.org Reporter: hxy9...@gmail.com CC: llvm-bugs@lists.llvm.org Classification: Unclassified Created attachment 17413 --> https://llvm.org/bugs/attachment.cgi?id=17413&action=edit IR from reduced cpp code. We noticed ever since patch https://reviews.llvm.org/rL282418, trunk LLVM will produce errorneous output in the vectorized part of the code with flags "-O3 -cl-fast-relaxed-math", with the command listed below: clang --target=aarch64-unknown-linux-gnu -O3 -cl-fast-relaxed-math reduced.O0.strip.ll -emit-llvm -S The erroneous IR is as following: ; <label>:82: ; preds = %82, %77 %83 = phi i64 [ 0, %77 ], [ %188, %82 ] %84 = phi <2 x i64> [ <i64 0, i64 1>, %77 ], [ %189, %82 ] ... %108 = add nuw nsw <2 x i64> %84, <i64 1, i64 1> %109 = add <2 x i64> %84, <i64 3, i64 3> %110 = extractelement <2 x i64> %108, i32 0 %111 = icmp slt i64 %110, %64 %112 = extractelement <2 x i64> %109, i32 0 %113 = icmp slt i64 %112, %64 %114 = insertelement <2 x i1> undef, i1 %111, i32 0 %115 = shufflevector <2 x i1> %114, <2 x i1> undef, <2 x i32> zeroinitializer %116 = insertelement <2 x i1> undef, i1 %113, i32 0 %117 = shufflevector <2 x i1> %116, <2 x i1> undef, <2 x i32> zeroinitializer %118 = select <2 x i1> %115, <2 x i64> %108, <2 x i64> zeroinitializer %119 = select <2 x i1> %117, <2 x i64> %109, <2 x i64> zeroinitializer ... Before the patch, the generated IR should look like the following: %43 = add nuw nsw <2 x i64> %vec.ind, <i64 1, i64 1> %44 = add <2 x i64> %vec.ind, <i64 3, i64 3> %45 = icmp slt <2 x i64> %43, %broadcast.splat158 %46 = icmp slt <2 x i64> %44, %broadcast.splat158 %47 = select <2 x i1> %45, <2 x i64> %43, <2 x i64> zeroinitializer %48 = select <2 x i1> %46, <2 x i64> %44, <2 x i64> zeroinitializer Scalar values %111 and %113 are coming from the 1st element of vector %108 and %109. And these are inserted in both lanes of the vectors %115 and %117, which are used in the select statement. Where before the patch the select statements are using values from vector comparisons %43 and %44, and they're using both values in the vectors. For example, %108 is <8, 9>, and %64 is 8. %114 will have the value of the <0, undef> %115 will propagate the 1st value to the 2nd lane, and %115 is <0, 0>, which is used in the select. Before the patch, %43 is <8, 9>, and %broadcast.splat158 is <8, 8>. %45 is <0, 1>, which is used in the select, producing different result. We believe it's the problem in the scalarization part of the pass. As the commit message suggests: [LV] Scalarize instructions marked scalar after vectorization This patch ensures that we actually scalarize instructions marked scalar after vectorization. Previously, such instructions may have been vectorized instead. Differential Revision: https://reviews.llvm.org/D23889 -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs