https://llvm.org/bugs/show_bug.cgi?id=31032
Bug ID: 31032 Summary: [X86] Extra scalar move in scalar intrinsic sequences Product: libraries Version: trunk Hardware: PC OS: All Status: NEW Severity: normal Priority: P Component: Backend: X86 Assignee: unassignedb...@nondot.org Reporter: craig.top...@gmail.com CC: llvm-bugs@lists.llvm.org Classification: Unclassified This IR sequence produces an extra move or blend of zero into the upper bits just before the minss instruction. ; Function Attrs: nounwind define i16 @test1(float %f) #0 { %tmp = insertelement <4 x float> undef, float %f, i32 0 %tmp10 = insertelement <4 x float> %tmp, float 0.000000e+00, i32 1 %tmp11 = insertelement <4 x float> %tmp10, float 0.000000e+00, i32 2 %tmp12 = insertelement <4 x float> %tmp11, float 0.000000e+00, i32 3 %1 = extractelement <4 x float> %tmp12, i32 0 %2 = fsub float %1, 1.000000e+00 %3 = insertelement <4 x float> %tmp12, float %2, i32 0 %4 = extractelement <4 x float> %3, i32 0 %5 = fmul float %4, 5.000000e-01 %6 = insertelement <4 x float> %3, float %5, i32 0 %tmp48 = tail call <4 x float> @llvm.x86.sse.min.ss(<4 x float> %6, <4 x float> <float 6.553500e+04, float 0.000000e+00, float 0.000000e+00, float 0.000000e+00>) %tmp59 = tail call <4 x float> @llvm.x86.sse.max.ss(<4 x float> %tmp48, <4 x float> zeroinitializer) %tmp.upgrd.1 = tail call i32 @llvm.x86.sse.cvttss2si(<4 x float> %tmp59) %tmp69 = trunc i32 %tmp.upgrd.1 to i16 ret i16 %tmp69 } Output for the AVX target: vxorps %xmm1, %xmm1, %xmm1 vaddss LCPI0_0(%rip), %xmm0, %xmm0 vmulss LCPI0_1(%rip), %xmm0, %xmm0 vblendps $1, %xmm0, %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[1,2,3] vminss LCPI0_2(%rip), %xmm0, %xmm0 vmaxss %xmm1, %xmm0, %xmm0 vcvttss2si %xmm0, %eax ## kill: %AX<def> %AX<kill> %EAX<kill> retq This occurs because the floats constant fold forward to the final insertelement at %6 creating a vzmovl SDNode that is lowered as a blendps or movss depending on which features are enabled. Ideally we'd realize that the min, max, and vcvtss2si don't need the bits are move them. I suspect running this code through InstCombine first might figure that out. -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs