https://llvm.org/bugs/show_bug.cgi?id=28001
Bug ID: 28001 Summary: [x86, SSE] recognize min/max FP patterns Product: libraries Version: trunk Hardware: PC OS: All Status: NEW Severity: normal Priority: P Component: Scalar Optimizations Assignee: unassignedb...@nondot.org Reporter: spatel+l...@rotateright.com CC: llvm-bugs@lists.llvm.org Classification: Unclassified This is the FP version of http://reviews.llvm.org/D20774 , but we need a different solution. The problem in this case is not the bitcasts. It's either that the x86 intrinsic is interfering with IR pattern recognition or that the backend needs to recognize this as a select of 'fcmp oge'. define <4 x i32> @gibsonfp(<4 x float> %a, <4 x float> %b) { %bc1 = bitcast <4 x float> %a to <4 x i32> %bc2 = bitcast <4 x float> %b to <4 x i32> %cmpps = tail call <4 x float> @llvm.x86.sse.cmp.ps(<4 x float> %b, <4 x float> %a, i8 1) %cmpbc = bitcast <4 x float> %cmpps to <4 x i32> %neg = xor <4 x i32> %cmpbc, <i32 -1, i32 -1, i32 -1, i32 -1> %and1 = and <4 x i32> %cmpbc, %bc1 %and2 = and <4 x i32> %neg, %bc2 %or = or <4 x i32> %and1, %and2 ret <4 x i32> %or } So instead of: _gibsonfp: movaps %xmm1, %xmm2 cmpltps %xmm0, %xmm2 andps %xmm2, %xmm0 andnps %xmm1, %xmm2 orps %xmm2, %xmm0 retq We should be able to produce: maxps %xmm1, %xmm0 retq Note: 1. The FP min/max instructions have been around since SSE1, so this can apply to the vast majority of subtargets. 2. This does not require fast-math in the general case, but there may be cases that simplify further if we have some kind of fast-math decoration on the compare intrinsic. -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs