https://llvm.org/bugs/show_bug.cgi?id=31283
Bug ID: 31283 Summary: Terrible ARM shuffle lowering for extend from <2 x i8> to <8 x i8> Product: libraries Version: trunk Hardware: PC OS: Windows NT Status: NEW Severity: normal Priority: P Component: Backend: ARM Assignee: unassignedb...@nondot.org Reporter: efrie...@codeaurora.org CC: llvm-bugs@lists.llvm.org Classification: Unclassified Testcase: define <8 x i8> @f(<2 x i8> *%p) { %t = load <2 x i8>, <2 x i8> *%p %r = shufflevector <2 x i8> %t, <2 x i8> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2> ret <8 x i8> %r } With llc -mtriple=armv7--linux-gnueabihf: vld1.16 {d16[0]}, [r0:16] vldr d18, .LCPI0_0 vmovl.u8 q8, d16 vmovl.u16 q8, d16 vtbl.8 d0, {d16}, d18 bx lr The first instruction is great... vld1.16 produces exactly the result we want. The problem is the following four instructions, which add up to an identity shuffle. I think what's happening is that we treat vld1.16+vmovl.u8+vmovl.u16 as a single, legal DAG node ("load<LD2[%p], anyext from v2i8>"), so shuffle combining never tries to eliminate the extra shuffles. Excerpts from SelectionDAG debug output: Optimized type-legalized selection DAG: BB#0 'f:' SelectionDAG has 14 nodes: t0: ch = EntryToken t18: i32 = extract_vector_elt t16, Constant:i32<0> t21: i32 = extract_vector_elt t16, Constant:i32<1> t24: v8i8 = BUILD_VECTOR t18, t21, undef:i32, undef:i32, undef:i32, undef:i32, undef:i32, undef:i32 t9: f64 = bitcast t24 t11: ch,glue = CopyToReg t0, Register:f64 %D0, t9 t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0 t16: v2i32,ch = load<LD2[%p], anyext from v2i8> t0, t2, undef:i32 t12: ch = ARMISD::RET_FLAG t11, Register:f64 %D0, t11:1 Legalized selection DAG: BB#0 'f:' SelectionDAG has 15 nodes: t0: ch = EntryToken t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0 t16: v2i32,ch = load<LD2[%p], anyext from v2i8> t0, t2, undef:i32 t25: v8i8 = bitcast t16 t38: i32 = ARMISD::Wrapper TargetConstantPool:i32<<8 x i8> <i8 0, i8 4, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>> 0 t35: f64,ch = load<LD8[ConstantPool]> t0, t38, undef:i32 t36: v8i8 = bitcast t35 t32: v8i8 = ARMISD::VTBL1 t25, t36 t9: f64 = bitcast t32 t11: ch,glue = CopyToReg t0, Register:f64 %D0, t9 t12: ch = ARMISD::RET_FLAG t11, Register:f64 %D0, t11:1 -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs