Issue |
144227
|
Summary |
[SelectionDAG][x86] Shuffle pyramid is not eliminated
|
Labels |
new issue
|
Assignees |
|
Reporter |
sutajo
|
https://godbolt.org/z/zEonMr7da
The above example illustrates the problem.
There are 2 minimum reductions in this snippet, but only the second is compiled into the efficient form:
```
vextracti128 xmm1, ymm0, 1
vpminuw xmm0, xmm0, xmm1
vphminposuw xmm0, xmm0
```
The first one is computed with shuffles and mins:
```
vextracti128 xmm2, ymm0, 1
vpminuw xmm2, xmm0, xmm2
vpshufd xmm3, xmm2, 238 # xmm3 = xmm2[2,3,2,3]
vpminuw xmm3, xmm2, xmm3
vpshufd xmm4, xmm3, 85 # xmm4 = xmm3[1,1,1,1]
vpminuw xmm3, xmm3, xmm4
vpsrld xmm4, xmm3, 16
vphminposuw xmm2, xmm2
```
Normally, `llvm.vector.reduce.umin.v16i16` is converted into a series of vector_shuffle and umin operations in the SelectionDAG.
```
Initial selection DAG: %bb.0 'test_reduce_v16i16_with_sharing:start'
SelectionDAG has 47 nodes:
t0: ch,glue = EntryToken
t2: i64,ch = CopyFromReg t0, Register:i64 %0
t4: v16i16,ch = load<(load (s256))> t0, t2, undef:i64
t9: v16i16 = vector_shuffle<8,9,10,11,12,13,14,15,u,u,u,u,u,u,u,u> t4, poison:v16i16
t10: v16i16 = umin t4, t9
t11: v16i16 = vector_shuffle<4,5,6,7,u,u,u,u,u,u,u,u,u,u,u,u> t10, poison:v16i16
t12: v16i16 = umin t10, t11
t13: v16i16 = vector_shuffle<2,3,u,u,u,u,u,u,u,u,u,u,u,u,u,u> t12, poison:v16i16
t14: v16i16 = umin t12, t13
t17: i32 = Constant<0>
t15: v16i16 = vector_shuffle<1,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u> t14, poison:v16i16
t16: v16i16 = umin t14, t15
t19: i16 = extract_vector_elt t16, Constant:i64<0>
t21: v1i16 = insert_vector_elt poison:v1i16, t19, Constant:i64<0>
t22: v16i16 = concat_vectors t21, t21, t21, t21, t21, t21, t21, t21, t21, t21, t21, t21, t21, t21, t21, t21
t24: v16i1 = setcc t4, t22, seteq:ch
t6: i64,ch = CopyFromReg t0, Register:i64 %1
t7: v16i16,ch = load<(load (s256))> t0, t6, undef:i64
t26: v16i16 = BUILD_VECTOR Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>, Constant:i16<-1>
t27: v16i16 = vselect t24, t7, t26
t28: v16i16 = vector_shuffle<8,9,10,11,12,13,14,15,u,u,u,u,u,u,u,u> t27, poison:v16i16
t29: v16i16 = umin t27, t28
t30: v16i16 = vector_shuffle<4,5,6,7,u,u,u,u,u,u,u,u,u,u,u,u> t29, poison:v16i16
t31: v16i16 = umin t29, t30
t32: v16i16 = vector_shuffle<2,3,u,u,u,u,u,u,u,u,u,u,u,u,u,u> t31, poison:v16i16
t33: v16i16 = umin t31, t32
t38: i16,i16 = merge_values undef:i16, undef:i16
t39: i16,i16 = merge_values t19, undef:i16
t34: v16i16 = vector_shuffle<1,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u> t33, poison:v16i16
t35: v16i16 = umin t33, t34
t36: i16 = extract_vector_elt t35, Constant:i64<0>
t40: i16,i16 = merge_values t39, t36
t43: ch,glue = CopyToReg t0, Register:i16 $ax, t40
t45: ch,glue = CopyToReg t43, Register:i16 $dx, t40:1, t43:1
t46: ch = X86ISD::RET_GLUE t45, TargetConstant:i32<0>, Register:i16 $ax, Register:i16 $dx, t45:1
```
In the x86 backend, the `combineExtractVectorElt` function is supposed detect binary operation reductions and replace these with a more efficient implementation. The problem is that `combineExtractVectorElt` only replaces the result of `extract_vector_elt` (in this case t19 and 36), but it does not replace t16 and t35.
During the combination steps, t22 is transformed to `vector_shuffle<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0> t16, undef:v16i16`, so it depends on t16, and thus it keeps the reduction chain alive.
A possbile solution to this would be:
Let's say `N = extract_vector_elt(V,0)` and `V` comes from a reduction.
If `N` gets replaced with `M`, we could replace `V` with `scalar_to_vector(M)`.
This would eliminate all references to the inefficient reduction.
This is safe to do, since:
- All elements of V expect from the zeroth are known to be undefined
- M does not depend on V, so there will be no cycles.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs