Issue |
85691
|
Summary |
`castAwayContractionLeadingOneDim` introduces unnecessary transposes on outer unit dims
|
Labels |
new issue
|
Assignees |
|
Reporter |
KoolJBlack
|
During IREE's mmt4d lowering, we have a vector to matrix product represented by `vector.contract` in transit of the following form:
```mlir
%result = vector.contract {indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>, affine_map<(d0, d1, d2, d3) -> (d0, d2, d3)>, affine_map<(d0, d1, d2, d3) -> (d1, d2)>], iterator_types = ["parallel", "parallel", "parallel", "reduction"], kind = #vector.kind<add>} %lhs, %rhs, %acc : vector<1x1x8xi32>, vector<1x8x8xi32> into vector<1x8xi32>
```
Passing this through `castAwayContractionLeadingOneDim` pattern produces the following:
```mlir
%0 = vector.transpose %arg0, [1, 0, 2] : vector<1x1x8xi32> to vector<1x1x8xi32>
%1 = vector.extract %0[0] : vector<1x8xi32> from vector<1x1x8xi32>
%2 = vector.extract %arg2[0] : vector<8xi32> from vector<1x8xi32>
%3 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d1)>], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %1, %arg1, %2 : vector<1x8xi32>, vector<1x8x8xi32> into vector<8xi32>
%4 = vector.broadcast %3 : vector<8xi32> to vector<1x8xi32>
```
The `vector.transpose` introduced adds additional leading dimensionality to the overall flow and cannot be trivially reduced further down. This is adding a challenge for subsequent patterns to process vec_x_matrix contracts properly.
The cause is the [pattern](https://github.com/llvm/llvm-project/blob/cf835b96b13bec3b5df1962bae609934edda6d55/mlir/lib/Dialect/Vector/Transforms/VectorDropLeadUnitDim.cpp#L333) requiring the cast away dimensions to be outermost for all operands while being driven by the accumulator of the contract.
### Thoughts
In practice, transposing outer unit dimensions in this instance does not affect the underlying data layout. This transpose could be omitted before the patterns final output. Alternatively, a transpose canonicalizer that can fold transposes for outer leading unit dimensions could also apply here.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs