Issue |
145879
|
Summary |
[mlir][linalg] using outputDimSize in `getPackOpSourceOrPaddedSource` to lower linanlg.pack for sources with dynamic leading dims
|
Labels |
mlir
|
Assignees |
|
Reporter |
rYm-A
|
Lowering the following linalg.pack op generates an issue:
```
func.func @main(%arg0: tensor<?x?x8xf32>, %arg1: tensor<?x?x8x8x1xf32>, %cst: f32 ) -> tensor<?x?x8x8x1xf32> {
%result = linalg.pack %arg0
padding_value(%cst : f32)
outer_dims_perm = [0, 1, 2]
inner_dims_pos = [1, 2]
inner_tiles = [8, 1]
into %arg1
{lowering_config = #iree_codegen.lowering_config<tile_sizes = [[64, 8, 8], [1, 1, 1]]>}
: tensor<?x?x8xf32> -> tensor<?x?x8x8x1xf32>
return %result : tensor<?x?x8x8x1xf32>
}
```
...using
```
iree-opt packOp.mlir \
--pass-pipeline="builtin.module(func.func(iree-llvmcpu-tile{tiling-level=1}, iree-codegen-decompose-pack-unpack-ops))" \
--debug \
--mlir-disable-threading
```
...will generate the following IR after `iree-llvmcpu-tile{tiling-level=1}`:
```
func.func @main(%arg0: tensor<?x?x8xf32>, %arg1: tensor<?x?x8x8x1xf32>, %arg2: f32) -> tensor<?x?x8x8x1xf32> {
%c8 = arith.constant 8 : index
%c1 = arith.constant 1 : index
%c0 = arith.constant 0 : index
%dim = tensor.dim %arg1, %c0 : tensor<?x?x8x8x1xf32>
%dim_0 = tensor.dim %arg1, %c1 : tensor<?x?x8x8x1xf32>
%0 = scf.for %arg3 = %c0 to %dim step %c1 iter_args(%arg4 = %arg1) -> (tensor<?x?x8x8x1xf32>) {
%1 = scf.for %arg5 = %c0 to %dim_0 step %c1 iter_args(%arg6 = %arg4) -> (tensor<?x?x8x8x1xf32>) {
%2 = scf.for %arg7 = %c0 to %c8 step %c1 iter_args(%arg8 = %arg6) -> (tensor<?x?x8x8x1xf32>) {
%dim_1 = tensor.dim %arg0, %c0 : tensor<?x?x8xf32>
%dim_2 = tensor.dim %arg0, %c1 : tensor<?x?x8xf32>
%3 = affine.min affine_map<(d0)[s0] -> (-d0 + s0, 1)>(%arg3)[%dim_1]
%4 = affine.apply affine_map<(d0) -> (d0 * 8)>(%arg5)
%5 = affine.min affine_map<(d0)[s0] -> (d0 * -8 + s0, 8)>(%arg5)[%dim_2]
%extracted_slice = tensor.extract_slice %arg0[%arg3, %4, %arg7] [%3, %5, 1] [1, 1, 1] : tensor<?x?x8xf32> to tensor<?x?x1xf32>
%extracted_slice_3 = tensor.extract_slice %arg8[%arg3, %arg5, %arg7, 0, 0] [1, 1, 1, 8, 1] [1, 1, 1, 1, 1] : tensor<?x?x8x8x1xf32> to tensor<1x1x1x8x1xf32>
%pack = linalg.pack %extracted_slice_1 padding_value(%cst : f32)
outer_dims_perm = [0, 1, 2]
inner_dims_pos = [1, 2]
inner_tiles = [8, 1]
into %extracted_slice_2 {lowering_config = #iree_codegen.lowering_config<tile_sizes = [[64, 8, 8], [1, 1, 1]]>} :
tensor<?x?x1xf32> -> tensor<1x1x1x8x1xf32>
%inserted_slice = tensor.insert_slice %pack into %arg8[%arg3, %arg5, %arg7, 0, 0] [1, 1, 1, 8, 1] [1, 1, 1, 1, 1] : tensor<1x1x1x8x1xf32> into tensor<?x?x8x8x1xf32>
scf.yield %inserted_slice : tensor<?x?x8x8x1xf32>
}
scf.yield %2 : tensor<?x?x8x8x1xf32>
}
scf.yield %1 : tensor<?x?x8x8x1xf32>
}
return %0 : tensor<?x?x8x8x1xf32>
}
```
The conversion `iree-codegen-decompose-pack-unpack-ops` will trigger this [assert](https://github.com/llvm/llvm-project/blob/66f84c8b8a762832af39e91370018f8f8307a0fc/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp#L1057) in `getPackOpSourceOrPaddedSource` , since the source's outermost dim of the linalg.pack op isn't 1 but dynamic.
@banach-space, why not using packOp.SourceType() instead? The `iree-llvmcpu-tile{tiling-level=1}` pass already ensures that the non-tiled outer dimension of the linalg.pack result is set to 1.
This time, not a duplicated issue 🙂
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs