Issue 145879
Summary [mlir][linalg] using outputDimSize in `getPackOpSourceOrPaddedSource` to lower linanlg.pack for sources with dynamic leading dims
Labels mlir
Assignees
Reporter rYm-A
    Lowering the following linalg.pack op generates an issue:

```
func.func @main(%arg0: tensor<?x?x8xf32>, %arg1:  tensor<?x?x8x8x1xf32>, %cst: f32 ) -> tensor<?x?x8x8x1xf32> {
  %result = linalg.pack %arg0
    padding_value(%cst : f32)
    outer_dims_perm = [0, 1, 2]
    inner_dims_pos = [1, 2]
    inner_tiles = [8, 1]
    into %arg1
    {lowering_config = #iree_codegen.lowering_config<tile_sizes = [[64, 8, 8], [1, 1, 1]]>}
    : tensor<?x?x8xf32> -> tensor<?x?x8x8x1xf32>
  return %result : tensor<?x?x8x8x1xf32>
}
```

...using 

```
iree-opt packOp.mlir \
--pass-pipeline="builtin.module(func.func(iree-llvmcpu-tile{tiling-level=1}, iree-codegen-decompose-pack-unpack-ops))"  \
--debug \
--mlir-disable-threading 
```

...will generate the following IR after `iree-llvmcpu-tile{tiling-level=1}`:

```
func.func @main(%arg0: tensor<?x?x8xf32>, %arg1: tensor<?x?x8x8x1xf32>, %arg2: f32) -> tensor<?x?x8x8x1xf32> {
  %c8 = arith.constant 8 : index
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index
  %dim = tensor.dim %arg1, %c0 : tensor<?x?x8x8x1xf32>
  %dim_0 = tensor.dim %arg1, %c1 : tensor<?x?x8x8x1xf32>
  %0 = scf.for %arg3 = %c0 to %dim step %c1 iter_args(%arg4 = %arg1) -> (tensor<?x?x8x8x1xf32>) {
    %1 = scf.for %arg5 = %c0 to %dim_0 step %c1 iter_args(%arg6 = %arg4) -> (tensor<?x?x8x8x1xf32>) {
      %2 = scf.for %arg7 = %c0 to %c8 step %c1 iter_args(%arg8 = %arg6) -> (tensor<?x?x8x8x1xf32>) {
        %dim_1 = tensor.dim %arg0, %c0 : tensor<?x?x8xf32>
        %dim_2 = tensor.dim %arg0, %c1 : tensor<?x?x8xf32>
        %3 = affine.min affine_map<(d0)[s0] -> (-d0 + s0, 1)>(%arg3)[%dim_1]
        %4 = affine.apply affine_map<(d0) -> (d0 * 8)>(%arg5)
        %5 = affine.min affine_map<(d0)[s0] -> (d0 * -8 + s0, 8)>(%arg5)[%dim_2]
        %extracted_slice = tensor.extract_slice %arg0[%arg3, %4, %arg7] [%3, %5, 1] [1, 1, 1] : tensor<?x?x8xf32> to tensor<?x?x1xf32>
        %extracted_slice_3 = tensor.extract_slice %arg8[%arg3, %arg5, %arg7, 0, 0] [1, 1, 1, 8, 1] [1, 1, 1, 1, 1] : tensor<?x?x8x8x1xf32> to tensor<1x1x1x8x1xf32>
        %pack = linalg.pack %extracted_slice_1 padding_value(%cst : f32) 
			outer_dims_perm = [0, 1, 2] 
			inner_dims_pos = [1, 2] 
			inner_tiles = [8, 1] 
			into %extracted_slice_2 {lowering_config = #iree_codegen.lowering_config<tile_sizes = [[64, 8, 8], [1, 1, 1]]>} : 
			tensor<?x?x1xf32> -> tensor<1x1x1x8x1xf32>
        %inserted_slice = tensor.insert_slice %pack into %arg8[%arg3, %arg5, %arg7, 0, 0] [1, 1, 1, 8, 1] [1, 1, 1, 1, 1] : tensor<1x1x1x8x1xf32> into tensor<?x?x8x8x1xf32>
        scf.yield %inserted_slice : tensor<?x?x8x8x1xf32>
      }
      scf.yield %2 : tensor<?x?x8x8x1xf32>
    }
    scf.yield %1 : tensor<?x?x8x8x1xf32>
  }
  return %0 : tensor<?x?x8x8x1xf32>
}
```

The conversion `iree-codegen-decompose-pack-unpack-ops` will trigger this [assert](https://github.com/llvm/llvm-project/blob/66f84c8b8a762832af39e91370018f8f8307a0fc/mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp#L1057) in `getPackOpSourceOrPaddedSource` , since the source's outermost dim of the linalg.pack op isn't 1 but dynamic.

@banach-space, why not using packOp.SourceType() instead? The `iree-llvmcpu-tile{tiling-level=1}` pass already ensures that the non-tiled outer dimension of the linalg.pack result is set to 1.

This time, not a duplicated issue 🙂

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to