Issue 142107
Summary [mlir][vector][linalg] Fix incorrect in_bounds attribute calculation
Labels mlir
Assignees banach-space
Reporter banach-space
    This ticket is about the `in_bounds` attribute from the `Vector`s [vector.transfer_read](https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop) + [vector.transfer_write](https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_write-vectortransferwriteop) Ops. 

There are at least two places in the Linalg vectoriser where the computation of this attribute is too weak: 
* [Link](https://github.com/llvm/llvm-project/blob/28eb66b79413950e584b2c3d8581ae5dc858d6dc/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp#L2886-L2921) to a code block in `vectorizeAsInsertSliceOp`.
* [Link](https://github.com/llvm/llvm-project/blob/28eb66b79413950e584b2c3d8581ae5dc858d6dc/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp#L1555-L1565) to a code block in `createWriteOrMaskedWrite`.

Note that in both cases, the write/read indices are ignored when computing the attribute (you will have to open the links). This means that the following example is always inferred as "in bounds". However, that should depend on the indices:
```mlir
vector.transfer_write %14, %arg8[%arg3, %arg5] {in_bounds = [true, true]} : vector<8x4xf32>, memref<8x32xf32>
```

Effectively, `vectorizeAsInsertSliceOp` and `createWriteOrMaskedWrite` assume that the indices are always `0`, but that is not always the case (e.g. in the example above).

In https://github.com/llvm/llvm-project/pull/141244 I am refactoring `vectorizeAsInsertSliceOp` to re-use `createWriteOrMaskedWrite` - this way there will be only one hook to fix. Note that this issue also affects [createReadOrMaskedRead](https://github.com/llvm/llvm-project/blob/28eb66b79413950e584b2c3d8581ae5dc858d6dc/mlir/lib/Dialect/Vector/Utils/VectorUtils.cpp#L328-L369), which is implemented in "VectorUtils.cpp" (the other hooks are implemented in "Vectorization.cpp").

Also in https://github.com/llvm/llvm-project/pull/141244, I am replacing this calculation:
```cpp
    for (unsigned i = 0; i < rank; i++)
 inBoundsVal[i] = (destShape[i] == inputVecSizesForLeadingDims[i]) &&
 !ShapedType::isDynamic(destShape[i]);
```
with (`==` --> `>=`):
```cpp
    for (unsigned i = 0; i < rank; i++)
      inBoundsVal[i] = (destShape[i] >= inputVecSizesForLeadingDims[i]) &&
 !ShapedType::isDynamic(destShape[i]);
```
If the write/read indices are `0` (and that's the only situation in which the computation above is valid), then both `==` and `>=` are valid (i.e there is not need to require `eq`, `geq` is also fine).

Addressing this might be tricky - switching from "in-bounds" to "out-of-bounds" will mean that `vector.maskedload` will be generated instead of `vector.load` in many places.

NEXT STEPS:
* Land https://github.com/llvm/llvm-project/pull/141244 - it maintains the current behaviour and will simplify the next steps.
* Investigate the impact of using "out-of-bounds" instead of "in-bounds".
* Look into value-bounds-analysis as a tool to infer that the indices are indeed `0`, even when using SSA values (that will often be the case in practice).

As a specific example, this was generated from [ransform-vector.mlir](https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Vector/transform-vector.mlir):

```mlir
 %0 = scf.for %arg3 = %c0 to %c8 step %c8 iter_args(%arg4 = %arg2) -> (tensor<8x32xf32>) {
      %1 = scf.for %arg5 = %c0 to %c32 step %c4 iter_args(%arg6 = %arg4) -> (tensor<8x32xf32>) {
        %2 = scf.for %arg7 = %c0 to %c16 step %c2 iter_args(%arg8 = %arg6) -> (tensor<8x32xf32>) {
 %3 = vector.transfer_read %arg0[%arg3, %arg7], %cst {in_bounds = [true, true]} : tensor<8x16xf32>, vector<8x2xf32>
          %4 = vector.transfer_read %arg1[%arg7, %arg5], %cst {in_bounds = [true, true]} : tensor<16x32xf32>, vector<2x4xf32>
          %5 = vector.transfer_read %arg8[%arg3, %arg5], %cst {in_bounds = [true, true]} : tensor<8x32xf32>, vector<8x4xf32>
          %6 = vector.contract {indexing_maps = [#map, #map1, #map2], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %3, %4, %5 : vector<8x2xf32>, vector<2x4xf32> into vector<8x4xf32>
          %7 = vector.transfer_write %6, %arg8[%arg3, %arg5] {in_bounds = [false, false]} : vector<8x4xf32>, tensor<8x32xf32>
 scf.yield %7 : tensor<8x32xf32>
        }
        scf.yield %2 : tensor<8x32xf32>
      }
      scf.yield %1 : tensor<8x32xf32>
    }
 return %0 : tensor<8x32xf32>
  }
```

It's quite clear that the range of indices for this `vector.transfer_write` make it always "in-bounds":
```mlir
%7 = vector.transfer_write %6, %arg8[%arg3, %arg5] {in_bounds = [false, false]} : vector<8x4xf32>, tensor<8x32xf32>
```
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to