Issue |
140259
|
Summary |
[mlir] Inconsistent output when executing MLIR program with `--scf-parallel-loop-fusion`
|
Labels |
mlir
|
Assignees |
|
Reporter |
Emilyaxe
|
git version: ba631508ae7f
system: `Ubuntu 18.04.6 LTS`
## Description:
I am experiencing an inconsistent result when executing the same MLIR program with and without `--scf-parallel-loop-fusion`.
## Steps to Reproduce:
### 1. **MLIR Program (a.mlir)**:
a.mlir:
```
module {
func.func private @printMemrefI32(tensor<*xi32>)
func.func private @printMemrefF32(tensor<*xf32>)
func.func @main() {
%idx0 = index.constant 0
%1 = "tosa.const"() <{values = dense<-5857> : tensor<1x4x4xi32>}> : () -> tensor<1x4x4xi32>
%10 = tosa.clz %1 : (tensor<1x4x4xi32>) -> tensor<1x4x4xi32>
%28 = tosa.cast %10 : (tensor<1x4x4xi32>) -> tensor<1x4x4xf32>
%48 = tosa.floor %28 : (tensor<1x4x4xf32>) -> tensor<1x4x4xf32>
%55 = tosa.reverse %48 {axis = 1 : i32} : (tensor<1x4x4xf32>) -> tensor<1x4x4xf32>
%cast_5 = tensor.cast %55 : tensor<1x4x4xf32> to tensor<*xf32>
call @printMemrefF32(%cast_5) : (tensor<*xf32>) -> ()
return
}
}
```
### 2. **Command to Run without `--scf-parallel-loop-fusion` :**
```
/pathto/mlir-opt a.mlir -tosa-to-scf -tosa-to-arith \
| /pathto/mlir-opt -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" \
| /pathto/mlir-opt -convert-scf-to-cf --cse \
-one-shot-bufferize="bufferize-function-boundaries" -finalize-memref-to-llvm -convert-linalg-to-parallel-loops \
-convert-scf-to-cf -finalize-memref-to-llvm -convert-math-to-llvm -convert-arith-to-llvm \
-convert-index-to-llvm -convert-cf-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts \
| timeout 10 /pathto/mlir-cpu-runner -e main -entry-point-result=void \
--shared-libs=/pathto/libmlir_c_runner_utils.so \
--shared-libs=/pathto/libmlir_runner_utils.so \
--shared-libs=/pathto/libmlir_async_runtime.so
```
### 3. **Output without `--scf-parallel-loop-fusion` :**:
```
Unranked Memref base@ = 0x55a554495b40 rank = 3 offset = 0 sizes = [1, 4, 4] strides = [16, 4, 1] data =
"" 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]]]
```
### 4. **Command to Run with `--scf-parallel-loop-fusion` :**
```
/pathto/mlir-opt a.mlir -tosa-to-scf -tosa-to-arith \
| /pathto/mlir-opt -pass-pipeline="builtin.module(func.func(tosa-to-linalg))" \
| /pathto/mlir-opt -convert-scf-to-cf --cse \
-one-shot-bufferize="bufferize-function-boundaries" -finalize-memref-to-llvm -convert-linalg-to-parallel-loops \
--scf-parallel-loop-fusion -convert-scf-to-cf -finalize-memref-to-llvm -convert-math-to-llvm -convert-arith-to-llvm \
-convert-index-to-llvm -convert-cf-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts \
| timeout 10 /pathto/mlir-cpu-runner -e main -entry-point-result=void \
--shared-libs=/pathto/libmlir_c_runner_utils.so \
--shared-libs=/pathto/libmlir_runner_utils.so \
--shared-libs=/pathto/libmlir_async_runtime.so
```
### 5. **Output with `--scf-parallel-loop-fusion` :**
```
Unranked Memref base@ = 0x5625e8d1c780 rank = 3 offset = 0 sizes = [1, 4, 4] strides = [16, 4, 1] data =
"" 7.20647e+31, 2.89575e+32, 2.48354e+27],
[5.12554e-11, 2.63753e+23, 7.00624e+22, 5.61567e+13],
[0, 0, 0, 0],
[0, 0, 0, 0]]]
```
### 6. **Analysis for this case :**
This issue can be consistently reproduced by enabling the `--scf-parallel-loop-fusion` option, which fuses the last two `scf.parallel` loops.
I'm not sure whether the root cause lies in the `--scf-parallel-loop-fusion` transformation itself or in the subsequent `-convert-cf-to-llvm` lowering.
For reference:
[input.txt](https://github.com/user-attachments/files/20254061/input.txt)
contains the IR before applying `--scf-parallel-loop-fusion`.
[output.txt](https://github.com/user-attachments/files/20254107/output.txt)
contains the IR after applying `--scf-parallel-loop-fusion`.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs