Issue |
139221
|
Summary |
[MLIR] Inconsistent output when executing MLIR program with and without `-convert-affine-for-to-gpu`
|
Labels |
mlir
|
Assignees |
|
Reporter |
Lambor24
|
My git version is [145aa66](https://github.com/llvm/llvm-project/commit/145aa66f689c24c0cf2fffd995ba83678cfaa310).
## Description:
I am experiencing an inconsistent result when executing the same MLIR program with and without the `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`.
## Steps to Reproduce:
### 1. **MLIR Program (test.mlir)**:
test.mlir:
```
module {
memref.global "private" constant @__constant_4xi16 : memref<4xi16> = dense<-1> {alignment = 64 : i64}
func.func private @printMemrefI16(memref<*xi16>) attributes {llvm.emit_c_interface}
func.func @main() {
%c-1_i16 = arith.constant -1 : i16
%c0_i16 = arith.constant 0 : i16
%alloc = memref.alloc() {alignment = 64 : i64} : memref<i16>
affine.store %c0_i16, %alloc[] : memref<i16>
affine.for %arg0 = 0 to 4 {
%0 = affine.load %alloc[] : memref<i16>
%1 = arith.addi %0, %c-1_i16 : i16
affine.store %1, %alloc[] : memref<i16>
}
%expand_shape = memref.expand_shape %alloc [] output_shape [1] : memref<i16> into memref<1xi16>
%cast = memref.cast %expand_shape : memref<1xi16> to memref<*xi16>
call @printMemrefI16(%cast) : (memref<*xi16>) -> ()
return
}
}
```
### 2. **Command to Run Without `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**
```
/path/llvm-project/build/bin/mlir-opt test.mlir -lower-affine -gpu-lower-to-nvvm-pipeline | \
/path/llvm-project/build/bin/mlir-runner -e main -entry-point-result=void \
-shared-libs=/path/llvm-project/build/lib/libmlir_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_c_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_cuda_runtime.so
```
### 3. **Output Without `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**
```
[-4]
```
### 4. **Command to Run With `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**
```
/path/llvm-project/build/bin/mlir-opt test.mlir -pass-pipeline="builtin.module(func.func(convert-affine-for-to-gpu{gpu-block-dims=1 gpu-thread-dims=0}))" | \
/path/llvm-project/build/bin/mlir-opt -lower-affine -gpu-lower-to-nvvm-pipeline | \
/path/llvm-project/build/bin/mlir-runner -e main -entry-point-result=void \
-shared-libs=/path/llvm-project/build/lib/libmlir_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_c_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_cuda_runtime.so
```
### 5. **Output With `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**
```
[-1]
```
I'm not sure if there is any bug in my program or if the wrong usage of the above passes caused this result.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs