Issue 139221
Summary [MLIR] Inconsistent output when executing MLIR program with and without `-convert-affine-for-to-gpu`
Labels mlir
Assignees
Reporter Lambor24
    My git version is [145aa66](https://github.com/llvm/llvm-project/commit/145aa66f689c24c0cf2fffd995ba83678cfaa310).

## Description:
I am experiencing an inconsistent result when executing the same MLIR program with and without the `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`.

## Steps to Reproduce:

### 1. **MLIR Program (test.mlir)**:

test.mlir:

```
module {
  memref.global "private" constant @__constant_4xi16 : memref<4xi16> = dense<-1> {alignment = 64 : i64}
  func.func private @printMemrefI16(memref<*xi16>) attributes {llvm.emit_c_interface}
  func.func @main() {
    %c-1_i16 = arith.constant -1 : i16
    %c0_i16 = arith.constant 0 : i16
    %alloc = memref.alloc() {alignment = 64 : i64} : memref<i16>
    affine.store %c0_i16, %alloc[] : memref<i16>
    affine.for %arg0 = 0 to 4 {
      %0 = affine.load %alloc[] : memref<i16>
      %1 = arith.addi %0, %c-1_i16 : i16
      affine.store %1, %alloc[] : memref<i16>
    }
    %expand_shape = memref.expand_shape %alloc [] output_shape [1] : memref<i16> into memref<1xi16>
    %cast = memref.cast %expand_shape : memref<1xi16> to memref<*xi16>
    call @printMemrefI16(%cast) : (memref<*xi16>) -> ()
    return
 }
}
```

### 2. **Command to Run Without `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**

```
/path/llvm-project/build/bin/mlir-opt test.mlir -lower-affine -gpu-lower-to-nvvm-pipeline | \
/path/llvm-project/build/bin/mlir-runner -e main -entry-point-result=void \
-shared-libs=/path/llvm-project/build/lib/libmlir_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_c_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_cuda_runtime.so
```

### 3. **Output Without `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**

```
[-4]
```

### 4. **Command to Run With `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**

```
/path/llvm-project/build/bin/mlir-opt test.mlir -pass-pipeline="builtin.module(func.func(convert-affine-for-to-gpu{gpu-block-dims=1 gpu-thread-dims=0}))" | \
/path/llvm-project/build/bin/mlir-opt -lower-affine -gpu-lower-to-nvvm-pipeline | \
/path/llvm-project/build/bin/mlir-runner -e main -entry-point-result=void \
-shared-libs=/path/llvm-project/build/lib/libmlir_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_c_runner_utils.so \
-shared-libs=/path/llvm-project/build/lib/libmlir_cuda_runtime.so
```

### 5. **Output With `-convert-affine-for-to-gpu{gpu-block-dims=0 gpu-thread-dims=1}`:**

```
[-1]
```

I'm not sure if there is any bug in my program or if the wrong usage of the above passes caused this result.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to