| Issue |
159699
|
| Summary |
[MLIR] TransferOptimization::storeToLoadForwarding doesn't seem to support RAW pattern with separate write ops.
|
| Labels |
mlir:vector
|
| Assignees |
|
| Reporter |
LWenH
|
[https://github.com/llvm/llvm-project/blob/a2efa7ab207d78bf753b4a9651070fee283d8217/mlir/lib/Dialect/Vector/Transforms/VectorTransferOpTransforms.cpp#L237](url)
According to the description of storeToLoadForwarding, if there are multiple candidate vector.transfer_write operations where the lastWrite post-dominates other write operations and also dominates the read operation, these write operations should be merged and collectively forwarded for use. However, the current implementation of vector::checkSameValueRAW does not appear to support such cases, as it checks whether the VectorType of the read and write operations matches. Would MLIR add support for this scenario, or the comments in storeToLoadForwardingbe would be refined to more accurately reflect its behavior?
Consider the following mlir case:
```mlir
vector.transfer_write %1, %alloc[%c32, %c32] {in_bounds = [true, true]} : vector<32x32xi32>, memref<256x256xi32, #gpu.address_space<workgroup>>
vector.transfer_write %2, %alloc[%c32, %c0] {in_bounds = [true, true]} : vector<32x32xi32>, memref<256x256xi32, #gpu.address_space<workgroup>>
vector.transfer_write %3, %alloc[%c0, %c32] {in_bounds = [true, true]} : vector<32x32xi32>, memref<256x256xi32, #gpu.address_space<workgroup>>
vector.transfer_write %4, %alloc[%c0, %c0] {in_bounds = [true, true]} : vector<32x32xi32>, memref<256x256xi32, #gpu.address_space<workgroup>>
%5 = vector.transfer_read %alloc[%c0, %c0], %c0_i32 {in_bounds = [true, true]} : memref<256x256xi32, #gpu.address_space<workgroup>>, vector<64x64xi32>
use %5
```
In this case, %1, %2, %3, and %4 can all be considered as sub-vectors of %5 and can collectively form the complete %5. However, the current **vector::checkSameValueRAW** will detect that the vector type written by vector.transfer_write %4(lastWrite Op) does not match the vector type read by %5 = vector.transfer_read, causing it to skip this kind of forwardUse optimization. So, I'm curious if MLIR community has the plan in place to address this?
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs