Issue 163882
Summary [DirectX] Mismatched data layout causing validation errors about exceeding TGSM storage
Labels new issue
Assignees
Reporter Icohedron
    99 DML shaders are [failing to validate](https://microsoft.visualstudio.com/WindowsAI/_build/results?buildId=131899824&view=ms.vss-test-web.build-test-results-tab) after #163587 with the error:
```
error: Total Thread Group Shared Memory storage is 43688, exceeded 32768.
Validation failed.
```

All 99 DML shaders have names of the form `QuantizedGemm*`. (e.g., `QuantizedGemm_20480_16_0_uint4_packed32_float16_native_accum32_0`)

## Minimal reproducible test case
```hlsl
// compile args: -T cs_6_7 -E CSMain -enable-16bit-types -Fo output.dat
groupshared float16_t smem[10240];
[numthreads(1, 1, 1)] 
void CSMain() {
  smem[0] = 0;
}
```
Comparing the dxil output before and after the PR commit c87e0e8fe0ea14dcd84e835c0f7b02c5b0edca70, the only difference is the data layout.
```
1c1
< target datalayout = "e-m:e-p:32:32-i1:32-i8:8-i16:16-i32:32-i64:64-f16:16-f32:32-f64:64-n8:16:32:64"
---
> target datalayout = "e-m:e-p:32:32-i1:8-i8:8-i16:32-i32:32-i64:64-f16:32-f32:32-f64:64-n8:16:32:64"
270c270
< !1 = !{!"clang version 22.0.0git ([email protected]:Icohedron/llvm-project.git 72c6e4b230ddb5ca85361e145e177245319b271e)"}
---
> !1 = !{!"clang version 22.0.0git ([email protected]:Icohedron/llvm-project.git c87e0e8fe0ea14dcd84e835c0f7b02c5b0edca70)"}
 ```

 Compiling the same DML shader with DXC, DXC gives the shader a datalayout of 
 ```
 target datalayout = "e-m:e-p:32:32-i1:32-i8:8-i16:16-i32:32-i64:64-f16:16-f32:32-f64:64-n8:16:32:64"
 ```
 which matches the data layout that Clang emitted before the PR.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to