Issue 83626
Summary [DXIL] implement dot intrinsic lowering
Labels new issue
Assignees farzonl
Reporter farzonl
    There are three parts
First to do float\half dot products we need to support three opcodes with varying argument lengths

- `@dx.op.dot2.f32(i32 54 ...)` - 4 arguments a[0], a[1], b[0], b[1]
- `@dx.op.dot3.f32(i32 55 ...)` - 6 arguments a[0], a[1], a[2], b[0], b[1], b[2]
- `@dx.op.dot4.f32(i32 56 ...)` - 8 arguments a[0], a[1], a[2], a[3], b[0], b[1], b[2], b[3]

For each of these we will need to do 4 to 8 extract element before we call the intrinsic.

Part 1 would be to create a pass that  flattens the vectors to scalars into the form shown above
Part2 is to modify DXIL.td to represent DIXIL ops

For good references on behavior see:
- https://godbolt.org/z/TbvshPchs
- [lib/HLSL/HLOperationLower.cpp](https://github.com/microsoft/DirectXShaderCompiler/blob/main/lib/HLSL/HLOperationLower.cpp#L6646C5-L6651C37)

Part 3 is to support integer dot products.

We will need to create a DxilTrinaryOperation which we don't currently have and further support the lowering of DXIL::OpCode::UMad  and  DXIL::OpCode::IMad

See
- https://godbolt.org/z/srbzrhMbq
- [TranslateIDot] (https://github.com/microsoft/DirectXShaderCompiler/blob/main/lib/HLSL/HLOperationLower.cpp#L2451C1-L2467C1)

The format for integer dot product is silightly different. You don't fetch all the  vector  indices up front you do them in stages
extract a[i] and b[i]
multiply a[i] and b[i]
extract a[i+1] and b[i+1]
Perform an IMad on  (a[i+1], b[i+1],a[i]*b[i])  // `%IMad = call i32 @dx.op.tertiary.i32(i32 48, i32 %4, i32 %5, i32 %3)`


For every vector size increase you daisy chain the  extract elements and IMad results.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to