Re: [apache/tvm-rfcs] [RFC] Scalable vectors in TIR (PR #104)

Elen Kalda Wed, 13 Sep 2023 09:50:35 -0700

Thanks for your comments @kparzysz-quic! Some clarifying questions and 
thoughts:


> Add a parameter to tir.vscale to state the minimal assumed vector length. For 
> AArch64 SVE it will be 128 (bits), but some other non-SVE architecture can 
> provide a different value (via a target hook, or something like that).

Happy to include it, but I'd like to understand better the value it would add. 
AFAIK the `llvm::vscale` does not have the minimum vector length associated 
with it, it's encoded in the "multiplier", e.g. in
```
%wide.load11 = load <vscale x 4 x i32>, ptr %12
```
the 4 represents min_vector_length / size_of_the_data_type. If we follow that 
philosophy and mimic LLVM's `vscale` in TIR, then it will be the responsibility 
of the author of target specific schedule to set that multiplier correctly. It 
would be different if we opted for something like `vfactor` instead of `vscale` 
(as originally proposed in the RFC) since `vfactor` would essentially represent 
the number of elements in a vector which would depend on the minimum length. 

I'm mostly looking at it from the point of SVE, so I'm interested to learn if 
there is a case for it for other scalable architecture extensions out there. 

> If you plan to include predication eventually, that would be something that a 
> lot of targets could use. The LLVM intrinsics for predicated operations do 
> not explicitly require SVE, they can be used with fixed-sized vectors as well.

Agreed! This might require its own mini-RFC. 

> For dealing with an unknown vector lengths and simultaneously allowing 
> specific lengths per use-site we could either
> 1. Require that if Ramp/Broadcast has lanes == -1, then the base/value member 
> must be a TIR intrinsic specifying the vscale for the value. E.g. 
> Ramp(tir.vscale(128, base), stride, -1) or Broadcast(tir.vscale(256, value), 
> -1).
> 2. Extend Ramp and Broadcast to take lanes as PrimExpr, with restrictions on 
> what that expression can contain.

Option 2. is what we propose in this RFC. From some prototyping experience, it 
would let us use all the current infrastructure for vectors in TVM and the LLVM 
codegen pretty much "just works", with ca 10 lines to map `tir.vscale` to 
`llvm::vscale` (that applies to simple consecutive loads and stores, it's a bit 
more complex for things like ramps with stride != 1). I'm not in favour of 
exposing -1 to user in any form, e.g. from TVMScript or just from printing TIR, 
it is not particularly intuitive interface. The only reason for -1 is the 
DLPack standard for which we need a way to express scalable vectors. Another 
idea to handle this would be to add a new field to `DLDataType`, e.g. `bool 
is_scalable`, but I'm not sure how feasible changing that standard is.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/104#issuecomment-1717982991
You are receiving this because you are subscribed to this thread.

Message ID: <apache/tvm-rfcs/pull/104/c1717982...@github.com>

Re: [apache/tvm-rfcs] [RFC] Scalable vectors in TIR (PR #104)

Reply via email to