I think there's a confusion about the difference between what we have referred to as `vscale` and `vfactor`. I'll try to summarise the the difference and the respective pros and cons.
For reference, this is how LLVM represents vectors (copied from the [documentation](https://llvm.org/docs/LangRef.html#vector-type)): ``` < <# elements> x <elementtype> > ; Fixed-length vector < vscale x <# elements> x <elementtype> > ; Scalable vector ``` A concrete example of a scalable vector: ``` <vscale x 4 x float> ``` or ``` <vscale x 16 x i8> ``` To construct these vectors we need to know the minimum vector length (SVE's 128 used in these examples) and the size of the data type of the vector elements (32 bits or 8 bits in these examples). ## Vscale This would mirror LLVM's `vscale` intrinsic, so if we had a TIR intrinsic with the same meaning, a TVM vector of floats that would exactly map to a hardware vector would look like ``` ramp(base, stride, 4 * vscale) # or vscale(4) depending on which UI we want to go for ``` ### Pros 1. When eyeballing the TIR, the meaning of the `vscale` intrinsic is intuitive since it's matches LLVM 2. It makes translating the expressions involving `vscale` that exist outside of the vectors in codegen very easy since we just have to map `tir.vscale` -> `llvm.vscale` 3. Since we can pull the information about the vector element data type from the ramp node, we can deduce the minimum vector length from the multiplier 4. Makes it simpler to support arbitrarily long vectors\* ### Cons 1. Representing `lanes` in runtime data type is very awkward (see the comments above) 2. It's harder to place restriction on what `ramp->lanes` can be so it can get accidentally set to something nonsensical. This could be alleviated by using `vscale(4)` though as recommended by @kparzysz-quic ## Vfactor This was proposed in the first version of this RFC. A TVM vector that would map to a hardware vector would be: ``` ramp(base, stride, vfactor) ``` In this case the constant is implicitly absorbed into `vfactor` and will be deduced during codegen. The minimum vector length should be known to the backend specific codegen and the data type size can be pulled from the data type of the elements in the vector. ### Pros 1. Simpler to use in the scheduling, you don't have to worry about data type size and minimum vector length 3. Less visual clutter 4. Easier to create a robust implementation since we can enforce that if `lanes` of the ramp is not `int`, it is `vfactor` (unless we go to the territory of arbitrarily long vectors\*) 5. `DLDataType` representation is less of an issue, we can just go for -1 ### Cons 1. We don't know the implicit data type of `vfactor` that is outside of the vector (this is a big problem) ## \*The arbitrarily long vectors This is the "vectors with multiple vector width" that @tqchen mentioned. It is referring to there being no restrictions to the length of the TIR vectors and subsequently LLVM vectors in TVM. I've seen things like ``` <1024 x float> ``` coming out of TVM's codegen. I've always wondered if this is feature or (mostly harmless) side effect. LLVM itself deals with it by breaking these vectors down into a string of vector instruction that match the hardware length. SVE support in LLVM can also do that for SVE vectors, so in theory we could create vectors like ``` <vscale x 512 x float> ``` So the question there is if we want to support creating these vectors in TVM. If we do, `vscale` approach would be more appropriate. I agree tough that it is not probably particularly useful. So depends how much we care about the feature parity between the vector types there. -- Reply to this email directly or view it on GitHub: https://github.com/apache/tvm-rfcs/pull/104#issuecomment-1759954046 You are receiving this because you are subscribed to this thread. Message ID: <apache/tvm-rfcs/pull/104/c1759954...@github.com>