@cbalint13 @tqchen Thank you for your input! This thread has been dormant for a bit, but we're still on it!
> A comprehensive presentation on SVE design booth on RISCV and ARM from > perspective of LLVM. The presentation captures all the design details of the SVE rationale in LLVM including arch comparisions. https://youtu.be/-ox8iJmbp0c?feature=shared (Vector Codegen / Luke Lau) Thanks for sharing this, a really nice presentation! I'm trying to think how RVV's features will align with this RFC proposal... I think LLVM can be a good source of inspiration there :) Based on my (quite basic) understanding of RVV, there are two features that need consideration: **1. Addressing several vectors at once (`LMUL`)** They have resolved it in LLVM by encoding the `LMUL` value into the multiplier of `vscale`. Since this proposal follows the LLVM convection in expressing the scalable length, it can easily be adopted in TVM. As it currently stands, it will be up to the schedule author to do the maths and figure out the correct multiplier. It would be even easier if we implemented both `vscale` and `vfactor`...* **2. Predication** If I understood it correctly, there are two ways of setting the active lanes: 1. By providing a bitmask as a predicate to the operation - I'd expect LLVM RVV backend supports `llvm.get_active_lane_mask` for that purpose, so for this case the current proposal should work 2. By setting the `VL` register to the number of active lanes - I suppose that's the feature @cbalint13 you mentioned in your previous comment? I can think of few options there: 1. If it is more like a status register that will apply to several instructions, we can use pragmas/TensorIR block attributes 2. If it is not an expensive operation, we can add an optional argument to ramps and broadcasts to indicate the active lanes 3. The RISC-V backend in TVM should have all the information to translate `tir.get_active_lane_mask` to appropriate LLVM intrinsics that set the VL register. ### * ... implement both `vscale` and `vfactor` > Why not have both `vfactor` (abstract) concept along with `vscale` (real), > where the `vfactor` would be a "virtual" teller of how a single true type > `vscale` ramps ? This make the "implicit data type to be know" on one hand, > and also would be expressive enough for "vectors with multiple vector width". Sorry I missed this before! That's a good point, I think there would be benefits in having both of these available. It would certainly make expressing multiples of vector length simpler, e.g. ``` Ramp(base, 1, 2 * vfactor) ``` would imply `LMUL = 2`. If we want to keep `vfactor` as a user facing convenience function, we could do the translation from `vfactor` to `n * vscale` in the Ramp constructor so we wouldn't need to teach the TVM internal passes how to deal with it. --- @tqchen > Just to circle back here a bit. the main root issue is that we are using > runtime::DataType, which is supposely being concrete through out the TIR node. > This places restrictions on what we can normally represent. A more > comprehensive update would change the PrimExpr's field to also an object, as > per StructInfo in the relax. That would requires bit more thinking, which > likely can get around the issues mentioned in the thread(of passing around > runtime::DataType which is not an object). I think I see what you mean and I agree, if we had something of a base type `Object` representing the data type that would give us much more freedom in expressing the compile time data type. I see how this would be a pretty invasive change though, I'm also not sure how this would interoperate with the `DLDataType` dependent runtime implementation (but I also don't know the runtime implementation very well). > I think in short term making the protocol of lanes = -1 and lanes = -8(for > vscale(8)) may not be a bad idea. The main reason is I cannot think of > another possible use of the lanes field other than for the SVE. I'm fine with going with this option (especially if we manage to hide the -8 from the user) as it is probably the least invasive and sturdy option that will allow us to achieve our goals. @lhutton has been prototyping an additional field in `runtime::DataType`, but it's a bit of a can of worms (as per his post above). We intend to upload a draft prototype soon, then you guys will have something more concrete to look at :) -- Reply to this email directly or view it on GitHub: https://github.com/apache/tvm-rfcs/pull/104#issuecomment-1847090278 You are receiving this because you are subscribed to this thread. Message ID: <apache/tvm-rfcs/pull/104/c1847090...@github.com>