Thanks for bringing this up again.

A few suggestions to make it more general:
1. Add a parameter to `tir.vscale` to state the minimal assumed vector length.  
For AArch64 SVE it will be 128 (bits), but some other non-SVE architecture can 
provide a different value (via a target hook, or something like that).  This 
way more targets can take advantage of this.
2. The special case of `lanes == -1` in Ramp does not easily extend to multiple 
parameters, but it could be handled in some ways...
3. If you plan to include predication eventually, that would be something that 
a lot of targets could use.  The LLVM intrinsics for predicated operations do 
not explicitly require SVE, they can be used with fixed-sized vectors as well.

For dealing with an unknown vector lengths and simultaneously allowing specific 
lengths per use-site we could either
1. Require that if Ramp/Broadcast has `lanes == -1`, then the `base`/`value` 
member must be a TIR intrinsic specifying the vscale for the value.  E.g. 
`Ramp(tir.vscale(128, base), stride, -1)` or `Broadcast(tir.vscale(256, value), 
-1)`.
2. Extend Ramp and Broadcast to take `lanes` as `PrimExpr`, with restrictions 
on what that expression can contain.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/104#issuecomment-1705543104
You are receiving this because you are subscribed to this thread.

Message ID: <apache/tvm-rfcs/pull/104/c1705543...@github.com>

Reply via email to