[The tracking issue](https://github.com/apache/tvm/issues/16455)
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/104#issuecomment-1906190519
You are receiving this because you are subscribed to this thread.
Message ID:
Merged #104 into main.
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/104#event-11567305947
You are receiving this because you are subscribed to this thread.
Message ID:
Thanks @tqchen, good point! I updated the Future Possibilities section with
some ideas for enabling the scalable vector support in the meta schedule.
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/104#issuecomment-1898806879
You are receiving this
Thanks for working through this. One final comment, on `Exposing scalable
vectors to tuning`. Iet us discuss through
[MetaSchedule](https://github.com/apache/tvm-rfcs/blob/main/rfcs/0005-meta-schedule-autotensorir.md)
as that is a more synergistic approach of tuning moving forward and also works
Thanks everyone for all the good discussion so far! ❤️ We've had this RFC
public for over 4 months now and the prototype up for few weeks and from what I
can see, there are currently no outstanding issues here - hence we'd like to
proceed with merging this RFC next week. I'll then create a trac
> if predication is involved, maybe we can explicitly do A.store(...)? where
> predicate can be a kwarg
Thanks @tqchen for the good suggestion, I included it into the RFC text (as an
extension to `vload` and `vstore`).
I also included a note about the "-8" decision regarding to `runtime::DataT
if predication is involved, maybe we can explicitly do A.store(...)? where
predicate can be a kwarg
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/104#issuecomment-1881743971
You are receiving this because you are subscribed to this thread.
Message
A change that has not yet been included in the prototype was the predicate
representation on buffer loads/stores in TVMScript programs. This was briefly
referenced in the RFC:
https://github.com/apache/tvm-rfcs/pull/104/files#diff-6724c2a24eb34f7094b4ff2e8562f7812e6e22c8197f51792f4b5cdfa811fec4R
Happy new year everyone! 🎉 Here's the SVE prototype, as promised -
https://github.com/apache/tvm/pull/16347. It's made by @lhutton1, @neildhickey
and me.
@tqchen @cbalint13 @Lunderberg @kparzysz-quic et al please have a look!
--
Reply to this email directly or view it on GitHub:
https://git
> I'm also not sure how this would interoperate with the DLDataType dependent
> runtime implementation (but I also don't know the runtime implementation very
> well).
Given SVE is only at compile time concept, likely we don't need DLDataType
counterpart, if we remove runtime data type from the
@cbalint13 @tqchen Thank you for your input! This thread has been dormant for
a bit, but we're still on it!
> A comprehensive presentation on SVE design booth on RISCV and ARM from
> perspective of LLVM.
The presentation captures all the design details of the SVE rationale in LLVM
including ar
Just to circle back here a bit. the main root issue is that we are using
runtime::DataType, which is supposely being concrete through out the TIR node.
This places restrictions on what we can normally represent. A more
comprehensive update would change the PrimExpr's field to also an object, as
FYI,
@ekalda , @lhutton1 , @tqchen
An comprehensive presentation on SVE design booth on RISCV and ARM from
perspective of LLVM.
The presentation captures all the design details of the SVE rationale in LLVM
including arch comparisions.
* https://youtu.be/-ox8iJmbp0c?feature=shared (Vector Codegen
Regarding the changes required to support scalability in the data type, I've
been prototyping adding a new `scalable_` attribute to `DataType` that wraps
`DLDataType`.
However, I've ran into what I believe is an issue when accessing data types at
compile-time across the FFI boundary between pyt
Thanks @ekalda for the nice work of the proposal, permit few personal point of
views supporting the initiative:
### Pros
> 1. When eyeballing the TIR, the meaning of the `vscale` intrinsic is
> intuitive since it's matches LLVM
> 2. It makes translating the expressions involving `vscale` that e
I think there's a confusion about the difference between what we have referred
to as `vscale` and `vfactor`. I'll try to summarise the the difference and the
respective pros and cons.
For reference, this is how LLVM represents vectors (copied from the
[documentation](https://llvm.org/docs/Lang
I think assuming a single vector width(vscale) and use `kScalableVectorMark=-1`
to mark it would be a good tradeoff, given it may not be that useful to create
vectors with multiple vector width anyway for optimization reasons.
If we want to go beyond a single symbolic variable, having some expli
Regarding to changing the `DLDataType`, I can see how it could have a wide
disruptive impact. Scalable vectors are here to stay though, so could be a way
to future proof `DLPack` standard? 🤷♀️
One of the main problems we have with using -1 to denote scalable vectors is
that it doesn't capture
> I guess we could pass an argument to the vectorizer whether to generate
> SVE-friendly code. If this is limited to emitting additional TIR builtins,
> then I'm ok with that. I just want to be able to reuse as much of the
> vectorization code as possible between SVE and non-SVE targets.
@kparz
Agreeing with @kparzysz-quic, changes that update the `DLDataType` would need
to be approached very cautiously. I usually lean toward allowing short-term
breakages if they lead to better long-term code health, but updating the
`DLDataType` would be very wide reaching even more my tastes.
One
> Is there any technical reason blocking us from extending DLDataType to have a
> `is_scalable` vector field, allowing us to maintain the meaning of the lanes
> field to represent the number of lanes?
DLDataType comes from dlpack not TVM. Changing it may affect the ABI of any
function acceptin
> Another idea to handle this would be to add a new field to DLDataType, e.g.
> bool is_scalable, but I'm not sure how feasible changing that standard is.
I feel extending DLDataType to represent scalable vectors explicitly would be a
more robust design than depending on interpreting -1 in a spe
I guess we could pass an argument to the vectorizer whether to generate
SVE-friendly code. If this is limited to emitting additional TIR builtins,
then I'm ok with that. I just want to be able to reuse as much of the
vectorization code as possible between SVE and non-SVE targets.
As far as pr
> What I'm aiming at is to be able to lower the TIR to a generic CPU, that is
> to an architecture that does not support SVE. The TIR will need to have some
> default lowering in CodeGenLLVM/CodeGenCPU, so being able to do that is
> important. For that, we should be able to assume that vscale is
> Could it instead be in a target-dependent lowering pass?
Sure. My idea is to have a single SVE-aware vectorization pass in TVM, and
then be able to utilize it for all targets. I'm particularly interested in
predication. How the codegen is done doesn't matter much.
--
Reply to this email d
> What I'm aiming at is to be able to lower the TIR to a generic CPU, that is
> to an architecture that does not support SVE. The TIR will need to have some
> default lowering in CodeGenLLVM/CodeGenCPU, so being able to do that is
> important.
Could it instead be in a target-dependent lowering
Sorry for the delay... What I'm aiming at is to be able to lower the TIR to a
generic CPU, that is to an architecture that does not support SVE. The TIR
will need to have some default lowering in CodeGenLLVM/CodeGenCPU, so being
able to do that is important. For that, we should be able to ass
I'm back from holiday and want to get this RFC moving again! Thanks for all the
good discussion so far, I've made some changes to the RFC:
* Use `vscale` directly instead of `vfactor` and use TIR intrinsic to represent
`vscale` instead of introducing new node
* Opt for predication instead of clea
Thanks for your comments @kparzysz-quic! Some clarifying questions and
thoughts:
> Add a parameter to tir.vscale to state the minimal assumed vector length. For
> AArch64 SVE it will be 128 (bits), but some other non-SVE architecture can
> provide a different value (via a target hook, or somet
Thanks for bringing this up again.
A few suggestions to make it more general:
1. Add a parameter to `tir.vscale` to state the minimal assumed vector length.
For AArch64 SVE it will be 128 (bits), but some other non-SVE architecture can
provide a different value (via a target hook, or something
@tqchen Thanks for elaborating on the GPU programming model, I see the
parallels between programming for variable number of threads and vectors with
unknown lenghts. S1 option looks quite similar to what is described in this
RFC, except using the scoping instead of marking the variable with
`T.
BTW, after writing it down, we can find that perhaps it is not necessary (for
S1) to explicitly introduce a special vscale. Another approach is that we can
mark an SVE scope, and use a normal tvm variable `n` to mark the sve extent.
```python
# note vscale = n
n = T.let(call(tvm.builtin.vscale()
it might be useful also bring some discussions to forums. here is a quick
related sketch of GPU related models
```python
for y in range(64):
for x in range(64):
C[y, x] = A[y, x] * (B[y] + 1)
```
Say we are interested in the original program. In a normal GPU programming
terminology, we w
Thanks for your comments @tqchen, much appreciated! I want to ask some
clarifications and expand on some of the points you made, based on my
understanding.
TL;DR:
- We need to be able to express `vscale` dependent `extent`s in the TIR `For`
nodes
- Aside of predication, SVE vectors are not muc
Some quick comments
- I think we should use tir intrinsics(as opposed to a new node, which would
add extra burdens in the IR)
- In general, it might be useful to know the information that a value is
multiple of something (e.g. 128), so having something like `x * 128` might help
- I would still
Tagging some people who have been involved in related discussions before:
@tqchen @kparzysz-quic @masahi
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/104#issuecomment-1692114269
You are receiving this because you are subscribed to this thread.
M
This RFC is to add support for vector length agnostic programming in TVM stack.
You can view, comment on, or merge this pull request online at:
https://github.com/apache/tvm-rfcs/pull/104
-- Commit Summary --
* [RFC] Scalable vectors in TIR
-- File Changes --
A rfcs/0104-scalable-vecto
37 matches
Mail list logo