We discussed this at the TVM Community Meeting this morning. There was a
presentation about the approach followed by some discussion. Thanks @MJKlaiber
@cgerum @SebastianBoblestETAS @paulpb @PhilippvK @r.stahl @aca88 for bringing
this to the meeting!
Here are some notes (please feel free to correct them if I got anything wrong!):
- The current graph partitioning approach is the same one that's used in the
compiler today. It's compatible with the collage partitioning which is in the
works and not yet RFC'd.
- Would the v1 support Tensor Expression (TE), or are we skipping that?
- Mikael understands CreatePrimFunc can support TE so should be natively
supported
- Paolo: using standard lowering as is done by Ethos-U
- Proposal has an explicit differentiation between S-TIR adn NS-TRI. WOuld
there be different hooks? e.g. here we can register TIR scheduling passes vs
TIR passes.
- Will it be possible to contribute S-TIR back to the compiler or just NS-TIR?
- Scheduling passes work on S-TIR; passes in the boxes behind the
schedules are injected into the lowering by pass context. Passes do not return
S-TIR. They are part of the lowering from S-TIR to NS-TIR. At the moment,
calling tvm.lower() and injecting those passes in to tvm.lower()
- In Relay-to-TIR hook, already trying to figure out the lowering order, which
might not match parittioning order. Want to see memory available after
compiling c functions but before lowering Ethos-U functions. Any thoughts on
whether it's possible to configure the order of partitioning in this flow?
- Why? Need to see the amount of live memory available after running the
default TVM flow.
- Relay passes can see the whole IRModule, past that only functions for a
particular target are seen by a TIR pass.
- The order needs to be decided and it varies by registration point.
- Q: Are there common accelerator passes that are in use in TVM, or does
everyone do something different?
- There are common touch points, those are the "plumbing" mentioned in this
slide presentation. e.g. Graph partitioning, scheduling, code-generation.
- UMA isn't trying to box anyone into a particular flow, instead it's just
trying to suggest one way doing this from a broader set of options to serve as
a guide for folks who may be new to TVM.
- Question from Federico, who is integrating an accelerator of his own.
- VTA uses memory scopes to define buffers in block-ram. Are we planning to
accommodate that in UMA?
- You could write your own schedules and passes to do this. storage_scope
is kind of the way to do this at the runtime level. You can also leverage USMP
to define memory pools and use it as a pass to schedule.
---
[Visit
Topic](https://discuss.tvm.apache.org/t/rfc-uma-universal-modular-accelerator-interface/12039/13)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/fdb7294180587d4848a6ec0dc04a430f4006ee52258efe1b861d3ec0f05a7cd3).