We discussed this at the TVM Community Meeting this morning. There was a 
presentation about the approach followed by some discussion. Thanks @MJKlaiber 
@cgerum @SebastianBoblestETAS @paulpb @PhilippvK @r.stahl @aca88 for bringing 
this to the meeting!

Here are some notes (please feel free to correct them if I got anything wrong!):
- The current graph partitioning approach is the same one that's used in the 
compiler today. It's compatible with the collage partitioning which is in the 
works and not yet RFC'd.
- Would the v1 support Tensor Expression (TE), or are we skipping that?
  - Mikael understands CreatePrimFunc can support TE so should be natively 
supported
  - Paolo: using standard lowering as is done by Ethos-U
- Proposal has an explicit differentiation between S-TIR adn NS-TRI. WOuld 
there be different hooks? e.g. here we can register TIR scheduling passes vs 
TIR passes.
  - Will it be possible to contribute S-TIR back to the compiler or just NS-TIR?
     - Scheduling passes work on S-TIR; passes in the boxes behind the 
schedules are injected into the lowering by pass context. Passes do not return 
S-TIR. They are part of the lowering from S-TIR to NS-TIR. At the moment, 
calling tvm.lower() and injecting those passes in to tvm.lower()
- In Relay-to-TIR hook, already trying to figure out the lowering order, which 
might not match parittioning order. Want to see memory available after 
compiling c functions but before lowering Ethos-U functions. Any thoughts on 
whether it's possible to configure the order of partitioning in this flow?
   - Why? Need to see the amount of live memory available after running the 
default TVM flow.
   - Relay passes can see the whole IRModule, past that only functions for a 
particular target are seen by a TIR pass.
   - The order needs to be decided and it varies by registration point.

- Q: Are there common accelerator passes that are in use in TVM, or does 
everyone do something different?
   - There are common touch points, those are the "plumbing" mentioned in this 
slide presentation. e.g. Graph partitioning, scheduling, code-generation.
   - UMA isn't trying to box anyone into a particular flow, instead it's just 
trying to suggest one way doing this from a broader set of options to serve as 
a guide for folks who may be new to TVM.
- Question from Federico, who is integrating an accelerator of his own.
  - VTA uses memory scopes to define buffers in block-ram. Are we planning to 
accommodate that in UMA?
    - You could write your own schedules and passes to do this. storage_scope 
is kind of the way to do this at the runtime level. You can also leverage USMP 
to define memory pools and use it as a pass to schedule.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/rfc-uma-universal-modular-accelerator-interface/12039/13)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/fdb7294180587d4848a6ec0dc04a430f4006ee52258efe1b861d3ec0f05a7cd3).

Reply via email to