We discussed this at the TVM Community Meeting this morning. There was a presentation about the approach followed by some discussion. Thanks @MJKlaiber @cgerum @SebastianBoblestETAS @paulpb @PhilippvK @r.stahl @aca88 for bringing this to the meeting!
Here are some notes (please feel free to correct them if I got anything wrong!): - The current graph partitioning approach is the same one that's used in the compiler today. It's compatible with the collage partitioning which is in the works and not yet RFC'd. - Would the v1 support Tensor Expression (TE), or are we skipping that? - Mikael understands CreatePrimFunc can support TE so should be natively supported - Paolo: using standard lowering as is done by Ethos-U - Proposal has an explicit differentiation between S-TIR adn NS-TRI. WOuld there be different hooks? e.g. here we can register TIR scheduling passes vs TIR passes. - Will it be possible to contribute S-TIR back to the compiler or just NS-TIR? - Scheduling passes work on S-TIR; passes in the boxes behind the schedules are injected into the lowering by pass context. Passes do not return S-TIR. They are part of the lowering from S-TIR to NS-TIR. At the moment, calling tvm.lower() and injecting those passes in to tvm.lower() - In Relay-to-TIR hook, already trying to figure out the lowering order, which might not match parittioning order. Want to see memory available after compiling c functions but before lowering Ethos-U functions. Any thoughts on whether it's possible to configure the order of partitioning in this flow? - Why? Need to see the amount of live memory available after running the default TVM flow. - Relay passes can see the whole IRModule, past that only functions for a particular target are seen by a TIR pass. - The order needs to be decided and it varies by registration point. - Q: Are there common accelerator passes that are in use in TVM, or does everyone do something different? - There are common touch points, those are the "plumbing" mentioned in this slide presentation. e.g. Graph partitioning, scheduling, code-generation. - UMA isn't trying to box anyone into a particular flow, instead it's just trying to suggest one way doing this from a broader set of options to serve as a guide for folks who may be new to TVM. - Question from Federico, who is integrating an accelerator of his own. - VTA uses memory scopes to define buffers in block-ram. Are we planning to accommodate that in UMA? - You could write your own schedules and passes to do this. storage_scope is kind of the way to do this at the runtime level. You can also leverage USMP to define memory pools and use it as a pass to schedule. --- [Visit Topic](https://discuss.tvm.apache.org/t/rfc-uma-universal-modular-accelerator-interface/12039/13) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/fdb7294180587d4848a6ec0dc04a430f4006ee52258efe1b861d3ec0f05a7cd3).