Hi @MJKlaiber ,

Apologies to for not getting back to this in time.
Thanks for the proposal! and it broadly looks like wrapping the Target Hooks 
RFC (by @Mousius ) : 
https://github.com/apache/tvm-rfcs/blob/main/rfcs/0010-target-registered-compiler-flow-customisation.md,
 and exposing a nice/structured interface to python. It is nice to see progress 
on this :) .

I would like to suggest potential text changes for the formal RFC to those of 
us who are familiar with the existing flow (specially around naming).

[quote="MJKlaiber, post:1, topic:12039"]
UMA Partitioning:

* Register relay passes
* Register patterns - supported sub-graph operations
* Order: pre-partitioning passes, Graph partitioning, post-partitioning passes
* UMAPartitioner baseclass (Python only) has to be inherited by 
accelerator-specific Partitioners (e.g. Accelerator A Partitioner, etc)
[/quote]

Maybe it is worth mentioning these are current implemented as 
partition_for_\<(backend\target)\> ?
https://github.com/apache/tvm/blob/f1ff61a7b9adb61eaa0db842bbf71c704d350154/python/tvm/relay/op/contrib/dnnl.py#L210
https://github.com/apache/tvm/blob/f1ff61a7b9adb61eaa0db842bbf71c704d350154/python/tvm/relay/op/contrib/tensorrt.py#L83
https://github.com/apache/tvm/blob/f1ff61a7b9adb61eaa0db842bbf71c704d350154/python/tvm/relay/op/contrib/ethosu.py#L1660

I am a bit curious, why this interface is specifically positioned as an 
"accelerator" (as in UMA) partitioner though ?
i.e. Would it not be used to support optimized library support as we currently 
have today with BYOC ?

[quote="MJKlaiber, post:1, topic:12039"]
```
class MyCustomAcceleratorPartitioner(UMAPartitioner):
    @property
    def target_name(self):
        return "my_custom_accelerator"

    def _register_patterns(self):
        self._register_pattern("conv1d_relu", conv1d_relu_pattern())
    
    def _register_relay_passes(self):
        self._register_relay_pass(1, ConfigGenerator())
        self._register_relay_pass(2, BufferScopeAnnotator())
```
[/quote]

Since the proposal suggests to use the properly registered targets, any reason 
should we stick to target_name (str) as opposed to the actual TargetKind ?

Following up on the above question, what are your thoughts on moving the 
UMAPartitioner inside relay.build(...) ?

Also this seemed to be proposed on using S-TIR (as opposed to "legacy" TE->TIR 
pipeline), would you be able to share the motivation to the partitioning of 
tir_schedules and tir_passes ? (Im asking mainly because they will all be S-TIR 
--> S-TIR IRModule passes).

Following from the above question, is there an ambition to handover S-TIR back 
to the core compiler ?

Following up on Mark's comments,

[quote="mbs-octoml, post:5, topic:12039"]
My group here at OctoML have been looking at bringing a backend placement 
search capability to TVM, a la the ‘Collage’ paper 
([https://arxiv.org/pdf/2111.00655.pdf 
](https://arxiv.org/pdf/2111.00655.pdf)). Under that approach there’s no longer 
a notion of a BYOC uniquely partitioning the graph according to its rules and 
heuristics in ‘one shot’. Instead the BYOC must convey the rules (patterns, 
predicates) for which operators could potentially be offloaded, and leave the 
actual partitioning to the main Collage searcher.
[/quote]

Mark, we are quite looking forward for the RFC for this, especially related to 
reference-level explanation to see where this work is headed -- which I believe 
might be better to know in this mutual interest of structuring BYOC targets.

However, I think we all share the ambition to replace kCompiler strings to be 
targets if can get more support from the community.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/rfc-uma-universal-modular-accelerator-interface/12039/11)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/70a91c3429ea525fc3b94a58d070b1b823d4e916f5277fe8ed3bfffa12c7b64b).

Reply via email to