[Apache TVM Discuss] [Development/pre-RFC] Introducing TY-NNP backend with end2end TensorIR integration

Mark Shields via Apache TVM Discuss Tue, 04 Jan 2022 15:31:16 -0800


Hi @wrongtest, thanks for the nice write up.

> Currently, we have to hack the compile engine to find the pre-scheduled
> PrimFunc from a standalone cache, we are glad to know what is the best way to
> achieve this goal.

Here's some thoughts, please correct any misunderstandings I might have.

Yes, the relay_to_tir pass that @Mousius added a few months back could work,
since it seems you want to completely take over scheduling. You can use
annotations to convey layout hints from your analysis pass to lowering (or just
do it on-the-fly in your hook, you're probably doing a global analysis though?)
You'd have to implement caching yourself, but when Chris and I were trying to
decide if caching was something worth building into the relay_to_tir machinery
the consensus was it was straightforward to just implement it directly in each
hook function. So that gives you both full control over the conversion to TIR
and full control over the rewritten call_lowered you leave behind. I think
everyone would be happy to extend that if you find it lacking.

We've also been mulling over another approach to incremental layout
optimization, though it's by no means ready to use out-of-the-box (but maybe it
sparks your interest?). We can now invoke lowering multiple times, and with a
bit more work we could even restrict lowering to trigger on only particular
'focus' primitive functions. Ie we don't have to lower all-at-once. We've also
done some legwork to allow virtual device annotations to flow both into and out
of already lowered PrimFuncs, all be it currently only for memory scope and not
layout. But putting those together we could imagine:
- allow layout constraints to appear in VirtualDevices, just as we now do for
memory/storage scope.
- choose a subset of 'critical' primitives (maybe just one) to lower, and give
lowering free choices to choose the best layout. Capture that choice in the
PrimFunc using VirtualDevices on the arguments.
- re-run device planning to flow the new layout constraints to
yet-to-be-lowered primitives. Where layouts have a hard disagreement insert the
necessary layout x-forms as per the bijections you describe.
- re-run lowering on the next set of 'critical' primitives, this time
respecting any layout constraints already imposed on the arguments, but as
before any still unconstrained arguments can have their layout chosen during
lowering.
- repeat until all primitives lowered.

Would be happy to talk more about that if you see a connection.

---
[Visit
Topic](https://discuss.tvm.apache.org/t/introducing-ty-nnp-backend-with-end2end-tensorir-integration/11807/6)
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/05d8d7721d54fb9b1767e6d1b3598622904e8d988254f667cfebd6d4ad427275).

[Apache TVM Discuss] [Development/pre-RFC] Introducing TY-NNP backend with end2end TensorIR integration

Reply via email to