We discussed this at the [TVM Community
Meeting](https://discuss.tvm.apache.org/t/next-tvm-community-meeting-may-25/12807)
this morning. Here are notes (thanks to mbs-octoml for taking these):
- Most discussion was around whether the approach could help the related
problem of long training time and a large number of distinct kernels
for fully statically-shaped models. Eg resnet50 with its dozens of
static shape instances for conv2d.
It is useful to consider kernel compilation in two steps:
- Start with the N instances of the same operator, with different
shapes.
- After tuning those independently, may end up with M << N schedules.
- However, each schedule is instantiated for the original static shape
so that no dimension variables are left behind at run time,
thus we are back at N implementations.
There's a significant perf impact to having only M kernel implementation
which retain their parameterization on dimension variables. However, TVM
could support not doing that inlining today.
DietCode could be used to reduce M. The learned decision tree would dispatch
from a fully shape polymorphic implementation to the specific schedule
implementation. But at this stage the focus is on dynamic shape workloads.
- No changes to Relay are required. Though Relay's dynamic shape functions
support data-dependent output shapes, currently DietCode does not search
over output shape and assumes only input shapes need to be inspected. But
sounds like this could be addressed in future work.
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/72#issuecomment-1137658500
You are receiving this because you are subscribed to this thread.
Message ID: <apache/tvm-rfcs/pull/72/[email protected]>