We discussed this at the [TVM Community 
Meeting](https://discuss.tvm.apache.org/t/next-tvm-community-meeting-may-25/12807)
 this morning. Here are notes (thanks to mbs-octoml for taking these):
 - Most discussion was around whether the approach could help the related
   problem of long training time and a large number of distinct kernels
   for fully statically-shaped models. Eg resnet50 with its dozens of
   static shape instances for conv2d.

   It is useful to consider kernel compilation in two steps:
    - Start with the N instances of the same operator, with different
      shapes.
    - After tuning those independently, may end up with M << N schedules.
    - However, each schedule is instantiated for the original static shape
      so that no dimension variables are left behind at run time,
      thus we are back at N implementations.

   There's a significant perf impact to having only M kernel implementation
   which retain their parameterization on dimension variables. However, TVM
   could support not doing that inlining today.

   DietCode could be used to reduce M. The learned decision tree would dispatch
   from a fully shape polymorphic implementation to the specific schedule
   implementation. But at this stage the focus is on dynamic shape workloads.

 - No changes to Relay are required. Though Relay's dynamic shape functions
   support data-dependent output shapes, currently DietCode does not search
   over output shape and assumes only input shapes need to be inspected. But
   sounds like this could be addressed in future work.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/72#issuecomment-1137658500
You are receiving this because you are subscribed to this thread.

Message ID: <apache/tvm-rfcs/pull/72/c1137658...@github.com>

Reply via email to