[Apache TVM Discuss] [Development/RFC] [RFC] TensorIR: A schedulable IR for TVM

2020-09-15 Thread Matt Barrett via Apache TVM Discuss
Thanks for this RFC, I think it's a great idea and will help solve a number of issues I've been facing recently. I'm particularly interested in what 'tensorize' will look like for this new IR. Could you give a snippet as an example? I'm also interested in what the interaction of this will be

[Apache TVM Discuss] [Development/RFC] [RFC] TensorIR: A schedulable IR for TVM

2020-09-15 Thread Matt Barrett via Apache TVM Discuss
Thanks for this explanation. I'm interested if it might be possible to match tensor intrinsics with variable size? For example, Arm SVE introduces vector instructions of variable size. --- [Visit Topic](https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872/24) to re

[Apache TVM Discuss] [Development] Creating store_at in TVM

2020-10-05 Thread Matt Barrett via Apache TVM Discuss
Halide provides the scheduling primitive `store_at` (as well as `store_root`) to move where the storage of a tensor happens independent of the compute. This is very useful for when we want to make use of the sliding window optimization and create rolling buffers - both of which can be critical

[Apache TVM Discuss] [Development/RFC] [RFC] 'Cascade' Scheduling

2020-10-08 Thread Matt Barrett via Apache TVM Discuss
We at Arm would like to introduce a new optimization technique to TVM which we refer to as 'cascading', which aims to optimize a network's peak memory usage. While in many of the systems TVM targets today there is plentiful memory available, for the embedded devices that uTVM aims to target, w

[Apache TVM Discuss] [Development/RFC] [RFC] 'Cascade' Scheduling

2020-10-08 Thread Matt Barrett via Apache TVM Discuss
Thanks for the feedback :) Tiling the output computations + `compute_at` is actually exactly what I've been doing to prototype this - and you're right that for a sufficiently large tile the recompute isn't particularly bad. I think the rolling buffers aren't immediately essential, but they wou

[Apache TVM Discuss] [Development/RFC] [RFC] 'Cascade' Scheduling

2020-10-11 Thread Matt Barrett via Apache TVM Discuss
The limitation is perhaps more with TE/TIR than it is TOPI, in that currently *all* the scheduling decisions need to happen together. The proposed changes with TensorIR would lift that constraint, but for it to actually be useful the FuseOps pass would have to become a TIR pass rather than a R

[Apache TVM Discuss] [Development/RFC] [RFC] 'Cascade' Scheduling

2020-10-12 Thread Matt Barrett via Apache TVM Discuss
I should emphasize that while I've used 2 convolutions to illustrate a situation in which this technique is useful, it actually generalizes to any operators which have some degree of locality to them (eg. max pool). In that sense, we're not interested in matching particular well-defined patter

[Apache TVM Discuss] [Development/RFC] [RFC] Refactor the compile_engine to expose a Relay -> TE translator

2020-11-09 Thread Matt Barrett via Apache TVM Discuss
The current design of the compile_engine utilises ScheduleGetter to translate a primitive function into a scheduled tensor expression. However, as it is an all-in-one pass, this means it is directly coupled to the schedules defined in TOPI. It would instead be useful to break this into two sta

[Apache TVM Discuss] [Development/RFC] [RFC] Refactor the compile_engine to expose a Relay -> TE translator

2020-11-09 Thread Matt Barrett via Apache TVM Discuss
> it seems you need to call `lower_call` twice (one in `TETranslator` and > another in `ScheduleGetter` ). In this case, seems like you still select the > schedule in `TETranslator` So yes, I do call it twice and really this is a consequence of 'lower_call' also probably needing a similar ref

[Apache TVM Discuss] [Development/RFC] [RFC] Refactor the compile_engine to expose a Relay -> TE translator

2020-11-10 Thread Matt Barrett via Apache TVM Discuss
@Hzfengsy @spectrometerHBH I'd be interested to hear your thoughts on this as I imagine it could have some overlap with the work you're doing on TensorIR. --- [Visit Topic](https://discuss.tvm.apache.org/t/rfc-refactor-the-compile-engine-to-expose-a-relay-te-translator/8417/5) to respond.

[Apache TVM Discuss] [Development] Duplication of the driver between C++ and Python

2021-04-14 Thread Matt Barrett via Apache TVM Discuss
I've been looking into the TVM lower/build pipeline recently and have encountered an unusual duplication around the 'driver'. In particular, we have two files `src/driver/driver_api.cc` and `python/tvm/driver/build_module.py` which both seem to independently define almost identical functionali

[Apache TVM Discuss] [Development/RFC] [RFC][TFLite frontend] Create models for frontend testing by directly writing TFLite buffers

2021-04-26 Thread Matt Barrett via Apache TVM Discuss
@siju-samuel @FrozenGene you may also be interested in this proposal. --- [Visit Topic](https://discuss.tvm.apache.org/t/rfc-tflite-frontend-create-models-for-frontend-testing-by-directly-writing-tflite-buffers/9811/3) to respond. You are receiving this because you enabled mailing list mo

[Apache TVM Discuss] [Development/RFC] [RFC] Introducing a 'rolling_buffer' scheduling primitive

2021-04-26 Thread Matt Barrett via Apache TVM Discuss
### What is a rolling buffer? A rolling buffer (at least for the purposes of this RFC) is a buffer where one of the dimensions should be addressed via modulo arithmetic. This gives it a 'wrap-around' behaviour which makes it self-overwriting. This means they are effectively just higher dimens

[Apache TVM Discuss] [Development/pre-RFC] [RFC] Introducing a 'rolling_buffer' scheduling primitive

2021-11-25 Thread Matt Barrett via Apache TVM Discuss
Hi, Thanks for your interest in this :) Regarding 1, this is a good point. Initially we were only intending to use rolling_buffer in our custom TIR lowering pipeline (doesn't use driver_api.cc), but I agree it'd be much better to also include it in the standard flow. I'll work on a patch to e