Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-03 Thread Siyuan Feng
@tmoreau89 Exactly! For now, we use the NCHWnc layout, the same layout with VTA. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/4052#issuecomment-537816661

Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-03 Thread Siyuan Feng
@yangjunpro Really happy to see another solution for TensorCore. You are right! I just extend tvm intrinsic to support it. It does cause programmers who write the schedule some trouble. It is not easy to write a high-performance schedule. I'm really curious about how to use IR passes to recogn

Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-03 Thread 孙敏敏
@Hzfengsy Sure, we will show the code as well as a sample schedule very soon. It's being under internal review now. As you will see, the schedule for TensorCore CodeGen looks no different than a normal matmul schedule for GPU. Everything is done in IR passes including matrix_a/matrix_b/accumulat

[dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread 雾雨魔理沙
I am trying to train mnist using tvm, and I hit an issue: There is two function, loss and infer, which one calculate loss and gradient, and one just does infer. However, create_executor/aot/vm all only take one single entry point. If there is multiple entry point, the passes will be called multip

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread Junru Shao
Just to clarify: > There is two function, loss and infer, which one calculate loss and gradient, > and one just does infer. Do you mean that there are two modes, training and inference, so that there are multiple entry points in relay Module? -- You are receiving this because you are subscrib

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread Yao Wang
Can we have a more detailed example to help clarify this issue? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/4054#issuecomment-538060274

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread Junru Shao
And also: > 'compile then execute' is not enough for all the deep learning workload. For > example, using our partial evaluator to specialize > training/validation/testing data mean we must compile only after we had > loaded all the data. So in DL, common practice is that we specify the input

Re: [dmlc/tvm] [DEV] TVM v0.6 Roadmap (#2623)

2019-10-03 Thread Haichen Shen
# TVM Monthly - September 2019 https://discuss.tvm.ai/t/tvm-monthly-september-2019 -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2623#issuecomment-538074210

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread 雾雨魔理沙
@junrushao1994 yes, that is what I mean, inference mode and training mode, each mode compiled to one function. we do the partial evaluator primarily for partially evaluating the control flow w.r.t data. Other framework also does this, but they require manual loop unrolling -- You are receiving

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread Junru Shao
@MarisaKirisame Do you mean you want the training loop itself to be part of Relay? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/4054#issuecomment-538140726

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread 雾雨魔理沙
No. suppose I have a treelstm. Now normally, I cannot do any operator fusion/batching, because of control flow everywhere. Using the partial eval on batches of training data individually will solve this problem. -- You are receiving this because you are subscribed to this thread. Reply to this

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread 雾雨魔理沙
However, that require the ability to create multiple entry (one entry per batch). If we also want to use relay for any form of jit, we must be able to interleave running relay/adding more definitions to a relay module. -- You are receiving this because you are subscribed to this thread. Reply t

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread Junru Shao
I see. So for mnist, there is no such issue; but for treelstm, it is true that we are not able to do more optimization if we don't do code replication. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread Junru Shao
Let's get back to the original topic, which is broader imo. First of all, depending on your scenario, incremental compilation may be doable or not, like on edge devices where space is only allowed for tvm runtime, not compiler. Then, I am actually in favor of incremental compilation, or some pr

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread 雾雨魔理沙
@junrushao1994 we unroll the control flow to be more efficient. maybe multiple module can work, but there cant be code sharing between multiple ones. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tv

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread Junru Shao
my point is that we don't have to do full data-dependent unrolling, but unroll a deterministic 4 steps or 10 steps to make it data independent -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issue

Re: [dmlc/tvm] [RFC][Relay] Multiple Entries & Incremental compilation (#4054)

2019-10-03 Thread 雾雨魔理沙
@junrushao1994 it is only possible for lstm. for treelstm if you do it it will blow up exponentially, and lots of time will be spend on testing the match on all the cases - unless some tricks are used (for example, a decision tree for pattern matching instead of linear) still, this doesnt allow