Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-29 Thread Siyuan Feng
Closed #4052. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/4052#event-2753595541

Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-15 Thread Siyuan Feng
I have chatted with @minminsun and his team these days. Just as then mentioned https://github.com/dmlc/tvm/issues/4105#issuecomment-542032766. We can have different frontends but only one backend. In my previous implement, users can only use fragments with 16x16x16 shape and row-major layout. To

Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-07 Thread Siyuan Feng
@soiferj Thank you for such a helpful comment. I have just made the extension into the schedule for BatchMatMul. You can check the schedule in my fork repo: https://github.com/Hzfengsy/tvm/blob/master/tests/python/unittest/test_schedule_tensor_core.py#L101 -- You are receiving this because you

Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-06 Thread Jon Soifer
Would it be easy to extend your gemm schedule into a schedule for BatchMatMul? That would help round out the TensorCore story for matrix multiplication. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc

Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-03 Thread 孙敏敏
@Hzfengsy Sure, we will show the code as well as a sample schedule very soon. It's being under internal review now. As you will see, the schedule for TensorCore CodeGen looks no different than a normal matmul schedule for GPU. Everything is done in IR passes including matrix_a/matrix_b/accumulat

Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-03 Thread Siyuan Feng
@yangjunpro Really happy to see another solution for TensorCore. You are right! I just extend tvm intrinsic to support it. It does cause programmers who write the schedule some trouble. It is not easy to write a high-performance schedule. I'm really curious about how to use IR passes to recogn

Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-03 Thread Siyuan Feng
@tmoreau89 Exactly! For now, we use the NCHWnc layout, the same layout with VTA. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/4052#issuecomment-537816661

Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-02 Thread Bing Xu
cc @Laurawly -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/4052#issuecomment-537808438

Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-02 Thread Jun Yang
Nice to see other folks working on adding TensorCore support into TVM, we have also been working on enhancing TVM to incorporate TensorCore schedule support. If my understanding is correct, @Hzfengsy your solution is based on extending TVM's intrinsic while our solution put most of the complexit

Re: [dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-02 Thread Thierry Moreau
Very welcome work @Hzfengsy ! -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/4052#issuecomment-537791018

[dmlc/tvm] [RFC] Tensor Core Support (#4052)

2019-10-02 Thread Siyuan Feng
Tensor Core is a defining feature of the NVIDIA new Volta and Turing GPU Architecture, which gives a massive boost for matrix multiplication and convolution. Tensor Cores enable us to use mixed-precision to achieve higher throughput without sacrificing accuracy. ## Tensor Core Overview Each Ten