[TVM Discuss] [Development/RFC] [RFC] Ansor: An Auto-scheduler for TVM (AutoTVM v2.0)

Zhao Wu via TVM Discuss Wed, 17 Jun 2020 20:28:03 -0700


We do support to generate OpenCL, so we could run on Mali GPU. However, we 
don't test it on Mali GPU when we complete Ansor. Some difference compared with 
Nvidia GPU we could see, for example, on Mali GPU, we shouldn't use 
`cache_read("shared")` because Mali GPU doesn't have separate shared memory 
like Nvidia GPU. And we should generate `vectorize` explicitly which doesn't be 
required by Nvidia GPU.


We have collected the performance data of TFLite quantized model on ARM CPU. 
However we don't put it on paper. I am glad to share it:

![image|360x217](upload://kOVtkrTGnilHXZF4aCFqSDGr3xR.png) 

The target is 4 cores of cortext-a53, qnnpack commit is 
(b7bacb1899e6fa3a934c1dd6128096f2e1abf071) and only convolution been counted. 
 As you could see we have competitive performance compared with TFLite (2.1) 
and libraries like Qnnpack. However we should still have room to improve, for 
example we should generate the pari instruction (`smlal` / `smlal2`), which 
maybe could be done by tensorize.





---
[Visit 
Topic](https://discuss.tvm.ai/t/rfc-ansor-an-auto-scheduler-for-tvm-autotvm-v2-0/7005/10)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/b8265a98075df24bff1c38c633f5dae7ee516403e8b3993c1113a1ff588673d8).

[TVM Discuss] [Development/RFC] [RFC] Ansor: An Auto-scheduler for TVM (AutoTVM v2.0)

Reply via email to