Hi There,

VTA first conv layer is running on CPU and not get offload into FPGA, in most 
case that is a performance bottle neck and need optimization, following are 
some idea about the
optimization, please kindly comments.

Regards
Hua

1. training network to make first conv layer support int8 input and weight, add 
feature
    into vta to support using 16*16 MAC to compute 3 input channel compute.
 
2. When running on arm-cpu, seems like only one cpu get used for first conv 
compute,
    we may can do parallel to running first conv in multiple cpu for accelerate.





---
[Visit Topic](https://discuss.tvm.ai/t/vta-first-conv-layer-optimize/6766/1) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/a560279b6579e8d93ca0a5906b926ec0a2ee716fa8c681d4aae77f7e876e51a1).

Reply via email to