Thank you very much for your reply.
The hardware I use is AArch64 CPU with 8 cores. I refer to this tutorial to deploy tvm:https://tvm.apache.org/docs/deploy/cpp_deploy.html.The c++ thread that load and use tvm library is bound to 3 intermediate frequency cpus, and TVM_NUM_THREADS is set to 1(There is a question that confuses me: the larger the TVM_NUM_THREADS, the worse the performance, so TVM_NUM_THREADS is set to 1.I did not figure out why is the optimal TVM_NUM_THREADS not 3 or 8.) According to your conclusion, I think it is not easy to make tvm beyond MNN on my current hardware(ARM CPU, 8 cores, not want to occupy all the cpus), which makes me feel a little frustrated.But I will still make some efforts , such as try Ansor.If there is any progress, I will be happy to discuss further with you. --- [Visit Topic](https://discuss.tvm.apache.org/t/strassen-algorithm-for-dense/2661/10) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/5bae229acf286db8ec68742de6833d21b833c7ac57e4221d6352f80ff4a264d2).