thanks @yrchen and colleagues for the RFC! overall it's very exciting work. a couple of thoughts - is your eventual target bare metal devices, or does your runtime require a kernel? - `riscv_cpu` target: in the past we had introduced a special `micro_dev` target for µTVM work. recently, we deprecated that in favor of `llvm` and `c` targets. then, when creating the list of candidate schedules for a given op, we (for ARM) analyze the ISA supported by the CPU in `-mcpu`. is it possible to do something similar with risc-v (I.e. encode the P extension in some flag `-mcpu=rv32p`)? - LLVM support for riscv P extension, and codegen: since you will need to build TVM against a forked LLVM, is it possible to use the `c` backend for any tests in the CI, until LLVM formally supports RISC-V P? it could be possible then to include a forked llvm compiler in one of the CI docker images, but still compile TVM against mainline LLVM. you could take a look at the [GEMM impl](https://github.com/apache/incubator-tvm/blob/master/python/tvm/topi/arm_cpu/cortex_m7/micro_kernel/gemm.py) for cortex-m7 as an example of how to do that. - RISC-V custom runtime: your sample `host.cpp` link was broken, but is it the one [here](https://github.com/nthu-pllab/RISCV-DLR/blob/master/example/pre_quant_mobilenet_v1_tflite/host.cpp)? I'm also beginning to look at AOT compilation, which looks somewhat similar to your `kernel.inc` code (but would be generated from TVM). there are some additional considerations such as memory planning that may depend more on the device layout. do you have a full example of the `kernel.inc` anywhere I could look at? - looks like the function signatures in your DLR differ from the typically generated signature: ``` typedef int (*TVMBackendPackedCFunc)(TVMValue* args, int* type_codes, int num_args, TVMValue* out_ret_value, int* out_ret_tcode, void* resource_handle); ``` seems like the main difference between this func and `DLR` func is lack of out_* and resource_handle params? - did you try using the new µTVM RPC server-based runtime with spike? this would allow you to use the graph runtime in the TVM python binary and perform autotuning. would it be possible to use that to submit the schedules as one PR and then split any runtime changes into another? we modified the [micro_tflite](https://tvm.apache.org/docs/tutorials/micro/micro_tflite.html) tutorial to demonstrate one use of that runtime. - I don't quite understand your evaluation numbers. are these measured over a fixed time period? otherwise, it seems like there should be fewer instructions executed using the intrinsic for one inference run, correct? - what is your plan for upstreaming binutils and riscv-isa-sim work? - for testing in CI, would we need to build a spike docker image?
--- [Visit Topic](https://discuss.tvm.apache.org/t/rfc-enable-tvm-qnn-on-risc-v-with-subword-simd-computation/7967/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/97e592455be9ff7eca6c949380147c4f1214dd22ce762d0ddcdc47ee569825f0).