Thanks @giuseros I agree what you said about removing overheads for embedded.
In the meantime, it is also good to think about some form of standardization specifically for embedded land that maintains the minimalism while still offers some generality. For example, some standardization around W1a, which removes the overhead of string lookup, but still preserves the CPackeFunc might be helpful. Since then the CPackedFunc would be able to serve as a generic way for users to plugin customized operators(because we still need a somewhat type erased function to remain general). We might also be able to further reduce the overhead if we aggressively perform link time optimization and inline all the CPackedFunc calls, translating the code themselves effectively similar to standard calls. So it would be great if we could work together to come up with such standardization that we can use across. Once such standardization happens(e.g. in the form of W1a), we can provide addon libraries that exposes the tiny standard api to the c runtime so we can invoke these generated code through RPC, and then remove such dependencies when it comes to actual deployment. --- [Visit Topic](https://discuss.tvm.apache.org/t/implementing-aot-in-tvm/9206/21) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/ea0e2721a70361e2b67721eaf5da90990185d7acd15efa4c211c2e2ff43d0c53).