Hi all, 
I was finally able to have a first version of the AOT work in a PR upstream. 
## PR
You can find the PR here:  https://github.com/apache/tvm/pull/7785

At this stage, I gladly accept any feedback on things that can be improved in 
the PR or on issues I might have overlooked. Please, help me smoothing the 
edges of this work :slight_smile: 
# Limitations
There are two main limitation of the current work:
* We didn't add support for LLVM codegeneration. This is because we thought 
better to agree on the overall picture first using the `c` backend as POC, and 
then taking care of the LLVM backend
* We didn't include support for `LetNode` in the `aot_codegen`. Support for the 
`LetNode` is in the pipeline and will be added soon
## Next steps
Bear in mind that this is only the first step of a journey. We are currently 
working on different improvements to AOT, in particular:
* **LLVM support** LLVM support is currently being worked on and we are almost 
there
* **Name mangling** We are adding name mangling into the picture, i.e., the 
user should be able to specify a prefix and this prefix should be added to all 
the global names used in the library. In this way, we will enable the user to 
build and link more than one network in the same application.
* **DLTensor surgery** Since the memory allocation is done statically, we don't 
need to carry DLTensor through the generated code, as it exposes metadata that 
are not consumed by the codegen and that increases the size of the binary image 
to be flashed on the microcontroller
* **Unpack the runner function signature** Change the API of the runner 
function. Indeed, we would like the runner function to not have a packed API 
signature. This is to avoid instantiating `type_id`s or forcing a dynamic size 
of the function stack (all things that don't add benefits in the embedded 
space, but take a toll in terms of code size, performance and power)
* **`int64_t` surgery** Using `int64_t` on embedded devices usually increases 
in register spilling, which means power and performance will be heavily 
affected. We are removing this datatype in every place it's being used.
* **Remove param lookup through `__lookup_linked_param`**: in order to make 
things simple, we are currently reusing the `__lookup_linked_param` function to 
access the parameters in the library. However, with AOT we can simply create a 
TIR builtin that accesses the parameters directly without going through the 
issues of a function invocation. This is still with the aim of saving power, 
performance and space. 

cc: @ramana-arm @manupa-arm @areusch @matt-arm @stoa @mjs





---
[Visit Topic](https://discuss.tvm.apache.org/t/implementing-aot-in-tvm/9206/14) 
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/aa015d52b6bfcc4b61a8cde1645f27d2455ceb0d11c6a0ad867293d642ed20c5).

Reply via email to