cc @MJKlaiber 

@Mousius thanks for splitting this off into another RFC. I agree implementing a 
low-overhead embedded interface is super important. A couple thoughts:

At a high level, it would be great to explicitly spell out the entire interface 
we expect to implement here. I think it might be useful to include an entire 
`main()` program (either here or perhaps linked as a branch if it's long) just 
to ensure we aren't leaving anything out.

### Runtime vs compile time knowledge

A key question we should tackle here is when model metadata should be 
available. Basically there are two scenarios:

S1. The user wants to use model metadata in the compilation flow.

S2. The user wants to write functions that make use of model metadata at 
runtime.

My opinion is we need to support both. So any metadata here e.g. stored in a 
struct should also be present in some JSON created as part of Model Library 
Format.

### Model Input and Output Allocation

I think it'd be great to illustrate how we expect users to allocate model 
inputs and outputs. This is kind of there, but it would be great to propose the 
thing end-to-end. In particular, I'm curious how a user should size the 
tensors. One such possible sketch is to generate code like:
```
typedef struct {
    uint8_t input1[1 * 32 * 32 * 3];   // dimensions are examples
    int8_t input2[10 * 5 * 5 * 3];
} tvm_model_input_t;
```
This allows users with simple memory layout requirements to just declare the 
struct in the correct memory address space, and fill data as needed. It also 
serves as documentation-as-code of the required inputs and memory. We could 
move the buffer sizes to be constants, too. I want to ensure users retain 
control of all memory allocations, but we should design the API such that the 
typical case is very easy to use.

### Custom-Workspace Compilation

I would take this a step further and ask if we can make the workspace size a 
`#define` constant such that the user could allocate the space at compile time. 
Or whether we expect this to live in the Model Library Format metadata as a 
means to access it at compile time. For example, instead of:

```
TVMSetWorkspaces(&context, malloc(TVMGetWorkspaceSize(model, 0));
TVMExecute(&my_model, inputs, outputs, context);
```

I'd like people to be able to:
```
uint8_t g_workspace[TVM_MODEL_NAME_WORKSPACE_BYTES];

void main() {
  TVMSetWorkspaces(&context, g_workspace);
}
```

Finally, is it possible that whatever context is needed to identify the 
workspace could optionally live in flash? This has some benefits e.g. in simple 
deployment scenarios when the workspace is allocated as global memory. In this 
case, it's not possible to overwrite it with invalid pointers, which is a class 
of bugs that can be hard to trace down on embedded platforms

### Context

> Paired with the model descriptor, this provides any contextual information 
> required to run the model, such as an application driven workspace 
> configuration:
> 
> ```
> typedef struct {
>       void** workspace; /** Pointers to different memory to use as a 
> workspace */
> } TVMContext;
> ```

I'd like to avoid general-purpose structs if possible, at least at this phase 
of the implementation. While I think it's likely some top-level glue struct 
will eventually be a useful entry point for developers (and something is likely 
going to be needed as `resource_handle`, I think there are still quite a few 
things related to e.g. multi-core and accelerator dispatch yet to be decided. 
Rather than provide a sort of "kitchen sink" struct, I'd like to encourage us 
to define dedicated places for each orthogonal aspect of computing the  I think 
it'd be great to make progress on the API in this RFC and tackle the 
accelerator dispatch question in a follow-on.

### Generated APIs vs function pointers

When considering how to write user-facing APIs, I think we have a couple of 
choices:

G1. Generate a function call table e.g. `TVMModel` and write wrapper functions 
around it.

G2. Generate a wrapper function with a standard interface (or perhaps a 
standard templated model interface).

Here, I'm not necessarily proposing to generate a wrapper function with 
model-specific signatures (though that has been proposed elsewhere). Instead, I 
am just wondering whether it's necessary to place the `entrypoint` function 
pointer in `TVMModel`. It seems like we may have some desire to generate 
model-specific C++ metadata outside of that generated by the AOT codegen, so I 
wonder if it's worth it to just build a small codegen dedicated to this 
user-facing API now. Doing this would also remove the need for "accessor" 
functions such as `TVMGetTVMVersionMajor`.

### Accelerator binding

If possible, I'd like to defer this to a separate RFC. I think there are lots 
of questions to be answered there and it'd be necessary to review a lifecycle 
diagram of the accelerator to do so. I think that would be better placed in a 
separate RFC.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/rfc-utvm-embedded-c-runtime-interface/9951/3)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/e68a4c662bb9d1abad80bfdbab5f585d4e831385104d4d71b49aeaba499e8466).

Reply via email to