@mousius Thanks for this important RFC. I'd like to approach this from the other way around: what parts of device configuration does it make sense for TVM to involve itself with, and which parts don't necessitate any binding between TVM runtime and device control? In this framework you can almost think of TVM providing a set of "device callbacks" which it invokes when the user requests it to do something that might necessitate a change in device state.
In the Module-based Model Runtime Interface, there is a lifecycle roughly as follows: ``` +---------------+ | uninitialized | <-----------------+ +---------------+ | ↓ (instantiate Executor) | (destruct Executor) +---------------+ ------------------+ +--> | initialized | <- device memories available for preloading; constants loaded | +---------------+ | ↓ (start of run()) | +---------------+ | | executing | <- device available to launch compute tasks with low latency | +---------------+ | ↓ (end of run()) +------------+ ``` I can appreciate we aren't necessarily keeping compatibility down to the letter with Module-based Model Runtime in microTVM. However, internally the compiler needs some model of the executor strategy to work with. I don't *think* we've conceptually gone away from the MBMR model yet, and prefer to keep with this even if the specific APIs used on microcontrollers don't exactly replicate the C++ executors. With this in mind, it seems like we may like to have a callback for each transition in the graph above. I could see this as: * `TVMDeviceOpen` -- to match instantiating the executor. Contract is that the device memory becomes available for use. * `TVMDeviceActivate` -- to match starting the `run` function. Contract is that the device exits any low power state which may impact inference latency. * `TVMDeviceDeactivate` -- to match ending the `run` function. Contract is that the device may re-enter any low-power state left in `Activate`, but must maintain device memory state. * `TVMDeviceClose` -- to match ending the `run` function. Contract is that the device may be released for others to use. Now these look suspiciously similar to your Open/Close and Init/Destroy--so forgive me if I've written a bunch of text only to agree with you. I'm not attached to the names I've used; but let's make sure we write down the contracts for these functions. I think your function signatures look fine to me. ### Type of `tvm_device_t` I like the idea of making this platform-specific, but I wonder if there will be device-specific state that may be unnecessarily replicated across multiple accelerators (e.g. `tvm_device_t` is sort of forced to be a union struct if it is only platform-specific and not device-specific). Should we further narrow this to e.g. `tvm_device_woofles_t`? ### Device API functions It would be best to assume we'll need to implement the full C++ Device API even if most of the functions are no-ops. ### Follow-ups > When this is packed in the lowering phase, the `resource_handle` will be > assumed to exist as the last argument after being provided by the executor > code generation. The eventual `Call` returned in `lower_tvm_builtin.c` > contains the `resource_handle` by removing this final argument: Is this specific to the AOT main function's TIR? It seems like it may be hard to verify that a TIR Call node has `resource_handle` included correctly with the args. Should we track `resource_handle` separately from the `ins` and `outs`? (I realize this may have been the subject of another PR which I pushed back on--so now that I have context we could probably reconsider). > Initially, devices will be defined by Target name or external compiler name. > This means if you mark an operator as needing an external `woofles` compiler > it would result in a devices struct such as: It would be great to note that this applies to the Target string. Finally, it would be great to spell out the full Device API somewhere so it's clear the full extent of this proposal. --- [Visit Topic](https://discuss.tvm.apache.org/t/pre-rfc-c-device-api/10874/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/0e3447bd56592ce39ceff81d01a76cd7ab402f9e2228690a319c7c6e25e08222).