25/01/2023 20:01, Jerin Jacob: > On Wed, Jan 25, 2023 at 7:50 PM Thomas Monjalon <tho...@monjalon.net> wrote: > > 14/11/2022 13:02, jer...@marvell.com: > > > ML Model: An ML model is an algorithm trained over a dataset. A model > > > consists of > > > procedure/algorithm and data/pattern required to make predictions on live > > > data. > > > Once the model is created and trained outside of the DPDK scope, the > > > model can be loaded > > > via rte_ml_model_load() and then start it using rte_ml_model_start() API. > > > The rte_ml_model_params_update() can be used to update the model > > > parameters such as weight > > > and bias without unloading the model using rte_ml_model_unload(). > > > > The fact that the model is prepared outside means the model format is free > > and probably different per mldev driver. > > I think it is OK but it requires a lot of documentation effort to explain > > how to bind the model and its parameters with the DPDK API. > > Also we may need to pass some metadata from the model builder > > to the inference engine in order to enable optimizations prepared in the > > model. > > And the other way, we may need inference capabilities in order to generate > > an optimized model which can run in the inference engine. > > The base API specification kept absolute minimum. Currently, weight and biases > parameters updated through rte_ml_model_params_update(). It can be extended > when there are drivers supports it or if you have any specific > parameter you would like to add > it in rte_ml_model_params_update().
This function is int rte_ml_model_params_update(int16_t dev_id, int16_t model_id, void *buffer); How are we supposed to provide separate parameters in this void* ? > Other metadata data like batch, shapes, formats queried using > rte_ml_io_info(). Copying: +/** Input and output data information structure + * + * Specifies the type and shape of input and output data. + */ +struct rte_ml_io_info { + char name[RTE_ML_STR_MAX]; + /**< Name of data */ + struct rte_ml_io_shape shape; + /**< Shape of data */ + enum rte_ml_io_type qtype; + /**< Type of quantized data */ + enum rte_ml_io_type dtype; + /**< Type of de-quantized data */ +}; Is it the right place to notify the app that some model optimizations are supported? (example: merge some operations in the graph) > > [...] > > > Typical application utilisation of the ML API will follow the following > > > programming flow. > > > > > > - rte_ml_dev_configure() > > > - rte_ml_dev_queue_pair_setup() > > > - rte_ml_model_load() > > > - rte_ml_model_start() > > > - rte_ml_model_info() > > > - rte_ml_dev_start() > > > - rte_ml_enqueue_burst() > > > - rte_ml_dequeue_burst() > > > - rte_ml_model_stop() > > > - rte_ml_model_unload() > > > - rte_ml_dev_stop() > > > - rte_ml_dev_close() > > > > Where is parameters update in this flow? > > Added the mandatory APIs in the top level flow doc. > rte_ml_model_params_update() used to update the parameters. The question is "where" should it be done? Before/after start? > > Should we update all parameters at once or can it be done more fine-grain? > > Currently, rte_ml_model_params_update() can be used to update weight > and bias via buffer when device is > in stop state and without unloading the model. The question is "can we update a single parameter"? And how? > > Question about the memory used by mldev: > > Can we manage where the memory is allocated (host, device, mix, etc)? > > Just passing buffer pointers now like other subsystem. > Other EAL infra service can take care of the locality of memory as it > is not specific to ML dev. I was thinking about memory allocation required by the inference engine. How to specify where to allocate? Is it just hardcoded in the driver?