RE: [PATCH v5 23/26] regexdev: remove experimental tag
Hi Stephen, As I replied to the previous patch, Please don't remove the experimental tag from this lib, since probably, Nvidia will remove support in the near future. The same with Marvel. So this lib may be deprecated soon. I don't think we want to notify everyone that those functions are here to stay, and we don't want to force future HW provider with API that doesn't meet their need. Thanks, Ori > -Original Message- > From: Stephen Hemminger > Sent: Friday, October 20, 2023 11:58 PM > To: dev@dpdk.org > Cc: Stephen Hemminger ; Ori Kam > > Subject: [PATCH v5 23/26] regexdev: remove experimental tag > > This library was added in 22.11. > Time to make it not experimental. > > Signed-off-by: Stephen Hemminger > --- > lib/regexdev/rte_regexdev.h | 92 - > lib/regexdev/version.map| 2 +- > 2 files changed, 1 insertion(+), 93 deletions(-) > > diff --git a/lib/regexdev/rte_regexdev.h b/lib/regexdev/rte_regexdev.h > index d50af775b551..3ea1f0c061a0 100644 > --- a/lib/regexdev/rte_regexdev.h > +++ b/lib/regexdev/rte_regexdev.h > @@ -226,9 +226,6 @@ extern int rte_regexdev_logtype; > } while (0) > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change without prior notice. > - * > * Check if dev_id is ready. > * > * @param dev_id > @@ -238,27 +235,19 @@ extern int rte_regexdev_logtype; > * - 0 if device state is not in ready state. > * - 1 if device state is ready state. > */ > -__rte_experimental > int rte_regexdev_is_valid_dev(uint16_t dev_id); > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change without prior notice. > - * > * Get the total number of RegEx devices that have been successfully > * initialised. > * > * @return > * The total number of usable RegEx devices. > */ > -__rte_experimental > uint8_t > rte_regexdev_count(void); > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change without prior notice. > - * > * Get the device identifier for the named RegEx device. > * > * @param name > @@ -268,7 +257,6 @@ rte_regexdev_count(void); > * Returns RegEx device identifier on success. > * - <0: Failure to find named RegEx device. > */ > -__rte_experimental > int > rte_regexdev_get_dev_id(const char *name); > > @@ -628,9 +616,6 @@ struct rte_regexdev_info { > }; > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change without prior notice. > - * > * Retrieve the contextual information of a RegEx device. > * > * @param dev_id > @@ -644,7 +629,6 @@ struct rte_regexdev_info { > * - 0: Success, driver updates the contextual information of the RegEx > device > * - <0: Error code returned by the driver info get function. > */ > -__rte_experimental > int > rte_regexdev_info_get(uint8_t dev_id, struct rte_regexdev_info *dev_info); > > @@ -723,9 +707,6 @@ struct rte_regexdev_config { > }; > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change without prior notice. > - * > * Configure a RegEx device. > * > * This function must be invoked first before any other function in the > @@ -743,7 +724,6 @@ struct rte_regexdev_config { > * @return > * - 0: Success, device configured. Otherwise negative errno is returned. > */ > -__rte_experimental > int > rte_regexdev_configure(uint8_t dev_id, const struct rte_regexdev_config > *cfg); > > @@ -782,9 +762,6 @@ struct rte_regexdev_qp_conf { > }; > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change without prior notice. > - * > * Allocate and set up a RegEx queue pair for a RegEx device. > * > * @param dev_id > @@ -799,15 +776,11 @@ struct rte_regexdev_qp_conf { > * @return > * 0 on success. Otherwise negative errno is returned. > */ > -__rte_experimental > int > rte_regexdev_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id, > const struct rte_regexdev_qp_conf *qp_conf); > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change without prior notice. > - * > * Start a RegEx device. > * > * The device start step is the last one and consists of setting the RegEx > @@ -822,14 +795,10 @@ rte_regexdev_queue_pair_setup(uint8_t dev_id, > uint16_t queue_pair_id, > * @return > * 0 on success. Otherwise negative errno is returned. > */ > -__rte_experimental > int > rte_regexdev_start(uint8_t dev_id); > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change without prior notice. > - * > * Stop a RegEx device. > * > * Stop a RegEx device. The device can be restarted with a call to > @@ -845,14 +814,10 @@ rte_regexdev_start(uint8_t dev_id); > * @return > * 0 on success. Otherwise negative errno is returned. > */ > -__rte_experimental > int > rte_regexdev_stop(uint8_t dev_id); > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change without prior notice. > - * > * Close a RegEx device. The device cannot be restarted! > * > * @param dev_id > @@ -861,7 +8
Re: [RFC PATCH v4 1/4] dts: code adjustments for sphinx
On 8/31/23 11:04, Juraj Linkeš wrote: sphinx-build only imports the Python modules when building the documentation; it doesn't run DTS. This requires changes that make the code importable without running it. This means: * properly guarding argument parsing in the if __name__ == '__main__' block. * the logger used by DTS runner underwent the same treatment so that it doesn't create unnecessary log files. * however, DTS uses the arguments to construct an object holding global variables. The defaults for the global variables needed to be moved from argument parsing elsewhere. * importing the remote_session module from framework resulted in circular imports because of one module trying to import another module. This is fixed by more granular imports. Signed-off-by: Juraj Linkeš --- dts/framework/config/__init__.py | 3 - dts/framework/dts.py | 34 ++- dts/framework/remote_session/__init__.py | 41 - .../interactive_remote_session.py | 0 .../{remote => }/interactive_shell.py | 0 .../{remote => }/python_shell.py | 0 .../remote_session/remote/__init__.py | 27 -- .../{remote => }/remote_session.py| 0 .../{remote => }/ssh_session.py | 0 .../{remote => }/testpmd_shell.py | 0 dts/framework/settings.py | 92 +++ dts/framework/test_suite.py | 3 +- dts/framework/testbed_model/__init__.py | 12 +-- dts/framework/testbed_model/common.py | 29 ++ dts/framework/testbed_model/{hw => }/cpu.py | 13 +++ dts/framework/testbed_model/hw/__init__.py| 27 -- .../linux_session.py | 4 +- dts/framework/testbed_model/node.py | 22 - .../os_session.py | 14 +-- dts/framework/testbed_model/{hw => }/port.py | 0 .../posix_session.py | 2 +- dts/framework/testbed_model/sut_node.py | 8 +- dts/framework/testbed_model/tg_node.py| 30 +- .../traffic_generator/__init__.py | 24 + .../capturing_traffic_generator.py| 2 +- .../{ => traffic_generator}/scapy.py | 17 +--- .../traffic_generator.py | 16 +++- .../testbed_model/{hw => }/virtual_device.py | 0 dts/framework/utils.py| 53 +-- dts/main.py | 3 +- 30 files changed, 229 insertions(+), 247 deletions(-) rename dts/framework/remote_session/{remote => }/interactive_remote_session.py (100%) rename dts/framework/remote_session/{remote => }/interactive_shell.py (100%) rename dts/framework/remote_session/{remote => }/python_shell.py (100%) delete mode 100644 dts/framework/remote_session/remote/__init__.py rename dts/framework/remote_session/{remote => }/remote_session.py (100%) rename dts/framework/remote_session/{remote => }/ssh_session.py (100%) rename dts/framework/remote_session/{remote => }/testpmd_shell.py (100%) create mode 100644 dts/framework/testbed_model/common.py rename dts/framework/testbed_model/{hw => }/cpu.py (95%) delete mode 100644 dts/framework/testbed_model/hw/__init__.py rename dts/framework/{remote_session => testbed_model}/linux_session.py (98%) rename dts/framework/{remote_session => testbed_model}/os_session.py (97%) rename dts/framework/testbed_model/{hw => }/port.py (100%) rename dts/framework/{remote_session => testbed_model}/posix_session.py (99%) create mode 100644 dts/framework/testbed_model/traffic_generator/__init__.py rename dts/framework/testbed_model/{ => traffic_generator}/capturing_traffic_generator.py (99%) rename dts/framework/testbed_model/{ => traffic_generator}/scapy.py (96%) rename dts/framework/testbed_model/{ => traffic_generator}/traffic_generator.py (80%) rename dts/framework/testbed_model/{hw => }/virtual_device.py (100%) diff --git a/dts/framework/config/__init__.py b/dts/framework/config/__init__.py index cb7e00ba34..5de8b54bcf 100644 --- a/dts/framework/config/__init__.py +++ b/dts/framework/config/__init__.py @@ -324,6 +324,3 @@ def load_config() -> Configuration: config: dict[str, Any] = warlock.model_factory(schema, name="_Config")(config_data) config_obj: Configuration = Configuration.from_dict(dict(config)) return config_obj - - -CONFIGURATION = load_config() diff --git a/dts/framework/dts.py b/dts/framework/dts.py index f773f0c38d..925a212210 100644 --- a/dts/framework/dts.py +++ b/dts/framework/dts.py @@ -3,22 +3,23 @@ # Copyright(c) 2022-2023 PANTHEON.tech s.r.o. # Copyright(c) 2022-2023 University of New Hampshire +import logging import sys from .config import ( -CONFIGURATION, BuildTargetConfiguration, ExecutionConfiguration, TestSuiteConfig, +load_config, ) from .exception import BlockingTestSuiteError f
Re: [PATCH] eal: fix modify data area after memset
2023-09-22 16:12 (UTC+0800), Fengnan Chang: > ping > > Fengnan Chang 于2023年9月12日周二 17:05写道: > > > > Let's look at this path: > > malloc_elem_free > >->malloc_elem_join_adjacent_free > > ->join_elem(elem, elem->next) > > > > 0. cur elem's pad > 0 > > 1. data area memset in malloc_elem_free first. > > 2. next elem is free, try to join cur elem and next. > > 3. in join_elem, try to modify inner->size, this address had > > memset in step 1, it casue the content of addrees become non-zero. > > > > If user call rte_zmalloc, and pick this elem, it can't get all > > zero'd memory. malloc_elem_join_adjacent_free() always calls memset() after join_elem(), for the next and the previous element respectively. How to reproduce this bug?
RE: [PATCH] net/virtio: fix the setting of the vector for link state interrupt
Hi Maxime, > -Original Message- > From: Maxime Coquelin > Sent: Friday, October 20, 2023 5:01 PM > To: Ma, WenwuX ; dev@dpdk.org > Cc: chenbo@intel.com; Ling, WeiX ; > sta...@dpdk.org > Subject: Re: [PATCH] net/virtio: fix the setting of the vector for link state > interrupt > > Hi Wenwu, > > Please reword the commit title to something: > net/virtio: fix link state interrupt vector setting > > On 8/7/23 05:15, Wenwu Ma wrote: > > The settings of the vector for link state interrupts should be done > > before the initialization of the device is completed. > > > > Fixes: ee85024cf5f7 ("net/virtio: complete init stage at the right > > place") > > Cc: sta...@dpdk.org > > > > Signed-off-by: Wenwu Ma > > --- > > drivers/net/virtio/virtio_ethdev.c | 16 > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/net/virtio/virtio_ethdev.c > > b/drivers/net/virtio/virtio_ethdev.c > > index 2c23f1c00e..1801b0ae47 100644 > > --- a/drivers/net/virtio/virtio_ethdev.c > > +++ b/drivers/net/virtio/virtio_ethdev.c > > @@ -1912,6 +1912,14 @@ virtio_init_device(struct rte_eth_dev *eth_dev, > uint64_t req_features) > > } > > } > > > > + if (eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC) > > + /* Enable vector (0) for Link State Interrupt */ > > + if (VIRTIO_OPS(hw)->set_config_irq(hw, 0) == > > + VIRTIO_MSI_NO_VECTOR) { > > + PMD_DRV_LOG(ERR, "failed to set config vector"); > > + return -EBUSY; > > + } > > + > > virtio_reinit_complete(hw); > > > > return 0; > > @@ -2237,14 +2245,6 @@ virtio_dev_configure(struct rte_eth_dev *dev) > > hw->has_tx_offload = tx_offload_enabled(hw); > > hw->has_rx_offload = rx_offload_enabled(hw); > > > > - if (dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC) > > - /* Enable vector (0) for Link State Interrupt */ > > - if (VIRTIO_OPS(hw)->set_config_irq(hw, 0) == > > - VIRTIO_MSI_NO_VECTOR) { > > - PMD_DRV_LOG(ERR, "failed to set config vector"); > > - return -EBUSY; > > - } > > - > > if (virtio_with_packed_queue(hw)) { > > #if defined(RTE_ARCH_X86_64) && defined(CC_AVX512_SUPPORT) > > if ((hw->use_vec_rx || hw->use_vec_tx) && > > It looks good to me, so I can change the title myself while applying if Ok for > you. > I will submit a new patch with your reworded title. > Reviewed-by: Maxime Coquelin > > By the way, can you tell me with which backends have you tested it with? > Only Virtio-PCI? Or also Virtio-user? > Test step: 1.Bind 1 NIC port to vfio-pci driver: dpdk-devbind.py --force --bind=vfio-pci :4b:00.0 2.Start dpdk-testpmd as back-end: x86_64-native-linuxapp-gcc/app/dpdk-testpmd -l 1-5 -n 8 -a :4b:00.0 \ --vdev net_vhost0,iface=/root/dpdk/vhost-net,queues=4\ -- -i --nb-cores=4 --rxq=4 --txq=4 --rss-ip testpmd>start 3.Start VM with QEMU-8.0.0 as front-end: taskset -c 20,21,22,23,24,25,26,27 /home/QEMU/qemu-8.0.0/bin/qemu-system-x86_64 -name vm0 -enable-kvm -pidfile /tmp/.vm0.pid \ -daemonize -monitor unix:/tmp/vm0_monitor.sock,server,nowait -netdev user,id=nttsip1,hostfwd=tcp:10.239.252.245:6000-:22 -device e1000,netdev=nttsip1 \ -cpu host -smp 4 -m 8192 -object memory-backend-file,id=mem,size=8192M,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc \ -chardev socket,path=/tmp/vm0_qga0.sock,server,nowait,id=vm0_qga0 -device virtio-serial -device virtserialport,chardev=vm0_qga0,name=org.qemu.guest_agent.0 -vnc :4 \ -drive file=/home/image/ubuntu2004.img -chardev socket,id=char0,path=/root/dpdk/vhost-net -netdev type=vhost-user,id=netdev0,chardev=char0,vhostforce,queues=4 \ -device virtio-net-pci,netdev=netdev0,mac=00:11:22:33:44:55,disable-modern=true,mrg_rxbuf=on,csum=on,mq=on,vectors=10 4.SSH connect VM and build dpdk-l3fwd-power APP, and then start dpdk-l3fwd-power: CC=gcc meson -Denable_kmods=True -Dlibdir=lib --default-library=static x86_64-native-linuxapp-gcc ninja -C x86_64-native-linuxapp-gcc meson configure -Dexamples=l3fwd-power x86_64-native-linuxapp-gcc ninja -C x86_64-native-linuxapp-gcc echo 1 > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode dpdk-devbind.py -b vfio-pci :00:05.0 ./x86_64-native-linuxapp-gcc/examples/dpdk-l3fwd-power -c 0xf -n 4 --log-level='user1,7' -- -p 1 -P --config '(0,0,0),(0,1,1),(0,2,2),(0,3,3)' --no-numa --parse-ptype --interrupt-only The VM will crash, when start dpdk-l3fwd-power APP in VM with QEMU-8.0.0, and it works well when start VM with other QEMU version less than QEMU-8.0.0. > Thanks, > Maxime
[PATCH v2] net/virtio: fix link state interrupt vector setting
The settings of the vector for link state interrupts should be done before the initialization of the device is completed. Fixes: ee85024cf5f7 ("net/virtio: complete init stage at the right place") Cc: sta...@dpdk.org Signed-off-by: Wenwu Ma Tested-by: Wei Ling Reviewed-by: Maxime Coquelin --- v2: - rewording of the title --- drivers/net/virtio/virtio_ethdev.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 3ab56ef769..c2c0a1a111 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -1912,6 +1912,14 @@ virtio_init_device(struct rte_eth_dev *eth_dev, uint64_t req_features) } } + if (eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC) + /* Enable vector (0) for Link State Interrupt */ + if (VIRTIO_OPS(hw)->set_config_irq(hw, 0) == + VIRTIO_MSI_NO_VECTOR) { + PMD_DRV_LOG(ERR, "failed to set config vector"); + return -EBUSY; + } + virtio_reinit_complete(hw); return 0; @@ -2237,14 +2245,6 @@ virtio_dev_configure(struct rte_eth_dev *dev) hw->has_tx_offload = tx_offload_enabled(hw); hw->has_rx_offload = rx_offload_enabled(hw); - if (dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC) - /* Enable vector (0) for Link State Interrupt */ - if (VIRTIO_OPS(hw)->set_config_irq(hw, 0) == - VIRTIO_MSI_NO_VECTOR) { - PMD_DRV_LOG(ERR, "failed to set config vector"); - return -EBUSY; - } - if (virtio_with_packed_queue(hw)) { #if defined(RTE_ARCH_X86_64) && defined(CC_AVX512_SUPPORT) if ((hw->use_vec_rx || hw->use_vec_tx) && -- 2.34.1
RE: [PATCH v2 01/14] eal: make bitops a stable API
> -Original Message- > From: Stephen Hemminger > Sent: Saturday, October 21, 2023 5:41 AM > To: dev@dpdk.org > Cc: Stephen Hemminger ; Cristian > Dumitrescu ; Joyce Kong > > Subject: [PATCH v2 01/14] eal: make bitops a stable API > > These were added in 20.05 release. > > Signed-off-by: Stephen Hemminger Reviewed-by: Joyce Kong > --- > lib/eal/include/rte_bitmap.h | 8 lib/eal/include/rte_bitops.h | > 40 - > --- > 2 files changed, 48 deletions(-) > > diff --git a/lib/eal/include/rte_bitmap.h b/lib/eal/include/rte_bitmap.h index > 46a822768d50..ec819595624c 100644 > --- a/lib/eal/include/rte_bitmap.h > +++ b/lib/eal/include/rte_bitmap.h > @@ -203,9 +203,6 @@ rte_bitmap_init(uint32_t n_bits, uint8_t *mem, > uint32_t mem_size) } > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change without prior notice. > - * > * Bitmap clear slab overhead bits. > * > * @param slabs > @@ -215,7 +212,6 @@ rte_bitmap_init(uint32_t n_bits, uint8_t *mem, > uint32_t mem_size) > * @param pos > * The start bit position in the slabs to be cleared. > */ > -__rte_experimental > static inline void > __rte_bitmap_clear_slab_overhead_bits(uint64_t *slabs, uint32_t slab_size, > uint32_t pos) > @@ -235,9 +231,6 @@ __rte_bitmap_clear_slab_overhead_bits(uint64_t > *slabs, uint32_t slab_size, } > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change without prior notice. > - * > * Bitmap initialization with all bits set > * > * @param n_bits > @@ -249,7 +242,6 @@ __rte_bitmap_clear_slab_overhead_bits(uint64_t > *slabs, uint32_t slab_size, > * @return > * Handle to bitmap instance. > */ > -__rte_experimental > static inline struct rte_bitmap * > rte_bitmap_init_with_all_set(uint32_t n_bits, uint8_t *mem, uint32_t > mem_size) { diff --git a/lib/eal/include/rte_bitops.h > b/lib/eal/include/rte_bitops.h index 6b8ae8d3acf6..29d24b3a780e 100644 > --- a/lib/eal/include/rte_bitops.h > +++ b/lib/eal/include/rte_bitops.h > @@ -42,9 +42,6 @@ extern "C" { > /* 32-bit relaxed operations > */ > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change, or be removed, without prior > notice > - * > * Get the target bit from a 32-bit value without memory ordering. > * > * @param nr > @@ -54,7 +51,6 @@ extern "C" { > * @return > * The target bit. > */ > -__rte_experimental > static inline uint32_t > rte_bit_relaxed_get32(unsigned int nr, volatile uint32_t *addr) { @@ -65,9 > +61,6 @@ rte_bit_relaxed_get32(unsigned int nr, volatile uint32_t *addr) } > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change, or be removed, without prior > notice > - * > * Set the target bit in a 32-bit value to 1 without memory ordering. > * > * @param nr > @@ -75,7 +68,6 @@ rte_bit_relaxed_get32(unsigned int nr, volatile uint32_t > *addr) > * @param addr > * The address holding the bit. > */ > -__rte_experimental > static inline void > rte_bit_relaxed_set32(unsigned int nr, volatile uint32_t *addr) { @@ -86,9 > +78,6 @@ rte_bit_relaxed_set32(unsigned int nr, volatile uint32_t *addr) } > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change, or be removed, without prior > notice > - * > * Clear the target bit in a 32-bit value to 0 without memory ordering. > * > * @param nr > @@ -96,7 +85,6 @@ rte_bit_relaxed_set32(unsigned int nr, volatile uint32_t > *addr) > * @param addr > * The address holding the bit. > */ > -__rte_experimental > static inline void > rte_bit_relaxed_clear32(unsigned int nr, volatile uint32_t *addr) { @@ > -107,9 > +95,6 @@ rte_bit_relaxed_clear32(unsigned int nr, volatile uint32_t *addr) } > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change, or be removed, without prior > notice > - * > * Return the original bit from a 32-bit value, then set it to 1 without > * memory ordering. > * > @@ -120,7 +105,6 @@ rte_bit_relaxed_clear32(unsigned int nr, volatile > uint32_t *addr) > * @return > * The original bit. > */ > -__rte_experimental > static inline uint32_t > rte_bit_relaxed_test_and_set32(unsigned int nr, volatile uint32_t *addr) > { @@ -133,9 +117,6 @@ rte_bit_relaxed_test_and_set32(unsigned int nr, > volatile uint32_t *addr) } > > /** > - * @warning > - * @b EXPERIMENTAL: this API may change, or be removed, without prior > notice > - * > * Return the original bit from a 32-bit value, then clear it to 0 without > * memory ordering. > * > @@ -146,7 +127,6 @@ rte_bit_relaxed_test_and_set32(unsigned int nr, > volatile uint32_t *addr) > * @return > * The original bit. > */ > -__rte_experimental > static inline uint32_t > rte_bit_relaxed_test_and_clear32(unsigned int nr, volatile uint32_t *addr) > { @@ -161,9 +141,6 @@ rte_bit_relaxed_test_and_clear32(unsigned int nr, > volatile uint32_t *addr) > /* 64-bit rel
[PATCH] app/testpmd: support config all offload
Extend supports all offload configuration in following commands: 1. port config 0 rx_offload all on/off 2. port config 0 tx_offload all on/off 3. port 0 rxq 0 rx_offload all on/off 4. port 0 txq 0 tx_offload all on/off Signed-off-by: Chengwen Feng --- app/test-pmd/cmdline.c | 112 +++- doc/guides/testpmd_app_ug/testpmd_funcs.rst | 8 +- 2 files changed, 68 insertions(+), 52 deletions(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 679ca47b94..35f5e4bbc0 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -763,7 +763,7 @@ static void cmd_help_long_parsed(void *parsed_result, "port config (port_id) udp_tunnel_port add|rm vxlan|geneve|ecpri (udp_port)\n\n" "Add/remove UDP tunnel port for tunneling offload\n\n" - "port config rx_offload vlan_strip|" + "port config rx_offload all|vlan_strip|" "ipv4_cksum|udp_cksum|tcp_cksum|tcp_lro|qinq_strip|" "outer_ipv4_cksum|macsec_strip|" "vlan_filter|vlan_extend|scatter|" @@ -771,7 +771,7 @@ static void cmd_help_long_parsed(void *parsed_result, " Enable or disable a per port Rx offloading" " on all Rx queues of a port\n\n" - "port (port_id) rxq (queue_id) rx_offload vlan_strip|" + "port (port_id) rxq (queue_id) rx_offload all|vlan_strip|" "ipv4_cksum|udp_cksum|tcp_cksum|tcp_lro|qinq_strip|" "outer_ipv4_cksum|macsec_strip|" "vlan_filter|vlan_extend|scatter|" @@ -779,7 +779,7 @@ static void cmd_help_long_parsed(void *parsed_result, "Enable or disable a per queue Rx offloading" " only on a specific Rx queue\n\n" - "port config (port_id) tx_offload vlan_insert|" + "port config (port_id) tx_offload all|vlan_insert|" "ipv4_cksum|udp_cksum|tcp_cksum|sctp_cksum|tcp_tso|" "udp_tso|outer_ipv4_cksum|qinq_insert|vxlan_tnl_tso|" "gre_tnl_tso|ipip_tnl_tso|geneve_tnl_tso|" @@ -788,7 +788,7 @@ static void cmd_help_long_parsed(void *parsed_result, "Enable or disable a per port Tx offloading" " on all Tx queues of a port\n\n" - "port (port_id) txq (queue_id) tx_offload vlan_insert|" + "port (port_id) txq (queue_id) tx_offload all|vlan_insert|" "ipv4_cksum|udp_cksum|tcp_cksum|sctp_cksum|tcp_tso|" "udp_tso|outer_ipv4_cksum|qinq_insert|vxlan_tnl_tso|" "gre_tnl_tso|ipip_tnl_tso|geneve_tnl_tso|macsec_insert" @@ -2,7 +2,7 @@ static cmdline_parse_token_string_t cmd_config_per_port_rx_offload_result_rx_off static cmdline_parse_token_string_t cmd_config_per_port_rx_offload_result_offload = TOKEN_STRING_INITIALIZER (struct cmd_config_per_port_rx_offload_result, -offload, "vlan_strip#ipv4_cksum#udp_cksum#tcp_cksum#tcp_lro#" +offload, "all#vlan_strip#ipv4_cksum#udp_cksum#tcp_cksum#tcp_lro#" "qinq_strip#outer_ipv4_cksum#macsec_strip#" "vlan_filter#vlan_extend#" "scatter#buffer_split#timestamp#security#" @@ -11155,8 +11155,8 @@ cmd_config_per_port_rx_offload_parsed(void *parsed_result, portid_t port_id = res->port_id; struct rte_eth_dev_info dev_info; struct rte_port *port = &ports[port_id]; - uint64_t single_offload; uint16_t nb_rx_queues; + uint64_t offload; int q; int ret; @@ -11167,25 +11167,29 @@ cmd_config_per_port_rx_offload_parsed(void *parsed_result, return; } - single_offload = search_rx_offload(res->offload); - if (single_offload == 0) { - fprintf(stderr, "Unknown offload name: %s\n", res->offload); - return; - } - ret = eth_dev_info_get_print_err(port_id, &dev_info); if (ret != 0) return; + if (!strcmp(res->offload, "all")) { + offload = dev_info.rx_offload_capa; + } else { + offload = search_rx_offload(res->offload); + if (offload == 0) { + fprintf(stderr, "Unknown offload name: %s\n", res->offload); + return; + } + } + nb_rx_queues = dev_info.nb_rx_queues; if (!strcmp(res->on_off, "on")) { - port->dev_conf.rxmode.offloads |= single_offload; + port->dev_conf.rxmode.offloads |= offload; for (q = 0; q < nb_rx_queues; q++) - port->
Re: [PATCH] app/testpmd: support config all offload
Add cc to testpmd's maintainer due tools failed to add. On 2023/10/23 10:29, Chengwen Feng wrote: > Extend supports all offload configuration in following commands: > 1. port config 0 rx_offload all on/off > 2. port config 0 tx_offload all on/off > 3. port 0 rxq 0 rx_offload all on/off > 4. port 0 txq 0 tx_offload all on/off > > Signed-off-by: Chengwen Feng > --- > app/test-pmd/cmdline.c | 112 +++- > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 8 +- > 2 files changed, 68 insertions(+), 52 deletions(-) > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c > index 679ca47b94..35f5e4bbc0 100644 > --- a/app/test-pmd/cmdline.c > +++ b/app/test-pmd/cmdline.c > @@ -763,7 +763,7 @@ static void cmd_help_long_parsed(void *parsed_result, > "port config (port_id) udp_tunnel_port add|rm > vxlan|geneve|ecpri (udp_port)\n\n" > "Add/remove UDP tunnel port for tunneling > offload\n\n" > > - "port config rx_offload vlan_strip|" > + "port config rx_offload all|vlan_strip|" > "ipv4_cksum|udp_cksum|tcp_cksum|tcp_lro|qinq_strip|" > "outer_ipv4_cksum|macsec_strip|" > "vlan_filter|vlan_extend|scatter|" > @@ -771,7 +771,7 @@ static void cmd_help_long_parsed(void *parsed_result, > " Enable or disable a per port Rx offloading" > " on all Rx queues of a port\n\n" > > - "port (port_id) rxq (queue_id) rx_offload vlan_strip|" > + "port (port_id) rxq (queue_id) rx_offload > all|vlan_strip|" > "ipv4_cksum|udp_cksum|tcp_cksum|tcp_lro|qinq_strip|" > "outer_ipv4_cksum|macsec_strip|" > "vlan_filter|vlan_extend|scatter|" > @@ -779,7 +779,7 @@ static void cmd_help_long_parsed(void *parsed_result, > "Enable or disable a per queue Rx offloading" > " only on a specific Rx queue\n\n" > > - "port config (port_id) tx_offload vlan_insert|" > + "port config (port_id) tx_offload all|vlan_insert|" > "ipv4_cksum|udp_cksum|tcp_cksum|sctp_cksum|tcp_tso|" > "udp_tso|outer_ipv4_cksum|qinq_insert|vxlan_tnl_tso|" > "gre_tnl_tso|ipip_tnl_tso|geneve_tnl_tso|" > @@ -788,7 +788,7 @@ static void cmd_help_long_parsed(void *parsed_result, > "Enable or disable a per port Tx offloading" > " on all Tx queues of a port\n\n" > > - "port (port_id) txq (queue_id) tx_offload vlan_insert|" > + "port (port_id) txq (queue_id) tx_offload > all|vlan_insert|" > "ipv4_cksum|udp_cksum|tcp_cksum|sctp_cksum|tcp_tso|" > "udp_tso|outer_ipv4_cksum|qinq_insert|vxlan_tnl_tso|" > "gre_tnl_tso|ipip_tnl_tso|geneve_tnl_tso|macsec_insert" > @@ -2,7 +2,7 @@ static cmdline_parse_token_string_t > cmd_config_per_port_rx_offload_result_rx_off > static cmdline_parse_token_string_t > cmd_config_per_port_rx_offload_result_offload = > TOKEN_STRING_INITIALIZER > (struct cmd_config_per_port_rx_offload_result, > - offload, "vlan_strip#ipv4_cksum#udp_cksum#tcp_cksum#tcp_lro#" > + offload, > "all#vlan_strip#ipv4_cksum#udp_cksum#tcp_cksum#tcp_lro#" > "qinq_strip#outer_ipv4_cksum#macsec_strip#" > "vlan_filter#vlan_extend#" > "scatter#buffer_split#timestamp#security#" > @@ -11155,8 +11155,8 @@ cmd_config_per_port_rx_offload_parsed(void > *parsed_result, > portid_t port_id = res->port_id; > struct rte_eth_dev_info dev_info; > struct rte_port *port = &ports[port_id]; > - uint64_t single_offload; > uint16_t nb_rx_queues; > + uint64_t offload; > int q; > int ret; > > @@ -11167,25 +11167,29 @@ cmd_config_per_port_rx_offload_parsed(void > *parsed_result, > return; > } > > - single_offload = search_rx_offload(res->offload); > - if (single_offload == 0) { > - fprintf(stderr, "Unknown offload name: %s\n", res->offload); > - return; > - } > - > ret = eth_dev_info_get_print_err(port_id, &dev_info); > if (ret != 0) > return; > > + if (!strcmp(res->offload, "all")) { > + offload = dev_info.rx_offload_capa; > + } else { > + offload = search_rx_offload(res->offload); > + if (offload == 0) { > + fprintf(stderr, "Unknown offload name: %s\n", > res->offload); > + return; > + } > + } > + > nb_rx_queues = dev_info.nb_rx_queues; > if (!strcmp(res->on_off, "on")) { > - port-
RE: [PATCH v2 10/14] eal: mark rte_atomic128_cmp_exchange as stable
> -Original Message- > From: Stephen Hemminger > Sent: Saturday, October 21, 2023 5:41 AM > To: dev@dpdk.org > Cc: Stephen Hemminger ; Ruifeng Wang > ; > Bruce Richardson ; Konstantin Ananyev > > Subject: [PATCH v2 10/14] eal: mark rte_atomic128_cmp_exchange as stable > > This has been around since 2021. > > Signed-off-by: Stephen Hemminger > --- > lib/eal/arm/include/rte_atomic_64.h | 1 - > lib/eal/include/generic/rte_atomic.h | 1 - > lib/eal/x86/include/rte_atomic_64.h | 1 - > 3 files changed, 3 deletions(-) > > diff --git a/lib/eal/arm/include/rte_atomic_64.h > b/lib/eal/arm/include/rte_atomic_64.h > index 75d8ba6092cc..96205e6ad372 100644 > --- a/lib/eal/arm/include/rte_atomic_64.h > +++ b/lib/eal/arm/include/rte_atomic_64.h > @@ -94,7 +94,6 @@ __ATOMIC128_CAS_OP(__cas_128_acq_rel, "caspal") > > #endif > > -__rte_experimental > static inline int > rte_atomic128_cmp_exchange(rte_int128_t *dst, rte_int128_t *exp, > const rte_int128_t *src, unsigned int weak, int success, diff > --git > a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h > index db6797e379f3..048b547ea62b 100644 > --- a/lib/eal/include/generic/rte_atomic.h > +++ b/lib/eal/include/generic/rte_atomic.h > @@ -1147,7 +1147,6 @@ typedef struct { > * @return > * Non-zero on success; 0 on failure. > */ > -__rte_experimental > static inline int > rte_atomic128_cmp_exchange(rte_int128_t *dst, > rte_int128_t *exp, > diff --git a/lib/eal/x86/include/rte_atomic_64.h > b/lib/eal/x86/include/rte_atomic_64.h > index 0edee8627224..e968bbf0ce65 100644 > --- a/lib/eal/x86/include/rte_atomic_64.h > +++ b/lib/eal/x86/include/rte_atomic_64.h > @@ -182,7 +182,6 @@ static inline void rte_atomic64_clear(rte_atomic64_t *v) > > /* 128 bit atomic operations > -*/ > > -__rte_experimental > static inline int > rte_atomic128_cmp_exchange(rte_int128_t *dst, > rte_int128_t *exp, > -- > 2.39.2 Acked-by: Ruifeng Wang
[PATCH] eal: support lcore usage ratio
Current, the lcore usage only display two key fields: busy_cycles and total_cycles, which is inconvenient to obtain the usage ratio immediately. So adds lcore usage ratio field. Signed-off-by: Chengwen Feng --- lib/eal/common/eal_common_lcore.c | 34 --- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/lib/eal/common/eal_common_lcore.c b/lib/eal/common/eal_common_lcore.c index ceda714ca5..d1d0da2dd0 100644 --- a/lib/eal/common/eal_common_lcore.c +++ b/lib/eal/common/eal_common_lcore.c @@ -446,6 +446,12 @@ rte_lcore_register_usage_cb(rte_lcore_usage_cb cb) lcore_usage_cb = cb; } +static float +calc_usage_ratio(const struct rte_lcore_usage *usage) +{ + return (usage->busy_cycles * 100.0) / (usage->total_cycles == 0 ? 1 : usage->total_cycles); +} + static int lcore_dump_cb(unsigned int lcore_id, void *arg) { @@ -462,8 +468,9 @@ lcore_dump_cb(unsigned int lcore_id, void *arg) /* Guard against concurrent modification of lcore_usage_cb. */ usage_cb = lcore_usage_cb; if (usage_cb != NULL && usage_cb(lcore_id, &usage) == 0) { - if (asprintf(&usage_str, ", busy cycles %"PRIu64"/%"PRIu64, - usage.busy_cycles, usage.total_cycles) < 0) { + if (asprintf(&usage_str, ", busy cycles %"PRIu64"/%"PRIu64" (ratio %.3f%%)", + usage.busy_cycles, usage.total_cycles, + calc_usage_ratio(&usage)) < 0) { return -ENOMEM; } } @@ -511,11 +518,19 @@ struct lcore_telemetry_info { struct rte_tel_data *d; }; +static void +format_usage_ratio(char *buf, uint16_t size, const struct rte_lcore_usage *usage) +{ + float ratio = calc_usage_ratio(usage); + snprintf(buf, size, "%.3f%%", ratio); +} + static int lcore_telemetry_info_cb(unsigned int lcore_id, void *arg) { struct rte_config *cfg = rte_eal_get_configuration(); struct lcore_telemetry_info *info = arg; + char ratio_str[RTE_TEL_MAX_STRING_LEN]; struct rte_lcore_usage usage; struct rte_tel_data *cpuset; rte_lcore_usage_cb usage_cb; @@ -544,6 +559,8 @@ lcore_telemetry_info_cb(unsigned int lcore_id, void *arg) if (usage_cb != NULL && usage_cb(lcore_id, &usage) == 0) { rte_tel_data_add_dict_uint(info->d, "total_cycles", usage.total_cycles); rte_tel_data_add_dict_uint(info->d, "busy_cycles", usage.busy_cycles); + format_usage_ratio(ratio_str, sizeof(ratio_str), &usage); + rte_tel_data_add_dict_string(info->d, "usage_ratio", ratio_str); } return 0; @@ -574,11 +591,13 @@ struct lcore_telemetry_usage { struct rte_tel_data *lcore_ids; struct rte_tel_data *total_cycles; struct rte_tel_data *busy_cycles; + struct rte_tel_data *usage_ratio; }; static int lcore_telemetry_usage_cb(unsigned int lcore_id, void *arg) { + char ratio_str[RTE_TEL_MAX_STRING_LEN]; struct lcore_telemetry_usage *u = arg; struct rte_lcore_usage usage; rte_lcore_usage_cb usage_cb; @@ -591,6 +610,8 @@ lcore_telemetry_usage_cb(unsigned int lcore_id, void *arg) rte_tel_data_add_array_uint(u->lcore_ids, lcore_id); rte_tel_data_add_array_uint(u->total_cycles, usage.total_cycles); rte_tel_data_add_array_uint(u->busy_cycles, usage.busy_cycles); + format_usage_ratio(ratio_str, sizeof(ratio_str), &usage); + rte_tel_data_add_array_string(u->usage_ratio, ratio_str); } return 0; @@ -603,15 +624,19 @@ handle_lcore_usage(const char *cmd __rte_unused, const char *params __rte_unused struct lcore_telemetry_usage usage; struct rte_tel_data *total_cycles; struct rte_tel_data *busy_cycles; + struct rte_tel_data *usage_ratio; struct rte_tel_data *lcore_ids; lcore_ids = rte_tel_data_alloc(); total_cycles = rte_tel_data_alloc(); busy_cycles = rte_tel_data_alloc(); - if (lcore_ids == NULL || total_cycles == NULL || busy_cycles == NULL) { + usage_ratio = rte_tel_data_alloc(); + if (lcore_ids == NULL || total_cycles == NULL || busy_cycles == NULL || + usage_ratio == NULL) { rte_tel_data_free(lcore_ids); rte_tel_data_free(total_cycles); rte_tel_data_free(busy_cycles); + rte_tel_data_free(usage_ratio); return -ENOMEM; } @@ -619,12 +644,15 @@ handle_lcore_usage(const char *cmd __rte_unused, const char *params __rte_unused rte_tel_data_start_array(lcore_ids, RTE_TEL_UINT_VAL); rte_tel_data_start_array(total_cycles, RTE_TEL_UINT_VAL); rte_tel_data_start_array(busy_cycles, RTE_TEL_UINT_VAL); + rte_tel_data_start_array(usage_ratio, RTE_TEL_STRING_VAL); rte_tel_data_add_dict_cont
[PATCH] maintainers: volunteer to maintain power library
Add co-maintainer for power library. Signed-off-by: Sivaprasad Tummala --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 4083658697..d4d7546eb6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1668,6 +1668,7 @@ F: lib/pci/ Power management M: Anatoly Burakov M: David Hunt +M: Sivaprasad Tummala F: lib/power/ F: doc/guides/prog_guide/power_man.rst F: app/test/test_power* -- 2.34.1
[PATCH v8 01/34] ml/cnxk: drop support for register polling
Dropped support for device argument "poll_mem" for cnxk ML driver. Support to use registers for polling is removed and DDR addresses would be used for polling. Signed-off-by: Srikanth Yalavarthi --- doc/guides/mldevs/cnxk.rst | 16 - drivers/ml/cnxk/cn10k_ml_dev.c | 36 +-- drivers/ml/cnxk/cn10k_ml_dev.h | 13 +--- drivers/ml/cnxk/cn10k_ml_ops.c | 111 - drivers/ml/cnxk/cn10k_ml_ops.h | 6 -- 5 files changed, 18 insertions(+), 164 deletions(-) diff --git a/doc/guides/mldevs/cnxk.rst b/doc/guides/mldevs/cnxk.rst index b79bc540d9..1834b1f905 100644 --- a/doc/guides/mldevs/cnxk.rst +++ b/doc/guides/mldevs/cnxk.rst @@ -180,22 +180,6 @@ Runtime Config Options in the fast path enqueue burst operation. -**Polling memory location** (default ``ddr``) - - ML cnxk driver provides the option to select the memory location to be used - for polling to check the inference request completion. - Driver supports using either the DDR address space (``ddr``) - or ML registers (``register``) as polling locations. - The parameter ``poll_mem`` is used to specify the poll location. - - For example:: - - -a :00:10.0,poll_mem="register" - - With the above configuration, ML cnxk driver is configured to use ML registers - for polling in fastpath requests. - - Debugging Options - diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c index 983138a7f2..e3c2badcef 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.c +++ b/drivers/ml/cnxk/cn10k_ml_dev.c @@ -23,7 +23,6 @@ #define CN10K_ML_DEV_CACHE_MODEL_DATA "cache_model_data" #define CN10K_ML_OCM_ALLOC_MODE"ocm_alloc_mode" #define CN10K_ML_DEV_HW_QUEUE_LOCK "hw_queue_lock" -#define CN10K_ML_FW_POLL_MEM "poll_mem" #define CN10K_ML_OCM_PAGE_SIZE "ocm_page_size" #define CN10K_ML_FW_PATH_DEFAULT "/lib/firmware/mlip-fw.bin" @@ -32,7 +31,6 @@ #define CN10K_ML_DEV_CACHE_MODEL_DATA_DEFAULT 1 #define CN10K_ML_OCM_ALLOC_MODE_DEFAULT"lowest" #define CN10K_ML_DEV_HW_QUEUE_LOCK_DEFAULT 1 -#define CN10K_ML_FW_POLL_MEM_DEFAULT "ddr" #define CN10K_ML_OCM_PAGE_SIZE_DEFAULT 16384 /* ML firmware macros */ @@ -54,7 +52,6 @@ static const char *const valid_args[] = {CN10K_ML_FW_PATH, CN10K_ML_DEV_CACHE_MODEL_DATA, CN10K_ML_OCM_ALLOC_MODE, CN10K_ML_DEV_HW_QUEUE_LOCK, -CN10K_ML_FW_POLL_MEM, CN10K_ML_OCM_PAGE_SIZE, NULL}; @@ -103,9 +100,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde bool hw_queue_lock_set = false; bool ocm_page_size_set = false; char *ocm_alloc_mode = NULL; - bool poll_mem_set = false; bool fw_path_set = false; - char *poll_mem = NULL; char *fw_path = NULL; int ret = 0; bool found; @@ -189,17 +184,6 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde hw_queue_lock_set = true; } - if (rte_kvargs_count(kvlist, CN10K_ML_FW_POLL_MEM) == 1) { - ret = rte_kvargs_process(kvlist, CN10K_ML_FW_POLL_MEM, &parse_string_arg, -&poll_mem); - if (ret < 0) { - plt_err("Error processing arguments, key = %s\n", CN10K_ML_FW_POLL_MEM); - ret = -EINVAL; - goto exit; - } - poll_mem_set = true; - } - if (rte_kvargs_count(kvlist, CN10K_ML_OCM_PAGE_SIZE) == 1) { ret = rte_kvargs_process(kvlist, CN10K_ML_OCM_PAGE_SIZE, &parse_integer_arg, &mldev->ocm_page_size); @@ -280,18 +264,6 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde } plt_info("ML: %s = %d", CN10K_ML_DEV_HW_QUEUE_LOCK, mldev->hw_queue_lock); - if (!poll_mem_set) { - mldev->fw.poll_mem = CN10K_ML_FW_POLL_MEM_DEFAULT; - } else { - if (!((strcmp(poll_mem, "ddr") == 0) || (strcmp(poll_mem, "register") == 0))) { - plt_err("Invalid argument, %s = %s\n", CN10K_ML_FW_POLL_MEM, poll_mem); - ret = -EINVAL; - goto exit; - } - mldev->fw.poll_mem = poll_mem; - } - plt_info("ML: %s = %s", CN10K_ML_FW_POLL_MEM, mldev->fw.poll_mem); - if (!ocm_page_size_set) { mldev->ocm_page_size = CN10K_ML_OCM_PAGE_SIZE_DEFAULT; } else { @@ -450,10 +422,7 @@ cn10k_ml_fw_flags_get(struct cn10k_ml_fw *fw) if (fw->report_dpe_warnings) flags = flags | FW_REPORT_DPE_WARNING_BITMASK;
[PATCH v8 00/34] Implementation of revised ml/cnxk driver
This patch series is an implementation of revised ml/cnxk driver to support models compiled with TVM compiler framework. TVM models use a hybrid mode for execution, with regions of the model executing on the ML accelerator and the rest executing on CPU cores. This series of commits reorganizes the ml/cnxk driver and adds support to execute multiple regions with-in a TVM model. v8: - Updated CMake dependency resolution of external dependencies - Updated mldevs/cnxk documentation - Updated meson config files for cn9k and cn10k to include cmake v7: - Updated steps to build dependencies in cnxk mldev documentation - Replace str functions with rte_str functions - Drop use of rte_exit in ml/cnxk driver v6: - Added depends info for series. This series depends on patch-132887 - Fix merge conflicts with dpdk-23.11-rc1 - Fix issues with ml/cnxk driver release notes - Added build dependency information for dlpack headers v5: - Fix build failures for individual patches in the series - Finished build testing with devtools/test-meson-builds.sh script v4: - Squashed release notes - Updated external build dependency info in documentation v3: - Reduced use of RTE_MLDEV_CNXK_ENABLE_MVTVM macro - Added stubs file with dummy functions to use when TVM is disabled - Dropped patch with internal function to read firmware - Updated ML CNXK PMD documentation - Added external library dependency info in documentation - Added release notes for 23.11 v2: - Fix xstats reporting - Fix issues reported by klocwork static analysis tool - Update external header inclusions v1: - Initial changes Anup Prabhu (2): ml/cnxk: enable OCM check for multilayer TVM model ml/cnxk: enable fast-path ops for TVM models Prince Takkar (2): ml/cnxk: update internal TVM model info structure ml/cnxk: support quantize and dequantize callback Srikanth Yalavarthi (30): ml/cnxk: drop support for register polling ml/cnxk: add generic cnxk device structure ml/cnxk: add generic model and layer structures ml/cnxk: add generic cnxk request structure ml/cnxk: add generic cnxk xstats structures ml/cnxk: rename cnxk ops function pointers struct ml/cnxk: update device handling functions ml/cnxk: update queue-pair handling functions ml/cnxk: update model load and unload functions ml/cnxk: update model start and stop functions ml/cnxk: update model utility functions ml/cnxk: update data quantization functions ml/cnxk: update device debug functions ml/cnxk: update device stats functions ml/cnxk: update device and model xstats functions ml/cnxk: update fast path functions ml/cnxk: move error handling to cnxk layer ml/cnxk: support config and close of tvmdp library ml/cnxk: add structures to support TVM model type ml/cnxk: add support for identify model type ml/cnxk: add support to parse TVM model objects ml/cnxk: fetch layer info and load TVM model ml/cnxk: update internal info for TVM model ml/cnxk: enable model unload in tvmdp library ml/cnxk: support start and stop for TVM models ml/cnxk: support device dump for TVM models ml/cnxk: enable reporting model runtime as xstats ml/cnxk: implement I/O alloc and free callbacks ml/cnxk: add generic ML malloc and free callback ml/cnxk: enable creation of mvtvm virtual device config/arm/arm64_cn10k_linux_gcc |1 + config/arm/arm64_cn9k_linux_gcc|1 + doc/guides/mldevs/cnxk.rst | 223 +- doc/guides/rel_notes/release_23_11.rst |3 + drivers/ml/cnxk/cn10k_ml_dev.c | 416 ++-- drivers/ml/cnxk/cn10k_ml_dev.h | 457 +--- drivers/ml/cnxk/cn10k_ml_model.c | 403 ++-- drivers/ml/cnxk/cn10k_ml_model.h | 151 +- drivers/ml/cnxk/cn10k_ml_ocm.c | 111 +- drivers/ml/cnxk/cn10k_ml_ocm.h | 15 +- drivers/ml/cnxk/cn10k_ml_ops.c | 2828 drivers/ml/cnxk/cn10k_ml_ops.h | 358 ++- drivers/ml/cnxk/cnxk_ml_dev.c | 22 + drivers/ml/cnxk/cnxk_ml_dev.h | 120 + drivers/ml/cnxk/cnxk_ml_io.c | 95 + drivers/ml/cnxk/cnxk_ml_io.h | 88 + drivers/ml/cnxk/cnxk_ml_model.c| 94 + drivers/ml/cnxk/cnxk_ml_model.h| 192 ++ drivers/ml/cnxk/cnxk_ml_ops.c | 1690 ++ drivers/ml/cnxk/cnxk_ml_ops.h | 87 + drivers/ml/cnxk/cnxk_ml_utils.c| 15 + drivers/ml/cnxk/cnxk_ml_utils.h| 17 + drivers/ml/cnxk/cnxk_ml_xstats.h | 152 ++ drivers/ml/cnxk/meson.build| 70 + drivers/ml/cnxk/mvtvm_ml_dev.c | 196 ++ drivers/ml/cnxk/mvtvm_ml_dev.h | 40 + drivers/ml/cnxk/mvtvm_ml_model.c | 392 drivers/ml/cnxk/mvtvm_ml_model.h | 90 + drivers/ml/cnxk/mvtvm_ml_ops.c | 652 ++ drivers/ml/cnxk/mvtvm_ml_ops.h | 82 + drivers/ml/cnxk/mvtvm_ml_stubs.c | 141 ++ drivers/ml/cnxk/mvtvm_ml_stubs.h | 36 + 32 files changed, 6279 insertions
[PATCH v8 05/34] ml/cnxk: add generic cnxk xstats structures
Introduced generic xstats structures and renamed cn10k xstats enumerations with cnxk prefix. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_dev.h | 86 +--- drivers/ml/cnxk/cn10k_ml_model.h | 6 +- drivers/ml/cnxk/cn10k_ml_ops.c | 169 ++- drivers/ml/cnxk/cnxk_ml_xstats.h | 128 +++ 4 files changed, 209 insertions(+), 180 deletions(-) create mode 100644 drivers/ml/cnxk/cnxk_ml_xstats.h diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h index 1852d4f6c9..be989e0a20 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.h +++ b/drivers/ml/cnxk/cn10k_ml_dev.h @@ -10,6 +10,7 @@ #include "cn10k_ml_ocm.h" #include "cnxk_ml_io.h" +#include "cnxk_ml_xstats.h" /* Dummy Device ops */ extern struct rte_ml_dev_ops ml_dev_dummy_ops; @@ -121,89 +122,6 @@ struct cn10k_ml_fw { struct cnxk_ml_req *req; }; -/* Extended stats types enum */ -enum cn10k_ml_xstats_type { - /* Number of models loaded */ - nb_models_loaded, - - /* Number of models unloaded */ - nb_models_unloaded, - - /* Number of models started */ - nb_models_started, - - /* Number of models stopped */ - nb_models_stopped, - - /* Average inference hardware latency */ - avg_hw_latency, - - /* Minimum hardware latency */ - min_hw_latency, - - /* Maximum hardware latency */ - max_hw_latency, - - /* Average firmware latency */ - avg_fw_latency, - - /* Minimum firmware latency */ - min_fw_latency, - - /* Maximum firmware latency */ - max_fw_latency, -}; - -/* Extended stats function type enum. */ -enum cn10k_ml_xstats_fn_type { - /* Device function */ - CN10K_ML_XSTATS_FN_DEVICE, - - /* Model function */ - CN10K_ML_XSTATS_FN_MODEL, -}; - -/* Function pointer to get xstats for a type */ -typedef uint64_t (*cn10k_ml_xstats_fn)(struct rte_ml_dev *dev, uint16_t obj_idx, - enum cn10k_ml_xstats_type stat); - -/* Extended stats entry structure */ -struct cn10k_ml_xstats_entry { - /* Name-ID map */ - struct rte_ml_dev_xstats_map map; - - /* xstats mode, device or model */ - enum rte_ml_dev_xstats_mode mode; - - /* Type of xstats */ - enum cn10k_ml_xstats_type type; - - /* xstats function */ - enum cn10k_ml_xstats_fn_type fn_id; - - /* Object ID, model ID for model stat type */ - uint16_t obj_idx; - - /* Allowed to reset the stat */ - uint8_t reset_allowed; - - /* An offset to be taken away to emulate resets */ - uint64_t reset_value; -}; - -/* Extended stats data */ -struct cn10k_ml_xstats { - /* Pointer to xstats entries */ - struct cn10k_ml_xstats_entry *entries; - - /* Store num stats and offset of the stats for each model */ - uint16_t count_per_model[ML_CNXK_MAX_MODELS]; - uint16_t offset_for_model[ML_CNXK_MAX_MODELS]; - uint16_t count_mode_device; - uint16_t count_mode_model; - uint16_t count; -}; - /* Device private data */ struct cn10k_ml_dev { /* Device ROC */ @@ -216,7 +134,7 @@ struct cn10k_ml_dev { struct cn10k_ml_ocm ocm; /* Extended stats data */ - struct cn10k_ml_xstats xstats; + struct cnxk_ml_xstats xstats; /* Enable / disable model data caching */ int cache_model_data; diff --git a/drivers/ml/cnxk/cn10k_ml_model.h b/drivers/ml/cnxk/cn10k_ml_model.h index 74ada1531a..5c32f48c68 100644 --- a/drivers/ml/cnxk/cn10k_ml_model.h +++ b/drivers/ml/cnxk/cn10k_ml_model.h @@ -404,7 +404,7 @@ struct cn10k_ml_layer_addr { }; /* Model fast-path stats */ -struct cn10k_ml_layer_stats { +struct cn10k_ml_layer_xstats { /* Total hardware latency, sum of all inferences */ uint64_t hw_latency_tot; @@ -447,10 +447,10 @@ struct cn10k_ml_layer_data { struct cnxk_ml_req *req; /* Layer: Stats for burst ops */ - struct cn10k_ml_layer_stats *burst_stats; + struct cn10k_ml_layer_xstats *burst_xstats; /* Layer: Stats for sync ops */ - struct cn10k_ml_layer_stats *sync_stats; + struct cn10k_ml_layer_xstats *sync_xstats; }; struct cn10k_ml_model_data { diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 25ebb28993..b470955ffd 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -10,6 +10,7 @@ #include "cnxk_ml_dev.h" #include "cnxk_ml_model.h" #include "cnxk_ml_ops.h" +#include "cnxk_ml_xstats.h" /* ML model macros */ #define CN10K_ML_MODEL_MEMZONE_NAME "ml_cn10k_model_mz" @@ -425,26 +426,6 @@ cn10k_ml_prep_fp_job_descriptor(struct cn10k_ml_dev *cn10k_mldev, struct cnxk_ml req->cn10k_req.jd.model_run.num_batches = op->nb_batches; } -struct xstat_info { - char name[32]; - enum cn10k_ml_xstats_type type; - uint8_t reset_allowed; -}; -
[PATCH v8 03/34] ml/cnxk: add generic model and layer structures
Introduce generic cnxk model and layer structure. These structures would enable supporting models with multiple layers. A model is a collection of multiple independent layers with flow dependencies between the layers. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_dev.h | 9 +- drivers/ml/cnxk/cn10k_ml_model.c | 247 drivers/ml/cnxk/cn10k_ml_model.h | 122 ++-- drivers/ml/cnxk/cn10k_ml_ocm.c | 50 ++-- drivers/ml/cnxk/cn10k_ml_ocm.h | 9 +- drivers/ml/cnxk/cn10k_ml_ops.c | 488 +-- drivers/ml/cnxk/cnxk_ml_io.h | 79 + drivers/ml/cnxk/cnxk_ml_model.c | 7 + drivers/ml/cnxk/cnxk_ml_model.h | 111 +++ drivers/ml/cnxk/meson.build | 1 + 10 files changed, 653 insertions(+), 470 deletions(-) create mode 100644 drivers/ml/cnxk/cnxk_ml_io.h create mode 100644 drivers/ml/cnxk/cnxk_ml_model.c create mode 100644 drivers/ml/cnxk/cnxk_ml_model.h diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h index f9da1548c4..99ff0a344a 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.h +++ b/drivers/ml/cnxk/cn10k_ml_dev.h @@ -9,6 +9,8 @@ #include "cn10k_ml_ocm.h" +#include "cnxk_ml_io.h" + /* Dummy Device ops */ extern struct rte_ml_dev_ops ml_dev_dummy_ops; @@ -21,9 +23,6 @@ extern struct rte_ml_dev_ops ml_dev_dummy_ops; /* Device alignment size */ #define ML_CN10K_ALIGN_SIZE 128 -/* Maximum number of models per device */ -#define ML_CN10K_MAX_MODELS 16 - /* Maximum number of queue-pairs per device, spinlock version */ #define ML_CN10K_MAX_QP_PER_DEVICE_SL 16 @@ -455,8 +454,8 @@ struct cn10k_ml_xstats { struct cn10k_ml_xstats_entry *entries; /* Store num stats and offset of the stats for each model */ - uint16_t count_per_model[ML_CN10K_MAX_MODELS]; - uint16_t offset_for_model[ML_CN10K_MAX_MODELS]; + uint16_t count_per_model[ML_CNXK_MAX_MODELS]; + uint16_t offset_for_model[ML_CNXK_MAX_MODELS]; uint16_t count_mode_device; uint16_t count_mode_model; uint16_t count; diff --git a/drivers/ml/cnxk/cn10k_ml_model.c b/drivers/ml/cnxk/cn10k_ml_model.c index cc46ca2efd..d033d6deff 100644 --- a/drivers/ml/cnxk/cn10k_ml_model.c +++ b/drivers/ml/cnxk/cn10k_ml_model.c @@ -6,10 +6,10 @@ #include -#include "cn10k_ml_model.h" #include "cn10k_ml_ocm.h" #include "cnxk_ml_dev.h" +#include "cnxk_ml_model.h" static enum rte_ml_io_type cn10k_ml_io_type_map(uint8_t type) @@ -311,19 +311,17 @@ cn10k_ml_model_metadata_update(struct cn10k_ml_model_metadata *metadata) } void -cn10k_ml_model_addr_update(struct cn10k_ml_model *model, uint8_t *buffer, uint8_t *base_dma_addr) +cn10k_ml_layer_addr_update(struct cnxk_ml_layer *layer, uint8_t *buffer, uint8_t *base_dma_addr) { struct cn10k_ml_model_metadata *metadata; - struct cn10k_ml_model_addr *addr; + struct cn10k_ml_layer_addr *addr; size_t model_data_size; uint8_t *dma_addr_load; uint8_t *dma_addr_run; - uint8_t i; - uint8_t j; int fpos; - metadata = &model->metadata; - addr = &model->addr; + metadata = &layer->glow.metadata; + addr = &layer->glow.addr; model_data_size = metadata->init_model.file_size + metadata->main_model.file_size + metadata->finish_model.file_size + metadata->weights_bias.file_size; @@ -361,102 +359,138 @@ cn10k_ml_model_addr_update(struct cn10k_ml_model *model, uint8_t *buffer, uint8_ addr->wb_base_addr = PLT_PTR_SUB(dma_addr_load, metadata->weights_bias.mem_offset); addr->wb_load_addr = PLT_PTR_ADD(addr->wb_base_addr, metadata->weights_bias.mem_offset); rte_memcpy(addr->wb_load_addr, PLT_PTR_ADD(buffer, fpos), metadata->weights_bias.file_size); +} + +void +cn10k_ml_layer_info_update(struct cnxk_ml_layer *layer) +{ + struct cn10k_ml_model_metadata *metadata; + uint8_t i; + uint8_t j; + + metadata = &layer->glow.metadata; /* Inputs */ - addr->total_input_sz_d = 0; - addr->total_input_sz_q = 0; + layer->info.nb_inputs = metadata->model.num_input; + layer->info.total_input_sz_d = 0; + layer->info.total_input_sz_q = 0; for (i = 0; i < metadata->model.num_input; i++) { if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) { - addr->input[i].nb_dims = 4; - addr->input[i].shape[0] = metadata->input1[i].shape.w; - addr->input[i].shape[1] = metadata->input1[i].shape.x; - addr->input[i].shape[2] = metadata->input1[i].shape.y; - addr->input[i].shape[3] = metadata->input1[i].shape.z; - - addr->input[i].nb_elements = + rte_strscpy(layer->info.input[i].name, + (char *)metadata->input1[i].input_name, MRVL_ML_INPUT_NAME_LEN); + layer->info.inpu
[PATCH v8 02/34] ml/cnxk: add generic cnxk device structure
Introduce generic cnxk device structure. This structure is a top level device structure for the driver, which would encapsulate the target / platform specific device structure. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_dev.c | 316 ++-- drivers/ml/cnxk/cn10k_ml_dev.h | 47 +-- drivers/ml/cnxk/cn10k_ml_model.c | 15 +- drivers/ml/cnxk/cn10k_ml_model.h | 8 +- drivers/ml/cnxk/cn10k_ml_ocm.c | 60 ++-- drivers/ml/cnxk/cn10k_ml_ops.c | 495 +-- drivers/ml/cnxk/cnxk_ml_dev.c| 11 + drivers/ml/cnxk/cnxk_ml_dev.h| 58 drivers/ml/cnxk/meson.build | 1 + 9 files changed, 562 insertions(+), 449 deletions(-) create mode 100644 drivers/ml/cnxk/cnxk_ml_dev.c create mode 100644 drivers/ml/cnxk/cnxk_ml_dev.h diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c index e3c2badcef..3bc61443d8 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.c +++ b/drivers/ml/cnxk/cn10k_ml_dev.c @@ -10,13 +10,14 @@ #include #include -#include - #include -#include "cn10k_ml_dev.h" +#include + #include "cn10k_ml_ops.h" +#include "cnxk_ml_dev.h" + #define CN10K_ML_FW_PATH "fw_path" #define CN10K_ML_FW_ENABLE_DPE_WARNINGS "enable_dpe_warnings" #define CN10K_ML_FW_REPORT_DPE_WARNINGS "report_dpe_warnings" @@ -58,9 +59,6 @@ static const char *const valid_args[] = {CN10K_ML_FW_PATH, /* Supported OCM page sizes: 1KB, 2KB, 4KB, 8KB and 16KB */ static const int valid_ocm_page_size[] = {1024, 2048, 4096, 8192, 16384}; -/* Dummy operations for ML device */ -struct rte_ml_dev_ops ml_dev_dummy_ops = {0}; - static int parse_string_arg(const char *key __rte_unused, const char *value, void *extra_args) { @@ -90,7 +88,7 @@ parse_integer_arg(const char *key __rte_unused, const char *value, void *extra_a } static int -cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mldev) +cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *cn10k_mldev) { bool enable_dpe_warnings_set = false; bool report_dpe_warnings_set = false; @@ -127,7 +125,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde if (rte_kvargs_count(kvlist, CN10K_ML_FW_ENABLE_DPE_WARNINGS) == 1) { ret = rte_kvargs_process(kvlist, CN10K_ML_FW_ENABLE_DPE_WARNINGS, -&parse_integer_arg, &mldev->fw.enable_dpe_warnings); +&parse_integer_arg, &cn10k_mldev->fw.enable_dpe_warnings); if (ret < 0) { plt_err("Error processing arguments, key = %s\n", CN10K_ML_FW_ENABLE_DPE_WARNINGS); @@ -139,7 +137,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde if (rte_kvargs_count(kvlist, CN10K_ML_FW_REPORT_DPE_WARNINGS) == 1) { ret = rte_kvargs_process(kvlist, CN10K_ML_FW_REPORT_DPE_WARNINGS, -&parse_integer_arg, &mldev->fw.report_dpe_warnings); +&parse_integer_arg, &cn10k_mldev->fw.report_dpe_warnings); if (ret < 0) { plt_err("Error processing arguments, key = %s\n", CN10K_ML_FW_REPORT_DPE_WARNINGS); @@ -151,7 +149,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde if (rte_kvargs_count(kvlist, CN10K_ML_DEV_CACHE_MODEL_DATA) == 1) { ret = rte_kvargs_process(kvlist, CN10K_ML_DEV_CACHE_MODEL_DATA, &parse_integer_arg, -&mldev->cache_model_data); +&cn10k_mldev->cache_model_data); if (ret < 0) { plt_err("Error processing arguments, key = %s\n", CN10K_ML_DEV_CACHE_MODEL_DATA); @@ -174,7 +172,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde if (rte_kvargs_count(kvlist, CN10K_ML_DEV_HW_QUEUE_LOCK) == 1) { ret = rte_kvargs_process(kvlist, CN10K_ML_DEV_HW_QUEUE_LOCK, &parse_integer_arg, -&mldev->hw_queue_lock); +&cn10k_mldev->hw_queue_lock); if (ret < 0) { plt_err("Error processing arguments, key = %s\n", CN10K_ML_DEV_HW_QUEUE_LOCK); @@ -186,7 +184,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde if (rte_kvargs_count(kvlist, CN10K_ML_OCM_PAGE_SIZE) == 1) { ret = rte_kvargs_process(kvlist, CN10K_ML_OCM_PAGE_SIZE, &parse_integer_arg, -&mldev->ocm_page_size); +&cn10k_mldev->ocm_page_size); if (ret < 0) {
[PATCH v8 09/34] ml/cnxk: update model load and unload functions
Implemented cnxk wrapper functions to load and unload ML models. Wrapper functions would invoke the cn10k model load and unload functions. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_model.c | 244 - drivers/ml/cnxk/cn10k_ml_model.h | 26 ++- drivers/ml/cnxk/cn10k_ml_ops.c | 296 ++- drivers/ml/cnxk/cn10k_ml_ops.h | 12 +- drivers/ml/cnxk/cnxk_ml_dev.h| 15 ++ drivers/ml/cnxk/cnxk_ml_ops.c| 144 ++- drivers/ml/cnxk/cnxk_ml_ops.h| 2 + 7 files changed, 462 insertions(+), 277 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_model.c b/drivers/ml/cnxk/cn10k_ml_model.c index d2f1c761be..48d70027ca 100644 --- a/drivers/ml/cnxk/cn10k_ml_model.c +++ b/drivers/ml/cnxk/cn10k_ml_model.c @@ -316,42 +316,31 @@ cn10k_ml_layer_addr_update(struct cnxk_ml_layer *layer, uint8_t *buffer, uint8_t { struct cn10k_ml_model_metadata *metadata; struct cn10k_ml_layer_addr *addr; - size_t model_data_size; uint8_t *dma_addr_load; - uint8_t *dma_addr_run; int fpos; metadata = &layer->glow.metadata; addr = &layer->glow.addr; - model_data_size = metadata->init_model.file_size + metadata->main_model.file_size + - metadata->finish_model.file_size + metadata->weights_bias.file_size; /* Base address */ addr->base_dma_addr_load = base_dma_addr; - addr->base_dma_addr_run = PLT_PTR_ADD(addr->base_dma_addr_load, model_data_size); /* Init section */ dma_addr_load = addr->base_dma_addr_load; - dma_addr_run = addr->base_dma_addr_run; fpos = sizeof(struct cn10k_ml_model_metadata); addr->init_load_addr = dma_addr_load; - addr->init_run_addr = dma_addr_run; rte_memcpy(dma_addr_load, PLT_PTR_ADD(buffer, fpos), metadata->init_model.file_size); /* Main section */ dma_addr_load += metadata->init_model.file_size; - dma_addr_run += metadata->init_model.file_size; fpos += metadata->init_model.file_size; addr->main_load_addr = dma_addr_load; - addr->main_run_addr = dma_addr_run; rte_memcpy(dma_addr_load, PLT_PTR_ADD(buffer, fpos), metadata->main_model.file_size); /* Finish section */ dma_addr_load += metadata->main_model.file_size; - dma_addr_run += metadata->main_model.file_size; fpos += metadata->main_model.file_size; addr->finish_load_addr = dma_addr_load; - addr->finish_run_addr = dma_addr_run; rte_memcpy(dma_addr_load, PLT_PTR_ADD(buffer, fpos), metadata->finish_model.file_size); /* Weights and Bias section */ @@ -363,142 +352,148 @@ cn10k_ml_layer_addr_update(struct cnxk_ml_layer *layer, uint8_t *buffer, uint8_t } void -cn10k_ml_layer_info_update(struct cnxk_ml_layer *layer) +cn10k_ml_layer_io_info_set(struct cnxk_ml_io_info *io_info, + struct cn10k_ml_model_metadata *metadata) { - struct cn10k_ml_model_metadata *metadata; uint8_t i; uint8_t j; - metadata = &layer->glow.metadata; - /* Inputs */ - layer->info.nb_inputs = metadata->model.num_input; - layer->info.total_input_sz_d = 0; - layer->info.total_input_sz_q = 0; + io_info->nb_inputs = metadata->model.num_input; + io_info->total_input_sz_d = 0; + io_info->total_input_sz_q = 0; for (i = 0; i < metadata->model.num_input; i++) { if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) { - rte_strscpy(layer->info.input[i].name, - (char *)metadata->input1[i].input_name, MRVL_ML_INPUT_NAME_LEN); - layer->info.input[i].dtype = metadata->input1[i].input_type; - layer->info.input[i].qtype = metadata->input1[i].model_input_type; - layer->info.input[i].nb_dims = 4; - layer->info.input[i].shape[0] = metadata->input1[i].shape.w; - layer->info.input[i].shape[1] = metadata->input1[i].shape.x; - layer->info.input[i].shape[2] = metadata->input1[i].shape.y; - layer->info.input[i].shape[3] = metadata->input1[i].shape.z; - layer->info.input[i].nb_elements = + rte_strscpy(io_info->input[i].name, (char *)metadata->input1[i].input_name, + MRVL_ML_INPUT_NAME_LEN); + io_info->input[i].dtype = metadata->input1[i].input_type; + io_info->input[i].qtype = metadata->input1[i].model_input_type; + io_info->input[i].nb_dims = 4; + io_info->input[i].shape[0] = metadata->input1[i].shape.w; + io_info->input[i].shape[1] = metadata->input1[i].shape.x; + io_info->input[i].shape[2] = metadata->input1[i].shap
[PATCH v8 04/34] ml/cnxk: add generic cnxk request structure
Added generic cnxk request structure. Moved common fields from cn10k structures to cnxk structure. Moved job related structures and enumerations to ops headers. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_dev.c | 72 +++ drivers/ml/cnxk/cn10k_ml_dev.h | 269 + drivers/ml/cnxk/cn10k_ml_model.c | 6 +- drivers/ml/cnxk/cn10k_ml_model.h | 4 +- drivers/ml/cnxk/cn10k_ml_ops.c | 331 +-- drivers/ml/cnxk/cn10k_ml_ops.h | 296 +++ drivers/ml/cnxk/cnxk_ml_ops.c| 7 + drivers/ml/cnxk/cnxk_ml_ops.h| 63 ++ drivers/ml/cnxk/meson.build | 1 + 9 files changed, 557 insertions(+), 492 deletions(-) create mode 100644 drivers/ml/cnxk/cnxk_ml_ops.c create mode 100644 drivers/ml/cnxk/cnxk_ml_ops.h diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c index 3bc61443d8..fc6f78d414 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.c +++ b/drivers/ml/cnxk/cn10k_ml_dev.c @@ -14,9 +14,8 @@ #include -#include "cn10k_ml_ops.h" - #include "cnxk_ml_dev.h" +#include "cnxk_ml_ops.h" #define CN10K_ML_FW_PATH "fw_path" #define CN10K_ML_FW_ENABLE_DPE_WARNINGS "enable_dpe_warnings" @@ -400,20 +399,23 @@ cn10k_ml_pci_remove(struct rte_pci_device *pci_dev) static void cn10k_ml_fw_print_info(struct cn10k_ml_fw *fw) { - plt_info("ML Firmware Version = %s", fw->req->jd.fw_load.version); - - plt_ml_dbg("Firmware capabilities = 0x%016lx", fw->req->jd.fw_load.cap.u64); - plt_ml_dbg("Version = %s", fw->req->jd.fw_load.version); - plt_ml_dbg("core0_debug_ptr = 0x%016lx", fw->req->jd.fw_load.debug.core0_debug_ptr); - plt_ml_dbg("core1_debug_ptr = 0x%016lx", fw->req->jd.fw_load.debug.core1_debug_ptr); - plt_ml_dbg("debug_buffer_size = %u bytes", fw->req->jd.fw_load.debug.debug_buffer_size); + plt_info("ML Firmware Version = %s", fw->req->cn10k_req.jd.fw_load.version); + + plt_ml_dbg("Firmware capabilities = 0x%016lx", fw->req->cn10k_req.jd.fw_load.cap.u64); + plt_ml_dbg("Version = %s", fw->req->cn10k_req.jd.fw_load.version); + plt_ml_dbg("core0_debug_ptr = 0x%016lx", + fw->req->cn10k_req.jd.fw_load.debug.core0_debug_ptr); + plt_ml_dbg("core1_debug_ptr = 0x%016lx", + fw->req->cn10k_req.jd.fw_load.debug.core1_debug_ptr); + plt_ml_dbg("debug_buffer_size = %u bytes", + fw->req->cn10k_req.jd.fw_load.debug.debug_buffer_size); plt_ml_dbg("core0_exception_buffer = 0x%016lx", - fw->req->jd.fw_load.debug.core0_exception_buffer); + fw->req->cn10k_req.jd.fw_load.debug.core0_exception_buffer); plt_ml_dbg("core1_exception_buffer = 0x%016lx", - fw->req->jd.fw_load.debug.core1_exception_buffer); + fw->req->cn10k_req.jd.fw_load.debug.core1_exception_buffer); plt_ml_dbg("exception_state_size = %u bytes", - fw->req->jd.fw_load.debug.exception_state_size); - plt_ml_dbg("flags = 0x%016lx", fw->req->jd.fw_load.flags); + fw->req->cn10k_req.jd.fw_load.debug.exception_state_size); + plt_ml_dbg("flags = 0x%016lx", fw->req->cn10k_req.jd.fw_load.flags); } uint64_t @@ -458,29 +460,30 @@ cn10k_ml_fw_load_asim(struct cn10k_ml_fw *fw) roc_ml_reg_save(&cn10k_mldev->roc, ML_MLR_BASE); /* Update FW load completion structure */ - fw->req->jd.hdr.jce.w1.u64 = PLT_U64_CAST(&fw->req->status); - fw->req->jd.hdr.job_type = ML_CN10K_JOB_TYPE_FIRMWARE_LOAD; - fw->req->jd.hdr.result = roc_ml_addr_ap2mlip(&cn10k_mldev->roc, &fw->req->result); - fw->req->jd.fw_load.flags = cn10k_ml_fw_flags_get(fw); - plt_write64(ML_CNXK_POLL_JOB_START, &fw->req->status); + fw->req->cn10k_req.jd.hdr.jce.w1.u64 = PLT_U64_CAST(&fw->req->cn10k_req.status); + fw->req->cn10k_req.jd.hdr.job_type = ML_CN10K_JOB_TYPE_FIRMWARE_LOAD; + fw->req->cn10k_req.jd.hdr.result = + roc_ml_addr_ap2mlip(&cn10k_mldev->roc, &fw->req->cn10k_req.result); + fw->req->cn10k_req.jd.fw_load.flags = cn10k_ml_fw_flags_get(fw); + plt_write64(ML_CNXK_POLL_JOB_START, &fw->req->cn10k_req.status); plt_wmb(); /* Enqueue FW load through scratch registers */ timeout = true; timeout_cycle = plt_tsc_cycles() + ML_CNXK_CMD_TIMEOUT * plt_tsc_hz(); - roc_ml_scratch_enqueue(&cn10k_mldev->roc, &fw->req->jd); + roc_ml_scratch_enqueue(&cn10k_mldev->roc, &fw->req->cn10k_req.jd); plt_rmb(); do { if (roc_ml_scratch_is_done_bit_set(&cn10k_mldev->roc) && - (plt_read64(&fw->req->status) == ML_CNXK_POLL_JOB_FINISH)) { + (plt_read64(&fw->req->cn10k_req.status) == ML_CNXK_POLL_JOB_FINISH)) { timeout = false; break; } } while (plt_tsc_cycles() < tim
[PATCH v8 13/34] ml/cnxk: update device debug functions
Added cnxk wrapper for device dump and selftest debug functions. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_model.c | 118 + drivers/ml/cnxk/cn10k_ml_model.h | 1 + drivers/ml/cnxk/cn10k_ml_ocm.c | 8 +- drivers/ml/cnxk/cn10k_ml_ocm.h | 2 +- drivers/ml/cnxk/cn10k_ml_ops.c | 176 ++- drivers/ml/cnxk/cn10k_ml_ops.h | 4 +- drivers/ml/cnxk/cnxk_ml_model.c | 33 ++ drivers/ml/cnxk/cnxk_ml_model.h | 2 + drivers/ml/cnxk/cnxk_ml_ops.c| 39 ++- drivers/ml/cnxk/cnxk_ml_utils.c | 15 +++ drivers/ml/cnxk/cnxk_ml_utils.h | 17 +++ drivers/ml/cnxk/meson.build | 1 + 12 files changed, 235 insertions(+), 181 deletions(-) create mode 100644 drivers/ml/cnxk/cnxk_ml_utils.c create mode 100644 drivers/ml/cnxk/cnxk_ml_utils.h diff --git a/drivers/ml/cnxk/cn10k_ml_model.c b/drivers/ml/cnxk/cn10k_ml_model.c index 48d70027ca..af9d5a666f 100644 --- a/drivers/ml/cnxk/cn10k_ml_model.c +++ b/drivers/ml/cnxk/cn10k_ml_model.c @@ -11,6 +11,7 @@ #include "cnxk_ml_dev.h" #include "cnxk_ml_model.h" #include "cnxk_ml_ops.h" +#include "cnxk_ml_utils.h" static enum rte_ml_io_type cn10k_ml_io_type_map(uint8_t type) @@ -598,3 +599,120 @@ cn10k_ml_model_info_set(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *mo rte_ml_io_type_size_get(io_info->output[i].qtype); } } + +void +cn10k_ml_layer_print(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer *layer, FILE *fp) +{ + struct cn10k_ml_ocm *ocm; + char str[STR_LEN]; + uint8_t i; + uint8_t j; + + ocm = &cnxk_mldev->cn10k_mldev.ocm; + + /* Print debug info */ + cnxk_ml_print_line(fp, LINE_LEN); + fprintf(fp, " Layer Information (Layer ID: %u, Name: %s)\n", + cnxk_mldev->index_map[layer->index].layer_id, layer->name); + cnxk_ml_print_line(fp, LINE_LEN); + fprintf(fp, "%*s : %u\n", FIELD_LEN, "index", layer->index); + fprintf(fp, "%*s : %s\n", FIELD_LEN, "name", layer->name); + fprintf(fp, "%*s : %u.%u.%u.%u\n", FIELD_LEN, "version", + layer->glow.metadata.model.version[0], layer->glow.metadata.model.version[1], + layer->glow.metadata.model.version[2], layer->glow.metadata.model.version[3]); + fprintf(fp, "%*s : 0x%016lx\n", FIELD_LEN, "layer", PLT_U64_CAST(layer)); + fprintf(fp, "%*s : %u\n", FIELD_LEN, "batch_size", layer->batch_size); + + /* Print model state */ + if (layer->state == ML_CNXK_LAYER_STATE_LOADED) + fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "loaded"); + if (layer->state == ML_CNXK_LAYER_STATE_JOB_ACTIVE) + fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "job_active"); + if (layer->state == ML_CNXK_LAYER_STATE_STARTED) + fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "started"); + + /* Print OCM status */ + fprintf(fp, "%*s : %" PRIu64 " bytes\n", FIELD_LEN, "wb_size", + layer->glow.metadata.model.ocm_wb_range_end - + layer->glow.metadata.model.ocm_wb_range_start + 1); + fprintf(fp, "%*s : %u\n", FIELD_LEN, "wb_pages", layer->glow.ocm_map.wb_pages); + fprintf(fp, "%*s : %" PRIu64 " bytes\n", FIELD_LEN, "scratch_size", + ocm->size_per_tile - layer->glow.metadata.model.ocm_tmp_range_floor); + fprintf(fp, "%*s : %u\n", FIELD_LEN, "scratch_pages", layer->glow.ocm_map.scratch_pages); + fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_tiles", + layer->glow.metadata.model.tile_end - layer->glow.metadata.model.tile_start + 1); + + if (layer->state == ML_CNXK_LAYER_STATE_STARTED) { + fprintf(fp, "%*s : 0x%0*" PRIx64 "\n", FIELD_LEN, "tilemask", + ML_CN10K_OCM_NUMTILES / 4, layer->glow.ocm_map.tilemask); + fprintf(fp, "%*s : 0x%" PRIx64 "\n", FIELD_LEN, "ocm_wb_start", + layer->glow.ocm_map.wb_page_start * ocm->page_size); + } + + fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_inputs", layer->glow.metadata.model.num_input); + fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_outputs", layer->glow.metadata.model.num_output); + fprintf(fp, "\n"); + + cnxk_ml_print_line(fp, LINE_LEN); + fprintf(fp, "%8s %16s %12s %18s\n", "input", "input_name", "input_type", + "model_input_type"); + cnxk_ml_print_line(fp, LINE_LEN); + for (i = 0; i < layer->glow.metadata.model.num_input; i++) { + if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) { + fprintf(fp, "%8u ", i); + fprintf(fp, "%*s ", 16, layer->glow.metadata.input1[i].input_name); + rte_ml_io_type_to_str(layer->glow.metadata.input1[i].input_type, str, + STR_LEN); + fprintf(fp, "%*s ", 12, str); + rt
[PATCH v8 06/34] ml/cnxk: rename cnxk ops function pointers struct
Renamed cn10k ML ops structure with cnxk prefix. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_dev.c | 2 +- drivers/ml/cnxk/cn10k_ml_ops.c | 73 +- drivers/ml/cnxk/cn10k_ml_ops.h | 34 +++- drivers/ml/cnxk/cnxk_ml_ops.c | 36 + drivers/ml/cnxk/cnxk_ml_ops.h | 2 + 5 files changed, 91 insertions(+), 56 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c index fc6f78d414..91813e9d0a 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.c +++ b/drivers/ml/cnxk/cn10k_ml_dev.c @@ -345,7 +345,7 @@ cn10k_ml_pci_probe(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_de goto pmd_destroy; } - dev->dev_ops = &cn10k_ml_ops; + dev->dev_ops = &cnxk_ml_ops; } else { plt_err("CN10K ML Ops are not supported on secondary process"); dev->dev_ops = &ml_dev_dummy_ops; diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index b470955ffd..a44fb26215 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -119,7 +119,7 @@ cnxk_ml_qp_destroy(const struct rte_ml_dev *dev, struct cnxk_ml_qp *qp) return 0; } -static int +int cn10k_ml_dev_queue_pair_release(struct rte_ml_dev *dev, uint16_t queue_pair_id) { struct cnxk_ml_qp *qp; @@ -860,7 +860,7 @@ cn10k_ml_cache_model_data(struct rte_ml_dev *dev, uint16_t model_id) return ret; } -static int +int cn10k_ml_dev_info_get(struct rte_ml_dev *dev, struct rte_ml_dev_info *dev_info) { struct cn10k_ml_dev *cn10k_mldev; @@ -888,7 +888,7 @@ cn10k_ml_dev_info_get(struct rte_ml_dev *dev, struct rte_ml_dev_info *dev_info) return 0; } -static int +int cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *conf) { struct rte_ml_dev_info dev_info; @@ -1087,7 +1087,7 @@ cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *c return ret; } -static int +int cn10k_ml_dev_close(struct rte_ml_dev *dev) { struct cn10k_ml_dev *cn10k_mldev; @@ -1160,7 +1160,7 @@ cn10k_ml_dev_close(struct rte_ml_dev *dev) return rte_dev_remove(dev->device); } -static int +int cn10k_ml_dev_start(struct rte_ml_dev *dev) { struct cn10k_ml_dev *cn10k_mldev; @@ -1180,7 +1180,7 @@ cn10k_ml_dev_start(struct rte_ml_dev *dev) return 0; } -static int +int cn10k_ml_dev_stop(struct rte_ml_dev *dev) { struct cn10k_ml_dev *cn10k_mldev; @@ -1200,7 +1200,7 @@ cn10k_ml_dev_stop(struct rte_ml_dev *dev) return 0; } -static int +int cn10k_ml_dev_queue_pair_setup(struct rte_ml_dev *dev, uint16_t queue_pair_id, const struct rte_ml_dev_qp_conf *qp_conf, int socket_id) { @@ -1241,7 +1241,7 @@ cn10k_ml_dev_queue_pair_setup(struct rte_ml_dev *dev, uint16_t queue_pair_id, return 0; } -static int +int cn10k_ml_dev_stats_get(struct rte_ml_dev *dev, struct rte_ml_dev_stats *stats) { struct cnxk_ml_qp *qp; @@ -1258,7 +1258,7 @@ cn10k_ml_dev_stats_get(struct rte_ml_dev *dev, struct rte_ml_dev_stats *stats) return 0; } -static void +void cn10k_ml_dev_stats_reset(struct rte_ml_dev *dev) { struct cnxk_ml_qp *qp; @@ -1273,7 +1273,7 @@ cn10k_ml_dev_stats_reset(struct rte_ml_dev *dev) } } -static int +int cn10k_ml_dev_xstats_names_get(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode mode, int32_t model_id, struct rte_ml_dev_xstats_map *xstats_map, uint32_t size) @@ -1321,7 +1321,7 @@ cn10k_ml_dev_xstats_names_get(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mod return idx; } -static int +int cn10k_ml_dev_xstats_by_name_get(struct rte_ml_dev *dev, const char *name, uint16_t *stat_id, uint64_t *value) { @@ -1363,7 +1363,7 @@ cn10k_ml_dev_xstats_by_name_get(struct rte_ml_dev *dev, const char *name, uint16 return -EINVAL; } -static int +int cn10k_ml_dev_xstats_get(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode mode, int32_t model_id, const uint16_t stat_ids[], uint64_t values[], uint16_t nb_ids) { @@ -1427,7 +1427,7 @@ cn10k_ml_dev_xstats_get(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode mode return idx; } -static int +int cn10k_ml_dev_xstats_reset(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode mode, int32_t model_id, const uint16_t stat_ids[], uint16_t nb_ids) { @@ -1441,7 +1441,7 @@ cn10k_ml_dev_xstats_reset(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode mo return 0; } -static int +int cn10k_ml_dev_dump(struct rte_ml_dev *dev, FILE *fp) { struct cn10k_ml_dev *cn10k_mldev; @@ -1528,7 +1528,7 @@ cn10k_ml_dev_dump(struct rte_ml_dev *dev, FILE *fp) return 0; } -static i
[PATCH v8 14/34] ml/cnxk: update device stats functions
Added cnxk wrapper function to handle ML device stats Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 32 -- drivers/ml/cnxk/cn10k_ml_ops.h | 2 -- drivers/ml/cnxk/cnxk_ml_ops.c | 36 -- 3 files changed, 34 insertions(+), 36 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index a56d002d4c..8cbf700f6e 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -770,38 +770,6 @@ cn10k_ml_dev_stop(struct cnxk_ml_dev *cnxk_mldev) return 0; } -int -cn10k_ml_dev_stats_get(struct rte_ml_dev *dev, struct rte_ml_dev_stats *stats) -{ - struct cnxk_ml_qp *qp; - int qp_id; - - for (qp_id = 0; qp_id < dev->data->nb_queue_pairs; qp_id++) { - qp = dev->data->queue_pairs[qp_id]; - stats->enqueued_count += qp->stats.enqueued_count; - stats->dequeued_count += qp->stats.dequeued_count; - stats->enqueue_err_count += qp->stats.enqueue_err_count; - stats->dequeue_err_count += qp->stats.dequeue_err_count; - } - - return 0; -} - -void -cn10k_ml_dev_stats_reset(struct rte_ml_dev *dev) -{ - struct cnxk_ml_qp *qp; - int qp_id; - - for (qp_id = 0; qp_id < dev->data->nb_queue_pairs; qp_id++) { - qp = dev->data->queue_pairs[qp_id]; - qp->stats.enqueued_count = 0; - qp->stats.dequeued_count = 0; - qp->stats.enqueue_err_count = 0; - qp->stats.dequeue_err_count = 0; - } -} - int cn10k_ml_dev_xstats_names_get(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode mode, int32_t model_id, struct rte_ml_dev_xstats_map *xstats_map, diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h index 5fda98ae88..47e7cb12af 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.h +++ b/drivers/ml/cnxk/cn10k_ml_ops.h @@ -298,8 +298,6 @@ int cn10k_ml_dev_stop(struct cnxk_ml_dev *cnxk_mldev); int cn10k_ml_dev_dump(struct cnxk_ml_dev *cnxk_mldev, FILE *fp); int cn10k_ml_dev_selftest(struct cnxk_ml_dev *cnxk_mldev); -int cn10k_ml_dev_stats_get(struct rte_ml_dev *dev, struct rte_ml_dev_stats *stats); -void cn10k_ml_dev_stats_reset(struct rte_ml_dev *dev); int cn10k_ml_dev_xstats_names_get(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode mode, int32_t model_id, struct rte_ml_dev_xstats_map *xstats_map, uint32_t size); diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c index 66b88ddae1..c75317d6da 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.c +++ b/drivers/ml/cnxk/cnxk_ml_ops.c @@ -489,6 +489,38 @@ cnxk_ml_dev_queue_pair_setup(struct rte_ml_dev *dev, uint16_t queue_pair_id, return 0; } +static int +cnxk_ml_dev_stats_get(struct rte_ml_dev *dev, struct rte_ml_dev_stats *stats) +{ + struct cnxk_ml_qp *qp; + int qp_id; + + for (qp_id = 0; qp_id < dev->data->nb_queue_pairs; qp_id++) { + qp = dev->data->queue_pairs[qp_id]; + stats->enqueued_count += qp->stats.enqueued_count; + stats->dequeued_count += qp->stats.dequeued_count; + stats->enqueue_err_count += qp->stats.enqueue_err_count; + stats->dequeue_err_count += qp->stats.dequeue_err_count; + } + + return 0; +} + +static void +cnxk_ml_dev_stats_reset(struct rte_ml_dev *dev) +{ + struct cnxk_ml_qp *qp; + int qp_id; + + for (qp_id = 0; qp_id < dev->data->nb_queue_pairs; qp_id++) { + qp = dev->data->queue_pairs[qp_id]; + qp->stats.enqueued_count = 0; + qp->stats.dequeued_count = 0; + qp->stats.enqueue_err_count = 0; + qp->stats.dequeue_err_count = 0; + } +} + static int cnxk_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, uint16_t *model_id) { @@ -772,8 +804,8 @@ struct rte_ml_dev_ops cnxk_ml_ops = { .dev_queue_pair_release = cnxk_ml_dev_queue_pair_release, /* Stats ops */ - .dev_stats_get = cn10k_ml_dev_stats_get, - .dev_stats_reset = cn10k_ml_dev_stats_reset, + .dev_stats_get = cnxk_ml_dev_stats_get, + .dev_stats_reset = cnxk_ml_dev_stats_reset, .dev_xstats_names_get = cn10k_ml_dev_xstats_names_get, .dev_xstats_by_name_get = cn10k_ml_dev_xstats_by_name_get, .dev_xstats_get = cn10k_ml_dev_xstats_get, -- 2.42.0
[PATCH v8 07/34] ml/cnxk: update device handling functions
Implement CNXK wrapper functions for dev_info_get, dev_configure, dev_close, dev_start and dev_stop. The wrapper functions allocate / release common resources for the ML driver and invoke device specific functions. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 230 ++ drivers/ml/cnxk/cn10k_ml_ops.h | 16 +- drivers/ml/cnxk/cnxk_ml_dev.h | 3 + drivers/ml/cnxk/cnxk_ml_ops.c | 286 - drivers/ml/cnxk/cnxk_ml_ops.h | 3 + 5 files changed, 314 insertions(+), 224 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index a44fb26215..f8c51ab394 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -101,7 +101,7 @@ qp_memzone_name_get(char *name, int size, int dev_id, int qp_id) snprintf(name, size, "cnxk_ml_qp_mem_%u:%u", dev_id, qp_id); } -static int +int cnxk_ml_qp_destroy(const struct rte_ml_dev *dev, struct cnxk_ml_qp *qp) { const struct rte_memzone *qp_mem; @@ -861,20 +861,12 @@ cn10k_ml_cache_model_data(struct rte_ml_dev *dev, uint16_t model_id) } int -cn10k_ml_dev_info_get(struct rte_ml_dev *dev, struct rte_ml_dev_info *dev_info) +cn10k_ml_dev_info_get(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_dev_info *dev_info) { struct cn10k_ml_dev *cn10k_mldev; - struct cnxk_ml_dev *cnxk_mldev; - if (dev_info == NULL) - return -EINVAL; - - cnxk_mldev = dev->data->dev_private; cn10k_mldev = &cnxk_mldev->cn10k_mldev; - memset(dev_info, 0, sizeof(struct rte_ml_dev_info)); - dev_info->driver_name = dev->device->driver->name; - dev_info->max_models = ML_CNXK_MAX_MODELS; if (cn10k_mldev->hw_queue_lock) dev_info->max_queue_pairs = ML_CN10K_MAX_QP_PER_DEVICE_SL; else @@ -889,143 +881,17 @@ cn10k_ml_dev_info_get(struct rte_ml_dev *dev, struct rte_ml_dev_info *dev_info) } int -cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *conf) +cn10k_ml_dev_configure(struct cnxk_ml_dev *cnxk_mldev, const struct rte_ml_dev_config *conf) { - struct rte_ml_dev_info dev_info; struct cn10k_ml_dev *cn10k_mldev; - struct cnxk_ml_dev *cnxk_mldev; - struct cnxk_ml_model *model; struct cn10k_ml_ocm *ocm; - struct cnxk_ml_qp *qp; - uint16_t model_id; - uint32_t mz_size; uint16_t tile_id; - uint16_t qp_id; int ret; - if (dev == NULL || conf == NULL) - return -EINVAL; + RTE_SET_USED(conf); - /* Get CN10K device handle */ - cnxk_mldev = dev->data->dev_private; cn10k_mldev = &cnxk_mldev->cn10k_mldev; - cn10k_ml_dev_info_get(dev, &dev_info); - if (conf->nb_models > dev_info.max_models) { - plt_err("Invalid device config, nb_models > %u\n", dev_info.max_models); - return -EINVAL; - } - - if (conf->nb_queue_pairs > dev_info.max_queue_pairs) { - plt_err("Invalid device config, nb_queue_pairs > %u\n", dev_info.max_queue_pairs); - return -EINVAL; - } - - if (cnxk_mldev->state == ML_CNXK_DEV_STATE_PROBED) { - plt_ml_dbg("Configuring ML device, nb_queue_pairs = %u, nb_models = %u", - conf->nb_queue_pairs, conf->nb_models); - - /* Load firmware */ - ret = cn10k_ml_fw_load(cnxk_mldev); - if (ret != 0) - return ret; - } else if (cnxk_mldev->state == ML_CNXK_DEV_STATE_CONFIGURED) { - plt_ml_dbg("Re-configuring ML device, nb_queue_pairs = %u, nb_models = %u", - conf->nb_queue_pairs, conf->nb_models); - } else if (cnxk_mldev->state == ML_CNXK_DEV_STATE_STARTED) { - plt_err("Device can't be reconfigured in started state\n"); - return -ENOTSUP; - } else if (cnxk_mldev->state == ML_CNXK_DEV_STATE_CLOSED) { - plt_err("Device can't be reconfigured after close\n"); - return -ENOTSUP; - } - - /* Configure queue-pairs */ - if (dev->data->queue_pairs == NULL) { - mz_size = sizeof(dev->data->queue_pairs[0]) * conf->nb_queue_pairs; - dev->data->queue_pairs = - rte_zmalloc("cn10k_mldev_queue_pairs", mz_size, RTE_CACHE_LINE_SIZE); - if (dev->data->queue_pairs == NULL) { - dev->data->nb_queue_pairs = 0; - plt_err("Failed to get memory for queue_pairs, nb_queue_pairs %u", - conf->nb_queue_pairs); - return -ENOMEM; - } - } else { /* Re-configure */ - void **queue_pairs; - - /* Release all queue pairs as ML spec doesn't support queue_pair_destroy. */ - for (qp_id = 0; qp_id < dev->data->nb_
[PATCH v8 17/34] ml/cnxk: move error handling to cnxk layer
Move error type structures to cnxk layer. cn10k layer to handle fw and hw error sub-types only. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_dev.h | 41 ++- drivers/ml/cnxk/cn10k_ml_ops.c | 93 +- drivers/ml/cnxk/cnxk_ml_dev.c | 8 +++ drivers/ml/cnxk/cnxk_ml_dev.h | 18 +++ drivers/ml/cnxk/cnxk_ml_ops.c | 2 +- 5 files changed, 78 insertions(+), 84 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h index 94a94d996f..2e7eb6c9ef 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.h +++ b/drivers/ml/cnxk/cn10k_ml_dev.h @@ -52,38 +52,27 @@ struct cnxk_ml_dev; struct cnxk_ml_req; struct cnxk_ml_qp; -/* Error types enumeration */ -enum cn10k_ml_error_etype { - /* 0x0 */ ML_ETYPE_NO_ERROR = 0, /* No error */ - /* 0x1 */ ML_ETYPE_FW_NONFATAL, /* Firmware non-fatal error */ - /* 0x2 */ ML_ETYPE_HW_NONFATAL, /* Hardware non-fatal error */ - /* 0x3 */ ML_ETYPE_HW_FATAL, /* Hardware fatal error */ - /* 0x4 */ ML_ETYPE_HW_WARNING, /* Hardware warning */ - /* 0x5 */ ML_ETYPE_DRIVER, /* Driver specific error */ - /* 0x6 */ ML_ETYPE_UNKNOWN, /* Unknown error */ -}; - /* Firmware non-fatal error sub-type */ enum cn10k_ml_error_stype_fw_nf { - /* 0x0 */ ML_FW_ERR_NOERR = 0, /* No error */ - /* 0x1 */ ML_FW_ERR_UNLOAD_ID_NOT_FOUND, /* Model ID not found during load */ - /* 0x2 */ ML_FW_ERR_LOAD_LUT_OVERFLOW, /* Lookup table overflow at load */ - /* 0x3 */ ML_FW_ERR_ID_IN_USE, /* Model ID already in use */ - /* 0x4 */ ML_FW_ERR_INVALID_TILEMASK,/* Invalid OCM tilemask */ - /* 0x5 */ ML_FW_ERR_RUN_LUT_OVERFLOW,/* Lookup table overflow at run */ - /* 0x6 */ ML_FW_ERR_RUN_ID_NOT_FOUND,/* Model ID not found during run */ - /* 0x7 */ ML_FW_ERR_COMMAND_NOTSUP, /* Unsupported command */ - /* 0x8 */ ML_FW_ERR_DDR_ADDR_RANGE, /* DDR address out of range */ - /* 0x9 */ ML_FW_ERR_NUM_BATCHES_INVALID, /* Invalid number of batches */ - /* 0xA */ ML_FW_ERR_INSSYNC_TIMEOUT, /* INS sync timeout */ + /* 0x0 */ ML_CN10K_FW_ERR_NOERR = 0, /* No error */ + /* 0x1 */ ML_CN10K_FW_ERR_UNLOAD_ID_NOT_FOUND, /* Model ID not found during load */ + /* 0x2 */ ML_CN10K_FW_ERR_LOAD_LUT_OVERFLOW, /* Lookup table overflow at load */ + /* 0x3 */ ML_CN10K_FW_ERR_ID_IN_USE, /* Model ID already in use */ + /* 0x4 */ ML_CN10K_FW_ERR_INVALID_TILEMASK,/* Invalid OCM tilemask */ + /* 0x5 */ ML_CN10K_FW_ERR_RUN_LUT_OVERFLOW,/* Lookup table overflow at run */ + /* 0x6 */ ML_CN10K_FW_ERR_RUN_ID_NOT_FOUND,/* Model ID not found during run */ + /* 0x7 */ ML_CN10K_FW_ERR_COMMAND_NOTSUP, /* Unsupported command */ + /* 0x8 */ ML_CN10K_FW_ERR_DDR_ADDR_RANGE, /* DDR address out of range */ + /* 0x9 */ ML_CN10K_FW_ERR_NUM_BATCHES_INVALID, /* Invalid number of batches */ + /* 0xA */ ML_CN10K_FW_ERR_INSSYNC_TIMEOUT, /* INS sync timeout */ }; /* Driver error sub-type */ enum cn10k_ml_error_stype_driver { - /* 0x0 */ ML_DRIVER_ERR_NOERR = 0, /* No error */ - /* 0x1 */ ML_DRIVER_ERR_UNKNOWN, /* Unable to determine error sub-type */ - /* 0x2 */ ML_DRIVER_ERR_EXCEPTION, /* Firmware exception */ - /* 0x3 */ ML_DRIVER_ERR_FW_ERROR, /* Unknown firmware error */ + /* 0x0 */ ML_CN10K_DRIVER_ERR_NOERR = 0, /* No error */ + /* 0x1 */ ML_CN10K_DRIVER_ERR_UNKNOWN, /* Unable to determine error sub-type */ + /* 0x2 */ ML_CN10K_DRIVER_ERR_EXCEPTION, /* Firmware exception */ + /* 0x3 */ ML_CN10K_DRIVER_ERR_FW_ERROR, /* Unknown firmware error */ }; /* Error structure */ diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 8116c8dedb..65eaaf030d 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -22,47 +22,27 @@ #define ML_FLAGS_POLL_COMPL BIT(0) #define ML_FLAGS_SSO_COMPL BIT(1) -/* Error message length */ -#define ERRMSG_LEN 32 - -/* Error type database */ -static const struct cn10k_ml_etype_db { - enum cn10k_ml_error_etype etype; - char name[ERRMSG_LEN]; -} ml_etype_db[] = { - {ML_ETYPE_NO_ERROR, "NO_ERROR"},{ML_ETYPE_FW_NONFATAL, "FW_NON_FATAL"}, - {ML_ETYPE_HW_NONFATAL, "HW_NON_FATAL"}, {ML_ETYPE_HW_FATAL, "HW_FATAL"}, - {ML_ETYPE_HW_WARNING, "HW_WARNING"},{ML_ETYPE_DRIVER, "DRIVER_ERROR"}, - {ML_ETYPE_UNKNOWN, "UNKNOWN_ERROR"}, -}; - /* Hardware non-fatal error subtype database */ -static const struct cn10k_ml_stype_db_hw_nf { - enum cn10k_ml_error_stype_fw_nf stype; - char msg[ERRMSG_LEN]; -} ml_stype_db_hw_nf[] = { - {ML_FW_ERR_NOERR, "NO ERROR"}, - {ML_FW_ERR_UNLOAD_ID_NOT_FOUND, "UNLOAD MODEL ID NOT FOUND"}, - {ML_FW_ERR_LOAD_LUT_OVERFLOW, "LOAD LUT OVERFLOW"}, - {ML_F
[PATCH v8 15/34] ml/cnxk: update device and model xstats functions
Added cnxk wrapper function to handle ML device and model extended stats. Handling resources for the xstats is done in the cnxk layer. Introduced internal xstats group. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_dev.h | 4 - drivers/ml/cnxk/cn10k_ml_ops.c | 531 +++ drivers/ml/cnxk/cn10k_ml_ops.h | 16 +- drivers/ml/cnxk/cnxk_ml_dev.h| 5 + drivers/ml/cnxk/cnxk_ml_ops.c| 481 +++- drivers/ml/cnxk/cnxk_ml_xstats.h | 21 +- 6 files changed, 551 insertions(+), 507 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h index be989e0a20..bde9d08901 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.h +++ b/drivers/ml/cnxk/cn10k_ml_dev.h @@ -10,7 +10,6 @@ #include "cn10k_ml_ocm.h" #include "cnxk_ml_io.h" -#include "cnxk_ml_xstats.h" /* Dummy Device ops */ extern struct rte_ml_dev_ops ml_dev_dummy_ops; @@ -133,9 +132,6 @@ struct cn10k_ml_dev { /* OCM info */ struct cn10k_ml_ocm ocm; - /* Extended stats data */ - struct cnxk_ml_xstats xstats; - /* Enable / disable model data caching */ int cache_model_data; diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 8cbf700f6e..776ad60401 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -198,107 +198,21 @@ cn10k_ml_prep_fp_job_descriptor(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_r req->cn10k_req.jd.model_run.num_batches = op->nb_batches; } -static int -cn10k_ml_xstats_init(struct rte_ml_dev *dev) -{ - struct cn10k_ml_dev *cn10k_mldev; - struct cnxk_ml_dev *cnxk_mldev; - uint16_t nb_stats; - uint16_t stat_id; - uint16_t model; - uint16_t i; - - cnxk_mldev = dev->data->dev_private; - cn10k_mldev = &cnxk_mldev->cn10k_mldev; - - /* Allocate memory for xstats entries. Don't allocate during reconfigure */ - nb_stats = RTE_DIM(device_xstats) + ML_CNXK_MAX_MODELS * RTE_DIM(layer_xstats); - if (cn10k_mldev->xstats.entries == NULL) - cn10k_mldev->xstats.entries = rte_zmalloc( - "cn10k_ml_xstats", sizeof(struct cnxk_ml_xstats_entry) * nb_stats, - PLT_CACHE_LINE_SIZE); - - if (cn10k_mldev->xstats.entries == NULL) - return -ENOMEM; - - /* Initialize device xstats */ - stat_id = 0; - for (i = 0; i < RTE_DIM(device_xstats); i++) { - cn10k_mldev->xstats.entries[stat_id].map.id = stat_id; - snprintf(cn10k_mldev->xstats.entries[stat_id].map.name, -sizeof(cn10k_mldev->xstats.entries[stat_id].map.name), "%s", -device_xstats[i].name); - - cn10k_mldev->xstats.entries[stat_id].mode = RTE_ML_DEV_XSTATS_DEVICE; - cn10k_mldev->xstats.entries[stat_id].type = device_xstats[i].type; - cn10k_mldev->xstats.entries[stat_id].fn_id = CNXK_ML_XSTATS_FN_DEVICE; - cn10k_mldev->xstats.entries[stat_id].obj_idx = 0; - cn10k_mldev->xstats.entries[stat_id].reset_allowed = device_xstats[i].reset_allowed; - stat_id++; - } - cn10k_mldev->xstats.count_mode_device = stat_id; - - /* Initialize model xstats */ - for (model = 0; model < ML_CNXK_MAX_MODELS; model++) { - cn10k_mldev->xstats.offset_for_model[model] = stat_id; - - for (i = 0; i < RTE_DIM(layer_xstats); i++) { - cn10k_mldev->xstats.entries[stat_id].map.id = stat_id; - cn10k_mldev->xstats.entries[stat_id].mode = RTE_ML_DEV_XSTATS_MODEL; - cn10k_mldev->xstats.entries[stat_id].type = layer_xstats[i].type; - cn10k_mldev->xstats.entries[stat_id].fn_id = CNXK_ML_XSTATS_FN_MODEL; - cn10k_mldev->xstats.entries[stat_id].obj_idx = model; - cn10k_mldev->xstats.entries[stat_id].reset_allowed = - layer_xstats[i].reset_allowed; - - /* Name of xstat is updated during model load */ - snprintf(cn10k_mldev->xstats.entries[stat_id].map.name, - sizeof(cn10k_mldev->xstats.entries[stat_id].map.name), -"Model-%u-%s", model, layer_xstats[i].name); - - stat_id++; - } - - cn10k_mldev->xstats.count_per_model[model] = RTE_DIM(layer_xstats); - } - - cn10k_mldev->xstats.count_mode_model = stat_id - cn10k_mldev->xstats.count_mode_device; - cn10k_mldev->xstats.count = stat_id; - - return 0; -} - static void -cn10k_ml_xstats_uninit(struct rte_ml_dev *dev) +cn10k_ml_xstats_layer_name_update(struct cnxk_ml_dev *cnxk_mldev, uint16_t model_id, + uint16_t layer_id) { - struct cn10k_
[PATCH v8 08/34] ml/cnxk: update queue-pair handling functions
Added cnxk wrapper function to handle ML device queue-pairs. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 135 + drivers/ml/cnxk/cn10k_ml_ops.h | 7 +- drivers/ml/cnxk/cnxk_ml_ops.c | 153 - drivers/ml/cnxk/cnxk_ml_ops.h | 3 - 4 files changed, 154 insertions(+), 144 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index f8c51ab394..9691cf03e3 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -95,93 +95,12 @@ cn10k_ml_get_poll_ptr(struct cnxk_ml_req *req) return plt_read64(req->status); } -static void -qp_memzone_name_get(char *name, int size, int dev_id, int qp_id) -{ - snprintf(name, size, "cnxk_ml_qp_mem_%u:%u", dev_id, qp_id); -} - -int -cnxk_ml_qp_destroy(const struct rte_ml_dev *dev, struct cnxk_ml_qp *qp) -{ - const struct rte_memzone *qp_mem; - char name[RTE_MEMZONE_NAMESIZE]; - int ret; - - qp_memzone_name_get(name, RTE_MEMZONE_NAMESIZE, dev->data->dev_id, qp->id); - qp_mem = rte_memzone_lookup(name); - ret = rte_memzone_free(qp_mem); - if (ret) - return ret; - - rte_free(qp); - - return 0; -} - -int -cn10k_ml_dev_queue_pair_release(struct rte_ml_dev *dev, uint16_t queue_pair_id) -{ - struct cnxk_ml_qp *qp; - int ret; - - qp = dev->data->queue_pairs[queue_pair_id]; - if (qp == NULL) - return -EINVAL; - - ret = cnxk_ml_qp_destroy(dev, qp); - if (ret) { - plt_err("Could not destroy queue pair %u", queue_pair_id); - return ret; - } - - dev->data->queue_pairs[queue_pair_id] = NULL; - - return 0; -} - -static struct cnxk_ml_qp * -cnxk_ml_qp_create(const struct rte_ml_dev *dev, uint16_t qp_id, uint32_t nb_desc, int socket_id) +void +cn10k_ml_qp_initialize(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_qp *qp) { - const struct rte_memzone *qp_mem; - char name[RTE_MEMZONE_NAMESIZE]; - struct cnxk_ml_qp *qp; - uint32_t len; - uint8_t *va; uint64_t i; - /* Allocate queue pair */ - qp = rte_zmalloc_socket("cn10k_ml_pmd_queue_pair", sizeof(struct cnxk_ml_qp), ROC_ALIGN, - socket_id); - if (qp == NULL) { - plt_err("Could not allocate queue pair"); - return NULL; - } - - /* For request queue */ - len = nb_desc * sizeof(struct cnxk_ml_req); - qp_memzone_name_get(name, RTE_MEMZONE_NAMESIZE, dev->data->dev_id, qp_id); - qp_mem = rte_memzone_reserve_aligned( - name, len, socket_id, RTE_MEMZONE_SIZE_HINT_ONLY | RTE_MEMZONE_256MB, ROC_ALIGN); - if (qp_mem == NULL) { - plt_err("Could not reserve memzone: %s", name); - goto qp_free; - } - - va = qp_mem->addr; - memset(va, 0, len); - - /* Initialize Request queue */ - qp->id = qp_id; - qp->queue.reqs = (struct cnxk_ml_req *)va; - qp->queue.head = 0; - qp->queue.tail = 0; - qp->queue.wait_cycles = ML_CNXK_CMD_TIMEOUT * plt_tsc_hz(); - qp->nb_desc = nb_desc; - qp->stats.enqueued_count = 0; - qp->stats.dequeued_count = 0; - qp->stats.enqueue_err_count = 0; - qp->stats.dequeue_err_count = 0; + RTE_SET_USED(cnxk_mldev); /* Initialize job command */ for (i = 0; i < qp->nb_desc; i++) { @@ -189,13 +108,6 @@ cnxk_ml_qp_create(const struct rte_ml_dev *dev, uint16_t qp_id, uint32_t nb_desc qp->queue.reqs[i].cn10k_req.jcmd.w1.s.jobptr = PLT_U64_CAST(&qp->queue.reqs[i].cn10k_req.jd); } - - return qp; - -qp_free: - rte_free(qp); - - return NULL; } static void @@ -1002,47 +914,6 @@ cn10k_ml_dev_stop(struct cnxk_ml_dev *cnxk_mldev) return 0; } -int -cn10k_ml_dev_queue_pair_setup(struct rte_ml_dev *dev, uint16_t queue_pair_id, - const struct rte_ml_dev_qp_conf *qp_conf, int socket_id) -{ - struct rte_ml_dev_info dev_info; - struct cnxk_ml_qp *qp; - uint32_t nb_desc; - - if (queue_pair_id >= dev->data->nb_queue_pairs) { - plt_err("Queue-pair id = %u (>= max queue pairs supported, %u)\n", queue_pair_id, - dev->data->nb_queue_pairs); - return -EINVAL; - } - - if (dev->data->queue_pairs[queue_pair_id] != NULL) - cn10k_ml_dev_queue_pair_release(dev, queue_pair_id); - - cnxk_ml_dev_info_get(dev, &dev_info); - if ((qp_conf->nb_desc > dev_info.max_desc) || (qp_conf->nb_desc == 0)) { - plt_err("Could not setup queue pair for %u descriptors", qp_conf->nb_desc); - return -EINVAL; - } - plt_ml_dbg("Creating queue-pair, queue_pair_id = %u, nb_desc = %u", queue_pair_id, - qp
[PATCH v8 10/34] ml/cnxk: update model start and stop functions
Implemented cnxk wrapper functions to start and stop ML models. Wrapper functions would invoke the cn10k model start and stop functions. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ocm.c | 28 ++-- drivers/ml/cnxk/cn10k_ml_ocm.h | 12 +- drivers/ml/cnxk/cn10k_ml_ops.c | 282 - drivers/ml/cnxk/cn10k_ml_ops.h | 8 +- drivers/ml/cnxk/cnxk_ml_ops.c | 48 +- drivers/ml/cnxk/cnxk_ml_ops.h | 1 + 6 files changed, 240 insertions(+), 139 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_ocm.c b/drivers/ml/cnxk/cn10k_ml_ocm.c index d71c36eae6..2197e5e0ed 100644 --- a/drivers/ml/cnxk/cn10k_ml_ocm.c +++ b/drivers/ml/cnxk/cn10k_ml_ocm.c @@ -215,11 +215,10 @@ cn10k_ml_ocm_tilecount(uint64_t tilemask, int *start, int *end) * scratch & WB pages and OCM allocation mode. */ int -cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t num_tiles, uint16_t wb_pages, +cn10k_ml_ocm_tilemask_find(struct cnxk_ml_dev *cnxk_mldev, uint8_t num_tiles, uint16_t wb_pages, uint16_t scratch_pages, uint64_t *tilemask) { struct cn10k_ml_dev *cn10k_mldev; - struct cnxk_ml_dev *cnxk_mldev; struct cn10k_ml_ocm *ocm; uint16_t used_scratch_pages_max; @@ -238,7 +237,6 @@ cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t num_tiles, uint16_t w int max_slot_sz; int page_id; - cnxk_mldev = dev->data->dev_private; cn10k_mldev = &cnxk_mldev->cn10k_mldev; ocm = &cn10k_mldev->ocm; @@ -333,12 +331,10 @@ cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t num_tiles, uint16_t w } void -cn10k_ml_ocm_reserve_pages(struct rte_ml_dev *dev, uint16_t model_id, uint16_t layer_id, +cn10k_ml_ocm_reserve_pages(struct cnxk_ml_dev *cnxk_mldev, uint16_t model_id, uint16_t layer_id, uint64_t tilemask, int wb_page_start, uint16_t wb_pages, uint16_t scratch_pages) { - struct cn10k_ml_dev *cn10k_mldev; - struct cnxk_ml_dev *cnxk_mldev; struct cnxk_ml_model *model; struct cnxk_ml_layer *layer; struct cn10k_ml_ocm *ocm; @@ -351,10 +347,8 @@ cn10k_ml_ocm_reserve_pages(struct rte_ml_dev *dev, uint16_t model_id, uint16_t l int tile_id; int page_id; - cnxk_mldev = dev->data->dev_private; - cn10k_mldev = &cnxk_mldev->cn10k_mldev; - ocm = &cn10k_mldev->ocm; - model = dev->data->models[model_id]; + ocm = &cnxk_mldev->cn10k_mldev.ocm; + model = cnxk_mldev->mldev->data->models[model_id]; layer = &model->layer[layer_id]; /* Get first set bit, tile_start */ @@ -396,12 +390,10 @@ cn10k_ml_ocm_reserve_pages(struct rte_ml_dev *dev, uint16_t model_id, uint16_t l } void -cn10k_ml_ocm_free_pages(struct rte_ml_dev *dev, uint16_t model_id, uint16_t layer_id) +cn10k_ml_ocm_free_pages(struct cnxk_ml_dev *cnxk_mldev, uint16_t model_id, uint16_t layer_id) { struct cnxk_ml_model *local_model; struct cnxk_ml_layer *local_layer; - struct cn10k_ml_dev *cn10k_mldev; - struct cnxk_ml_dev *cnxk_mldev; struct cnxk_ml_model *model; struct cnxk_ml_layer *layer; struct cn10k_ml_ocm *ocm; @@ -416,10 +408,8 @@ cn10k_ml_ocm_free_pages(struct rte_ml_dev *dev, uint16_t model_id, uint16_t laye uint16_t i; uint16_t j; - cnxk_mldev = dev->data->dev_private; - cn10k_mldev = &cnxk_mldev->cn10k_mldev; - ocm = &cn10k_mldev->ocm; - model = dev->data->models[model_id]; + ocm = &cnxk_mldev->cn10k_mldev.ocm; + model = cnxk_mldev->mldev->data->models[model_id]; layer = &model->layer[layer_id]; /* Update OCM info for WB memory */ @@ -438,8 +428,8 @@ cn10k_ml_ocm_free_pages(struct rte_ml_dev *dev, uint16_t model_id, uint16_t laye /* Get max scratch pages required, excluding the current model */ scratch_resize_pages = 0; - for (i = 0; i < dev->data->nb_models; i++) { - local_model = dev->data->models[i]; + for (i = 0; i < cnxk_mldev->mldev->data->nb_models; i++) { + local_model = cnxk_mldev->mldev->data->models[i]; if (local_model == NULL) continue; diff --git a/drivers/ml/cnxk/cn10k_ml_ocm.h b/drivers/ml/cnxk/cn10k_ml_ocm.h index 720f8caf76..97b723a56a 100644 --- a/drivers/ml/cnxk/cn10k_ml_ocm.h +++ b/drivers/ml/cnxk/cn10k_ml_ocm.h @@ -8,6 +8,8 @@ #include #include +struct cnxk_ml_dev; + /* Number of OCM tiles. */ #define ML_CN10K_OCM_NUMTILES 0x8 @@ -75,12 +77,12 @@ struct cn10k_ml_ocm { }; int cn10k_ml_ocm_tilecount(uint64_t tilemask, int *start, int *end); -int cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t num_tiles, uint16_t wb_pages, +int cn10k_ml_ocm_tilemask_find(struct cnxk_ml_dev *cnxk_mldev, uint8_t num_tiles, uint16_t wb_pages,
[PATCH v8 19/34] ml/cnxk: add structures to support TVM model type
Introduced model type, sub-type and layer type. Added internal structures for TVM model objects. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ocm.c | 3 ++ drivers/ml/cnxk/cn10k_ml_ops.c | 6 ++- drivers/ml/cnxk/cnxk_ml_model.h | 66 +++- drivers/ml/cnxk/cnxk_ml_ops.c| 52 - drivers/ml/cnxk/mvtvm_ml_model.h | 46 ++ 5 files changed, 160 insertions(+), 13 deletions(-) create mode 100644 drivers/ml/cnxk/mvtvm_ml_model.h diff --git a/drivers/ml/cnxk/cn10k_ml_ocm.c b/drivers/ml/cnxk/cn10k_ml_ocm.c index dc315cce10..749ddeb344 100644 --- a/drivers/ml/cnxk/cn10k_ml_ocm.c +++ b/drivers/ml/cnxk/cn10k_ml_ocm.c @@ -435,6 +435,9 @@ cn10k_ml_ocm_free_pages(struct cnxk_ml_dev *cnxk_mldev, uint16_t model_id, uint1 for (j = 0; j < local_model->nb_layers; j++) { local_layer = &local_model->layer[j]; + if (local_layer->type != ML_CNXK_LAYER_TYPE_MRVL) + continue; + if (local_layer != layer && local_layer->glow.ocm_map.ocm_reserved) { if (IS_BIT_SET(local_layer->glow.ocm_map.tilemask, tile_id)) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 65eaaf030d..a471e98fbf 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -725,6 +725,9 @@ cn10k_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_model_params * if (ret != 0) return ret; + /* Set model sub type */ + model->subtype = ML_CNXK_MODEL_SUBTYPE_GLOW_MRVL; + /* Copy metadata to internal buffer */ rte_memcpy(&model->glow.metadata, params->addr, sizeof(struct cn10k_ml_model_metadata)); cn10k_ml_model_metadata_update(&model->glow.metadata); @@ -746,6 +749,7 @@ cn10k_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_model_params * /* Load layer and get the index */ layer = &model->layer[0]; + layer->type = ML_CNXK_LAYER_TYPE_MRVL; ret = cn10k_ml_layer_load(cnxk_mldev, model->model_id, NULL, params->addr, params->size, &layer->index); if (ret != 0) { @@ -969,7 +973,7 @@ cn10k_ml_layer_start(void *device, uint16_t model_id, const char *layer_name) if (ret < 0) { cn10k_ml_layer_stop(device, model_id, layer_name); } else { - if (cn10k_mldev->cache_model_data) + if (cn10k_mldev->cache_model_data && model->type == ML_CNXK_MODEL_TYPE_GLOW) ret = cn10k_ml_cache_model_data(cnxk_mldev, layer); } diff --git a/drivers/ml/cnxk/cnxk_ml_model.h b/drivers/ml/cnxk/cnxk_ml_model.h index f618e5aa5f..f100eca203 100644 --- a/drivers/ml/cnxk/cnxk_ml_model.h +++ b/drivers/ml/cnxk/cnxk_ml_model.h @@ -11,6 +11,10 @@ #include "cn10k_ml_model.h" +#ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM +#include "mvtvm_ml_model.h" +#endif + #include "cnxk_ml_io.h" struct cnxk_ml_dev; @@ -18,6 +22,48 @@ struct cnxk_ml_model; struct cnxk_ml_qp; struct cnxk_ml_req; +/* Model type */ +enum cnxk_ml_model_type { + /* Unknown model type */ + ML_CNXK_MODEL_TYPE_UNKNOWN, + + /* Invalid model type */ + ML_CNXK_MODEL_TYPE_INVALID, + + /* Glow compiled model, for MLIP target */ + ML_CNXK_MODEL_TYPE_GLOW, + + /* TVM compiled model, for ARM64 / ARM64 + MLIP target */ + ML_CNXK_MODEL_TYPE_TVM, +}; + +/* Model subtype */ +enum cnxk_ml_model_subtype { + /* Marvell Glow model */ + ML_CNXK_MODEL_SUBTYPE_GLOW_MRVL, + + /* TVM model with single MRVL region */ + ML_CNXK_MODEL_SUBTYPE_TVM_MRVL, + + /* TVM model with LLVM regions only */ + ML_CNXK_MODEL_SUBTYPE_TVM_LLVM, + + /* TVM hybrid model, with both MRVL and LLVM regions or (> 1) MRVL regions*/ + ML_CNXK_MODEL_SUBTYPE_TVM_HYBRID, +}; + +/* Layer type */ +enum cnxk_ml_layer_type { + /* MRVL layer, for MLIP target*/ + ML_CNXK_LAYER_TYPE_UNKNOWN = 0, + + /* MRVL layer, for MLIP target*/ + ML_CNXK_LAYER_TYPE_MRVL, + + /* LLVM layer, for ARM64 target*/ + ML_CNXK_LAYER_TYPE_LLVM, +}; + /* Model state */ enum cnxk_ml_model_state { /* Unknown state */ @@ -53,6 +99,9 @@ struct cnxk_ml_layer { /* Name*/ char name[RTE_ML_STR_MAX]; + /* Type */ + enum cnxk_ml_layer_type type; + /* Model handle */ struct cnxk_ml_model *model; @@ -83,14 +132,27 @@ struct cnxk_ml_model { /* Device reference */ struct cnxk_ml_dev *cnxk_mldev; + /* Type */ + enum cnxk_ml_model_type type; + + /* Model subtype */ + enum cnxk_ml_model_subtype subtype; + /* ID */ uint16_t model_id; /* Name */ char name[R
[PATCH v8 11/34] ml/cnxk: update model utility functions
Added cnxk wrapper function to update model params and fetch model info. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 38 ++- drivers/ml/cnxk/cn10k_ml_ops.h | 5 ++-- drivers/ml/cnxk/cnxk_ml_ops.c | 48 -- 3 files changed, 56 insertions(+), 35 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 40f484158a..3ff82829f0 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -1835,45 +1835,23 @@ cn10k_ml_model_stop(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model) } int -cn10k_ml_model_info_get(struct rte_ml_dev *dev, uint16_t model_id, - struct rte_ml_model_info *model_info) +cn10k_ml_model_params_update(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model, +void *buffer) { - struct cnxk_ml_model *model; - - model = dev->data->models[model_id]; - - if (model == NULL) { - plt_err("Invalid model_id = %u", model_id); - return -EINVAL; - } - - rte_memcpy(model_info, model->info, sizeof(struct rte_ml_model_info)); - model_info->input_info = ((struct rte_ml_model_info *)model->info)->input_info; - model_info->output_info = ((struct rte_ml_model_info *)model->info)->output_info; - - return 0; -} - -int -cn10k_ml_model_params_update(struct rte_ml_dev *dev, uint16_t model_id, void *buffer) -{ - struct cnxk_ml_model *model; - - model = dev->data->models[model_id]; + struct cnxk_ml_layer *layer; - if (model == NULL) { - plt_err("Invalid model_id = %u", model_id); - return -EINVAL; - } + RTE_SET_USED(cnxk_mldev); if (model->state == ML_CNXK_MODEL_STATE_UNKNOWN) return -1; else if (model->state != ML_CNXK_MODEL_STATE_LOADED) return -EBUSY; + layer = &model->layer[0]; + /* Update model weights & bias */ - rte_memcpy(model->layer[0].glow.addr.wb_load_addr, buffer, - model->layer[0].glow.metadata.weights_bias.file_size); + rte_memcpy(layer->glow.addr.wb_load_addr, buffer, + layer->glow.metadata.weights_bias.file_size); return 0; } diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h index a222a43d55..ef12069f0d 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.h +++ b/drivers/ml/cnxk/cn10k_ml_ops.h @@ -317,9 +317,8 @@ int cn10k_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_model_para int cn10k_ml_model_unload(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model); int cn10k_ml_model_start(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model); int cn10k_ml_model_stop(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model); -int cn10k_ml_model_info_get(struct rte_ml_dev *dev, uint16_t model_id, - struct rte_ml_model_info *model_info); -int cn10k_ml_model_params_update(struct rte_ml_dev *dev, uint16_t model_id, void *buffer); +int cn10k_ml_model_params_update(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model, +void *buffer); /* I/O ops */ int cn10k_ml_io_quantize(struct rte_ml_dev *dev, uint16_t model_id, diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c index b61ed45876..9ce37fcfd1 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.c +++ b/drivers/ml/cnxk/cnxk_ml_ops.c @@ -604,6 +604,50 @@ cnxk_ml_model_stop(struct rte_ml_dev *dev, uint16_t model_id) return cn10k_ml_model_stop(cnxk_mldev, model); } +static int +cnxk_ml_model_info_get(struct rte_ml_dev *dev, uint16_t model_id, + struct rte_ml_model_info *model_info) +{ + struct rte_ml_model_info *info; + struct cnxk_ml_model *model; + + if ((dev == NULL) || (model_info == NULL)) + return -EINVAL; + + model = dev->data->models[model_id]; + if (model == NULL) { + plt_err("Invalid model_id = %u", model_id); + return -EINVAL; + } + + info = (struct rte_ml_model_info *)model->info; + rte_memcpy(model_info, info, sizeof(struct rte_ml_model_info)); + model_info->input_info = info->input_info; + model_info->output_info = info->output_info; + + return 0; +} + +static int +cnxk_ml_model_params_update(struct rte_ml_dev *dev, uint16_t model_id, void *buffer) +{ + struct cnxk_ml_dev *cnxk_mldev; + struct cnxk_ml_model *model; + + if ((dev == NULL) || (buffer == NULL)) + return -EINVAL; + + cnxk_mldev = dev->data->dev_private; + + model = dev->data->models[model_id]; + if (model == NULL) { + plt_err("Invalid model_id = %u", model_id); + return -EINVAL; + } + + return cn10k_ml_model_params_update(cnxk_mldev, model, buffer); +} +
[PATCH v8 21/34] ml/cnxk: add support to parse TVM model objects
Added support to parse TVM model objects from the model archive buffer. Added support to check for all expected objects and copy TVM model objects to internal buffers. Signed-off-by: Srikanth Yalavarthi Signed-off-by: Anup Prabhu --- drivers/ml/cnxk/cnxk_ml_ops.c| 5 ++- drivers/ml/cnxk/mvtvm_ml_model.c | 57 + drivers/ml/cnxk/mvtvm_ml_model.h | 2 ++ drivers/ml/cnxk/mvtvm_ml_ops.c | 62 drivers/ml/cnxk/mvtvm_ml_ops.h | 3 ++ drivers/ml/cnxk/mvtvm_ml_stubs.c | 11 ++ drivers/ml/cnxk/mvtvm_ml_stubs.h | 3 ++ 7 files changed, 142 insertions(+), 1 deletion(-) diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c index ebc78e36e9..85b37161d2 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.c +++ b/drivers/ml/cnxk/cnxk_ml_ops.c @@ -1079,7 +1079,10 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, u model, PLT_ALIGN_CEIL(sizeof(struct cnxk_ml_model), dev_info.align_size)); dev->data->models[lcl_model_id] = model; - ret = cn10k_ml_model_load(cnxk_mldev, params, model); + if (type == ML_CNXK_MODEL_TYPE_GLOW) + ret = cn10k_ml_model_load(cnxk_mldev, params, model); + else + ret = mvtvm_ml_model_load(cnxk_mldev, params, model); if (ret != 0) goto error; diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c index ab5f8baa67..4c9a080c05 100644 --- a/drivers/ml/cnxk/mvtvm_ml_model.c +++ b/drivers/ml/cnxk/mvtvm_ml_model.c @@ -53,3 +53,60 @@ mvtvm_ml_model_type_get(struct rte_ml_model_params *params) return ML_CNXK_MODEL_TYPE_TVM; } + +int +mvtvm_ml_model_blob_parse(struct rte_ml_model_params *params, struct mvtvm_ml_model_object *object) +{ + bool object_found[ML_MVTVM_MODEL_OBJECT_MAX] = {false, false, false}; + struct archive_entry *entry; + struct archive *a; + uint8_t i; + int ret; + + /* Open archive */ + a = archive_read_new(); + archive_read_support_filter_all(a); + archive_read_support_format_all(a); + + ret = archive_read_open_memory(a, params->addr, params->size); + if (ret != ARCHIVE_OK) + return archive_errno(a); + + /* Read archive */ + while (archive_read_next_header(a, &entry) == ARCHIVE_OK) { + for (i = 0; i < ML_MVTVM_MODEL_OBJECT_MAX; i++) { + if (!object_found[i] && + (strcmp(archive_entry_pathname(entry), mvtvm_object_list[i]) == 0)) { + memcpy(object[i].name, mvtvm_object_list[i], RTE_ML_STR_MAX); + object[i].size = archive_entry_size(entry); + object[i].buffer = rte_malloc(NULL, object[i].size, 0); + + if (archive_read_data(a, object[i].buffer, object[i].size) != + object[i].size) { + plt_err("Failed to read object from model archive: %s", + object[i].name); + goto error; + } + object_found[i] = true; + } + } + archive_read_data_skip(a); + } + + /* Check if all objects are parsed */ + for (i = 0; i < ML_MVTVM_MODEL_OBJECT_MAX; i++) { + if (!object_found[i]) { + plt_err("Object %s not found in archive!\n", mvtvm_object_list[i]); + goto error; + } + } + return 0; + +error: + for (i = 0; i < ML_MVTVM_MODEL_OBJECT_MAX; i++) { + if (object[i].buffer != NULL) + rte_free(object[i].buffer); + } + + return -EINVAL; +} diff --git a/drivers/ml/cnxk/mvtvm_ml_model.h b/drivers/ml/cnxk/mvtvm_ml_model.h index b6162fceec..b11b66f495 100644 --- a/drivers/ml/cnxk/mvtvm_ml_model.h +++ b/drivers/ml/cnxk/mvtvm_ml_model.h @@ -44,5 +44,7 @@ struct mvtvm_ml_model_data { }; enum cnxk_ml_model_type mvtvm_ml_model_type_get(struct rte_ml_model_params *params); +int mvtvm_ml_model_blob_parse(struct rte_ml_model_params *params, + struct mvtvm_ml_model_object *object); #endif /* _MVTVM_ML_MODEL_H_ */ diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c index 88c6d5a864..e2413b6b15 100644 --- a/drivers/ml/cnxk/mvtvm_ml_ops.c +++ b/drivers/ml/cnxk/mvtvm_ml_ops.c @@ -8,8 +8,12 @@ #include #include "cnxk_ml_dev.h" +#include "cnxk_ml_model.h" #include "cnxk_ml_ops.h" +/* ML model macros */ +#define MVTVM_ML_MODEL_MEMZONE_NAME "ml_mvtvm_model_mz" + int mvtvm_ml_dev_configure(struct cnxk_ml_dev *cnxk_mldev, const struct rte_ml_dev_config *conf) { @@ -39,3 +43,61 @@ mvtvm_ml_dev_close(struct cnxk_ml_dev *cnxk_mldev)
[PATCH v8 12/34] ml/cnxk: update data quantization functions
Added cnxk wrapper functions to quantize input data and dequantize output data. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 164 - drivers/ml/cnxk/cn10k_ml_ops.h | 7 -- drivers/ml/cnxk/cnxk_ml_io.c | 95 +++ drivers/ml/cnxk/cnxk_ml_io.h | 3 + drivers/ml/cnxk/cnxk_ml_ops.c | 78 +++- drivers/ml/cnxk/meson.build| 1 + 6 files changed, 175 insertions(+), 173 deletions(-) create mode 100644 drivers/ml/cnxk/cnxk_ml_io.c diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 3ff82829f0..c68e6c620c 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -1856,170 +1856,6 @@ cn10k_ml_model_params_update(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_mode return 0; } -int -cn10k_ml_io_quantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_buff_seg **dbuffer, -struct rte_ml_buff_seg **qbuffer) -{ - struct cnxk_ml_model *model; - uint8_t model_input_type; - uint8_t *lcl_dbuffer; - uint8_t *lcl_qbuffer; - uint8_t input_type; - float qscale; - uint32_t i; - uint32_t j; - int ret; - - model = dev->data->models[model_id]; - - if (model == NULL) { - plt_err("Invalid model_id = %u", model_id); - return -EINVAL; - } - - lcl_dbuffer = dbuffer[0]->addr; - lcl_qbuffer = qbuffer[0]->addr; - - for (i = 0; i < model->layer[0].glow.metadata.model.num_input; i++) { - if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) { - input_type = model->layer[0].glow.metadata.input1[i].input_type; - model_input_type = model->layer[0].glow.metadata.input1[i].model_input_type; - qscale = model->layer[0].glow.metadata.input1[i].qscale; - } else { - j = i - MRVL_ML_NUM_INPUT_OUTPUT_1; - input_type = model->layer[0].glow.metadata.input2[j].input_type; - model_input_type = model->layer[0].glow.metadata.input2[j].model_input_type; - qscale = model->layer[0].glow.metadata.input2[j].qscale; - } - - if (input_type == model_input_type) { - rte_memcpy(lcl_qbuffer, lcl_dbuffer, model->layer[0].info.input[i].sz_d); - } else { - switch (model->layer[0].glow.metadata.input1[i].model_input_type) { - case RTE_ML_IO_TYPE_INT8: - ret = rte_ml_io_float32_to_int8( - qscale, model->layer[0].info.input[i].nb_elements, - lcl_dbuffer, lcl_qbuffer); - break; - case RTE_ML_IO_TYPE_UINT8: - ret = rte_ml_io_float32_to_uint8( - qscale, model->layer[0].info.input[i].nb_elements, - lcl_dbuffer, lcl_qbuffer); - break; - case RTE_ML_IO_TYPE_INT16: - ret = rte_ml_io_float32_to_int16( - qscale, model->layer[0].info.input[i].nb_elements, - lcl_dbuffer, lcl_qbuffer); - break; - case RTE_ML_IO_TYPE_UINT16: - ret = rte_ml_io_float32_to_uint16( - qscale, model->layer[0].info.input[i].nb_elements, - lcl_dbuffer, lcl_qbuffer); - break; - case RTE_ML_IO_TYPE_FP16: - ret = rte_ml_io_float32_to_float16( - model->layer[0].info.input[i].nb_elements, lcl_dbuffer, - lcl_qbuffer); - break; - default: - plt_err("Unsupported model_input_type[%u] : %u", i, - model->layer[0].glow.metadata.input1[i].model_input_type); - ret = -ENOTSUP; - } - if (ret < 0) - return ret; - } - - lcl_dbuffer += model->layer[0].info.input[i].sz_d; - lcl_qbuffer += model->layer[0].info.input[i].sz_q; - } - - return 0; -} - -int -cn10k_ml_io_dequantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_buff_seg **qbuffer, - struct rte_ml_buff_seg **dbuffer) -{ - struct cnxk_ml_model *model; - uint8_t model_output_type; - uint8_t *lcl_qbuffer; - uint8_t *lcl_dbuffer; -
[PATCH v8 22/34] ml/cnxk: fetch layer info and load TVM model
Added support to fetch TVM model layer information and update internal structures based on the layer information Set callback functions for layer load and unload and enable model loading using TVMDP library. Added support to fetch full metadata after model load. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_model.c | 11 + drivers/ml/cnxk/cn10k_ml_model.h | 2 + drivers/ml/cnxk/cn10k_ml_ops.c | 7 ++- drivers/ml/cnxk/mvtvm_ml_model.c | 25 ++ drivers/ml/cnxk/mvtvm_ml_model.h | 4 ++ drivers/ml/cnxk/mvtvm_ml_ops.c | 81 drivers/ml/cnxk/mvtvm_ml_stubs.c | 10 drivers/ml/cnxk/mvtvm_ml_stubs.h | 3 ++ 8 files changed, 141 insertions(+), 2 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_model.c b/drivers/ml/cnxk/cn10k_ml_model.c index af9d5a666f..0325cd54f1 100644 --- a/drivers/ml/cnxk/cn10k_ml_model.c +++ b/drivers/ml/cnxk/cn10k_ml_model.c @@ -716,3 +716,14 @@ cn10k_ml_layer_print(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer *layer cnxk_ml_print_line(fp, LINE_LEN); fprintf(fp, "\n"); } + +int +cn10k_ml_model_get_layer_id(struct cnxk_ml_model *model, const char *layer_name, uint16_t *layer_id) +{ + if (model->type == ML_CNXK_MODEL_TYPE_TVM) + return mvtvm_ml_model_get_layer_id(model, layer_name, layer_id); + + *layer_id = 0; + + return 0; +} diff --git a/drivers/ml/cnxk/cn10k_ml_model.h b/drivers/ml/cnxk/cn10k_ml_model.h index 45f2ed5fcf..6744175cd5 100644 --- a/drivers/ml/cnxk/cn10k_ml_model.h +++ b/drivers/ml/cnxk/cn10k_ml_model.h @@ -461,5 +461,7 @@ void cn10k_ml_model_info_set(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_mode struct cnxk_ml_io_info *io_info, struct cn10k_ml_model_metadata *metadata); void cn10k_ml_layer_print(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer *layer, FILE *fp); +int cn10k_ml_model_get_layer_id(struct cnxk_ml_model *model, const char *layer_name, + uint16_t *layer_id); #endif /* _CN10K_ML_MODEL_H_ */ diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index a471e98fbf..4191ccc840 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -576,7 +576,7 @@ cn10k_ml_layer_load(void *device, uint16_t model_id, const char *layer_name, uin size_t layer_xstats_size; uint8_t *base_dma_addr; uint16_t scratch_pages; - uint16_t layer_id = 0; + uint16_t layer_id; uint16_t wb_pages; uint64_t mz_size; uint16_t idx; @@ -584,7 +584,6 @@ cn10k_ml_layer_load(void *device, uint16_t model_id, const char *layer_name, uin int ret; PLT_SET_USED(size); - PLT_SET_USED(layer_name); cnxk_mldev = (struct cnxk_ml_dev *)device; if (cnxk_mldev == NULL) { @@ -598,6 +597,10 @@ cn10k_ml_layer_load(void *device, uint16_t model_id, const char *layer_name, uin return -EINVAL; } + ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id); + if (ret != 0) + return ret; + layer = &model->layer[layer_id]; ret = cn10k_ml_model_metadata_check(buffer, size); diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c index 4c9a080c05..8536fd8927 100644 --- a/drivers/ml/cnxk/mvtvm_ml_model.c +++ b/drivers/ml/cnxk/mvtvm_ml_model.c @@ -110,3 +110,28 @@ mvtvm_ml_model_blob_parse(struct rte_ml_model_params *params, struct mvtvm_ml_mo return -EINVAL; } + +int +mvtvm_ml_model_get_layer_id(struct cnxk_ml_model *model, const char *layer_name, uint16_t *layer_id) +{ + uint16_t i; + + for (i = 0; i < model->mvtvm.metadata.model.nb_layers; i++) { + if (strcmp(model->layer[i].name, layer_name) == 0) + break; + } + + if (i == model->mvtvm.metadata.model.nb_layers) { + plt_err("Invalid layer name: %s", layer_name); + return -EINVAL; + } + + if (model->layer[i].type != ML_CNXK_LAYER_TYPE_MRVL) { + plt_err("Invalid layer type, name: %s type: %d", layer_name, model->layer[i].type); + return -EINVAL; + } + + *layer_id = i; + + return 0; +} diff --git a/drivers/ml/cnxk/mvtvm_ml_model.h b/drivers/ml/cnxk/mvtvm_ml_model.h index b11b66f495..6cb2639876 100644 --- a/drivers/ml/cnxk/mvtvm_ml_model.h +++ b/drivers/ml/cnxk/mvtvm_ml_model.h @@ -11,6 +11,8 @@ #include "cnxk_ml_io.h" +struct cnxk_ml_model; + /* Maximum number of objects per model */ #define ML_MVTVM_MODEL_OBJECT_MAX 3 @@ -46,5 +48,7 @@ struct mvtvm_ml_model_data { enum cnxk_ml_model_type mvtvm_ml_model_type_get(struct rte_ml_model_params *params); int mvtvm_ml_model_blob_parse(struct rte_ml_model_params *params, struct mvtvm_ml_model_object *object); +int mvtvm_ml_model_get_layer_id(struct cnxk_ml_mo
[PATCH v8 16/34] ml/cnxk: update fast path functions
Implemented cnxk layer fast-path functions and added support for model specific fast-path functions. CNXK layer functions would invoke model specific fast-path functions. Added support for model specific poll handling functions and updated internal inference sync function. Drop use of rte_ml_op as argument. Updated function arguments to enable the function to be used as callback by TVM HW runtime. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_dev.h | 5 - drivers/ml/cnxk/cn10k_ml_ops.c | 241 drivers/ml/cnxk/cn10k_ml_ops.h | 13 +- drivers/ml/cnxk/cnxk_ml_model.h | 14 ++ drivers/ml/cnxk/cnxk_ml_ops.c | 128 + drivers/ml/cnxk/cnxk_ml_ops.h | 7 + 6 files changed, 216 insertions(+), 192 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h index bde9d08901..94a94d996f 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.h +++ b/drivers/ml/cnxk/cn10k_ml_dev.h @@ -143,11 +143,6 @@ struct cn10k_ml_dev { /* JCMD enqueue function handler */ bool (*ml_jcmdq_enqueue)(struct roc_ml *roc_ml, struct ml_job_cmd_s *job_cmd); - - /* Poll handling function pointers */ - void (*set_poll_addr)(struct cnxk_ml_req *req); - void (*set_poll_ptr)(struct cnxk_ml_req *req); - uint64_t (*get_poll_ptr)(struct cnxk_ml_req *req); }; uint64_t cn10k_ml_fw_flags_get(struct cn10k_ml_fw *fw); diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 776ad60401..8116c8dedb 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -65,24 +65,12 @@ static const struct cn10k_ml_stype_db_driver { {ML_DRIVER_ERR_FW_ERROR, "UNKNOWN FIRMWARE ERROR"}, }; -static inline void +__rte_hot void cn10k_ml_set_poll_addr(struct cnxk_ml_req *req) { req->status = &req->cn10k_req.status; } -static inline void -cn10k_ml_set_poll_ptr(struct cnxk_ml_req *req) -{ - plt_write64(ML_CNXK_POLL_JOB_START, req->status); -} - -static inline uint64_t -cn10k_ml_get_poll_ptr(struct cnxk_ml_req *req) -{ - return plt_read64(req->status); -} - void cn10k_ml_qp_initialize(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_qp *qp) { @@ -177,7 +165,7 @@ cn10k_ml_prep_sp_job_descriptor(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_l static __rte_always_inline void cn10k_ml_prep_fp_job_descriptor(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_req *req, - struct rte_ml_op *op) + uint16_t index, void *input, void *output, uint16_t nb_batches) { struct cn10k_ml_dev *cn10k_mldev; @@ -185,17 +173,17 @@ cn10k_ml_prep_fp_job_descriptor(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_r req->cn10k_req.jd.hdr.jce.w0.u64 = 0; req->cn10k_req.jd.hdr.jce.w1.u64 = PLT_U64_CAST(req->status); - req->cn10k_req.jd.hdr.model_id = op->model_id; + req->cn10k_req.jd.hdr.model_id = index; req->cn10k_req.jd.hdr.job_type = ML_CN10K_JOB_TYPE_MODEL_RUN; req->cn10k_req.jd.hdr.fp_flags = ML_FLAGS_POLL_COMPL; req->cn10k_req.jd.hdr.sp_flags = 0x0; req->cn10k_req.jd.hdr.result = roc_ml_addr_ap2mlip(&cn10k_mldev->roc, &req->cn10k_req.result); req->cn10k_req.jd.model_run.input_ddr_addr = - PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, op->input[0]->addr)); + PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, input)); req->cn10k_req.jd.model_run.output_ddr_addr = - PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, op->output[0]->addr)); - req->cn10k_req.jd.model_run.num_batches = op->nb_batches; + PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, output)); + req->cn10k_req.jd.model_run.num_batches = nb_batches; } static void @@ -311,30 +299,15 @@ cn10k_ml_model_xstat_get(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer *l static int cn10k_ml_cache_model_data(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer *layer) { - struct rte_ml_buff_seg seg[2]; - struct rte_ml_buff_seg *inp; - struct rte_ml_buff_seg *out; - struct rte_ml_op op; - char str[RTE_MEMZONE_NAMESIZE]; const struct plt_memzone *mz; uint64_t isize = 0; uint64_t osize = 0; int ret = 0; - uint32_t i; - - inp = &seg[0]; - out = &seg[1]; /* Create input and output buffers. */ - for (i = 0; i < layer->info.nb_inputs; i++) - isize += layer->info.input[i].sz_q; - - for (i = 0; i < layer->info.nb_outputs; i++) - osize += layer->info.output[i].sz_q; - - isize = layer->batch_size * isize; - osize = layer->batch_size * osize; + isize = layer->info.total_input_sz_q; + osize = layer->info.total_output_sz_q; snprintf(str, RTE_MEMZONE_NAMESIZE, "%s_%u", "ml_dummy_io", layer->index); mz = plt_m
[PATCH v8 25/34] ml/cnxk: enable OCM check for multilayer TVM model
From: Anup Prabhu Enabled check for OCM size requirement for multi-layer TVM model. Compute OCM scratch and WB requirement for all layers during the load stage. Signed-off-by: Anup Prabhu --- drivers/ml/cnxk/cnxk_ml_ops.c | 60 +++ 1 file changed, 60 insertions(+) diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c index ce668e1eb6..d1471971e4 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.c +++ b/drivers/ml/cnxk/cnxk_ml_ops.c @@ -1023,8 +1023,12 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, u char str[RTE_MEMZONE_NAMESIZE]; const struct plt_memzone *mz; + uint16_t max_scratch_pages; + struct cn10k_ml_ocm *ocm; uint64_t model_info_size; + uint16_t total_wb_pages; uint16_t lcl_model_id; + uint16_t layer_id; uint64_t mz_size; bool found; int ret; @@ -1086,6 +1090,62 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, u if (ret != 0) goto error; + max_scratch_pages = 0; + total_wb_pages = 0; + layer_id = 0; + + ocm = &cnxk_mldev->cn10k_mldev.ocm; + + if (model->type == ML_CNXK_MODEL_TYPE_GLOW) { + total_wb_pages = total_wb_pages + model->layer[layer_id].glow.ocm_map.wb_pages; + max_scratch_pages = PLT_MAX(max_scratch_pages, + model->layer[layer_id].glow.ocm_map.scratch_pages); +#ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM + } else { + for (layer_id = 0; layer_id < model->mvtvm.metadata.model.nb_layers; layer_id++) { + if (model->layer[layer_id].type == ML_CNXK_LAYER_TYPE_MRVL) { + total_wb_pages = total_wb_pages + + model->layer[layer_id].glow.ocm_map.wb_pages; + max_scratch_pages = + PLT_MAX(max_scratch_pages, + model->layer[layer_id].glow.ocm_map.scratch_pages); + } + } +#endif + } + + if ((total_wb_pages + max_scratch_pages) > ocm->num_pages) { + plt_err("model_id = %u: total_wb_pages (%u) + scratch_pages (%u) > %u\n", + lcl_model_id, total_wb_pages, max_scratch_pages, ocm->num_pages); + + if (model->type == ML_CNXK_MODEL_TYPE_GLOW) { + plt_ml_dbg("layer_id = %u: wb_pages = %u, scratch_pages = %u\n", layer_id, + model->layer[layer_id].glow.ocm_map.wb_pages, + model->layer[layer_id].glow.ocm_map.scratch_pages); +#ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM + } else { + for (layer_id = 0; layer_id < model->mvtvm.metadata.model.nb_layers; +layer_id++) { + if (model->layer[layer_id].type == ML_CNXK_LAYER_TYPE_MRVL) { + plt_ml_dbg( + "layer_id = %u: wb_pages = %u, scratch_pages = %u\n", + layer_id, + model->layer[layer_id].glow.ocm_map.wb_pages, + model->layer[layer_id].glow.ocm_map.scratch_pages); + } + } +#endif + } + + if (model->type == ML_CNXK_MODEL_TYPE_GLOW) + cn10k_ml_model_unload(cnxk_mldev, model); +#ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM + else { + mvtvm_ml_model_unload(cnxk_mldev, model); + return -ENOMEM; + } +#endif + } plt_spinlock_init(&model->lock); model->state = ML_CNXK_MODEL_STATE_LOADED; cnxk_mldev->nb_models_loaded++; -- 2.42.0
[PATCH v8 26/34] ml/cnxk: support start and stop for TVM models
Added support to start and stop TVM models. TVM model start would invoke layer start for all Glow layers part of the model. TVM model stop would invoke layer stop for all Glow layers part of the model. Signed-off-by: Srikanth Yalavarthi Signed-off-by: Anup Prabhu --- drivers/ml/cnxk/cn10k_ml_ops.c | 16 ++ drivers/ml/cnxk/cnxk_ml_ops.c| 14 +++-- drivers/ml/cnxk/mvtvm_ml_ops.c | 52 drivers/ml/cnxk/mvtvm_ml_ops.h | 2 ++ drivers/ml/cnxk/mvtvm_ml_stubs.c | 18 +++ drivers/ml/cnxk/mvtvm_ml_stubs.h | 2 ++ 6 files changed, 96 insertions(+), 8 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index e7208391fd..2d308802cf 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -827,7 +827,7 @@ cn10k_ml_layer_start(void *device, uint16_t model_id, const char *layer_name) struct cn10k_ml_ocm *ocm; struct cnxk_ml_req *req; - uint16_t layer_id = 0; + uint16_t layer_id; bool job_enqueued; bool job_dequeued; uint8_t num_tiles; @@ -838,8 +838,6 @@ cn10k_ml_layer_start(void *device, uint16_t model_id, const char *layer_name) bool locked; int ret = 0; - PLT_SET_USED(layer_name); - cnxk_mldev = (struct cnxk_ml_dev *)device; if (cnxk_mldev == NULL) { plt_err("Invalid device = %p", device); @@ -852,6 +850,10 @@ cn10k_ml_layer_start(void *device, uint16_t model_id, const char *layer_name) return -EINVAL; } + ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id); + if (ret != 0) + return ret; + layer = &model->layer[layer_id]; cn10k_mldev = &cnxk_mldev->cn10k_mldev; ocm = &cn10k_mldev->ocm; @@ -1015,14 +1017,12 @@ cn10k_ml_layer_stop(void *device, uint16_t model_id, const char *layer_name) struct cn10k_ml_ocm *ocm; struct cnxk_ml_req *req; - uint16_t layer_id = 0; + uint16_t layer_id; bool job_enqueued; bool job_dequeued; bool locked; int ret = 0; - PLT_SET_USED(layer_name); - cnxk_mldev = (struct cnxk_ml_dev *)device; if (cnxk_mldev == NULL) { plt_err("Invalid device = %p", device); @@ -1035,6 +1035,10 @@ cn10k_ml_layer_stop(void *device, uint16_t model_id, const char *layer_name) return -EINVAL; } + ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id); + if (ret != 0) + return ret; + layer = &model->layer[layer_id]; cn10k_mldev = &cnxk_mldev->cn10k_mldev; ocm = &cn10k_mldev->ocm; diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c index d1471971e4..c38c60bf76 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.c +++ b/drivers/ml/cnxk/cnxk_ml_ops.c @@ -1216,7 +1216,12 @@ cnxk_ml_model_start(struct rte_ml_dev *dev, uint16_t model_id) return -EINVAL; } - return cn10k_ml_model_start(cnxk_mldev, model); + if (model->type == ML_CNXK_MODEL_TYPE_GLOW) + return cn10k_ml_model_start(cnxk_mldev, model); + else + return mvtvm_ml_model_start(cnxk_mldev, model); + + return 0; } int @@ -1236,7 +1241,12 @@ cnxk_ml_model_stop(struct rte_ml_dev *dev, uint16_t model_id) return -EINVAL; } - return cn10k_ml_model_stop(cnxk_mldev, model); + if (model->type == ML_CNXK_MODEL_TYPE_GLOW) + return cn10k_ml_model_stop(cnxk_mldev, model); + else + return mvtvm_ml_model_stop(cnxk_mldev, model); + + return 0; } static int diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c index 3847f9b6b9..323c7c6fb6 100644 --- a/drivers/ml/cnxk/mvtvm_ml_ops.c +++ b/drivers/ml/cnxk/mvtvm_ml_ops.c @@ -213,3 +213,55 @@ mvtvm_ml_model_unload(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *mode return plt_memzone_free(mz); } + +int +mvtvm_ml_model_start(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model) +{ + struct cnxk_ml_layer *layer; + + uint16_t layer_id = 0; + int ret = 0; + +next_layer: + layer = &model->layer[layer_id]; + if (layer->type == ML_CNXK_LAYER_TYPE_MRVL) { + ret = cn10k_ml_layer_start(cnxk_mldev, model->model_id, layer->name); + if (ret != 0) { + plt_err("Layer start failed, model_id = %u, layer_name = %s, error = %d", + model->model_id, layer->name, ret); + return ret; + } + } + layer_id++; + + if (layer_id < model->nb_layers) + goto next_layer; + + return 0; +} + +int +mvtvm_ml_model_stop(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model) +{ + struct cnxk_ml_layer *layer; + + uint16_t layer_id = 0; +
[PATCH v8 28/34] ml/cnxk: support device dump for TVM models
Enabled support to print TVM model layer info. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cnxk_ml_model.c | 7 +++- drivers/ml/cnxk/mvtvm_ml_model.c | 59 drivers/ml/cnxk/mvtvm_ml_model.h | 2 ++ drivers/ml/cnxk/mvtvm_ml_stubs.c | 8 + drivers/ml/cnxk/mvtvm_ml_stubs.h | 2 ++ 5 files changed, 77 insertions(+), 1 deletion(-) diff --git a/drivers/ml/cnxk/cnxk_ml_model.c b/drivers/ml/cnxk/cnxk_ml_model.c index 02f80410ec..ed6a1ed866 100644 --- a/drivers/ml/cnxk/cnxk_ml_model.c +++ b/drivers/ml/cnxk/cnxk_ml_model.c @@ -68,6 +68,8 @@ cnxk_ml_model_dump(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model, cnxk_ml_print_line(fp, LINE_LEN); fprintf(fp, "%*s : %u\n", FIELD_LEN, "model_id", model->model_id); fprintf(fp, "%*s : %s\n", FIELD_LEN, "name", model->name); + fprintf(fp, "%*s : %d\n", FIELD_LEN, "type", model->type); + fprintf(fp, "%*s : %d\n", FIELD_LEN, "subtype", model->subtype); fprintf(fp, "%*s : 0x%016lx\n", FIELD_LEN, "model", PLT_U64_CAST(model)); fprintf(fp, "%*s : %u\n", FIELD_LEN, "batch_size", model->batch_size); fprintf(fp, "%*s : %u\n", FIELD_LEN, "nb_layers", model->nb_layers); @@ -84,6 +86,9 @@ cnxk_ml_model_dump(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model, for (layer_id = 0; layer_id < model->nb_layers; layer_id++) { layer = &model->layer[layer_id]; - cn10k_ml_layer_print(cnxk_mldev, layer, fp); + if (layer->type == ML_CNXK_LAYER_TYPE_MRVL) + cn10k_ml_layer_print(cnxk_mldev, layer, fp); + else + mvtvm_ml_layer_print(cnxk_mldev, layer, fp); } } diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c index 650dd970bd..ffbcec8b80 100644 --- a/drivers/ml/cnxk/mvtvm_ml_model.c +++ b/drivers/ml/cnxk/mvtvm_ml_model.c @@ -13,6 +13,7 @@ #include "cnxk_ml_dev.h" #include "cnxk_ml_model.h" +#include "cnxk_ml_utils.h" /* Objects list */ char mvtvm_object_list[ML_MVTVM_MODEL_OBJECT_MAX][RTE_ML_STR_MAX] = {"mod.so", "mod.json", @@ -311,3 +312,61 @@ mvtvm_ml_model_info_set(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *mo cn10k_ml_model_info_set(cnxk_mldev, model, &model->mvtvm.info, &model->layer[0].glow.metadata); } + +void +mvtvm_ml_layer_print(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer *layer, FILE *fp) +{ + char str[STR_LEN]; + uint8_t i; + + /* Print debug info */ + cnxk_ml_print_line(fp, LINE_LEN); + fprintf(fp, " Layer Information (Layer ID: %u, Name: %s)\n", + cnxk_mldev->index_map[layer->index].layer_id, layer->name); + cnxk_ml_print_line(fp, LINE_LEN); + fprintf(fp, "%*s : %u\n", FIELD_LEN, "layer_id", + cnxk_mldev->index_map[layer->index].layer_id); + fprintf(fp, "%*s : %s\n", FIELD_LEN, "name", layer->name); + fprintf(fp, "%*s : %d\n", FIELD_LEN, "type", layer->type); + fprintf(fp, "%*s : 0x%016lx\n", FIELD_LEN, "layer", PLT_U64_CAST(layer)); + fprintf(fp, "%*s : %u\n", FIELD_LEN, "batch_size", layer->batch_size); + + /* Print model state */ + if (layer->state == ML_CNXK_LAYER_STATE_LOADED) + fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "loaded"); + if (layer->state == ML_CNXK_LAYER_STATE_JOB_ACTIVE) + fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "job_active"); + if (layer->state == ML_CNXK_LAYER_STATE_STARTED) + fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "started"); + + fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_inputs", layer->info.nb_inputs); + fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_outputs", layer->info.nb_outputs); + fprintf(fp, "\n"); + + cnxk_ml_print_line(fp, LINE_LEN); + fprintf(fp, "%8s %16s %12s\n", "input", "input_name", "input_type"); + cnxk_ml_print_line(fp, LINE_LEN); + for (i = 0; i < layer->info.nb_inputs; i++) { + fprintf(fp, "%8u ", i); + fprintf(fp, "%*s ", 16, layer->info.input[i].name); + rte_ml_io_type_to_str(layer->info.input[i].qtype, str, STR_LEN); + fprintf(fp, "%*s ", 12, str); + } + fprintf(fp, "\n"); + cnxk_ml_print_line(fp, LINE_LEN); + fprintf(fp, "\n"); + + cnxk_ml_print_line(fp, LINE_LEN); + fprintf(fp, "%8s %16s %12s\n", "output", "output_name", "output_type"); + cnxk_ml_print_line(fp, LINE_LEN); + for (i = 0; i < layer->info.nb_outputs; i++) { + fprintf(fp, "%8u ", i); + fprintf(fp, "%*s ", 16, layer->info.output[i].name); + rte_ml_io_type_to_str(layer->info.output[i].qtype, str, STR_LEN); + fprintf(fp, "%*s ", 12, str); + fprintf(fp, "\n"); + } + fprintf(fp, "\n"); + cnxk_ml_print_line(fp, LINE_
[PATCH v8 20/34] ml/cnxk: add support for identify model type
Enable support to parse model buffer to identify the model type and model sub-type. Enabled basic checks for Glow model type buffer. Signed-off-by: Srikanth Yalavarthi Signed-off-by: Anup Prabhu --- drivers/ml/cnxk/cnxk_ml_model.c | 49 drivers/ml/cnxk/cnxk_ml_model.h | 3 ++ drivers/ml/cnxk/cnxk_ml_ops.c| 8 + drivers/ml/cnxk/meson.build | 6 drivers/ml/cnxk/mvtvm_ml_model.c | 55 drivers/ml/cnxk/mvtvm_ml_model.h | 2 ++ drivers/ml/cnxk/mvtvm_ml_stubs.c | 9 ++ drivers/ml/cnxk/mvtvm_ml_stubs.h | 1 + 8 files changed, 133 insertions(+) create mode 100644 drivers/ml/cnxk/mvtvm_ml_model.c diff --git a/drivers/ml/cnxk/cnxk_ml_model.c b/drivers/ml/cnxk/cnxk_ml_model.c index b069d4e3a5..02f80410ec 100644 --- a/drivers/ml/cnxk/cnxk_ml_model.c +++ b/drivers/ml/cnxk/cnxk_ml_model.c @@ -2,11 +2,60 @@ * Copyright (c) 2023 Marvell. */ +#include #include #include "cnxk_ml_model.h" #include "cnxk_ml_utils.h" +enum cnxk_ml_model_type +cnxk_ml_model_get_type(struct rte_ml_model_params *params) +{ + struct cn10k_ml_model_metadata_header *metadata_header; + enum cnxk_ml_model_type type; + uint32_t payload_crc32c; + uint32_t header_crc32c; + + type = mvtvm_ml_model_type_get(params); + if (type == ML_CNXK_MODEL_TYPE_TVM) + return ML_CNXK_MODEL_TYPE_TVM; + else if (type == ML_CNXK_MODEL_TYPE_INVALID) + return ML_CNXK_MODEL_TYPE_INVALID; + + /* Check model magic string */ + metadata_header = (struct cn10k_ml_model_metadata_header *)params->addr; + if (strncmp((char *)metadata_header->magic, MRVL_ML_MODEL_MAGIC_STRING, 4) != 0) { + plt_err("Invalid Glow model, magic = %s", metadata_header->magic); + return ML_CNXK_MODEL_TYPE_INVALID; + } + + /* Header CRC check */ + if (metadata_header->header_crc32c != 0) { + header_crc32c = rte_hash_crc( + params->addr, + sizeof(struct cn10k_ml_model_metadata_header) - sizeof(uint32_t), 0); + + if (header_crc32c != metadata_header->header_crc32c) { + plt_err("Invalid Glow model, Header CRC mismatch"); + return ML_CNXK_MODEL_TYPE_INVALID; + } + } + + /* Payload CRC check */ + if (metadata_header->payload_crc32c != 0) { + payload_crc32c = rte_hash_crc( + PLT_PTR_ADD(params->addr, sizeof(struct cn10k_ml_model_metadata_header)), + params->size - sizeof(struct cn10k_ml_model_metadata_header), 0); + + if (payload_crc32c != metadata_header->payload_crc32c) { + plt_err("Invalid Glow model, Payload CRC mismatch"); + return ML_CNXK_MODEL_TYPE_INVALID; + } + } + + return ML_CNXK_MODEL_TYPE_GLOW; +} + void cnxk_ml_model_dump(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model, FILE *fp) { diff --git a/drivers/ml/cnxk/cnxk_ml_model.h b/drivers/ml/cnxk/cnxk_ml_model.h index f100eca203..a2fced46a2 100644 --- a/drivers/ml/cnxk/cnxk_ml_model.h +++ b/drivers/ml/cnxk/cnxk_ml_model.h @@ -13,6 +13,8 @@ #ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM #include "mvtvm_ml_model.h" +#else +#include "mvtvm_ml_stubs.h" #endif #include "cnxk_ml_io.h" @@ -184,6 +186,7 @@ struct cnxk_ml_model { set_poll_addr_t set_poll_addr; }; +enum cnxk_ml_model_type cnxk_ml_model_get_type(struct rte_ml_model_params *params); void cnxk_ml_model_dump(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model, FILE *fp); #endif /* _CNXK_ML_MODEL_H_ */ diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c index 96f87128f9..ebc78e36e9 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.c +++ b/drivers/ml/cnxk/cnxk_ml_ops.c @@ -1018,6 +1018,7 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, u { struct rte_ml_dev_info dev_info; struct cnxk_ml_dev *cnxk_mldev; + enum cnxk_ml_model_type type; struct cnxk_ml_model *model; char str[RTE_MEMZONE_NAMESIZE]; @@ -1033,6 +1034,12 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, u cnxk_mldev = dev->data->dev_private; + type = cnxk_ml_model_get_type(params); + if (type == ML_CNXK_MODEL_TYPE_INVALID) { + plt_err("Invalid / unsupported model type"); + return -EINVAL; + } + /* Find model ID */ found = false; for (lcl_model_id = 0; lcl_model_id < dev->data->nb_models; lcl_model_id++) { @@ -1066,6 +1073,7 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, u model = mz->addr; model->cnxk_mldev = cnxk_mldev; + model->type = type; model->model_id = lcl_model_id; model->info = PLT_PTR_ADD(
[PATCH v8 29/34] ml/cnxk: enable reporting model runtime as xstats
Added model xstats entries to compute runtime latency. Allocated internal resources for TVM model xstats. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 9 +++ drivers/ml/cnxk/cn10k_ml_ops.h | 2 + drivers/ml/cnxk/cnxk_ml_ops.c| 131 +++ drivers/ml/cnxk/cnxk_ml_ops.h| 1 + drivers/ml/cnxk/cnxk_ml_xstats.h | 7 ++ drivers/ml/cnxk/mvtvm_ml_model.h | 24 ++ drivers/ml/cnxk/mvtvm_ml_ops.c | 96 +- drivers/ml/cnxk/mvtvm_ml_ops.h | 8 ++ drivers/ml/cnxk/mvtvm_ml_stubs.c | 23 ++ drivers/ml/cnxk/mvtvm_ml_stubs.h | 6 ++ 10 files changed, 289 insertions(+), 18 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 2d308802cf..0c67ce7b40 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -197,6 +197,15 @@ cn10k_ml_xstats_layer_name_update(struct cnxk_ml_dev *cnxk_mldev, uint16_t model } } +void +cn10k_ml_xstat_model_name_set(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model, + uint16_t stat_id, uint16_t entry, char *suffix) +{ + snprintf(cnxk_mldev->xstats.entries[stat_id].map.name, +sizeof(cnxk_mldev->xstats.entries[stat_id].map.name), "%s-%s-%s", +model->glow.metadata.model.name, model_xstats[entry].name, suffix); +} + #define ML_AVG_FOREACH_QP(cnxk_mldev, layer, qp_id, str, value, count) \ do { \ value = 0; \ diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h index 3d18303ed3..045e2e6cd2 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.h +++ b/drivers/ml/cnxk/cn10k_ml_ops.h @@ -331,6 +331,8 @@ int cn10k_ml_layer_start(void *device, uint16_t model_id, const char *layer_name int cn10k_ml_layer_stop(void *device, uint16_t model_id, const char *layer_name); /* xstats ops */ +void cn10k_ml_xstat_model_name_set(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model, + uint16_t stat_id, uint16_t entry, char *suffix); uint64_t cn10k_ml_model_xstat_get(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer *layer, enum cnxk_ml_xstats_type type); diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c index c38c60bf76..2632d70d8c 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.c +++ b/drivers/ml/cnxk/cnxk_ml_ops.c @@ -138,7 +138,8 @@ cnxk_ml_xstats_init(struct cnxk_ml_dev *cnxk_mldev) /* Allocate memory for xstats entries. Don't allocate during reconfigure */ nb_stats = RTE_DIM(device_xstats) + - RTE_DIM(layer_xstats) * ML_CNXK_MAX_MODELS * ML_CNXK_MODEL_MAX_LAYERS; + RTE_DIM(layer_xstats) * ML_CNXK_MAX_MODELS * ML_CNXK_MODEL_MAX_LAYERS + + RTE_DIM(model_xstats) * ML_CNXK_MAX_MODELS; if (cnxk_mldev->xstats.entries == NULL) cnxk_mldev->xstats.entries = rte_zmalloc( "cnxk_ml_xstats", sizeof(struct cnxk_ml_xstats_entry) * nb_stats, @@ -169,6 +170,25 @@ cnxk_ml_xstats_init(struct cnxk_ml_dev *cnxk_mldev) for (model = 0; model < ML_CNXK_MAX_MODELS; model++) { cnxk_mldev->xstats.offset_for_model[model] = stat_id; + for (i = 0; i < RTE_DIM(model_xstats); i++) { + cnxk_mldev->xstats.entries[stat_id].map.id = stat_id; + cnxk_mldev->xstats.entries[stat_id].mode = RTE_ML_DEV_XSTATS_MODEL; + cnxk_mldev->xstats.entries[stat_id].group = CNXK_ML_XSTATS_GROUP_MODEL; + cnxk_mldev->xstats.entries[stat_id].type = model_xstats[i].type; + cnxk_mldev->xstats.entries[stat_id].fn_id = CNXK_ML_XSTATS_FN_MODEL; + cnxk_mldev->xstats.entries[stat_id].obj_idx = model; + cnxk_mldev->xstats.entries[stat_id].layer_id = -1; + cnxk_mldev->xstats.entries[stat_id].reset_allowed = + model_xstats[i].reset_allowed; + + /* Name of xstat is updated during model load */ + snprintf(cnxk_mldev->xstats.entries[stat_id].map.name, + sizeof(cnxk_mldev->xstats.entries[stat_id].map.name), +"Model-%u-%s", model, model_xstats[i].name); + + stat_id++; + } + for (layer = 0; layer < ML_CNXK_MODEL_MAX_LAYERS; layer++) { cnxk_mldev->xstats.offset_for_layer[model][layer] = stat_id; @@ -195,7 +215,8 @@ cnxk_ml_xstats_init(struct cnxk_ml_dev *cnxk_mldev) cnxk_mldev->xstats.count_per_layer[model][layer] = RTE_DIM(
[PATCH v8 23/34] ml/cnxk: update internal info for TVM model
Enabled updating internal IO info structures for TVM model. Compute static fields related to the model I/O. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cnxk_ml_ops.c| 4 ++ drivers/ml/cnxk/mvtvm_ml_model.c | 111 +++ drivers/ml/cnxk/mvtvm_ml_model.h | 2 + drivers/ml/cnxk/mvtvm_ml_ops.c | 3 + drivers/ml/cnxk/mvtvm_ml_stubs.c | 9 +++ drivers/ml/cnxk/mvtvm_ml_stubs.h | 1 + 6 files changed, 130 insertions(+) diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c index 85b37161d2..1565e521fd 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.c +++ b/drivers/ml/cnxk/cnxk_ml_ops.c @@ -1244,6 +1244,8 @@ cnxk_ml_io_quantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_buf if (model->type == ML_CNXK_MODEL_TYPE_GLOW) info = cn10k_ml_model_io_info_get(model, 0); + else + info = mvtvm_ml_model_io_info_get(model, 0); if (info == NULL) return -EINVAL; @@ -1296,6 +1298,8 @@ cnxk_ml_io_dequantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_b if (model->type == ML_CNXK_MODEL_TYPE_GLOW) info = cn10k_ml_model_io_info_get(model, model->nb_layers - 1); + else + info = mvtvm_ml_model_io_info_get(model, model->nb_layers - 1); if (info == NULL) return -EINVAL; diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c index 8536fd8927..b40b0a13af 100644 --- a/drivers/ml/cnxk/mvtvm_ml_model.c +++ b/drivers/ml/cnxk/mvtvm_ml_model.c @@ -7,6 +7,8 @@ #include +#include + #include #include "cnxk_ml_model.h" @@ -135,3 +137,112 @@ mvtvm_ml_model_get_layer_id(struct cnxk_ml_model *model, const char *layer_name, return 0; } + +static enum rte_ml_io_type +mvtvm_ml_io_type_map(uint8_t type) +{ + switch (type) { + case kDLInt: + return RTE_ML_IO_TYPE_INT32; + case kDLUInt: + return RTE_ML_IO_TYPE_UINT32; + case kDLFloat: + return RTE_ML_IO_TYPE_FP32; + case kDLBfloat: + return RTE_ML_IO_TYPE_BFLOAT16; + } + + return RTE_ML_IO_TYPE_UNKNOWN; +} + +void +mvtvm_ml_model_io_info_set(struct cnxk_ml_model *model) +{ + struct tvmdp_model_metadata *metadata; + int32_t i; + int32_t j; + + if (model->subtype == ML_CNXK_MODEL_SUBTYPE_TVM_MRVL) + goto tvm_mrvl_model; + + metadata = &model->mvtvm.metadata; + + /* Inputs, set for layer_id = 0 */ + model->mvtvm.info.nb_inputs = metadata->model.num_input; + model->mvtvm.info.total_input_sz_d = 0; + model->mvtvm.info.total_input_sz_q = 0; + for (i = 0; i < metadata->model.num_input; i++) { + rte_strscpy(model->mvtvm.info.input[i].name, metadata->input[i].name, + TVMDP_NAME_STRLEN); + model->mvtvm.info.input[i].dtype = + mvtvm_ml_io_type_map(metadata->input[i].datatype.code); + model->mvtvm.info.input[i].qtype = + mvtvm_ml_io_type_map(metadata->input[i].model_datatype.code); + model->mvtvm.info.input[i].nb_dims = metadata->input[i].ndim; + + model->mvtvm.info.input[i].nb_elements = 1; + for (j = 0; j < metadata->input[i].ndim; j++) { + model->mvtvm.info.input[i].shape[j] = metadata->input[i].shape[j]; + model->mvtvm.info.input[i].nb_elements *= metadata->input[i].shape[j]; + } + + model->mvtvm.info.input[i].sz_d = + model->mvtvm.info.input[i].nb_elements * + rte_ml_io_type_size_get(model->mvtvm.info.input[i].dtype); + model->mvtvm.info.input[i].sz_q = + model->mvtvm.info.input[i].nb_elements * + rte_ml_io_type_size_get(model->mvtvm.info.input[i].qtype); + + model->mvtvm.info.total_input_sz_d += model->mvtvm.info.input[i].sz_d; + model->mvtvm.info.total_input_sz_q += model->mvtvm.info.input[i].sz_q; + + plt_ml_dbg("model_id = %u, input[%u] - sz_d = %u sz_q = %u", model->model_id, i, + model->mvtvm.info.input[i].sz_d, model->mvtvm.info.input[i].sz_q); + } + + /* Outputs, set for nb_layers - 1 */ + model->mvtvm.info.nb_outputs = metadata->model.num_output; + model->mvtvm.info.total_output_sz_d = 0; + model->mvtvm.info.total_output_sz_q = 0; + for (i = 0; i < metadata->model.num_output; i++) { + rte_strscpy(model->mvtvm.info.output[i].name, metadata->output[i].name, + TVMDP_NAME_STRLEN); + model->mvtvm.info.output[i].dtype = + mvtvm_ml_io_type_map(metadata->output[i].datatype.code); + model->mvtvm.info.output[i].qtype = +
[PATCH v8 24/34] ml/cnxk: enable model unload in tvmdp library
Enable unloading model using external tvmdp library. Updated layer unload callback to support multiple layers. Signed-off-by: Srikanth Yalavarthi Signed-off-by: Anup Prabhu --- drivers/ml/cnxk/cn10k_ml_ops.c | 8 +--- drivers/ml/cnxk/cnxk_ml_ops.c| 7 +-- drivers/ml/cnxk/mvtvm_ml_ops.c | 28 drivers/ml/cnxk/mvtvm_ml_ops.h | 1 + drivers/ml/cnxk/mvtvm_ml_stubs.c | 9 + drivers/ml/cnxk/mvtvm_ml_stubs.h | 1 + 6 files changed, 49 insertions(+), 5 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 4191ccc840..e7208391fd 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -780,11 +780,9 @@ cn10k_ml_layer_unload(void *device, uint16_t model_id, const char *layer_name) struct cnxk_ml_layer *layer; char str[RTE_MEMZONE_NAMESIZE]; - uint16_t layer_id = 0; + uint16_t layer_id; int ret; - PLT_SET_USED(layer_name); - cnxk_mldev = (struct cnxk_ml_dev *)device; if (cnxk_mldev == NULL) { plt_err("Invalid device = %p", device); @@ -797,6 +795,10 @@ cn10k_ml_layer_unload(void *device, uint16_t model_id, const char *layer_name) return -EINVAL; } + ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id); + if (ret != 0) + return ret; + layer = &model->layer[layer_id]; snprintf(str, RTE_MEMZONE_NAMESIZE, "%s_%u_%u", CN10K_ML_LAYER_MEMZONE_NAME, diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c index 1565e521fd..ce668e1eb6 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.c +++ b/drivers/ml/cnxk/cnxk_ml_ops.c @@ -1107,7 +1107,7 @@ cnxk_ml_model_unload(struct rte_ml_dev *dev, uint16_t model_id) struct cnxk_ml_model *model; char str[RTE_MEMZONE_NAMESIZE]; - int ret; + int ret = 0; if (dev == NULL) return -EINVAL; @@ -1125,7 +1125,10 @@ cnxk_ml_model_unload(struct rte_ml_dev *dev, uint16_t model_id) return -EBUSY; } - ret = cn10k_ml_model_unload(cnxk_mldev, model); + if (model->type == ML_CNXK_MODEL_TYPE_GLOW) + ret = cn10k_ml_model_unload(cnxk_mldev, model); + else + ret = mvtvm_ml_model_unload(cnxk_mldev, model); if (ret != 0) return ret; diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c index e21bf2dc07..3847f9b6b9 100644 --- a/drivers/ml/cnxk/mvtvm_ml_ops.c +++ b/drivers/ml/cnxk/mvtvm_ml_ops.c @@ -185,3 +185,31 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_model_params * return ret; } + +int +mvtvm_ml_model_unload(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model) +{ + char str[RTE_MEMZONE_NAMESIZE]; + const struct plt_memzone *mz; + int ret; + + RTE_SET_USED(cnxk_mldev); + + /* Initialize model in TVMDP */ + ret = tvmdp_model_unload(model->model_id); + if (ret != 0) { + plt_err("TVMDP: Model unload failed, model_id = %u, error = %d", model->model_id, + ret); + return ret; + } + + snprintf(str, RTE_MEMZONE_NAMESIZE, "%s_%u", MVTVM_ML_MODEL_MEMZONE_NAME, model->model_id); + mz = rte_memzone_lookup(str); + if (mz == NULL) { + plt_err("Memzone lookup failed for TVM model: model_id = %u, mz = %s", + model->model_id, str); + return -EINVAL; + } + + return plt_memzone_free(mz); +} diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.h b/drivers/ml/cnxk/mvtvm_ml_ops.h index 6607537599..770794fe7d 100644 --- a/drivers/ml/cnxk/mvtvm_ml_ops.h +++ b/drivers/ml/cnxk/mvtvm_ml_ops.h @@ -18,5 +18,6 @@ int mvtvm_ml_dev_configure(struct cnxk_ml_dev *cnxk_mldev, const struct rte_ml_d int mvtvm_ml_dev_close(struct cnxk_ml_dev *cnxk_mldev); int mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_model_params *params, struct cnxk_ml_model *model); +int mvtvm_ml_model_unload(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model); #endif /* _MVTVM_ML_OPS_H_ */ diff --git a/drivers/ml/cnxk/mvtvm_ml_stubs.c b/drivers/ml/cnxk/mvtvm_ml_stubs.c index 80a9a90b4e..a17a76e41f 100644 --- a/drivers/ml/cnxk/mvtvm_ml_stubs.c +++ b/drivers/ml/cnxk/mvtvm_ml_stubs.c @@ -63,3 +63,12 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_model_params * return -EINVAL; } + +int +mvtvm_ml_model_unload(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model) +{ + RTE_SET_USED(cnxk_mldev); + RTE_SET_USED(model); + + return -EINVAL; +} diff --git a/drivers/ml/cnxk/mvtvm_ml_stubs.h b/drivers/ml/cnxk/mvtvm_ml_stubs.h index 29f721072a..3776fb5369 100644 --- a/drivers/ml/cnxk/mvtvm_ml_stubs.h +++ b/drivers/ml/cnxk/mvtvm_ml_stubs.h @@ -15,6 +15,7 @@ int mvtvm_ml_dev_configu
[PATCH v8 31/34] ml/cnxk: add generic ML malloc and free callback
Implemented generic ML malloc and free callbacks Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 30 ++ drivers/ml/cnxk/cn10k_ml_ops.h | 3 +++ drivers/ml/cnxk/mvtvm_ml_ops.c | 2 ++ 3 files changed, 35 insertions(+) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 7802425c87..01b0a44caa 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -1497,3 +1497,33 @@ cn10k_ml_io_free(void *device, uint16_t model_id, const char *layer_name) return plt_memzone_free(mz); } + +int +cn10k_ml_malloc(const char *name, size_t size, uint32_t align, void **addr) +{ + const struct plt_memzone *mz; + + mz = plt_memzone_reserve_aligned(name, size, 0, align); + if (mz == NULL) { + plt_err("ml_malloc failed: Unable to allocate memory: name = %s", name); + return -ENOMEM; + } + + *addr = mz->addr; + + return 0; +} + +int +cn10k_ml_free(const char *name) +{ + const struct plt_memzone *mz; + + mz = plt_memzone_lookup(name); + if (mz == NULL) { + plt_err("ml_free failed: Memzone not found: name = %s", name); + return -EINVAL; + } + + return plt_memzone_free(mz); +} diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h index 9c41c1c0b0..eb3e1c139c 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.h +++ b/drivers/ml/cnxk/cn10k_ml_ops.h @@ -333,6 +333,9 @@ int cn10k_ml_io_alloc(void *device, uint16_t model_id, const char *layer_name, uint64_t **input_qbuffer, uint64_t **output_qbuffer); int cn10k_ml_io_free(void *device, uint16_t model_id, const char *layer_name); +int cn10k_ml_malloc(const char *name, size_t size, uint32_t align, void **addr); +int cn10k_ml_free(const char *name); + /* xstats ops */ void cn10k_ml_xstat_model_name_set(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model, uint16_t stat_id, uint16_t entry, char *suffix); diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c index a50b31ec6e..9d59e28661 100644 --- a/drivers/ml/cnxk/mvtvm_ml_ops.c +++ b/drivers/ml/cnxk/mvtvm_ml_ops.c @@ -234,6 +234,8 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_model_params * callback->tvmrt_glow_layer_unload = cn10k_ml_layer_unload; callback->tvmrt_io_alloc = cn10k_ml_io_alloc; callback->tvmrt_io_free = cn10k_ml_io_free; + callback->tvmrt_malloc = cn10k_ml_malloc; + callback->tvmrt_free = cn10k_ml_free; } else { callback = NULL; } -- 2.42.0
[PATCH v8 27/34] ml/cnxk: update internal TVM model info structure
From: Prince Takkar Added support to update internal model info structure for TVM models. Signed-off-by: Prince Takkar Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/mvtvm_ml_model.c | 65 drivers/ml/cnxk/mvtvm_ml_model.h | 2 + drivers/ml/cnxk/mvtvm_ml_ops.c | 3 ++ 3 files changed, 70 insertions(+) diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c index b40b0a13af..650dd970bd 100644 --- a/drivers/ml/cnxk/mvtvm_ml_model.c +++ b/drivers/ml/cnxk/mvtvm_ml_model.c @@ -11,6 +11,7 @@ #include +#include "cnxk_ml_dev.h" #include "cnxk_ml_model.h" /* Objects list */ @@ -246,3 +247,67 @@ mvtvm_ml_model_io_info_get(struct cnxk_ml_model *model, uint16_t layer_id) return &model->mvtvm.info; } + +void +mvtvm_ml_model_info_set(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model) +{ + struct tvmdp_model_metadata *metadata; + struct rte_ml_model_info *info; + struct rte_ml_io_info *output; + struct rte_ml_io_info *input; + uint8_t i; + + info = PLT_PTR_CAST(model->info); + input = PLT_PTR_ADD(info, sizeof(struct rte_ml_model_info)); + output = PLT_PTR_ADD(input, ML_CNXK_MODEL_MAX_INPUT_OUTPUT * sizeof(struct rte_ml_io_info)); + + /* Reset model info */ + memset(info, 0, sizeof(struct rte_ml_model_info)); + + if (model->subtype == ML_CNXK_MODEL_SUBTYPE_TVM_MRVL) + goto tvm_mrvl_model; + + metadata = &model->mvtvm.metadata; + rte_memcpy(info->name, metadata->model.name, TVMDP_NAME_STRLEN); + snprintf(info->version, RTE_ML_STR_MAX, "%u.%u.%u.%u", metadata->model.version[0], +metadata->model.version[1], metadata->model.version[2], +metadata->model.version[3]); + info->model_id = model->model_id; + info->device_id = cnxk_mldev->mldev->data->dev_id; + info->io_layout = RTE_ML_IO_LAYOUT_SPLIT; + info->min_batches = model->batch_size; + info->max_batches = model->batch_size; + info->nb_inputs = metadata->model.num_input; + info->input_info = input; + info->nb_outputs = metadata->model.num_output; + info->output_info = output; + info->wb_size = 0; + + /* Set input info */ + for (i = 0; i < info->nb_inputs; i++) { + rte_memcpy(input[i].name, metadata->input[i].name, MRVL_ML_INPUT_NAME_LEN); + input[i].nb_dims = metadata->input[i].ndim; + input[i].shape = &model->mvtvm.info.input[i].shape[0]; + input[i].type = model->mvtvm.info.input[i].qtype; + input[i].nb_elements = model->mvtvm.info.input[i].nb_elements; + input[i].size = model->mvtvm.info.input[i].nb_elements * + rte_ml_io_type_size_get(model->mvtvm.info.input[i].qtype); + } + + /* Set output info */ + for (i = 0; i < info->nb_outputs; i++) { + rte_memcpy(output[i].name, metadata->output[i].name, MRVL_ML_OUTPUT_NAME_LEN); + output[i].nb_dims = metadata->output[i].ndim; + output[i].shape = &model->mvtvm.info.output[i].shape[0]; + output[i].type = model->mvtvm.info.output[i].qtype; + output[i].nb_elements = model->mvtvm.info.output[i].nb_elements; + output[i].size = model->mvtvm.info.output[i].nb_elements * + rte_ml_io_type_size_get(model->mvtvm.info.output[i].qtype); + } + + return; + +tvm_mrvl_model: + cn10k_ml_model_info_set(cnxk_mldev, model, &model->mvtvm.info, + &model->layer[0].glow.metadata); +} diff --git a/drivers/ml/cnxk/mvtvm_ml_model.h b/drivers/ml/cnxk/mvtvm_ml_model.h index e86581bc6a..a1247ffbde 100644 --- a/drivers/ml/cnxk/mvtvm_ml_model.h +++ b/drivers/ml/cnxk/mvtvm_ml_model.h @@ -11,6 +11,7 @@ #include "cnxk_ml_io.h" +struct cnxk_ml_dev; struct cnxk_ml_model; /* Maximum number of objects per model */ @@ -52,5 +53,6 @@ int mvtvm_ml_model_get_layer_id(struct cnxk_ml_model *model, const char *layer_n uint16_t *layer_id); void mvtvm_ml_model_io_info_set(struct cnxk_ml_model *model); struct cnxk_ml_io_info *mvtvm_ml_model_io_info_get(struct cnxk_ml_model *model, uint16_t layer_id); +void mvtvm_ml_model_info_set(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model); #endif /* _MVTVM_ML_MODEL_H_ */ diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c index 323c7c6fb6..c6872cd89a 100644 --- a/drivers/ml/cnxk/mvtvm_ml_ops.c +++ b/drivers/ml/cnxk/mvtvm_ml_ops.c @@ -178,6 +178,9 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_model_params * /* Update model I/O data */ mvtvm_ml_model_io_info_set(model); + /* Set model info */ + mvtvm_ml_model_info_set(cnxk_mldev, model); + return 0; error: -- 2.42.0
[PATCH v8 32/34] ml/cnxk: support quantize and dequantize callback
From: Prince Takkar Added support for quantize and dequantize callback functions for TVM models. Signed-off-by: Prince Takkar --- drivers/ml/cnxk/mvtvm_ml_ops.c | 129 + drivers/ml/cnxk/mvtvm_ml_ops.h | 4 + 2 files changed, 133 insertions(+) diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c index 9d59e28661..39c8bf0f04 100644 --- a/drivers/ml/cnxk/mvtvm_ml_ops.c +++ b/drivers/ml/cnxk/mvtvm_ml_ops.c @@ -2,11 +2,15 @@ * Copyright (c) 2023 Marvell. */ +#include + #include #include #include #include +#include + #include "cnxk_ml_dev.h" #include "cnxk_ml_model.h" #include "cnxk_ml_ops.h" @@ -236,6 +240,8 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_model_params * callback->tvmrt_io_free = cn10k_ml_io_free; callback->tvmrt_malloc = cn10k_ml_malloc; callback->tvmrt_free = cn10k_ml_free; + callback->tvmrt_quantize = mvtvm_ml_io_quantize; + callback->tvmrt_dequantize = mvtvm_ml_io_dequantize; } else { callback = NULL; } @@ -366,3 +372,126 @@ mvtvm_ml_model_stop(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model) return 0; } + +int +mvtvm_ml_io_quantize(void *device, uint16_t model_id, const char *layer_name, +const DLTensor **deq_tensor, void *qbuffer) +{ + struct cnxk_ml_io_info *info = NULL; + struct cnxk_ml_dev *cnxk_mldev; + struct cnxk_ml_model *model; + uint16_t layer_id = 0; + uint8_t *lcl_dbuffer; + uint8_t *lcl_qbuffer; + uint32_t i; + int ret; + +#ifdef CNXK_ML_DEV_DEBUG + if ((device == NULL) || (deq_tensor == NULL) || (qbuffer == NULL)) + return -EINVAL; +#endif + + cnxk_mldev = (struct cnxk_ml_dev *)device; + + model = cnxk_mldev->mldev->data->models[model_id]; +#ifdef CNXK_ML_DEV_DEBUG + if (model == NULL) { + plt_err("Invalid model_id = %u", model_id); + return -EINVAL; + } +#endif + + /* Get layer id */ + for (layer_id = 0; layer_id < model->mvtvm.metadata.model.nb_layers; layer_id++) { + if (strcmp(model->layer[layer_id].name, layer_name) == 0) + break; + } + +#ifdef CNXK_ML_DEV_DEBUG + if (layer_id == model->mvtvm.metadata.model.nb_layers) { + plt_err("Invalid layer name: %s", layer_name); + return -EINVAL; + } + + if (model->layer[layer_id].type != ML_CNXK_LAYER_TYPE_MRVL) { + plt_err("Invalid layer name / type: %s", layer_name); + return -EINVAL; + } +#endif + + info = &model->layer[layer_id].info; + lcl_qbuffer = (uint8_t *)qbuffer; + + for (i = 0; i < info->nb_inputs; i++) { + lcl_dbuffer = PLT_PTR_ADD(deq_tensor[i]->data, deq_tensor[i]->byte_offset); + + ret = cnxk_ml_io_quantize_single(&info->input[i], lcl_dbuffer, lcl_qbuffer); + if (ret < 0) + return ret; + + lcl_qbuffer += info->input[i].sz_q; + } + + return 0; +} + +int +mvtvm_ml_io_dequantize(void *device, uint16_t model_id, const char *layer_name, void *qbuffer, + const DLTensor **deq_tensor) +{ + struct cnxk_ml_io_info *info = NULL; + struct cnxk_ml_dev *cnxk_mldev; + struct cnxk_ml_model *model; + uint16_t layer_id = 0; + uint8_t *lcl_dbuffer; + uint8_t *lcl_qbuffer; + uint32_t i; + int ret; + +#ifdef CNXK_ML_DEV_DEBUG + if ((device == NULL) || (deq_tensor == NULL) || (qbuffer == NULL)) + return -EINVAL; +#endif + + cnxk_mldev = (struct cnxk_ml_dev *)device; + + model = cnxk_mldev->mldev->data->models[model_id]; +#ifdef CNXK_ML_DEV_DEBUG + if (model == NULL) { + plt_err("Invalid model_id = %u", model_id); + return -EINVAL; + } +#endif + + for (layer_id = 0; layer_id < model->mvtvm.metadata.model.nb_layers; layer_id++) { + if (strcmp(model->layer[layer_id].name, layer_name) == 0) + break; + } + +#ifdef CNXK_ML_DEV_DEBUG + if (layer_id == model->mvtvm.metadata.model.nb_layers) { + plt_err("Invalid layer name: %s", layer_name); + return -EINVAL; + } + + if (model->layer[layer_id].type != ML_CNXK_LAYER_TYPE_MRVL) { + plt_err("Invalid layer name / type: %s", layer_name); + return -EINVAL; + } +#endif + + info = &model->layer[layer_id].info; + lcl_qbuffer = (uint8_t *)qbuffer; + + for (i = 0; i < info->nb_outputs; i++) { + lcl_dbuffer = PLT_PTR_ADD(deq_tensor[i]->data, deq_tensor[i]->byte_offset); + + ret = cnxk_ml_io_dequantize_single(&info->output[i], lcl_qbuffer, lcl_dbuffer); + if (ret < 0) +
[PATCH v8 30/34] ml/cnxk: implement I/O alloc and free callbacks
Implemented callback functions for IO allocation and free for Glow layers. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 87 ++ drivers/ml/cnxk/cn10k_ml_ops.h | 3 ++ drivers/ml/cnxk/mvtvm_ml_ops.c | 2 + 3 files changed, 92 insertions(+) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 0c67ce7b40..7802425c87 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -1410,3 +1410,90 @@ cn10k_ml_inference_sync(void *device, uint16_t index, void *input, void *output, error_enqueue: return ret; } + +int +cn10k_ml_io_alloc(void *device, uint16_t model_id, const char *layer_name, uint64_t **input_qbuffer, + uint64_t **output_qbuffer) +{ + struct cnxk_ml_dev *cnxk_mldev; + struct cnxk_ml_model *model; + struct cnxk_ml_layer *layer; + + char str[RTE_MEMZONE_NAMESIZE]; + const struct plt_memzone *mz; + uint64_t output_size; + uint64_t input_size; + uint16_t layer_id; + int ret; + + cnxk_mldev = (struct cnxk_ml_dev *)device; + if (cnxk_mldev == NULL) { + plt_err("Invalid device = %p", device); + return -EINVAL; + } + + model = cnxk_mldev->mldev->data->models[model_id]; + if (model == NULL) { + plt_err("Invalid model_id = %u", model_id); + return -EINVAL; + } + + ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id); + if (ret != 0) + return ret; + + layer = &model->layer[layer_id]; + input_size = PLT_ALIGN_CEIL(layer->info.total_input_sz_q, ML_CN10K_ALIGN_SIZE); + output_size = PLT_ALIGN_CEIL(layer->info.total_output_sz_q, ML_CN10K_ALIGN_SIZE); + + sprintf(str, "cn10k_ml_io_mz_%u_%u", model_id, layer_id); + mz = plt_memzone_reserve_aligned(str, input_size + output_size, 0, ML_CN10K_ALIGN_SIZE); + if (mz == NULL) { + plt_err("io_alloc failed: Unable to allocate memory: model_id = %u, layer_name = %s", + model_id, layer_name); + return -ENOMEM; + } + + *input_qbuffer = mz->addr; + *output_qbuffer = PLT_PTR_ADD(mz->addr, input_size); + + return 0; +} + +int +cn10k_ml_io_free(void *device, uint16_t model_id, const char *layer_name) +{ + struct cnxk_ml_dev *cnxk_mldev; + struct cnxk_ml_model *model; + + char str[RTE_MEMZONE_NAMESIZE]; + const struct plt_memzone *mz; + uint16_t layer_id; + int ret; + + cnxk_mldev = (struct cnxk_ml_dev *)device; + if (cnxk_mldev == NULL) { + plt_err("Invalid device = %p", device); + return -EINVAL; + } + + model = cnxk_mldev->mldev->data->models[model_id]; + if (model == NULL) { + plt_err("Invalid model_id = %u", model_id); + return -EINVAL; + } + + ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id); + if (ret != 0) + return ret; + + sprintf(str, "cn10k_ml_io_mz_%u_%u", model_id, layer_id); + mz = plt_memzone_lookup(str); + if (mz == NULL) { + plt_err("io_free failed: Memzone not found: model_id = %u, layer_name = %s", + model_id, layer_name); + return -EINVAL; + } + + return plt_memzone_free(mz); +} diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h index 045e2e6cd2..9c41c1c0b0 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.h +++ b/drivers/ml/cnxk/cn10k_ml_ops.h @@ -329,6 +329,9 @@ int cn10k_ml_layer_load(void *device, uint16_t model_id, const char *layer_name, int cn10k_ml_layer_unload(void *device, uint16_t model_id, const char *layer_name); int cn10k_ml_layer_start(void *device, uint16_t model_id, const char *layer_name); int cn10k_ml_layer_stop(void *device, uint16_t model_id, const char *layer_name); +int cn10k_ml_io_alloc(void *device, uint16_t model_id, const char *layer_name, + uint64_t **input_qbuffer, uint64_t **output_qbuffer); +int cn10k_ml_io_free(void *device, uint16_t model_id, const char *layer_name); /* xstats ops */ void cn10k_ml_xstat_model_name_set(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model, diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c index abfbae2b3a..a50b31ec6e 100644 --- a/drivers/ml/cnxk/mvtvm_ml_ops.c +++ b/drivers/ml/cnxk/mvtvm_ml_ops.c @@ -232,6 +232,8 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_model_params * callback = &model->mvtvm.cb; callback->tvmrt_glow_layer_load = cn10k_ml_layer_load; callback->tvmrt_glow_layer_unload = cn10k_ml_layer_unload; + callback->tvmrt_io_alloc = cn10k_ml_io_alloc; + callback->tvmrt_io_free = cn10k_ml_io_free; } else { ca
[PATCH v8 33/34] ml/cnxk: enable fast-path ops for TVM models
From: Anup Prabhu Enable fast-path ops support for TVM models. Models would use TVMDP library function calls to execute inference operations for Hybrid and LLVM model sub-types. For TVM MRVL model subtypes that have a single MRVL layer, the inference requests are directly enqueued to hardware by the driver. Signed-off-by: Anup Prabhu Signed-off-by: Srikanth Yalavarthi --- doc/guides/rel_notes/release_23_11.rst | 3 + drivers/ml/cnxk/cn10k_ml_ops.c | 4 - drivers/ml/cnxk/cnxk_ml_io.h | 6 ++ drivers/ml/cnxk/cnxk_ml_ops.c | 4 + drivers/ml/cnxk/cnxk_ml_ops.h | 5 + drivers/ml/cnxk/mvtvm_ml_model.c | 20 drivers/ml/cnxk/mvtvm_ml_model.h | 6 ++ drivers/ml/cnxk/mvtvm_ml_ops.c | 124 + drivers/ml/cnxk/mvtvm_ml_ops.h | 43 + 9 files changed, 211 insertions(+), 4 deletions(-) diff --git a/doc/guides/rel_notes/release_23_11.rst b/doc/guides/rel_notes/release_23_11.rst index 0a6fc76a9d..5fcf2a1897 100644 --- a/doc/guides/rel_notes/release_23_11.rst +++ b/doc/guides/rel_notes/release_23_11.rst @@ -243,6 +243,9 @@ New Features Added dispatcher library which purpose is to help decouple different parts (modules) of an eventdev-based application. +* **Updated Marvell cnxk mldev driver.** + + * Added support for models compiled using TVM framework. Removed Items - diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index 01b0a44caa..b9d30278c6 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -371,10 +371,6 @@ cn10k_ml_dev_configure(struct cnxk_ml_dev *cnxk_mldev, const struct rte_ml_dev_c else cn10k_mldev->ml_jcmdq_enqueue = roc_ml_jcmdq_enqueue_lf; - cnxk_mldev->mldev->enqueue_burst = cnxk_ml_enqueue_burst; - cnxk_mldev->mldev->dequeue_burst = cnxk_ml_dequeue_burst; - cnxk_mldev->mldev->op_error_get = cn10k_ml_op_error_get; - return 0; } diff --git a/drivers/ml/cnxk/cnxk_ml_io.h b/drivers/ml/cnxk/cnxk_ml_io.h index 5de166c252..6d5d25a7c9 100644 --- a/drivers/ml/cnxk/cnxk_ml_io.h +++ b/drivers/ml/cnxk/cnxk_ml_io.h @@ -47,6 +47,12 @@ struct cnxk_ml_io { /* Scale */ float scale; + + /* Dequantized offset */ + uint32_t offset_d; + + /* Quantized offset */ + uint32_t offset_q; }; /* Model / Layer IO structure */ diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c index 2632d70d8c..bf266d4d6e 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.c +++ b/drivers/ml/cnxk/cnxk_ml_ops.c @@ -632,6 +632,10 @@ cnxk_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *co cnxk_mldev->max_nb_layers = cnxk_mldev->cn10k_mldev.fw.req->cn10k_req.jd.fw_load.cap.s.max_models; + cnxk_mldev->mldev->enqueue_burst = cnxk_ml_enqueue_burst; + cnxk_mldev->mldev->dequeue_burst = cnxk_ml_dequeue_burst; + cnxk_mldev->mldev->op_error_get = cn10k_ml_op_error_get; + /* Allocate and initialize index_map */ if (cnxk_mldev->index_map == NULL) { cnxk_mldev->index_map = diff --git a/drivers/ml/cnxk/cnxk_ml_ops.h b/drivers/ml/cnxk/cnxk_ml_ops.h index ab32676b3e..7b49793a57 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.h +++ b/drivers/ml/cnxk/cnxk_ml_ops.h @@ -24,6 +24,11 @@ struct cnxk_ml_req { union { /* CN10K */ struct cn10k_ml_req cn10k_req; + +#ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM + /* MVTVM */ + struct mvtvm_ml_req mvtvm_req; +#endif }; /* Address of status field */ diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c index ffbcec8b80..95bde6a9cb 100644 --- a/drivers/ml/cnxk/mvtvm_ml_model.c +++ b/drivers/ml/cnxk/mvtvm_ml_model.c @@ -198,6 +198,16 @@ mvtvm_ml_model_io_info_set(struct cnxk_ml_model *model) model->mvtvm.info.total_input_sz_d += model->mvtvm.info.input[i].sz_d; model->mvtvm.info.total_input_sz_q += model->mvtvm.info.input[i].sz_q; + model->mvtvm.info.input[i].offset_d = model->mvtvm.info.total_input_sz_d; + model->mvtvm.info.input[i].offset_q = model->mvtvm.info.total_input_sz_q; + + model->mvtvm.input_tensor[i].device = metadata->input[i].device; + model->mvtvm.input_tensor[i].ndim = metadata->input[i].ndim; + model->mvtvm.input_tensor[i].dtype = metadata->input[i].datatype; + model->mvtvm.input_tensor[i].shape = metadata->input[i].shape; + model->mvtvm.input_tensor[i].strides = NULL; + model->mvtvm.input_tensor[i].byte_offset = model->mvtvm.info.input[i].offset_q; + plt_ml_dbg("model_id = %u, input[%u] - sz_d = %u sz_q = %u", model->model_id, i, model->mvtvm.info.input[i].sz_d, model->mvtvm.info.input[i].sz_q); } @@ -231,6
[PATCH v8 34/34] ml/cnxk: enable creation of mvtvm virtual device
Enable support to create a mvtvm virtual device on system's without a PCI based ML HW accelerator. Signed-off-by: Srikanth Yalavarthi --- doc/guides/mldevs/cnxk.rst | 50 +++- drivers/ml/cnxk/cn10k_ml_dev.c | 8 ++ drivers/ml/cnxk/cn10k_ml_dev.h | 3 + drivers/ml/cnxk/cnxk_ml_dev.c| 3 + drivers/ml/cnxk/cnxk_ml_dev.h| 21 drivers/ml/cnxk/cnxk_ml_ops.c| 82 + drivers/ml/cnxk/meson.build | 1 + drivers/ml/cnxk/mvtvm_ml_dev.c | 196 +++ drivers/ml/cnxk/mvtvm_ml_dev.h | 40 +++ drivers/ml/cnxk/mvtvm_ml_ops.c | 31 + drivers/ml/cnxk/mvtvm_ml_ops.h | 2 + drivers/ml/cnxk/mvtvm_ml_stubs.c | 18 +++ drivers/ml/cnxk/mvtvm_ml_stubs.h | 2 + 13 files changed, 433 insertions(+), 24 deletions(-) create mode 100644 drivers/ml/cnxk/mvtvm_ml_dev.c create mode 100644 drivers/ml/cnxk/mvtvm_ml_dev.h diff --git a/doc/guides/mldevs/cnxk.rst b/doc/guides/mldevs/cnxk.rst index a4d8903896..28e5b5b87f 100644 --- a/doc/guides/mldevs/cnxk.rst +++ b/doc/guides/mldevs/cnxk.rst @@ -239,6 +239,23 @@ Bind the ML PF device to the vfio_pci driver: usertools/dpdk-devbind.py -u :00:10.0 usertools/dpdk-devbind.py -b vfio-pci :00:10.0 +VDEV support + + +On platforms which don't support ML hardware acceleration through PCI device, the +Marvell ML CNXK PMD can execute inference operations on a vdev with the ML models +compiled using Apache TVM framework. + +VDEV can be enabled by passing the EAL arguments + +.. code-block:: console + + --vdev ml_mvtvm + +VDEV can also be used on platforms with ML HW accelerator. However to use VDEV in +this case, the PCI device has to be un-binded. When PCI device is binded, creation +of vdev is skipped. + Runtime Config Options -- @@ -249,6 +266,8 @@ Runtime Config Options The parameter ``fw_path`` can be used by the user to load ML firmware from a custom path. + This option is supported only on PCI HW accelerator. + For example:: -a :00:10.0,fw_path="/home/user/ml_fw.bin" @@ -264,6 +283,8 @@ Runtime Config Options When enabled, firmware would mask the DPE non-fatal hardware errors as warnings. The parameter ``enable_dpe_warnings`` is used fo this configuration. + This option is supported only on PCI HW accelerator. + For example:: -a :00:10.0,enable_dpe_warnings=0 @@ -280,11 +301,19 @@ Runtime Config Options Caching of model data improves the inferencing throughput / latency for the model. The parameter ``cache_model_data`` is used to enable data caching. + This option is supported on PCI HW accelerator and vdev. + For example:: -a :00:10.0,cache_model_data=0 - With the above configuration, model data caching is disabled. + With the above configuration, model data caching is disabled on HW accelerator. + + For example:: + + --vdev ml_mvtvm,cache_model_data=0 + + With the above configuration, model data caching is disabled on vdev. **OCM allocation mode** (default ``lowest``) @@ -300,6 +329,8 @@ Runtime Config Options ``largest`` Allocate OCM for the model from the slot with largest amount of free space. + This option is supported only on PCI HW accelerator. + For example:: -a :00:10.0,ocm_alloc_mode=lowest @@ -317,6 +348,8 @@ Runtime Config Options Supported page sizes by the driver are 1 KB, 2 KB, 4 KB, 8 KB and 16 KB. Default page size is 16 KB. + This option is supported only on PCI HW accelerator. + For example:: -a :00:10.0,ocm_page_size=8192 @@ -341,6 +374,8 @@ Runtime Config Options Enabling spinlock version would disable restrictions on the number of queue-pairs that can be supported by the driver. + This option is supported only on PCI HW accelerator. + For example:: -a :00:10.0,hw_queue_lock=1 @@ -349,6 +384,19 @@ Runtime Config Options in the fast path enqueue burst operation. +**Maximum queue pairs** (default ``1``) + + VDEV supports additional EAL arguments to configure the maximum number of + queue-pairs on the ML device through the option ``max_qps``. + + This option is supported only on vdev. + + For example:: + + --vdev ml_mvtvm,max_qps=4 + + With the above configuration, 4 queue-pairs are created on the vdev. + Debugging Options - diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c index 91813e9d0a..41f3b7a95d 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.c +++ b/drivers/ml/cnxk/cn10k_ml_dev.c @@ -309,6 +309,12 @@ cn10k_ml_pci_probe(struct rte_pci_driver *pci_drv, struct rte_pci_device *pci_de PLT_SET_USED(pci_drv); + if (cnxk_ml_dev_initialized == 1) { + plt_err("ML CNXK device already initialized!"); + plt_err("Cannot initialize CN10K PCI dev"); + return -EINVAL; + } + init_params = (struct rte_ml_dev_pmd_init_params){
[PATCH v8 18/34] ml/cnxk: support config and close of tvmdp library
Added support to configure and close TVMDP library based on ML device configuration options. Updated meson build to enable Jansson, TVM runtime, TVMDP library as build dependencies. Signed-off-by: Srikanth Yalavarthi --- config/arm/arm64_cn10k_linux_gcc | 1 + config/arm/arm64_cn9k_linux_gcc | 1 + doc/guides/mldevs/cnxk.rst | 169 +++ drivers/ml/cnxk/cnxk_ml_ops.c| 7 ++ drivers/ml/cnxk/cnxk_ml_ops.h| 6 ++ drivers/ml/cnxk/meson.build | 58 +++ drivers/ml/cnxk/mvtvm_ml_ops.c | 41 drivers/ml/cnxk/mvtvm_ml_ops.h | 19 drivers/ml/cnxk/mvtvm_ml_stubs.c | 26 + drivers/ml/cnxk/mvtvm_ml_stubs.h | 15 +++ 10 files changed, 343 insertions(+) create mode 100644 drivers/ml/cnxk/mvtvm_ml_ops.c create mode 100644 drivers/ml/cnxk/mvtvm_ml_ops.h create mode 100644 drivers/ml/cnxk/mvtvm_ml_stubs.c create mode 100644 drivers/ml/cnxk/mvtvm_ml_stubs.h diff --git a/config/arm/arm64_cn10k_linux_gcc b/config/arm/arm64_cn10k_linux_gcc index 05d2d64cf2..fa904af5d0 100644 --- a/config/arm/arm64_cn10k_linux_gcc +++ b/config/arm/arm64_cn10k_linux_gcc @@ -5,6 +5,7 @@ ar = 'aarch64-linux-gnu-gcc-ar' strip = 'aarch64-linux-gnu-strip' pkgconfig = 'aarch64-linux-gnu-pkg-config' pcap-config = '' +cmake = 'cmake' [host_machine] system = 'linux' diff --git a/config/arm/arm64_cn9k_linux_gcc b/config/arm/arm64_cn9k_linux_gcc index 7416454de0..646ce4b5d3 100644 --- a/config/arm/arm64_cn9k_linux_gcc +++ b/config/arm/arm64_cn9k_linux_gcc @@ -5,6 +5,7 @@ ar = 'aarch64-linux-gnu-gcc-ar' strip = 'aarch64-linux-gnu-strip' pkgconfig = 'aarch64-linux-gnu-pkg-config' pcap-config = '' +cmake = 'cmake' [host_machine] system = 'linux' diff --git a/doc/guides/mldevs/cnxk.rst b/doc/guides/mldevs/cnxk.rst index 1834b1f905..a4d8903896 100644 --- a/doc/guides/mldevs/cnxk.rst +++ b/doc/guides/mldevs/cnxk.rst @@ -46,6 +46,175 @@ or cross-compiled on an x86 platform. Refer to :doc:`../platform/cnxk` for instructions to build your DPDK application. +Compilation Prerequisites +- + +This driver requires external libraries to optionally enable support for +models compiled using Apache TVM framework. The following dependencies are +not part of DPDK and must be installed separately: + +- **Jansson** + + This library enables support to parse and read JSON files. + +- **DLPack** + + This library provides headers for open in-memory tensor structures. + +.. note:: + +DPDK CNXK ML driver requires DLPack version 0.7 + +.. code-block:: console + +git clone https://github.com/dmlc/dlpack.git +cd dlpack +git checkout v0.7 -b v0.7 +cmake -S ./ -B build \ + -DCMAKE_INSTALL_PREFIX= \ + -DBUILD_MOCK=OFF +make -C build +make -C build install + +*Cross-compiling for AArch64* + +.. code-block:: console + +git clone https://github.com/dmlc/dlpack.git +cd dlpack +git checkout v0.7 -b v0.7 +cmake -S ./ -B build \ + -DCMAKE_INSTALL_PREFIX= + -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc \ + -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++ \ + -DBUILD_MOCK=OFF +make -C build +make -C build install + +- **DMLC** + + This is a common bricks library for building scalable and portable distributed + machine learning. + +.. code-block:: console + +git clone https://github.com/dmlc/dmlc-core.git +cd dmlc-core +git checkout main +cmake -S ./ -B build \ + -DCMAKE_INSTALL_PREFIX= \ + -DCMAKE_C_FLAGS="-fpermissive" \ + -DCMAKE_CXX_FLAGS="-fpermissive" \ + -DUSE_OPENMP=OFF +make -C build +make -C build install + +*Cross-compiling for AArch64* + +.. code-block:: console + +git clone https://github.com/dmlc/dmlc-core.git +cd dmlc-core +git checkout main +cmake -S ./ -B build \ + -DCMAKE_INSTALL_PREFIX= \ + -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc \ + -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++ \ + -DCMAKE_C_FLAGS="-fpermissive" \ + -DCMAKE_CXX_FLAGS="-fpermissive" \ + -DUSE_OPENMP=OFF +make -C build +make -C build install + +- **TVM** + + Apache TVM provides a runtime libraries used to execute models on CPU cores + or hardware accelerators. + +.. note:: + +DPDK CNXK ML driver requires TVM version 0.10.0 + +.. code-block:: console + +git clone https://github.com/apache/tvm.git +cd tvm +git checkout v0.11.0 -b v0.11.0 +git submodule update --init +cmake -S ./ -B build \ + -DCMAKE_INSTALL_PREFIX= \ + -DBUILD_STATIC_RUNTIME=OFF +make -C build +make -C build install + +*Cross-compiling for AArch64* + +.. code-block:: console + +git clone https://github.com/apache/tvm.git +cd tvm +git checkout v0.11.0 -b v0.11.0 +git submodule update --init +cmake -S ./ -B build \ + -DCMAKE_INSTALL_PREFIX= \ + -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc \ + -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++ \ + -DMACHINE_NAME=aarch64-linux-
[PATCH v2] common/qat: limit configuration to the primary process
This change prevents certain configuration functions from being called by the secondary process. Signed-off-by: Arkadiusz Kusztal --- v2: - fixed incorrect function call - rephrased comments drivers/common/qat/qat_device.c | 115 +++- drivers/common/qat/qat_device.h | 2 + 2 files changed, 67 insertions(+), 50 deletions(-) diff --git a/drivers/common/qat/qat_device.c b/drivers/common/qat/qat_device.c index cbf1e6a988..b7bd2ade4a 100644 --- a/drivers/common/qat/qat_device.c +++ b/drivers/common/qat/qat_device.c @@ -13,6 +13,15 @@ #include "adf_pf2vf_msg.h" #include "qat_pf2vf.h" +#define NOT_NULL(arg, func, msg, ...) \ + do {\ + if (arg == NULL) { \ + QAT_LOG(ERR,\ + msg, ##__VA_ARGS__);\ + func; \ + } \ + } while (0) + /* Hardware device information per generation */ struct qat_gen_hw_data qat_gen_config[QAT_N_GENS]; struct qat_dev_hw_spec_funcs *qat_dev_hw_spec[QAT_N_GENS]; @@ -173,6 +182,29 @@ qat_dev_parse_cmd(const char *str, struct qat_dev_cmd_param } } +static enum qat_device_gen +pick_gen(const struct rte_pci_device *pci_dev) +{ + switch (pci_dev->id.device_id) { + case 0x0443: + return QAT_GEN1; + case 0x37c9: + case 0x19e3: + case 0x6f55: + case 0x18ef: + return QAT_GEN2; + case 0x18a1: + return QAT_GEN3; + case 0x4941: + case 0x4943: + case 0x4945: + return QAT_GEN4; + default: + QAT_LOG(ERR, "Invalid dev_id, can't determine generation"); + return QAT_N_GENS; + } +} + struct qat_pci_device * qat_pci_device_allocate(struct rte_pci_device *pci_dev, struct qat_dev_cmd_param *qat_dev_cmd_param) @@ -190,25 +222,8 @@ qat_pci_device_allocate(struct rte_pci_device *pci_dev, rte_pci_device_name(&pci_dev->addr, name, sizeof(name)); snprintf(name+strlen(name), QAT_DEV_NAME_MAX_LEN-strlen(name), "_qat"); - switch (pci_dev->id.device_id) { - case 0x0443: - qat_dev_gen = QAT_GEN1; - break; - case 0x37c9: - case 0x19e3: - case 0x6f55: - case 0x18ef: - qat_dev_gen = QAT_GEN2; - break; - case 0x18a1: - qat_dev_gen = QAT_GEN3; - break; - case 0x4941: - case 0x4943: - case 0x4945: - qat_dev_gen = QAT_GEN4; - break; - default: + qat_dev_gen = pick_gen(pci_dev); + if (qat_dev_gen == QAT_N_GENS) { QAT_LOG(ERR, "Invalid dev_id, can't determine generation"); return NULL; } @@ -265,20 +280,15 @@ qat_pci_device_allocate(struct rte_pci_device *pci_dev, qat_dev->dev_private = qat_dev + 1; strlcpy(qat_dev->name, name, QAT_DEV_NAME_MAX_LEN); qat_dev->qat_dev_id = qat_dev_id; - qat_pci_devs[qat_dev_id].pci_dev = pci_dev; qat_dev->qat_dev_gen = qat_dev_gen; ops_hw = qat_dev_hw_spec[qat_dev->qat_dev_gen]; - if (ops_hw->qat_dev_get_misc_bar == NULL) { - QAT_LOG(ERR, "qat_dev_get_misc_bar function pointer not set"); - rte_memzone_free(qat_dev_mz); - return NULL; - } + NOT_NULL(ops_hw->qat_dev_get_misc_bar, goto error, + "QAT internal error! qat_dev_get_misc_bar function not set"); if (ops_hw->qat_dev_get_misc_bar(&mem_resource, pci_dev) == 0) { if (mem_resource->addr == NULL) { QAT_LOG(ERR, "QAT cannot get access to VF misc bar"); - rte_memzone_free(qat_dev_mz); - return NULL; + goto error; } qat_dev->misc_bar_io_addr = mem_resource->addr; } else @@ -291,22 +301,45 @@ qat_pci_device_allocate(struct rte_pci_device *pci_dev, QAT_LOG(ERR, "Cannot acquire ring configuration for QAT_%d", qat_dev_id); - rte_memzone_free(qat_dev_mz); - return NULL; + goto error; + } + NOT_NULL(ops_hw->qat_dev_reset_ring_pairs, goto error, + "QAT internal error! Reset ring pairs function not set, gen : %d", + qat_dev_gen); + if (ops_hw->qat_dev_reset_ring_pairs(qat_dev)) { + QAT_LOG(ERR, + "Cannot reset ring pairs, does pf driver supports pf2vf comms?" + ); + goto error; } + NOT_NULL(ops_hw->qat_dev_get_slice_map, goto error, + "QAT internal error! Reset ring pairs function not set, gen : %d", + qat_dev_gen); +
[PATCH v2] maintainers: update email address
I left Intel and joined Nvidia, so update my email address. Signed-off-by: Chenbo Xia Acked-by: Maxime Coquelin --- .mailmap| 2 +- MAINTAINERS | 12 ++-- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/.mailmap b/.mailmap index 3f5bab26a8..d40b3ad6c0 100644 --- a/.mailmap +++ b/.mailmap @@ -213,7 +213,7 @@ Charles Brett Charles Myers Charles Stoll Chas Williams <3ch...@gmail.com> -Chenbo Xia +Chenbo Xia Chengchang Tang Chengfeng Ye Chenghu Yao diff --git a/MAINTAINERS b/MAINTAINERS index 4083658697..b1c9495a00 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -50,7 +50,7 @@ T: git://dpdk.org/next/dpdk-next-net-mlx Next-virtio Tree M: Maxime Coquelin -M: Chenbo Xia +M: Chenbo Xia T: git://dpdk.org/next/dpdk-next-virtio Next-crypto Tree @@ -594,7 +594,7 @@ F: drivers/bus/dpaa/ F: drivers/bus/fslmc/ PCI bus driver -M: Chenbo Xia +M: Chenbo Xia M: Nipun Gupta F: drivers/bus/pci/ @@ -983,7 +983,7 @@ F: doc/guides/nics/features/vmxnet3.ini Vhost-user M: Maxime Coquelin -M: Chenbo Xia +M: Chenbo Xia T: git://dpdk.org/next/dpdk-next-virtio F: lib/vhost/ F: doc/guides/prog_guide/vhost_lib.rst @@ -997,7 +997,7 @@ F: doc/guides/sample_app_ug/vdpa.rst Vhost PMD M: Maxime Coquelin -M: Chenbo Xia +M: Chenbo Xia T: git://dpdk.org/next/dpdk-next-virtio F: drivers/net/vhost/ F: doc/guides/nics/vhost.rst @@ -1005,7 +1005,7 @@ F: doc/guides/nics/features/vhost.ini Virtio PMD M: Maxime Coquelin -M: Chenbo Xia +M: Chenbo Xia T: git://dpdk.org/next/dpdk-next-virtio F: drivers/net/virtio/ F: doc/guides/nics/virtio.rst @@ -1661,7 +1661,7 @@ F: app/test/test_rcu* F: doc/guides/prog_guide/rcu_lib.rst PCI -M: Chenbo Xia +M: Chenbo Xia M: Gaetan Rivet F: lib/pci/ -- 2.39.3 (Apple Git-145)
Re: [RFC PATCH v4 1/4] dts: code adjustments for sphinx
> > My only nitpick comment would be on the name of the file common.py that > only contain the MesonArgs class. Looks good otherwise Could you elaborate a bit more, Yoan? The common.py module is supposed to be extended with code common to all other modules in the testbed_model package. Right now we only have MesonArgs which fits in common.py, but we could also move something else into common.py. We also could rename common.py to something else, but then the above purpose would not be clear. I'm finishing the docstrings soon so expect a new version where things like these will be clearer. :-)