RE: [PATCH v5 23/26] regexdev: remove experimental tag

2023-10-22 Thread Ori Kam
Hi Stephen,

As I replied to the previous patch,
Please don't remove the experimental tag from this lib, since
probably, Nvidia will remove support in the near future. The same with Marvel.
So this lib may be deprecated soon.

I don't think we want to notify everyone that those functions are here to stay,
and we don't want to force future HW provider with API that doesn't meet their 
need.

Thanks,
Ori

> -Original Message-
> From: Stephen Hemminger 
> Sent: Friday, October 20, 2023 11:58 PM
> To: dev@dpdk.org
> Cc: Stephen Hemminger ; Ori Kam
> 
> Subject: [PATCH v5 23/26] regexdev: remove experimental tag
> 
> This library was added in 22.11.
> Time to make it not experimental.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/regexdev/rte_regexdev.h | 92 -
>  lib/regexdev/version.map|  2 +-
>  2 files changed, 1 insertion(+), 93 deletions(-)
> 
> diff --git a/lib/regexdev/rte_regexdev.h b/lib/regexdev/rte_regexdev.h
> index d50af775b551..3ea1f0c061a0 100644
> --- a/lib/regexdev/rte_regexdev.h
> +++ b/lib/regexdev/rte_regexdev.h
> @@ -226,9 +226,6 @@ extern int rte_regexdev_logtype;
>  } while (0)
> 
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice.
> - *
>   * Check if dev_id is ready.
>   *
>   * @param dev_id
> @@ -238,27 +235,19 @@ extern int rte_regexdev_logtype;
>   *   - 0 if device state is not in ready state.
>   *   - 1 if device state is ready state.
>   */
> -__rte_experimental
>  int rte_regexdev_is_valid_dev(uint16_t dev_id);
> 
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice.
> - *
>   * Get the total number of RegEx devices that have been successfully
>   * initialised.
>   *
>   * @return
>   *   The total number of usable RegEx devices.
>   */
> -__rte_experimental
>  uint8_t
>  rte_regexdev_count(void);
> 
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice.
> - *
>   * Get the device identifier for the named RegEx device.
>   *
>   * @param name
> @@ -268,7 +257,6 @@ rte_regexdev_count(void);
>   *   Returns RegEx device identifier on success.
>   *   - <0: Failure to find named RegEx device.
>   */
> -__rte_experimental
>  int
>  rte_regexdev_get_dev_id(const char *name);
> 
> @@ -628,9 +616,6 @@ struct rte_regexdev_info {
>  };
> 
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice.
> - *
>   * Retrieve the contextual information of a RegEx device.
>   *
>   * @param dev_id
> @@ -644,7 +629,6 @@ struct rte_regexdev_info {
>   *   - 0: Success, driver updates the contextual information of the RegEx 
> device
>   *   - <0: Error code returned by the driver info get function.
>   */
> -__rte_experimental
>  int
>  rte_regexdev_info_get(uint8_t dev_id, struct rte_regexdev_info *dev_info);
> 
> @@ -723,9 +707,6 @@ struct rte_regexdev_config {
>  };
> 
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice.
> - *
>   * Configure a RegEx device.
>   *
>   * This function must be invoked first before any other function in the
> @@ -743,7 +724,6 @@ struct rte_regexdev_config {
>   * @return
>   *   - 0: Success, device configured. Otherwise negative errno is returned.
>   */
> -__rte_experimental
>  int
>  rte_regexdev_configure(uint8_t dev_id, const struct rte_regexdev_config 
> *cfg);
> 
> @@ -782,9 +762,6 @@ struct rte_regexdev_qp_conf {
>  };
> 
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice.
> - *
>   * Allocate and set up a RegEx queue pair for a RegEx device.
>   *
>   * @param dev_id
> @@ -799,15 +776,11 @@ struct rte_regexdev_qp_conf {
>   * @return
>   *   0 on success. Otherwise negative errno is returned.
>   */
> -__rte_experimental
>  int
>  rte_regexdev_queue_pair_setup(uint8_t dev_id, uint16_t queue_pair_id,
> const struct rte_regexdev_qp_conf *qp_conf);
> 
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice.
> - *
>   * Start a RegEx device.
>   *
>   * The device start step is the last one and consists of setting the RegEx
> @@ -822,14 +795,10 @@ rte_regexdev_queue_pair_setup(uint8_t dev_id,
> uint16_t queue_pair_id,
>   * @return
>   *   0 on success. Otherwise negative errno is returned.
>   */
> -__rte_experimental
>  int
>  rte_regexdev_start(uint8_t dev_id);
> 
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice.
> - *
>   * Stop a RegEx device.
>   *
>   * Stop a RegEx device. The device can be restarted with a call to
> @@ -845,14 +814,10 @@ rte_regexdev_start(uint8_t dev_id);
>   * @return
>   *   0 on success. Otherwise negative errno is returned.
>   */
> -__rte_experimental
>  int
>  rte_regexdev_stop(uint8_t dev_id);
> 
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice.
> - *
>   * Close a RegEx device. The device cannot be restarted!
>   *
>   * @param dev_id
> @@ -861,7 +8

Re: [RFC PATCH v4 1/4] dts: code adjustments for sphinx

2023-10-22 Thread Yoan Picchi

On 8/31/23 11:04, Juraj Linkeš wrote:

sphinx-build only imports the Python modules when building the
documentation; it doesn't run DTS. This requires changes that make the
code importable without running it. This means:
* properly guarding argument parsing in the if __name__ == '__main__'
   block.
* the logger used by DTS runner underwent the same treatment so that it
   doesn't create unnecessary log files.
* however, DTS uses the arguments to construct an object holding global
   variables. The defaults for the global variables needed to be moved
   from argument parsing elsewhere.
* importing the remote_session module from framework resulted in
   circular imports because of one module trying to import another
   module. This is fixed by more granular imports.

Signed-off-by: Juraj Linkeš 
---
  dts/framework/config/__init__.py  |  3 -
  dts/framework/dts.py  | 34 ++-
  dts/framework/remote_session/__init__.py  | 41 -
  .../interactive_remote_session.py |  0
  .../{remote => }/interactive_shell.py |  0
  .../{remote => }/python_shell.py  |  0
  .../remote_session/remote/__init__.py | 27 --
  .../{remote => }/remote_session.py|  0
  .../{remote => }/ssh_session.py   |  0
  .../{remote => }/testpmd_shell.py |  0
  dts/framework/settings.py | 92 +++
  dts/framework/test_suite.py   |  3 +-
  dts/framework/testbed_model/__init__.py   | 12 +--
  dts/framework/testbed_model/common.py | 29 ++
  dts/framework/testbed_model/{hw => }/cpu.py   | 13 +++
  dts/framework/testbed_model/hw/__init__.py| 27 --
  .../linux_session.py  |  4 +-
  dts/framework/testbed_model/node.py   | 22 -
  .../os_session.py | 14 +--
  dts/framework/testbed_model/{hw => }/port.py  |  0
  .../posix_session.py  |  2 +-
  dts/framework/testbed_model/sut_node.py   |  8 +-
  dts/framework/testbed_model/tg_node.py| 30 +-
  .../traffic_generator/__init__.py | 24 +
  .../capturing_traffic_generator.py|  2 +-
  .../{ => traffic_generator}/scapy.py  | 17 +---
  .../traffic_generator.py  | 16 +++-
  .../testbed_model/{hw => }/virtual_device.py  |  0
  dts/framework/utils.py| 53 +--
  dts/main.py   |  3 +-
  30 files changed, 229 insertions(+), 247 deletions(-)
  rename dts/framework/remote_session/{remote => 
}/interactive_remote_session.py (100%)
  rename dts/framework/remote_session/{remote => }/interactive_shell.py (100%)
  rename dts/framework/remote_session/{remote => }/python_shell.py (100%)
  delete mode 100644 dts/framework/remote_session/remote/__init__.py
  rename dts/framework/remote_session/{remote => }/remote_session.py (100%)
  rename dts/framework/remote_session/{remote => }/ssh_session.py (100%)
  rename dts/framework/remote_session/{remote => }/testpmd_shell.py (100%)
  create mode 100644 dts/framework/testbed_model/common.py
  rename dts/framework/testbed_model/{hw => }/cpu.py (95%)
  delete mode 100644 dts/framework/testbed_model/hw/__init__.py
  rename dts/framework/{remote_session => testbed_model}/linux_session.py (98%)
  rename dts/framework/{remote_session => testbed_model}/os_session.py (97%)
  rename dts/framework/testbed_model/{hw => }/port.py (100%)
  rename dts/framework/{remote_session => testbed_model}/posix_session.py (99%)
  create mode 100644 dts/framework/testbed_model/traffic_generator/__init__.py
  rename dts/framework/testbed_model/{ => 
traffic_generator}/capturing_traffic_generator.py (99%)
  rename dts/framework/testbed_model/{ => traffic_generator}/scapy.py (96%)
  rename dts/framework/testbed_model/{ => 
traffic_generator}/traffic_generator.py (80%)
  rename dts/framework/testbed_model/{hw => }/virtual_device.py (100%)

diff --git a/dts/framework/config/__init__.py b/dts/framework/config/__init__.py
index cb7e00ba34..5de8b54bcf 100644
--- a/dts/framework/config/__init__.py
+++ b/dts/framework/config/__init__.py
@@ -324,6 +324,3 @@ def load_config() -> Configuration:
  config: dict[str, Any] = warlock.model_factory(schema, 
name="_Config")(config_data)
  config_obj: Configuration = Configuration.from_dict(dict(config))
  return config_obj
-
-
-CONFIGURATION = load_config()
diff --git a/dts/framework/dts.py b/dts/framework/dts.py
index f773f0c38d..925a212210 100644
--- a/dts/framework/dts.py
+++ b/dts/framework/dts.py
@@ -3,22 +3,23 @@
  # Copyright(c) 2022-2023 PANTHEON.tech s.r.o.
  # Copyright(c) 2022-2023 University of New Hampshire
  
+import logging

  import sys
  
  from .config import (

-CONFIGURATION,
  BuildTargetConfiguration,
  ExecutionConfiguration,
  TestSuiteConfig,
+load_config,
  )
  from .exception import BlockingTestSuiteError
  f

Re: [PATCH] eal: fix modify data area after memset

2023-10-22 Thread Dmitry Kozlyuk
2023-09-22 16:12 (UTC+0800), Fengnan Chang:
> ping
> 
> Fengnan Chang  于2023年9月12日周二 17:05写道:
> >
> > Let's look at this path:
> > malloc_elem_free  
> >->malloc_elem_join_adjacent_free
> >   ->join_elem(elem, elem->next)  
> >
> > 0. cur elem's pad > 0
> > 1. data area memset in malloc_elem_free first.
> > 2. next elem is free, try to join cur elem and next.
> > 3. in join_elem, try to modify inner->size, this address had
> > memset in step 1, it casue the content of addrees become non-zero.
> >
> > If user call rte_zmalloc, and pick this elem, it can't get all
> > zero'd memory.

malloc_elem_join_adjacent_free() always calls memset() after join_elem(),
for the next and the previous element respectively.
How to reproduce this bug?


RE: [PATCH] net/virtio: fix the setting of the vector for link state interrupt

2023-10-22 Thread Ma, WenwuX
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin 
> Sent: Friday, October 20, 2023 5:01 PM
> To: Ma, WenwuX ; dev@dpdk.org
> Cc: chenbo@intel.com; Ling, WeiX ;
> sta...@dpdk.org
> Subject: Re: [PATCH] net/virtio: fix the setting of the vector for link state
> interrupt
> 
> Hi Wenwu,
> 
> Please reword the commit title to something:
> net/virtio: fix link state interrupt vector setting
> 
> On 8/7/23 05:15, Wenwu Ma wrote:
> > The settings of the vector for link state interrupts should be done
> > before the initialization of the device is completed.
> >
> > Fixes: ee85024cf5f7 ("net/virtio: complete init stage at the right
> > place")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Wenwu Ma 
> > ---
> >   drivers/net/virtio/virtio_ethdev.c | 16 
> >   1 file changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/net/virtio/virtio_ethdev.c
> > b/drivers/net/virtio/virtio_ethdev.c
> > index 2c23f1c00e..1801b0ae47 100644
> > --- a/drivers/net/virtio/virtio_ethdev.c
> > +++ b/drivers/net/virtio/virtio_ethdev.c
> > @@ -1912,6 +1912,14 @@ virtio_init_device(struct rte_eth_dev *eth_dev,
> uint64_t req_features)
> > }
> > }
> >
> > +   if (eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
> > +   /* Enable vector (0) for Link State Interrupt */
> > +   if (VIRTIO_OPS(hw)->set_config_irq(hw, 0) ==
> > +   VIRTIO_MSI_NO_VECTOR) {
> > +   PMD_DRV_LOG(ERR, "failed to set config vector");
> > +   return -EBUSY;
> > +   }
> > +
> > virtio_reinit_complete(hw);
> >
> > return 0;
> > @@ -2237,14 +2245,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
> > hw->has_tx_offload = tx_offload_enabled(hw);
> > hw->has_rx_offload = rx_offload_enabled(hw);
> >
> > -   if (dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
> > -   /* Enable vector (0) for Link State Interrupt */
> > -   if (VIRTIO_OPS(hw)->set_config_irq(hw, 0) ==
> > -   VIRTIO_MSI_NO_VECTOR) {
> > -   PMD_DRV_LOG(ERR, "failed to set config vector");
> > -   return -EBUSY;
> > -   }
> > -
> > if (virtio_with_packed_queue(hw)) {
> >   #if defined(RTE_ARCH_X86_64) && defined(CC_AVX512_SUPPORT)
> > if ((hw->use_vec_rx || hw->use_vec_tx) &&
> 
> It looks good to me, so I can change the title myself while applying if Ok for
> you.
> 
I will submit a new patch with your reworded title.

> Reviewed-by: Maxime Coquelin 
> 
> By the way, can you tell me with which backends have you tested it with?
> Only Virtio-PCI? Or also Virtio-user?
> 
Test step:

1.Bind 1 NIC port to vfio-pci driver:

dpdk-devbind.py --force --bind=vfio-pci :4b:00.0

2.Start dpdk-testpmd as back-end:

x86_64-native-linuxapp-gcc/app/dpdk-testpmd  -l 1-5 -n 8 -a :4b:00.0  \
--vdev net_vhost0,iface=/root/dpdk/vhost-net,queues=4\
-- -i --nb-cores=4 --rxq=4 --txq=4 --rss-ip
testpmd>start

3.Start VM with QEMU-8.0.0 as front-end:

taskset -c 20,21,22,23,24,25,26,27 /home/QEMU/qemu-8.0.0/bin/qemu-system-x86_64 
 -name vm0 -enable-kvm -pidfile /tmp/.vm0.pid \
-daemonize -monitor unix:/tmp/vm0_monitor.sock,server,nowait -netdev 
user,id=nttsip1,hostfwd=tcp:10.239.252.245:6000-:22 -device 
e1000,netdev=nttsip1  \
-cpu host -smp 4 -m 8192 -object 
memory-backend-file,id=mem,size=8192M,mem-path=/dev/hugepages,share=on -numa 
node,memdev=mem -mem-prealloc \
-chardev socket,path=/tmp/vm0_qga0.sock,server,nowait,id=vm0_qga0 -device 
virtio-serial -device 
virtserialport,chardev=vm0_qga0,name=org.qemu.guest_agent.0 -vnc :4 \
-drive file=/home/image/ubuntu2004.img -chardev 
socket,id=char0,path=/root/dpdk/vhost-net -netdev 
type=vhost-user,id=netdev0,chardev=char0,vhostforce,queues=4 \
-device 
virtio-net-pci,netdev=netdev0,mac=00:11:22:33:44:55,disable-modern=true,mrg_rxbuf=on,csum=on,mq=on,vectors=10

4.SSH connect VM and build dpdk-l3fwd-power APP, and then start 
dpdk-l3fwd-power:

CC=gcc meson -Denable_kmods=True -Dlibdir=lib  --default-library=static 
x86_64-native-linuxapp-gcc
ninja -C x86_64-native-linuxapp-gcc
meson configure -Dexamples=l3fwd-power x86_64-native-linuxapp-gcc
ninja -C x86_64-native-linuxapp-gcc

echo 1 > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
dpdk-devbind.py -b vfio-pci :00:05.0

./x86_64-native-linuxapp-gcc/examples/dpdk-l3fwd-power  -c 0xf -n 4 
--log-level='user1,7' -- -p 1 -P --config '(0,0,0),(0,1,1),(0,2,2),(0,3,3)' 
--no-numa  --parse-ptype --interrupt-only

The VM will crash, when start dpdk-l3fwd-power APP in VM with QEMU-8.0.0, and 
it works well when start VM with other QEMU version less than QEMU-8.0.0.

> Thanks,
> Maxime



[PATCH v2] net/virtio: fix link state interrupt vector setting

2023-10-22 Thread Wenwu Ma
The settings of the vector for link state interrupts
should be done before the initialization of the device
is completed.

Fixes: ee85024cf5f7 ("net/virtio: complete init stage at the right place")
Cc: sta...@dpdk.org

Signed-off-by: Wenwu Ma 
Tested-by: Wei Ling 
Reviewed-by: Maxime Coquelin 
---
v2:
 - rewording of the title

---
 drivers/net/virtio/virtio_ethdev.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 3ab56ef769..c2c0a1a111 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1912,6 +1912,14 @@ virtio_init_device(struct rte_eth_dev *eth_dev, uint64_t 
req_features)
}
}
 
+   if (eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
+   /* Enable vector (0) for Link State Interrupt */
+   if (VIRTIO_OPS(hw)->set_config_irq(hw, 0) ==
+   VIRTIO_MSI_NO_VECTOR) {
+   PMD_DRV_LOG(ERR, "failed to set config vector");
+   return -EBUSY;
+   }
+
virtio_reinit_complete(hw);
 
return 0;
@@ -2237,14 +2245,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
hw->has_tx_offload = tx_offload_enabled(hw);
hw->has_rx_offload = rx_offload_enabled(hw);
 
-   if (dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
-   /* Enable vector (0) for Link State Interrupt */
-   if (VIRTIO_OPS(hw)->set_config_irq(hw, 0) ==
-   VIRTIO_MSI_NO_VECTOR) {
-   PMD_DRV_LOG(ERR, "failed to set config vector");
-   return -EBUSY;
-   }
-
if (virtio_with_packed_queue(hw)) {
 #if defined(RTE_ARCH_X86_64) && defined(CC_AVX512_SUPPORT)
if ((hw->use_vec_rx || hw->use_vec_tx) &&
-- 
2.34.1



RE: [PATCH v2 01/14] eal: make bitops a stable API

2023-10-22 Thread Joyce Kong
> -Original Message-
> From: Stephen Hemminger 
> Sent: Saturday, October 21, 2023 5:41 AM
> To: dev@dpdk.org
> Cc: Stephen Hemminger ; Cristian
> Dumitrescu ; Joyce Kong
> 
> Subject: [PATCH v2 01/14] eal: make bitops a stable API
>
> These were added in 20.05 release.
>
> Signed-off-by: Stephen Hemminger 

Reviewed-by: Joyce Kong 

> ---
>  lib/eal/include/rte_bitmap.h |  8   lib/eal/include/rte_bitops.h | 
> 40 -
> ---
>  2 files changed, 48 deletions(-)
>
> diff --git a/lib/eal/include/rte_bitmap.h b/lib/eal/include/rte_bitmap.h index
> 46a822768d50..ec819595624c 100644
> --- a/lib/eal/include/rte_bitmap.h
> +++ b/lib/eal/include/rte_bitmap.h
> @@ -203,9 +203,6 @@ rte_bitmap_init(uint32_t n_bits, uint8_t *mem,
> uint32_t mem_size)  }
>
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice.
> - *
>   * Bitmap clear slab overhead bits.
>   *
>   * @param slabs
> @@ -215,7 +212,6 @@ rte_bitmap_init(uint32_t n_bits, uint8_t *mem,
> uint32_t mem_size)
>   * @param pos
>   *   The start bit position in the slabs to be cleared.
>   */
> -__rte_experimental
>  static inline void
>  __rte_bitmap_clear_slab_overhead_bits(uint64_t *slabs, uint32_t slab_size,
> uint32_t pos)
> @@ -235,9 +231,6 @@ __rte_bitmap_clear_slab_overhead_bits(uint64_t
> *slabs, uint32_t slab_size,  }
>
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice.
> - *
>   * Bitmap initialization with all bits set
>   *
>   * @param n_bits
> @@ -249,7 +242,6 @@ __rte_bitmap_clear_slab_overhead_bits(uint64_t
> *slabs, uint32_t slab_size,
>   * @return
>   *   Handle to bitmap instance.
>   */
> -__rte_experimental
>  static inline struct rte_bitmap *
>  rte_bitmap_init_with_all_set(uint32_t n_bits, uint8_t *mem, uint32_t
> mem_size)  { diff --git a/lib/eal/include/rte_bitops.h
> b/lib/eal/include/rte_bitops.h index 6b8ae8d3acf6..29d24b3a780e 100644
> --- a/lib/eal/include/rte_bitops.h
> +++ b/lib/eal/include/rte_bitops.h
> @@ -42,9 +42,6 @@ extern "C" {
>  /* 32-bit relaxed operations 
> */
>
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> - *
>   * Get the target bit from a 32-bit value without memory ordering.
>   *
>   * @param nr
> @@ -54,7 +51,6 @@ extern "C" {
>   * @return
>   *   The target bit.
>   */
> -__rte_experimental
>  static inline uint32_t
>  rte_bit_relaxed_get32(unsigned int nr, volatile uint32_t *addr)  { @@ -65,9
> +61,6 @@ rte_bit_relaxed_get32(unsigned int nr, volatile uint32_t *addr)  }
>
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> - *
>   * Set the target bit in a 32-bit value to 1 without memory ordering.
>   *
>   * @param nr
> @@ -75,7 +68,6 @@ rte_bit_relaxed_get32(unsigned int nr, volatile uint32_t
> *addr)
>   * @param addr
>   *   The address holding the bit.
>   */
> -__rte_experimental
>  static inline void
>  rte_bit_relaxed_set32(unsigned int nr, volatile uint32_t *addr)  { @@ -86,9
> +78,6 @@ rte_bit_relaxed_set32(unsigned int nr, volatile uint32_t *addr)  }
>
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> - *
>   * Clear the target bit in a 32-bit value to 0 without memory ordering.
>   *
>   * @param nr
> @@ -96,7 +85,6 @@ rte_bit_relaxed_set32(unsigned int nr, volatile uint32_t
> *addr)
>   * @param addr
>   *   The address holding the bit.
>   */
> -__rte_experimental
>  static inline void
>  rte_bit_relaxed_clear32(unsigned int nr, volatile uint32_t *addr)  { @@ 
> -107,9
> +95,6 @@ rte_bit_relaxed_clear32(unsigned int nr, volatile uint32_t *addr)  }
>
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> - *
>   * Return the original bit from a 32-bit value, then set it to 1 without
>   * memory ordering.
>   *
> @@ -120,7 +105,6 @@ rte_bit_relaxed_clear32(unsigned int nr, volatile
> uint32_t *addr)
>   * @return
>   *   The original bit.
>   */
> -__rte_experimental
>  static inline uint32_t
>  rte_bit_relaxed_test_and_set32(unsigned int nr, volatile uint32_t *addr)
> { @@ -133,9 +117,6 @@ rte_bit_relaxed_test_and_set32(unsigned int nr,
> volatile uint32_t *addr)  }
>
>  /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice
> - *
>   * Return the original bit from a 32-bit value, then clear it to 0 without
>   * memory ordering.
>   *
> @@ -146,7 +127,6 @@ rte_bit_relaxed_test_and_set32(unsigned int nr,
> volatile uint32_t *addr)
>   * @return
>   *   The original bit.
>   */
> -__rte_experimental
>  static inline uint32_t
>  rte_bit_relaxed_test_and_clear32(unsigned int nr, volatile uint32_t *addr)
> { @@ -161,9 +141,6 @@ rte_bit_relaxed_test_and_clear32(unsigned int nr,
> volatile uint32_t *addr)
>  /* 64-bit rel

[PATCH] app/testpmd: support config all offload

2023-10-22 Thread Chengwen Feng
Extend supports all offload configuration in following commands:
1. port config 0 rx_offload all on/off
2. port config 0 tx_offload all on/off
3. port 0 rxq 0 rx_offload all on/off
4. port 0 txq 0 tx_offload all on/off

Signed-off-by: Chengwen Feng 
---
 app/test-pmd/cmdline.c  | 112 +++-
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |   8 +-
 2 files changed, 68 insertions(+), 52 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 679ca47b94..35f5e4bbc0 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -763,7 +763,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"port config (port_id) udp_tunnel_port add|rm 
vxlan|geneve|ecpri (udp_port)\n\n"
"Add/remove UDP tunnel port for tunneling 
offload\n\n"
 
-   "port config  rx_offload vlan_strip|"
+   "port config  rx_offload all|vlan_strip|"
"ipv4_cksum|udp_cksum|tcp_cksum|tcp_lro|qinq_strip|"
"outer_ipv4_cksum|macsec_strip|"
"vlan_filter|vlan_extend|scatter|"
@@ -771,7 +771,7 @@ static void cmd_help_long_parsed(void *parsed_result,
" Enable or disable a per port Rx offloading"
" on all Rx queues of a port\n\n"
 
-   "port (port_id) rxq (queue_id) rx_offload vlan_strip|"
+   "port (port_id) rxq (queue_id) rx_offload 
all|vlan_strip|"
"ipv4_cksum|udp_cksum|tcp_cksum|tcp_lro|qinq_strip|"
"outer_ipv4_cksum|macsec_strip|"
"vlan_filter|vlan_extend|scatter|"
@@ -779,7 +779,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"Enable or disable a per queue Rx offloading"
" only on a specific Rx queue\n\n"
 
-   "port config (port_id) tx_offload vlan_insert|"
+   "port config (port_id) tx_offload all|vlan_insert|"
"ipv4_cksum|udp_cksum|tcp_cksum|sctp_cksum|tcp_tso|"
"udp_tso|outer_ipv4_cksum|qinq_insert|vxlan_tnl_tso|"
"gre_tnl_tso|ipip_tnl_tso|geneve_tnl_tso|"
@@ -788,7 +788,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"Enable or disable a per port Tx offloading"
" on all Tx queues of a port\n\n"
 
-   "port (port_id) txq (queue_id) tx_offload vlan_insert|"
+   "port (port_id) txq (queue_id) tx_offload 
all|vlan_insert|"
"ipv4_cksum|udp_cksum|tcp_cksum|sctp_cksum|tcp_tso|"
"udp_tso|outer_ipv4_cksum|qinq_insert|vxlan_tnl_tso|"
"gre_tnl_tso|ipip_tnl_tso|geneve_tnl_tso|macsec_insert"
@@ -2,7 +2,7 @@ static cmdline_parse_token_string_t 
cmd_config_per_port_rx_offload_result_rx_off
 static cmdline_parse_token_string_t 
cmd_config_per_port_rx_offload_result_offload =
TOKEN_STRING_INITIALIZER
(struct cmd_config_per_port_rx_offload_result,
-offload, "vlan_strip#ipv4_cksum#udp_cksum#tcp_cksum#tcp_lro#"
+offload, 
"all#vlan_strip#ipv4_cksum#udp_cksum#tcp_cksum#tcp_lro#"
   "qinq_strip#outer_ipv4_cksum#macsec_strip#"
   "vlan_filter#vlan_extend#"
   "scatter#buffer_split#timestamp#security#"
@@ -11155,8 +11155,8 @@ cmd_config_per_port_rx_offload_parsed(void 
*parsed_result,
portid_t port_id = res->port_id;
struct rte_eth_dev_info dev_info;
struct rte_port *port = &ports[port_id];
-   uint64_t single_offload;
uint16_t nb_rx_queues;
+   uint64_t offload;
int q;
int ret;
 
@@ -11167,25 +11167,29 @@ cmd_config_per_port_rx_offload_parsed(void 
*parsed_result,
return;
}
 
-   single_offload = search_rx_offload(res->offload);
-   if (single_offload == 0) {
-   fprintf(stderr, "Unknown offload name: %s\n", res->offload);
-   return;
-   }
-
ret = eth_dev_info_get_print_err(port_id, &dev_info);
if (ret != 0)
return;
 
+   if (!strcmp(res->offload, "all")) {
+   offload = dev_info.rx_offload_capa;
+   } else {
+   offload = search_rx_offload(res->offload);
+   if (offload == 0) {
+   fprintf(stderr, "Unknown offload name: %s\n", 
res->offload);
+   return;
+   }
+   }
+
nb_rx_queues = dev_info.nb_rx_queues;
if (!strcmp(res->on_off, "on")) {
-   port->dev_conf.rxmode.offloads |= single_offload;
+   port->dev_conf.rxmode.offloads |= offload;
for (q = 0; q < nb_rx_queues; q++)
-   port->

Re: [PATCH] app/testpmd: support config all offload

2023-10-22 Thread fengchengwen
Add cc to testpmd's maintainer due tools failed to add.

On 2023/10/23 10:29, Chengwen Feng wrote:
> Extend supports all offload configuration in following commands:
> 1. port config 0 rx_offload all on/off
> 2. port config 0 tx_offload all on/off
> 3. port 0 rxq 0 rx_offload all on/off
> 4. port 0 txq 0 tx_offload all on/off
> 
> Signed-off-by: Chengwen Feng 
> ---
>  app/test-pmd/cmdline.c  | 112 +++-
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |   8 +-
>  2 files changed, 68 insertions(+), 52 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 679ca47b94..35f5e4bbc0 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -763,7 +763,7 @@ static void cmd_help_long_parsed(void *parsed_result,
>   "port config (port_id) udp_tunnel_port add|rm 
> vxlan|geneve|ecpri (udp_port)\n\n"
>   "Add/remove UDP tunnel port for tunneling 
> offload\n\n"
>  
> - "port config  rx_offload vlan_strip|"
> + "port config  rx_offload all|vlan_strip|"
>   "ipv4_cksum|udp_cksum|tcp_cksum|tcp_lro|qinq_strip|"
>   "outer_ipv4_cksum|macsec_strip|"
>   "vlan_filter|vlan_extend|scatter|"
> @@ -771,7 +771,7 @@ static void cmd_help_long_parsed(void *parsed_result,
>   " Enable or disable a per port Rx offloading"
>   " on all Rx queues of a port\n\n"
>  
> - "port (port_id) rxq (queue_id) rx_offload vlan_strip|"
> + "port (port_id) rxq (queue_id) rx_offload 
> all|vlan_strip|"
>   "ipv4_cksum|udp_cksum|tcp_cksum|tcp_lro|qinq_strip|"
>   "outer_ipv4_cksum|macsec_strip|"
>   "vlan_filter|vlan_extend|scatter|"
> @@ -779,7 +779,7 @@ static void cmd_help_long_parsed(void *parsed_result,
>   "Enable or disable a per queue Rx offloading"
>   " only on a specific Rx queue\n\n"
>  
> - "port config (port_id) tx_offload vlan_insert|"
> + "port config (port_id) tx_offload all|vlan_insert|"
>   "ipv4_cksum|udp_cksum|tcp_cksum|sctp_cksum|tcp_tso|"
>   "udp_tso|outer_ipv4_cksum|qinq_insert|vxlan_tnl_tso|"
>   "gre_tnl_tso|ipip_tnl_tso|geneve_tnl_tso|"
> @@ -788,7 +788,7 @@ static void cmd_help_long_parsed(void *parsed_result,
>   "Enable or disable a per port Tx offloading"
>   " on all Tx queues of a port\n\n"
>  
> - "port (port_id) txq (queue_id) tx_offload vlan_insert|"
> + "port (port_id) txq (queue_id) tx_offload 
> all|vlan_insert|"
>   "ipv4_cksum|udp_cksum|tcp_cksum|sctp_cksum|tcp_tso|"
>   "udp_tso|outer_ipv4_cksum|qinq_insert|vxlan_tnl_tso|"
>   "gre_tnl_tso|ipip_tnl_tso|geneve_tnl_tso|macsec_insert"
> @@ -2,7 +2,7 @@ static cmdline_parse_token_string_t 
> cmd_config_per_port_rx_offload_result_rx_off
>  static cmdline_parse_token_string_t 
> cmd_config_per_port_rx_offload_result_offload =
>   TOKEN_STRING_INITIALIZER
>   (struct cmd_config_per_port_rx_offload_result,
> -  offload, "vlan_strip#ipv4_cksum#udp_cksum#tcp_cksum#tcp_lro#"
> +  offload, 
> "all#vlan_strip#ipv4_cksum#udp_cksum#tcp_cksum#tcp_lro#"
>  "qinq_strip#outer_ipv4_cksum#macsec_strip#"
>  "vlan_filter#vlan_extend#"
>  "scatter#buffer_split#timestamp#security#"
> @@ -11155,8 +11155,8 @@ cmd_config_per_port_rx_offload_parsed(void 
> *parsed_result,
>   portid_t port_id = res->port_id;
>   struct rte_eth_dev_info dev_info;
>   struct rte_port *port = &ports[port_id];
> - uint64_t single_offload;
>   uint16_t nb_rx_queues;
> + uint64_t offload;
>   int q;
>   int ret;
>  
> @@ -11167,25 +11167,29 @@ cmd_config_per_port_rx_offload_parsed(void 
> *parsed_result,
>   return;
>   }
>  
> - single_offload = search_rx_offload(res->offload);
> - if (single_offload == 0) {
> - fprintf(stderr, "Unknown offload name: %s\n", res->offload);
> - return;
> - }
> -
>   ret = eth_dev_info_get_print_err(port_id, &dev_info);
>   if (ret != 0)
>   return;
>  
> + if (!strcmp(res->offload, "all")) {
> + offload = dev_info.rx_offload_capa;
> + } else {
> + offload = search_rx_offload(res->offload);
> + if (offload == 0) {
> + fprintf(stderr, "Unknown offload name: %s\n", 
> res->offload);
> + return;
> + }
> + }
> +
>   nb_rx_queues = dev_info.nb_rx_queues;
>   if (!strcmp(res->on_off, "on")) {
> - port-

RE: [PATCH v2 10/14] eal: mark rte_atomic128_cmp_exchange as stable

2023-10-22 Thread Ruifeng Wang
> -Original Message-
> From: Stephen Hemminger 
> Sent: Saturday, October 21, 2023 5:41 AM
> To: dev@dpdk.org
> Cc: Stephen Hemminger ; Ruifeng Wang 
> ;
> Bruce Richardson ; Konstantin Ananyev
> 
> Subject: [PATCH v2 10/14] eal: mark rte_atomic128_cmp_exchange as stable
> 
> This has been around since 2021.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/eal/arm/include/rte_atomic_64.h  | 1 -  
> lib/eal/include/generic/rte_atomic.h | 1 -
> lib/eal/x86/include/rte_atomic_64.h  | 1 -
>  3 files changed, 3 deletions(-)
> 
> diff --git a/lib/eal/arm/include/rte_atomic_64.h 
> b/lib/eal/arm/include/rte_atomic_64.h
> index 75d8ba6092cc..96205e6ad372 100644
> --- a/lib/eal/arm/include/rte_atomic_64.h
> +++ b/lib/eal/arm/include/rte_atomic_64.h
> @@ -94,7 +94,6 @@ __ATOMIC128_CAS_OP(__cas_128_acq_rel, "caspal")
> 
>  #endif
> 
> -__rte_experimental
>  static inline int
>  rte_atomic128_cmp_exchange(rte_int128_t *dst, rte_int128_t *exp,
>   const rte_int128_t *src, unsigned int weak, int success, diff 
> --git
> a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
> index db6797e379f3..048b547ea62b 100644
> --- a/lib/eal/include/generic/rte_atomic.h
> +++ b/lib/eal/include/generic/rte_atomic.h
> @@ -1147,7 +1147,6 @@ typedef struct {
>   * @return
>   *   Non-zero on success; 0 on failure.
>   */
> -__rte_experimental
>  static inline int
>  rte_atomic128_cmp_exchange(rte_int128_t *dst,
>  rte_int128_t *exp,
> diff --git a/lib/eal/x86/include/rte_atomic_64.h 
> b/lib/eal/x86/include/rte_atomic_64.h
> index 0edee8627224..e968bbf0ce65 100644
> --- a/lib/eal/x86/include/rte_atomic_64.h
> +++ b/lib/eal/x86/include/rte_atomic_64.h
> @@ -182,7 +182,6 @@ static inline void rte_atomic64_clear(rte_atomic64_t *v)
> 
>  /* 128 bit atomic operations 
> -*/
> 
> -__rte_experimental
>  static inline int
>  rte_atomic128_cmp_exchange(rte_int128_t *dst,
>  rte_int128_t *exp,
> --
> 2.39.2

Acked-by: Ruifeng Wang 



[PATCH] eal: support lcore usage ratio

2023-10-22 Thread Chengwen Feng
Current, the lcore usage only display two key fields: busy_cycles and
total_cycles, which is inconvenient to obtain the usage ratio
immediately. So adds lcore usage ratio field.

Signed-off-by: Chengwen Feng 
---
 lib/eal/common/eal_common_lcore.c | 34 ---
 1 file changed, 31 insertions(+), 3 deletions(-)

diff --git a/lib/eal/common/eal_common_lcore.c 
b/lib/eal/common/eal_common_lcore.c
index ceda714ca5..d1d0da2dd0 100644
--- a/lib/eal/common/eal_common_lcore.c
+++ b/lib/eal/common/eal_common_lcore.c
@@ -446,6 +446,12 @@ rte_lcore_register_usage_cb(rte_lcore_usage_cb cb)
lcore_usage_cb = cb;
 }
 
+static float
+calc_usage_ratio(const struct rte_lcore_usage *usage)
+{
+   return (usage->busy_cycles * 100.0) / (usage->total_cycles == 0 ? 1 : 
usage->total_cycles);
+}
+
 static int
 lcore_dump_cb(unsigned int lcore_id, void *arg)
 {
@@ -462,8 +468,9 @@ lcore_dump_cb(unsigned int lcore_id, void *arg)
/* Guard against concurrent modification of lcore_usage_cb. */
usage_cb = lcore_usage_cb;
if (usage_cb != NULL && usage_cb(lcore_id, &usage) == 0) {
-   if (asprintf(&usage_str, ", busy cycles %"PRIu64"/%"PRIu64,
-   usage.busy_cycles, usage.total_cycles) < 0) {
+   if (asprintf(&usage_str, ", busy cycles %"PRIu64"/%"PRIu64" 
(ratio %.3f%%)",
+   usage.busy_cycles, usage.total_cycles,
+   calc_usage_ratio(&usage)) < 0) {
return -ENOMEM;
}
}
@@ -511,11 +518,19 @@ struct lcore_telemetry_info {
struct rte_tel_data *d;
 };
 
+static void
+format_usage_ratio(char *buf, uint16_t size, const struct rte_lcore_usage 
*usage)
+{
+   float ratio = calc_usage_ratio(usage);
+   snprintf(buf, size, "%.3f%%", ratio);
+}
+
 static int
 lcore_telemetry_info_cb(unsigned int lcore_id, void *arg)
 {
struct rte_config *cfg = rte_eal_get_configuration();
struct lcore_telemetry_info *info = arg;
+   char ratio_str[RTE_TEL_MAX_STRING_LEN];
struct rte_lcore_usage usage;
struct rte_tel_data *cpuset;
rte_lcore_usage_cb usage_cb;
@@ -544,6 +559,8 @@ lcore_telemetry_info_cb(unsigned int lcore_id, void *arg)
if (usage_cb != NULL && usage_cb(lcore_id, &usage) == 0) {
rte_tel_data_add_dict_uint(info->d, "total_cycles", 
usage.total_cycles);
rte_tel_data_add_dict_uint(info->d, "busy_cycles", 
usage.busy_cycles);
+   format_usage_ratio(ratio_str, sizeof(ratio_str), &usage);
+   rte_tel_data_add_dict_string(info->d, "usage_ratio", ratio_str);
}
 
return 0;
@@ -574,11 +591,13 @@ struct lcore_telemetry_usage {
struct rte_tel_data *lcore_ids;
struct rte_tel_data *total_cycles;
struct rte_tel_data *busy_cycles;
+   struct rte_tel_data *usage_ratio;
 };
 
 static int
 lcore_telemetry_usage_cb(unsigned int lcore_id, void *arg)
 {
+   char ratio_str[RTE_TEL_MAX_STRING_LEN];
struct lcore_telemetry_usage *u = arg;
struct rte_lcore_usage usage;
rte_lcore_usage_cb usage_cb;
@@ -591,6 +610,8 @@ lcore_telemetry_usage_cb(unsigned int lcore_id, void *arg)
rte_tel_data_add_array_uint(u->lcore_ids, lcore_id);
rte_tel_data_add_array_uint(u->total_cycles, 
usage.total_cycles);
rte_tel_data_add_array_uint(u->busy_cycles, usage.busy_cycles);
+   format_usage_ratio(ratio_str, sizeof(ratio_str), &usage);
+   rte_tel_data_add_array_string(u->usage_ratio, ratio_str);
}
 
return 0;
@@ -603,15 +624,19 @@ handle_lcore_usage(const char *cmd __rte_unused, const 
char *params __rte_unused
struct lcore_telemetry_usage usage;
struct rte_tel_data *total_cycles;
struct rte_tel_data *busy_cycles;
+   struct rte_tel_data *usage_ratio;
struct rte_tel_data *lcore_ids;
 
lcore_ids = rte_tel_data_alloc();
total_cycles = rte_tel_data_alloc();
busy_cycles = rte_tel_data_alloc();
-   if (lcore_ids == NULL || total_cycles == NULL || busy_cycles == NULL) {
+   usage_ratio = rte_tel_data_alloc();
+   if (lcore_ids == NULL || total_cycles == NULL || busy_cycles == NULL ||
+   usage_ratio == NULL) {
rte_tel_data_free(lcore_ids);
rte_tel_data_free(total_cycles);
rte_tel_data_free(busy_cycles);
+   rte_tel_data_free(usage_ratio);
return -ENOMEM;
}
 
@@ -619,12 +644,15 @@ handle_lcore_usage(const char *cmd __rte_unused, const 
char *params __rte_unused
rte_tel_data_start_array(lcore_ids, RTE_TEL_UINT_VAL);
rte_tel_data_start_array(total_cycles, RTE_TEL_UINT_VAL);
rte_tel_data_start_array(busy_cycles, RTE_TEL_UINT_VAL);
+   rte_tel_data_start_array(usage_ratio, RTE_TEL_STRING_VAL);
rte_tel_data_add_dict_cont

[PATCH] maintainers: volunteer to maintain power library

2023-10-22 Thread Sivaprasad Tummala
Add co-maintainer for power library.

Signed-off-by: Sivaprasad Tummala 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4083658697..d4d7546eb6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1668,6 +1668,7 @@ F: lib/pci/
 Power management
 M: Anatoly Burakov 
 M: David Hunt 
+M: Sivaprasad Tummala 
 F: lib/power/
 F: doc/guides/prog_guide/power_man.rst
 F: app/test/test_power*
-- 
2.34.1



[PATCH v8 01/34] ml/cnxk: drop support for register polling

2023-10-22 Thread Srikanth Yalavarthi
Dropped support for device argument "poll_mem" for cnxk
ML driver. Support to use registers for polling is removed
and DDR addresses would be used for polling.

Signed-off-by: Srikanth Yalavarthi 
---
 doc/guides/mldevs/cnxk.rst |  16 -
 drivers/ml/cnxk/cn10k_ml_dev.c |  36 +--
 drivers/ml/cnxk/cn10k_ml_dev.h |  13 +---
 drivers/ml/cnxk/cn10k_ml_ops.c | 111 -
 drivers/ml/cnxk/cn10k_ml_ops.h |   6 --
 5 files changed, 18 insertions(+), 164 deletions(-)

diff --git a/doc/guides/mldevs/cnxk.rst b/doc/guides/mldevs/cnxk.rst
index b79bc540d9..1834b1f905 100644
--- a/doc/guides/mldevs/cnxk.rst
+++ b/doc/guides/mldevs/cnxk.rst
@@ -180,22 +180,6 @@ Runtime Config Options
   in the fast path enqueue burst operation.
 
 
-**Polling memory location** (default ``ddr``)
-
-  ML cnxk driver provides the option to select the memory location to be used
-  for polling to check the inference request completion.
-  Driver supports using either the DDR address space (``ddr``)
-  or ML registers (``register``) as polling locations.
-  The parameter ``poll_mem`` is used to specify the poll location.
-
-  For example::
-
- -a :00:10.0,poll_mem="register"
-
-  With the above configuration, ML cnxk driver is configured to use ML 
registers
-  for polling in fastpath requests.
-
-
 Debugging Options
 -
 
diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c
index 983138a7f2..e3c2badcef 100644
--- a/drivers/ml/cnxk/cn10k_ml_dev.c
+++ b/drivers/ml/cnxk/cn10k_ml_dev.c
@@ -23,7 +23,6 @@
 #define CN10K_ML_DEV_CACHE_MODEL_DATA  "cache_model_data"
 #define CN10K_ML_OCM_ALLOC_MODE"ocm_alloc_mode"
 #define CN10K_ML_DEV_HW_QUEUE_LOCK "hw_queue_lock"
-#define CN10K_ML_FW_POLL_MEM   "poll_mem"
 #define CN10K_ML_OCM_PAGE_SIZE "ocm_page_size"
 
 #define CN10K_ML_FW_PATH_DEFAULT   "/lib/firmware/mlip-fw.bin"
@@ -32,7 +31,6 @@
 #define CN10K_ML_DEV_CACHE_MODEL_DATA_DEFAULT  1
 #define CN10K_ML_OCM_ALLOC_MODE_DEFAULT"lowest"
 #define CN10K_ML_DEV_HW_QUEUE_LOCK_DEFAULT 1
-#define CN10K_ML_FW_POLL_MEM_DEFAULT   "ddr"
 #define CN10K_ML_OCM_PAGE_SIZE_DEFAULT 16384
 
 /* ML firmware macros */
@@ -54,7 +52,6 @@ static const char *const valid_args[] = {CN10K_ML_FW_PATH,
 CN10K_ML_DEV_CACHE_MODEL_DATA,
 CN10K_ML_OCM_ALLOC_MODE,
 CN10K_ML_DEV_HW_QUEUE_LOCK,
-CN10K_ML_FW_POLL_MEM,
 CN10K_ML_OCM_PAGE_SIZE,
 NULL};
 
@@ -103,9 +100,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, 
struct cn10k_ml_dev *mlde
bool hw_queue_lock_set = false;
bool ocm_page_size_set = false;
char *ocm_alloc_mode = NULL;
-   bool poll_mem_set = false;
bool fw_path_set = false;
-   char *poll_mem = NULL;
char *fw_path = NULL;
int ret = 0;
bool found;
@@ -189,17 +184,6 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, 
struct cn10k_ml_dev *mlde
hw_queue_lock_set = true;
}
 
-   if (rte_kvargs_count(kvlist, CN10K_ML_FW_POLL_MEM) == 1) {
-   ret = rte_kvargs_process(kvlist, CN10K_ML_FW_POLL_MEM, 
&parse_string_arg,
-&poll_mem);
-   if (ret < 0) {
-   plt_err("Error processing arguments, key = %s\n", 
CN10K_ML_FW_POLL_MEM);
-   ret = -EINVAL;
-   goto exit;
-   }
-   poll_mem_set = true;
-   }
-
if (rte_kvargs_count(kvlist, CN10K_ML_OCM_PAGE_SIZE) == 1) {
ret = rte_kvargs_process(kvlist, CN10K_ML_OCM_PAGE_SIZE, 
&parse_integer_arg,
 &mldev->ocm_page_size);
@@ -280,18 +264,6 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, 
struct cn10k_ml_dev *mlde
}
plt_info("ML: %s = %d", CN10K_ML_DEV_HW_QUEUE_LOCK, 
mldev->hw_queue_lock);
 
-   if (!poll_mem_set) {
-   mldev->fw.poll_mem = CN10K_ML_FW_POLL_MEM_DEFAULT;
-   } else {
-   if (!((strcmp(poll_mem, "ddr") == 0) || (strcmp(poll_mem, 
"register") == 0))) {
-   plt_err("Invalid argument, %s = %s\n", 
CN10K_ML_FW_POLL_MEM, poll_mem);
-   ret = -EINVAL;
-   goto exit;
-   }
-   mldev->fw.poll_mem = poll_mem;
-   }
-   plt_info("ML: %s = %s", CN10K_ML_FW_POLL_MEM, mldev->fw.poll_mem);
-
if (!ocm_page_size_set) {
mldev->ocm_page_size = CN10K_ML_OCM_PAGE_SIZE_DEFAULT;
} else {
@@ -450,10 +422,7 @@ cn10k_ml_fw_flags_get(struct cn10k_ml_fw *fw)
if (fw->report_dpe_warnings)
flags = flags | FW_REPORT_DPE_WARNING_BITMASK;

[PATCH v8 00/34] Implementation of revised ml/cnxk driver

2023-10-22 Thread Srikanth Yalavarthi
This patch series is an implementation of revised ml/cnxk driver
to support models compiled with TVM compiler framework. TVM models
use a hybrid mode for execution, with regions of the model executing
on the ML accelerator and the rest executing on CPU cores.

This series of commits reorganizes the ml/cnxk driver and adds support
to execute multiple regions with-in a TVM model.

v8:
  - Updated CMake dependency resolution of external dependencies
  - Updated mldevs/cnxk documentation
  - Updated meson config files for cn9k and cn10k to include cmake

v7:
  - Updated steps to build dependencies in cnxk mldev documentation
  - Replace str functions with rte_str functions
  - Drop use of rte_exit in ml/cnxk driver

v6:
  - Added depends info for series. This series depends on patch-132887
  - Fix merge conflicts with dpdk-23.11-rc1
  - Fix issues with ml/cnxk driver release notes
  - Added build dependency information for dlpack headers

v5:
  - Fix build failures for individual patches in the series
  - Finished build testing with devtools/test-meson-builds.sh script

v4:
  - Squashed release notes
  - Updated external build dependency info in documentation

v3:
  - Reduced use of RTE_MLDEV_CNXK_ENABLE_MVTVM macro
  - Added stubs file with dummy functions to use when TVM is disabled
  - Dropped patch with internal function to read firmware
  - Updated ML CNXK PMD documentation
  - Added external library dependency info in documentation
  - Added release notes for 23.11

v2:
  - Fix xstats reporting
  - Fix issues reported by klocwork static analysis tool
  - Update external header inclusions

v1:
  - Initial changes

Anup Prabhu (2):
  ml/cnxk: enable OCM check for multilayer TVM model
  ml/cnxk: enable fast-path ops for TVM models

Prince Takkar (2):
  ml/cnxk: update internal TVM model info structure
  ml/cnxk: support quantize and dequantize callback

Srikanth Yalavarthi (30):
  ml/cnxk: drop support for register polling
  ml/cnxk: add generic cnxk device structure
  ml/cnxk: add generic model and layer structures
  ml/cnxk: add generic cnxk request structure
  ml/cnxk: add generic cnxk xstats structures
  ml/cnxk: rename cnxk ops function pointers struct
  ml/cnxk: update device handling functions
  ml/cnxk: update queue-pair handling functions
  ml/cnxk: update model load and unload functions
  ml/cnxk: update model start and stop functions
  ml/cnxk: update model utility functions
  ml/cnxk: update data quantization functions
  ml/cnxk: update device debug functions
  ml/cnxk: update device stats functions
  ml/cnxk: update device and model xstats functions
  ml/cnxk: update fast path functions
  ml/cnxk: move error handling to cnxk layer
  ml/cnxk: support config and close of tvmdp library
  ml/cnxk: add structures to support TVM model type
  ml/cnxk: add support for identify model type
  ml/cnxk: add support to parse TVM model objects
  ml/cnxk: fetch layer info and load TVM model
  ml/cnxk: update internal info for TVM model
  ml/cnxk: enable model unload in tvmdp library
  ml/cnxk: support start and stop for TVM models
  ml/cnxk: support device dump for TVM models
  ml/cnxk: enable reporting model runtime as xstats
  ml/cnxk: implement I/O alloc and free callbacks
  ml/cnxk: add generic ML malloc and free callback
  ml/cnxk: enable creation of mvtvm virtual device

 config/arm/arm64_cn10k_linux_gcc   |1 +
 config/arm/arm64_cn9k_linux_gcc|1 +
 doc/guides/mldevs/cnxk.rst |  223 +-
 doc/guides/rel_notes/release_23_11.rst |3 +
 drivers/ml/cnxk/cn10k_ml_dev.c |  416 ++--
 drivers/ml/cnxk/cn10k_ml_dev.h |  457 +---
 drivers/ml/cnxk/cn10k_ml_model.c   |  403 ++--
 drivers/ml/cnxk/cn10k_ml_model.h   |  151 +-
 drivers/ml/cnxk/cn10k_ml_ocm.c |  111 +-
 drivers/ml/cnxk/cn10k_ml_ocm.h |   15 +-
 drivers/ml/cnxk/cn10k_ml_ops.c | 2828 
 drivers/ml/cnxk/cn10k_ml_ops.h |  358 ++-
 drivers/ml/cnxk/cnxk_ml_dev.c  |   22 +
 drivers/ml/cnxk/cnxk_ml_dev.h  |  120 +
 drivers/ml/cnxk/cnxk_ml_io.c   |   95 +
 drivers/ml/cnxk/cnxk_ml_io.h   |   88 +
 drivers/ml/cnxk/cnxk_ml_model.c|   94 +
 drivers/ml/cnxk/cnxk_ml_model.h|  192 ++
 drivers/ml/cnxk/cnxk_ml_ops.c  | 1690 ++
 drivers/ml/cnxk/cnxk_ml_ops.h  |   87 +
 drivers/ml/cnxk/cnxk_ml_utils.c|   15 +
 drivers/ml/cnxk/cnxk_ml_utils.h|   17 +
 drivers/ml/cnxk/cnxk_ml_xstats.h   |  152 ++
 drivers/ml/cnxk/meson.build|   70 +
 drivers/ml/cnxk/mvtvm_ml_dev.c |  196 ++
 drivers/ml/cnxk/mvtvm_ml_dev.h |   40 +
 drivers/ml/cnxk/mvtvm_ml_model.c   |  392 
 drivers/ml/cnxk/mvtvm_ml_model.h   |   90 +
 drivers/ml/cnxk/mvtvm_ml_ops.c |  652 ++
 drivers/ml/cnxk/mvtvm_ml_ops.h |   82 +
 drivers/ml/cnxk/mvtvm_ml_stubs.c   |  141 ++
 drivers/ml/cnxk/mvtvm_ml_stubs.h   |   36 +
 32 files changed, 6279 insertions

[PATCH v8 05/34] ml/cnxk: add generic cnxk xstats structures

2023-10-22 Thread Srikanth Yalavarthi
Introduced generic xstats structures and renamed cn10k
xstats enumerations with cnxk prefix.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_dev.h   |  86 +---
 drivers/ml/cnxk/cn10k_ml_model.h |   6 +-
 drivers/ml/cnxk/cn10k_ml_ops.c   | 169 ++-
 drivers/ml/cnxk/cnxk_ml_xstats.h | 128 +++
 4 files changed, 209 insertions(+), 180 deletions(-)
 create mode 100644 drivers/ml/cnxk/cnxk_ml_xstats.h

diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h
index 1852d4f6c9..be989e0a20 100644
--- a/drivers/ml/cnxk/cn10k_ml_dev.h
+++ b/drivers/ml/cnxk/cn10k_ml_dev.h
@@ -10,6 +10,7 @@
 #include "cn10k_ml_ocm.h"
 
 #include "cnxk_ml_io.h"
+#include "cnxk_ml_xstats.h"
 
 /* Dummy Device ops */
 extern struct rte_ml_dev_ops ml_dev_dummy_ops;
@@ -121,89 +122,6 @@ struct cn10k_ml_fw {
struct cnxk_ml_req *req;
 };
 
-/* Extended stats types enum */
-enum cn10k_ml_xstats_type {
-   /* Number of models loaded */
-   nb_models_loaded,
-
-   /* Number of models unloaded */
-   nb_models_unloaded,
-
-   /* Number of models started */
-   nb_models_started,
-
-   /* Number of models stopped */
-   nb_models_stopped,
-
-   /* Average inference hardware latency */
-   avg_hw_latency,
-
-   /* Minimum hardware latency */
-   min_hw_latency,
-
-   /* Maximum hardware latency */
-   max_hw_latency,
-
-   /* Average firmware latency */
-   avg_fw_latency,
-
-   /* Minimum firmware latency */
-   min_fw_latency,
-
-   /* Maximum firmware latency */
-   max_fw_latency,
-};
-
-/* Extended stats function type enum. */
-enum cn10k_ml_xstats_fn_type {
-   /* Device function */
-   CN10K_ML_XSTATS_FN_DEVICE,
-
-   /* Model function */
-   CN10K_ML_XSTATS_FN_MODEL,
-};
-
-/* Function pointer to get xstats for a type */
-typedef uint64_t (*cn10k_ml_xstats_fn)(struct rte_ml_dev *dev, uint16_t 
obj_idx,
-  enum cn10k_ml_xstats_type stat);
-
-/* Extended stats entry structure */
-struct cn10k_ml_xstats_entry {
-   /* Name-ID map */
-   struct rte_ml_dev_xstats_map map;
-
-   /* xstats mode, device or model */
-   enum rte_ml_dev_xstats_mode mode;
-
-   /* Type of xstats */
-   enum cn10k_ml_xstats_type type;
-
-   /* xstats function */
-   enum cn10k_ml_xstats_fn_type fn_id;
-
-   /* Object ID, model ID for model stat type */
-   uint16_t obj_idx;
-
-   /* Allowed to reset the stat */
-   uint8_t reset_allowed;
-
-   /* An offset to be taken away to emulate resets */
-   uint64_t reset_value;
-};
-
-/* Extended stats data */
-struct cn10k_ml_xstats {
-   /* Pointer to xstats entries */
-   struct cn10k_ml_xstats_entry *entries;
-
-   /* Store num stats and offset of the stats for each model */
-   uint16_t count_per_model[ML_CNXK_MAX_MODELS];
-   uint16_t offset_for_model[ML_CNXK_MAX_MODELS];
-   uint16_t count_mode_device;
-   uint16_t count_mode_model;
-   uint16_t count;
-};
-
 /* Device private data */
 struct cn10k_ml_dev {
/* Device ROC */
@@ -216,7 +134,7 @@ struct cn10k_ml_dev {
struct cn10k_ml_ocm ocm;
 
/* Extended stats data */
-   struct cn10k_ml_xstats xstats;
+   struct cnxk_ml_xstats xstats;
 
/* Enable / disable model data caching */
int cache_model_data;
diff --git a/drivers/ml/cnxk/cn10k_ml_model.h b/drivers/ml/cnxk/cn10k_ml_model.h
index 74ada1531a..5c32f48c68 100644
--- a/drivers/ml/cnxk/cn10k_ml_model.h
+++ b/drivers/ml/cnxk/cn10k_ml_model.h
@@ -404,7 +404,7 @@ struct cn10k_ml_layer_addr {
 };
 
 /* Model fast-path stats */
-struct cn10k_ml_layer_stats {
+struct cn10k_ml_layer_xstats {
/* Total hardware latency, sum of all inferences */
uint64_t hw_latency_tot;
 
@@ -447,10 +447,10 @@ struct cn10k_ml_layer_data {
struct cnxk_ml_req *req;
 
/* Layer: Stats for burst ops */
-   struct cn10k_ml_layer_stats *burst_stats;
+   struct cn10k_ml_layer_xstats *burst_xstats;
 
/* Layer: Stats for sync ops */
-   struct cn10k_ml_layer_stats *sync_stats;
+   struct cn10k_ml_layer_xstats *sync_xstats;
 };
 
 struct cn10k_ml_model_data {
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 25ebb28993..b470955ffd 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -10,6 +10,7 @@
 #include "cnxk_ml_dev.h"
 #include "cnxk_ml_model.h"
 #include "cnxk_ml_ops.h"
+#include "cnxk_ml_xstats.h"
 
 /* ML model macros */
 #define CN10K_ML_MODEL_MEMZONE_NAME "ml_cn10k_model_mz"
@@ -425,26 +426,6 @@ cn10k_ml_prep_fp_job_descriptor(struct cn10k_ml_dev 
*cn10k_mldev, struct cnxk_ml
req->cn10k_req.jd.model_run.num_batches = op->nb_batches;
 }
 
-struct xstat_info {
-   char name[32];
-   enum cn10k_ml_xstats_type type;
-   uint8_t reset_allowed;
-};
-

[PATCH v8 03/34] ml/cnxk: add generic model and layer structures

2023-10-22 Thread Srikanth Yalavarthi
Introduce generic cnxk model and layer structure. These
structures would enable supporting models with multiple
layers. A model is a collection of multiple independent
layers with flow dependencies between the layers.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_dev.h   |   9 +-
 drivers/ml/cnxk/cn10k_ml_model.c | 247 
 drivers/ml/cnxk/cn10k_ml_model.h | 122 ++--
 drivers/ml/cnxk/cn10k_ml_ocm.c   |  50 ++--
 drivers/ml/cnxk/cn10k_ml_ocm.h   |   9 +-
 drivers/ml/cnxk/cn10k_ml_ops.c   | 488 +--
 drivers/ml/cnxk/cnxk_ml_io.h |  79 +
 drivers/ml/cnxk/cnxk_ml_model.c  |   7 +
 drivers/ml/cnxk/cnxk_ml_model.h  | 111 +++
 drivers/ml/cnxk/meson.build  |   1 +
 10 files changed, 653 insertions(+), 470 deletions(-)
 create mode 100644 drivers/ml/cnxk/cnxk_ml_io.h
 create mode 100644 drivers/ml/cnxk/cnxk_ml_model.c
 create mode 100644 drivers/ml/cnxk/cnxk_ml_model.h

diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h
index f9da1548c4..99ff0a344a 100644
--- a/drivers/ml/cnxk/cn10k_ml_dev.h
+++ b/drivers/ml/cnxk/cn10k_ml_dev.h
@@ -9,6 +9,8 @@
 
 #include "cn10k_ml_ocm.h"
 
+#include "cnxk_ml_io.h"
+
 /* Dummy Device ops */
 extern struct rte_ml_dev_ops ml_dev_dummy_ops;
 
@@ -21,9 +23,6 @@ extern struct rte_ml_dev_ops ml_dev_dummy_ops;
 /* Device alignment size */
 #define ML_CN10K_ALIGN_SIZE 128
 
-/* Maximum number of models per device */
-#define ML_CN10K_MAX_MODELS 16
-
 /* Maximum number of queue-pairs per device, spinlock version */
 #define ML_CN10K_MAX_QP_PER_DEVICE_SL 16
 
@@ -455,8 +454,8 @@ struct cn10k_ml_xstats {
struct cn10k_ml_xstats_entry *entries;
 
/* Store num stats and offset of the stats for each model */
-   uint16_t count_per_model[ML_CN10K_MAX_MODELS];
-   uint16_t offset_for_model[ML_CN10K_MAX_MODELS];
+   uint16_t count_per_model[ML_CNXK_MAX_MODELS];
+   uint16_t offset_for_model[ML_CNXK_MAX_MODELS];
uint16_t count_mode_device;
uint16_t count_mode_model;
uint16_t count;
diff --git a/drivers/ml/cnxk/cn10k_ml_model.c b/drivers/ml/cnxk/cn10k_ml_model.c
index cc46ca2efd..d033d6deff 100644
--- a/drivers/ml/cnxk/cn10k_ml_model.c
+++ b/drivers/ml/cnxk/cn10k_ml_model.c
@@ -6,10 +6,10 @@
 
 #include 
 
-#include "cn10k_ml_model.h"
 #include "cn10k_ml_ocm.h"
 
 #include "cnxk_ml_dev.h"
+#include "cnxk_ml_model.h"
 
 static enum rte_ml_io_type
 cn10k_ml_io_type_map(uint8_t type)
@@ -311,19 +311,17 @@ cn10k_ml_model_metadata_update(struct 
cn10k_ml_model_metadata *metadata)
 }
 
 void
-cn10k_ml_model_addr_update(struct cn10k_ml_model *model, uint8_t *buffer, 
uint8_t *base_dma_addr)
+cn10k_ml_layer_addr_update(struct cnxk_ml_layer *layer, uint8_t *buffer, 
uint8_t *base_dma_addr)
 {
struct cn10k_ml_model_metadata *metadata;
-   struct cn10k_ml_model_addr *addr;
+   struct cn10k_ml_layer_addr *addr;
size_t model_data_size;
uint8_t *dma_addr_load;
uint8_t *dma_addr_run;
-   uint8_t i;
-   uint8_t j;
int fpos;
 
-   metadata = &model->metadata;
-   addr = &model->addr;
+   metadata = &layer->glow.metadata;
+   addr = &layer->glow.addr;
model_data_size = metadata->init_model.file_size + 
metadata->main_model.file_size +
  metadata->finish_model.file_size + 
metadata->weights_bias.file_size;
 
@@ -361,102 +359,138 @@ cn10k_ml_model_addr_update(struct cn10k_ml_model 
*model, uint8_t *buffer, uint8_
addr->wb_base_addr = PLT_PTR_SUB(dma_addr_load, 
metadata->weights_bias.mem_offset);
addr->wb_load_addr = PLT_PTR_ADD(addr->wb_base_addr, 
metadata->weights_bias.mem_offset);
rte_memcpy(addr->wb_load_addr, PLT_PTR_ADD(buffer, fpos), 
metadata->weights_bias.file_size);
+}
+
+void
+cn10k_ml_layer_info_update(struct cnxk_ml_layer *layer)
+{
+   struct cn10k_ml_model_metadata *metadata;
+   uint8_t i;
+   uint8_t j;
+
+   metadata = &layer->glow.metadata;
 
/* Inputs */
-   addr->total_input_sz_d = 0;
-   addr->total_input_sz_q = 0;
+   layer->info.nb_inputs = metadata->model.num_input;
+   layer->info.total_input_sz_d = 0;
+   layer->info.total_input_sz_q = 0;
for (i = 0; i < metadata->model.num_input; i++) {
if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) {
-   addr->input[i].nb_dims = 4;
-   addr->input[i].shape[0] = metadata->input1[i].shape.w;
-   addr->input[i].shape[1] = metadata->input1[i].shape.x;
-   addr->input[i].shape[2] = metadata->input1[i].shape.y;
-   addr->input[i].shape[3] = metadata->input1[i].shape.z;
-
-   addr->input[i].nb_elements =
+   rte_strscpy(layer->info.input[i].name,
+   (char *)metadata->input1[i].input_name, 
MRVL_ML_INPUT_NAME_LEN);
+   layer->info.inpu

[PATCH v8 02/34] ml/cnxk: add generic cnxk device structure

2023-10-22 Thread Srikanth Yalavarthi
Introduce generic cnxk device structure. This structure is
a top level device structure for the driver, which would
encapsulate the target / platform specific device structure.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_dev.c   | 316 ++--
 drivers/ml/cnxk/cn10k_ml_dev.h   |  47 +--
 drivers/ml/cnxk/cn10k_ml_model.c |  15 +-
 drivers/ml/cnxk/cn10k_ml_model.h |   8 +-
 drivers/ml/cnxk/cn10k_ml_ocm.c   |  60 ++--
 drivers/ml/cnxk/cn10k_ml_ops.c   | 495 +--
 drivers/ml/cnxk/cnxk_ml_dev.c|  11 +
 drivers/ml/cnxk/cnxk_ml_dev.h|  58 
 drivers/ml/cnxk/meson.build  |   1 +
 9 files changed, 562 insertions(+), 449 deletions(-)
 create mode 100644 drivers/ml/cnxk/cnxk_ml_dev.c
 create mode 100644 drivers/ml/cnxk/cnxk_ml_dev.h

diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c
index e3c2badcef..3bc61443d8 100644
--- a/drivers/ml/cnxk/cn10k_ml_dev.c
+++ b/drivers/ml/cnxk/cn10k_ml_dev.c
@@ -10,13 +10,14 @@
 #include 
 #include 
 
-#include 
-
 #include 
 
-#include "cn10k_ml_dev.h"
+#include 
+
 #include "cn10k_ml_ops.h"
 
+#include "cnxk_ml_dev.h"
+
 #define CN10K_ML_FW_PATH   "fw_path"
 #define CN10K_ML_FW_ENABLE_DPE_WARNINGS "enable_dpe_warnings"
 #define CN10K_ML_FW_REPORT_DPE_WARNINGS "report_dpe_warnings"
@@ -58,9 +59,6 @@ static const char *const valid_args[] = {CN10K_ML_FW_PATH,
 /* Supported OCM page sizes: 1KB, 2KB, 4KB, 8KB and 16KB */
 static const int valid_ocm_page_size[] = {1024, 2048, 4096, 8192, 16384};
 
-/* Dummy operations for ML device */
-struct rte_ml_dev_ops ml_dev_dummy_ops = {0};
-
 static int
 parse_string_arg(const char *key __rte_unused, const char *value, void 
*extra_args)
 {
@@ -90,7 +88,7 @@ parse_integer_arg(const char *key __rte_unused, const char 
*value, void *extra_a
 }
 
 static int
-cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev 
*mldev)
+cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev 
*cn10k_mldev)
 {
bool enable_dpe_warnings_set = false;
bool report_dpe_warnings_set = false;
@@ -127,7 +125,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, 
struct cn10k_ml_dev *mlde
 
if (rte_kvargs_count(kvlist, CN10K_ML_FW_ENABLE_DPE_WARNINGS) == 1) {
ret = rte_kvargs_process(kvlist, 
CN10K_ML_FW_ENABLE_DPE_WARNINGS,
-&parse_integer_arg, 
&mldev->fw.enable_dpe_warnings);
+&parse_integer_arg, 
&cn10k_mldev->fw.enable_dpe_warnings);
if (ret < 0) {
plt_err("Error processing arguments, key = %s\n",
CN10K_ML_FW_ENABLE_DPE_WARNINGS);
@@ -139,7 +137,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, 
struct cn10k_ml_dev *mlde
 
if (rte_kvargs_count(kvlist, CN10K_ML_FW_REPORT_DPE_WARNINGS) == 1) {
ret = rte_kvargs_process(kvlist, 
CN10K_ML_FW_REPORT_DPE_WARNINGS,
-&parse_integer_arg, 
&mldev->fw.report_dpe_warnings);
+&parse_integer_arg, 
&cn10k_mldev->fw.report_dpe_warnings);
if (ret < 0) {
plt_err("Error processing arguments, key = %s\n",
CN10K_ML_FW_REPORT_DPE_WARNINGS);
@@ -151,7 +149,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, 
struct cn10k_ml_dev *mlde
 
if (rte_kvargs_count(kvlist, CN10K_ML_DEV_CACHE_MODEL_DATA) == 1) {
ret = rte_kvargs_process(kvlist, CN10K_ML_DEV_CACHE_MODEL_DATA, 
&parse_integer_arg,
-&mldev->cache_model_data);
+&cn10k_mldev->cache_model_data);
if (ret < 0) {
plt_err("Error processing arguments, key = %s\n",
CN10K_ML_DEV_CACHE_MODEL_DATA);
@@ -174,7 +172,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, 
struct cn10k_ml_dev *mlde
 
if (rte_kvargs_count(kvlist, CN10K_ML_DEV_HW_QUEUE_LOCK) == 1) {
ret = rte_kvargs_process(kvlist, CN10K_ML_DEV_HW_QUEUE_LOCK, 
&parse_integer_arg,
-&mldev->hw_queue_lock);
+&cn10k_mldev->hw_queue_lock);
if (ret < 0) {
plt_err("Error processing arguments, key = %s\n",
CN10K_ML_DEV_HW_QUEUE_LOCK);
@@ -186,7 +184,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, 
struct cn10k_ml_dev *mlde
 
if (rte_kvargs_count(kvlist, CN10K_ML_OCM_PAGE_SIZE) == 1) {
ret = rte_kvargs_process(kvlist, CN10K_ML_OCM_PAGE_SIZE, 
&parse_integer_arg,
-&mldev->ocm_page_size);
+&cn10k_mldev->ocm_page_size);
if (ret < 0) {
 

[PATCH v8 09/34] ml/cnxk: update model load and unload functions

2023-10-22 Thread Srikanth Yalavarthi
Implemented cnxk wrapper functions to load and unload
ML models. Wrapper functions would invoke the cn10k
model load and unload functions.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_model.c | 244 -
 drivers/ml/cnxk/cn10k_ml_model.h |  26 ++-
 drivers/ml/cnxk/cn10k_ml_ops.c   | 296 ++-
 drivers/ml/cnxk/cn10k_ml_ops.h   |  12 +-
 drivers/ml/cnxk/cnxk_ml_dev.h|  15 ++
 drivers/ml/cnxk/cnxk_ml_ops.c| 144 ++-
 drivers/ml/cnxk/cnxk_ml_ops.h|   2 +
 7 files changed, 462 insertions(+), 277 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_model.c b/drivers/ml/cnxk/cn10k_ml_model.c
index d2f1c761be..48d70027ca 100644
--- a/drivers/ml/cnxk/cn10k_ml_model.c
+++ b/drivers/ml/cnxk/cn10k_ml_model.c
@@ -316,42 +316,31 @@ cn10k_ml_layer_addr_update(struct cnxk_ml_layer *layer, 
uint8_t *buffer, uint8_t
 {
struct cn10k_ml_model_metadata *metadata;
struct cn10k_ml_layer_addr *addr;
-   size_t model_data_size;
uint8_t *dma_addr_load;
-   uint8_t *dma_addr_run;
int fpos;
 
metadata = &layer->glow.metadata;
addr = &layer->glow.addr;
-   model_data_size = metadata->init_model.file_size + 
metadata->main_model.file_size +
- metadata->finish_model.file_size + 
metadata->weights_bias.file_size;
 
/* Base address */
addr->base_dma_addr_load = base_dma_addr;
-   addr->base_dma_addr_run = PLT_PTR_ADD(addr->base_dma_addr_load, 
model_data_size);
 
/* Init section */
dma_addr_load = addr->base_dma_addr_load;
-   dma_addr_run = addr->base_dma_addr_run;
fpos = sizeof(struct cn10k_ml_model_metadata);
addr->init_load_addr = dma_addr_load;
-   addr->init_run_addr = dma_addr_run;
rte_memcpy(dma_addr_load, PLT_PTR_ADD(buffer, fpos), 
metadata->init_model.file_size);
 
/* Main section */
dma_addr_load += metadata->init_model.file_size;
-   dma_addr_run += metadata->init_model.file_size;
fpos += metadata->init_model.file_size;
addr->main_load_addr = dma_addr_load;
-   addr->main_run_addr = dma_addr_run;
rte_memcpy(dma_addr_load, PLT_PTR_ADD(buffer, fpos), 
metadata->main_model.file_size);
 
/* Finish section */
dma_addr_load += metadata->main_model.file_size;
-   dma_addr_run += metadata->main_model.file_size;
fpos += metadata->main_model.file_size;
addr->finish_load_addr = dma_addr_load;
-   addr->finish_run_addr = dma_addr_run;
rte_memcpy(dma_addr_load, PLT_PTR_ADD(buffer, fpos), 
metadata->finish_model.file_size);
 
/* Weights and Bias section */
@@ -363,142 +352,148 @@ cn10k_ml_layer_addr_update(struct cnxk_ml_layer *layer, 
uint8_t *buffer, uint8_t
 }
 
 void
-cn10k_ml_layer_info_update(struct cnxk_ml_layer *layer)
+cn10k_ml_layer_io_info_set(struct cnxk_ml_io_info *io_info,
+  struct cn10k_ml_model_metadata *metadata)
 {
-   struct cn10k_ml_model_metadata *metadata;
uint8_t i;
uint8_t j;
 
-   metadata = &layer->glow.metadata;
-
/* Inputs */
-   layer->info.nb_inputs = metadata->model.num_input;
-   layer->info.total_input_sz_d = 0;
-   layer->info.total_input_sz_q = 0;
+   io_info->nb_inputs = metadata->model.num_input;
+   io_info->total_input_sz_d = 0;
+   io_info->total_input_sz_q = 0;
for (i = 0; i < metadata->model.num_input; i++) {
if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) {
-   rte_strscpy(layer->info.input[i].name,
-   (char *)metadata->input1[i].input_name, 
MRVL_ML_INPUT_NAME_LEN);
-   layer->info.input[i].dtype = 
metadata->input1[i].input_type;
-   layer->info.input[i].qtype = 
metadata->input1[i].model_input_type;
-   layer->info.input[i].nb_dims = 4;
-   layer->info.input[i].shape[0] = 
metadata->input1[i].shape.w;
-   layer->info.input[i].shape[1] = 
metadata->input1[i].shape.x;
-   layer->info.input[i].shape[2] = 
metadata->input1[i].shape.y;
-   layer->info.input[i].shape[3] = 
metadata->input1[i].shape.z;
-   layer->info.input[i].nb_elements =
+   rte_strscpy(io_info->input[i].name, (char 
*)metadata->input1[i].input_name,
+   MRVL_ML_INPUT_NAME_LEN);
+   io_info->input[i].dtype = 
metadata->input1[i].input_type;
+   io_info->input[i].qtype = 
metadata->input1[i].model_input_type;
+   io_info->input[i].nb_dims = 4;
+   io_info->input[i].shape[0] = 
metadata->input1[i].shape.w;
+   io_info->input[i].shape[1] = 
metadata->input1[i].shape.x;
+   io_info->input[i].shape[2] = 
metadata->input1[i].shap

[PATCH v8 04/34] ml/cnxk: add generic cnxk request structure

2023-10-22 Thread Srikanth Yalavarthi
Added generic cnxk request structure. Moved common fields
from cn10k structures to cnxk structure. Moved job related
structures and enumerations to ops headers.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_dev.c   |  72 +++
 drivers/ml/cnxk/cn10k_ml_dev.h   | 269 +
 drivers/ml/cnxk/cn10k_ml_model.c |   6 +-
 drivers/ml/cnxk/cn10k_ml_model.h |   4 +-
 drivers/ml/cnxk/cn10k_ml_ops.c   | 331 +--
 drivers/ml/cnxk/cn10k_ml_ops.h   | 296 +++
 drivers/ml/cnxk/cnxk_ml_ops.c|   7 +
 drivers/ml/cnxk/cnxk_ml_ops.h|  63 ++
 drivers/ml/cnxk/meson.build  |   1 +
 9 files changed, 557 insertions(+), 492 deletions(-)
 create mode 100644 drivers/ml/cnxk/cnxk_ml_ops.c
 create mode 100644 drivers/ml/cnxk/cnxk_ml_ops.h

diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c
index 3bc61443d8..fc6f78d414 100644
--- a/drivers/ml/cnxk/cn10k_ml_dev.c
+++ b/drivers/ml/cnxk/cn10k_ml_dev.c
@@ -14,9 +14,8 @@
 
 #include 
 
-#include "cn10k_ml_ops.h"
-
 #include "cnxk_ml_dev.h"
+#include "cnxk_ml_ops.h"
 
 #define CN10K_ML_FW_PATH   "fw_path"
 #define CN10K_ML_FW_ENABLE_DPE_WARNINGS "enable_dpe_warnings"
@@ -400,20 +399,23 @@ cn10k_ml_pci_remove(struct rte_pci_device *pci_dev)
 static void
 cn10k_ml_fw_print_info(struct cn10k_ml_fw *fw)
 {
-   plt_info("ML Firmware Version = %s", fw->req->jd.fw_load.version);
-
-   plt_ml_dbg("Firmware capabilities = 0x%016lx", 
fw->req->jd.fw_load.cap.u64);
-   plt_ml_dbg("Version = %s", fw->req->jd.fw_load.version);
-   plt_ml_dbg("core0_debug_ptr = 0x%016lx", 
fw->req->jd.fw_load.debug.core0_debug_ptr);
-   plt_ml_dbg("core1_debug_ptr = 0x%016lx", 
fw->req->jd.fw_load.debug.core1_debug_ptr);
-   plt_ml_dbg("debug_buffer_size = %u bytes", 
fw->req->jd.fw_load.debug.debug_buffer_size);
+   plt_info("ML Firmware Version = %s", 
fw->req->cn10k_req.jd.fw_load.version);
+
+   plt_ml_dbg("Firmware capabilities = 0x%016lx", 
fw->req->cn10k_req.jd.fw_load.cap.u64);
+   plt_ml_dbg("Version = %s", fw->req->cn10k_req.jd.fw_load.version);
+   plt_ml_dbg("core0_debug_ptr = 0x%016lx",
+  fw->req->cn10k_req.jd.fw_load.debug.core0_debug_ptr);
+   plt_ml_dbg("core1_debug_ptr = 0x%016lx",
+  fw->req->cn10k_req.jd.fw_load.debug.core1_debug_ptr);
+   plt_ml_dbg("debug_buffer_size = %u bytes",
+  fw->req->cn10k_req.jd.fw_load.debug.debug_buffer_size);
plt_ml_dbg("core0_exception_buffer = 0x%016lx",
-  fw->req->jd.fw_load.debug.core0_exception_buffer);
+  fw->req->cn10k_req.jd.fw_load.debug.core0_exception_buffer);
plt_ml_dbg("core1_exception_buffer = 0x%016lx",
-  fw->req->jd.fw_load.debug.core1_exception_buffer);
+  fw->req->cn10k_req.jd.fw_load.debug.core1_exception_buffer);
plt_ml_dbg("exception_state_size = %u bytes",
-  fw->req->jd.fw_load.debug.exception_state_size);
-   plt_ml_dbg("flags = 0x%016lx", fw->req->jd.fw_load.flags);
+  fw->req->cn10k_req.jd.fw_load.debug.exception_state_size);
+   plt_ml_dbg("flags = 0x%016lx", fw->req->cn10k_req.jd.fw_load.flags);
 }
 
 uint64_t
@@ -458,29 +460,30 @@ cn10k_ml_fw_load_asim(struct cn10k_ml_fw *fw)
roc_ml_reg_save(&cn10k_mldev->roc, ML_MLR_BASE);
 
/* Update FW load completion structure */
-   fw->req->jd.hdr.jce.w1.u64 = PLT_U64_CAST(&fw->req->status);
-   fw->req->jd.hdr.job_type = ML_CN10K_JOB_TYPE_FIRMWARE_LOAD;
-   fw->req->jd.hdr.result = roc_ml_addr_ap2mlip(&cn10k_mldev->roc, 
&fw->req->result);
-   fw->req->jd.fw_load.flags = cn10k_ml_fw_flags_get(fw);
-   plt_write64(ML_CNXK_POLL_JOB_START, &fw->req->status);
+   fw->req->cn10k_req.jd.hdr.jce.w1.u64 = 
PLT_U64_CAST(&fw->req->cn10k_req.status);
+   fw->req->cn10k_req.jd.hdr.job_type = ML_CN10K_JOB_TYPE_FIRMWARE_LOAD;
+   fw->req->cn10k_req.jd.hdr.result =
+   roc_ml_addr_ap2mlip(&cn10k_mldev->roc, 
&fw->req->cn10k_req.result);
+   fw->req->cn10k_req.jd.fw_load.flags = cn10k_ml_fw_flags_get(fw);
+   plt_write64(ML_CNXK_POLL_JOB_START, &fw->req->cn10k_req.status);
plt_wmb();
 
/* Enqueue FW load through scratch registers */
timeout = true;
timeout_cycle = plt_tsc_cycles() + ML_CNXK_CMD_TIMEOUT * plt_tsc_hz();
-   roc_ml_scratch_enqueue(&cn10k_mldev->roc, &fw->req->jd);
+   roc_ml_scratch_enqueue(&cn10k_mldev->roc, &fw->req->cn10k_req.jd);
 
plt_rmb();
do {
if (roc_ml_scratch_is_done_bit_set(&cn10k_mldev->roc) &&
-   (plt_read64(&fw->req->status) == ML_CNXK_POLL_JOB_FINISH)) {
+   (plt_read64(&fw->req->cn10k_req.status) == 
ML_CNXK_POLL_JOB_FINISH)) {
timeout = false;
break;
}
} while (plt_tsc_cycles() < tim

[PATCH v8 13/34] ml/cnxk: update device debug functions

2023-10-22 Thread Srikanth Yalavarthi
Added cnxk wrapper for device dump and selftest debug
functions.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_model.c | 118 +
 drivers/ml/cnxk/cn10k_ml_model.h |   1 +
 drivers/ml/cnxk/cn10k_ml_ocm.c   |   8 +-
 drivers/ml/cnxk/cn10k_ml_ocm.h   |   2 +-
 drivers/ml/cnxk/cn10k_ml_ops.c   | 176 ++-
 drivers/ml/cnxk/cn10k_ml_ops.h   |   4 +-
 drivers/ml/cnxk/cnxk_ml_model.c  |  33 ++
 drivers/ml/cnxk/cnxk_ml_model.h  |   2 +
 drivers/ml/cnxk/cnxk_ml_ops.c|  39 ++-
 drivers/ml/cnxk/cnxk_ml_utils.c  |  15 +++
 drivers/ml/cnxk/cnxk_ml_utils.h  |  17 +++
 drivers/ml/cnxk/meson.build  |   1 +
 12 files changed, 235 insertions(+), 181 deletions(-)
 create mode 100644 drivers/ml/cnxk/cnxk_ml_utils.c
 create mode 100644 drivers/ml/cnxk/cnxk_ml_utils.h

diff --git a/drivers/ml/cnxk/cn10k_ml_model.c b/drivers/ml/cnxk/cn10k_ml_model.c
index 48d70027ca..af9d5a666f 100644
--- a/drivers/ml/cnxk/cn10k_ml_model.c
+++ b/drivers/ml/cnxk/cn10k_ml_model.c
@@ -11,6 +11,7 @@
 #include "cnxk_ml_dev.h"
 #include "cnxk_ml_model.h"
 #include "cnxk_ml_ops.h"
+#include "cnxk_ml_utils.h"
 
 static enum rte_ml_io_type
 cn10k_ml_io_type_map(uint8_t type)
@@ -598,3 +599,120 @@ cn10k_ml_model_info_set(struct cnxk_ml_dev *cnxk_mldev, 
struct cnxk_ml_model *mo
 
rte_ml_io_type_size_get(io_info->output[i].qtype);
}
 }
+
+void
+cn10k_ml_layer_print(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer 
*layer, FILE *fp)
+{
+   struct cn10k_ml_ocm *ocm;
+   char str[STR_LEN];
+   uint8_t i;
+   uint8_t j;
+
+   ocm = &cnxk_mldev->cn10k_mldev.ocm;
+
+   /* Print debug info */
+   cnxk_ml_print_line(fp, LINE_LEN);
+   fprintf(fp, " Layer Information (Layer ID: %u, Name: %s)\n",
+   cnxk_mldev->index_map[layer->index].layer_id, layer->name);
+   cnxk_ml_print_line(fp, LINE_LEN);
+   fprintf(fp, "%*s : %u\n", FIELD_LEN, "index", layer->index);
+   fprintf(fp, "%*s : %s\n", FIELD_LEN, "name", layer->name);
+   fprintf(fp, "%*s : %u.%u.%u.%u\n", FIELD_LEN, "version",
+   layer->glow.metadata.model.version[0], 
layer->glow.metadata.model.version[1],
+   layer->glow.metadata.model.version[2], 
layer->glow.metadata.model.version[3]);
+   fprintf(fp, "%*s : 0x%016lx\n", FIELD_LEN, "layer", 
PLT_U64_CAST(layer));
+   fprintf(fp, "%*s : %u\n", FIELD_LEN, "batch_size", layer->batch_size);
+
+   /* Print model state */
+   if (layer->state == ML_CNXK_LAYER_STATE_LOADED)
+   fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "loaded");
+   if (layer->state == ML_CNXK_LAYER_STATE_JOB_ACTIVE)
+   fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "job_active");
+   if (layer->state == ML_CNXK_LAYER_STATE_STARTED)
+   fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "started");
+
+   /* Print OCM status */
+   fprintf(fp, "%*s : %" PRIu64 " bytes\n", FIELD_LEN, "wb_size",
+   layer->glow.metadata.model.ocm_wb_range_end -
+   layer->glow.metadata.model.ocm_wb_range_start + 1);
+   fprintf(fp, "%*s : %u\n", FIELD_LEN, "wb_pages", 
layer->glow.ocm_map.wb_pages);
+   fprintf(fp, "%*s : %" PRIu64 " bytes\n", FIELD_LEN, "scratch_size",
+   ocm->size_per_tile - 
layer->glow.metadata.model.ocm_tmp_range_floor);
+   fprintf(fp, "%*s : %u\n", FIELD_LEN, "scratch_pages", 
layer->glow.ocm_map.scratch_pages);
+   fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_tiles",
+   layer->glow.metadata.model.tile_end - 
layer->glow.metadata.model.tile_start + 1);
+
+   if (layer->state == ML_CNXK_LAYER_STATE_STARTED) {
+   fprintf(fp, "%*s : 0x%0*" PRIx64 "\n", FIELD_LEN, "tilemask",
+   ML_CN10K_OCM_NUMTILES / 4, 
layer->glow.ocm_map.tilemask);
+   fprintf(fp, "%*s : 0x%" PRIx64 "\n", FIELD_LEN, "ocm_wb_start",
+   layer->glow.ocm_map.wb_page_start * ocm->page_size);
+   }
+
+   fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_inputs", 
layer->glow.metadata.model.num_input);
+   fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_outputs", 
layer->glow.metadata.model.num_output);
+   fprintf(fp, "\n");
+
+   cnxk_ml_print_line(fp, LINE_LEN);
+   fprintf(fp, "%8s  %16s  %12s  %18s\n", "input", "input_name", 
"input_type",
+   "model_input_type");
+   cnxk_ml_print_line(fp, LINE_LEN);
+   for (i = 0; i < layer->glow.metadata.model.num_input; i++) {
+   if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) {
+   fprintf(fp, "%8u  ", i);
+   fprintf(fp, "%*s  ", 16, 
layer->glow.metadata.input1[i].input_name);
+   
rte_ml_io_type_to_str(layer->glow.metadata.input1[i].input_type, str,
+ STR_LEN);
+   fprintf(fp, "%*s  ", 12, str);
+   
rt

[PATCH v8 06/34] ml/cnxk: rename cnxk ops function pointers struct

2023-10-22 Thread Srikanth Yalavarthi
Renamed cn10k ML ops structure with cnxk prefix.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_dev.c |  2 +-
 drivers/ml/cnxk/cn10k_ml_ops.c | 73 +-
 drivers/ml/cnxk/cn10k_ml_ops.h | 34 +++-
 drivers/ml/cnxk/cnxk_ml_ops.c  | 36 +
 drivers/ml/cnxk/cnxk_ml_ops.h  |  2 +
 5 files changed, 91 insertions(+), 56 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c
index fc6f78d414..91813e9d0a 100644
--- a/drivers/ml/cnxk/cn10k_ml_dev.c
+++ b/drivers/ml/cnxk/cn10k_ml_dev.c
@@ -345,7 +345,7 @@ cn10k_ml_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_de
goto pmd_destroy;
}
 
-   dev->dev_ops = &cn10k_ml_ops;
+   dev->dev_ops = &cnxk_ml_ops;
} else {
plt_err("CN10K ML Ops are not supported on secondary process");
dev->dev_ops = &ml_dev_dummy_ops;
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index b470955ffd..a44fb26215 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -119,7 +119,7 @@ cnxk_ml_qp_destroy(const struct rte_ml_dev *dev, struct 
cnxk_ml_qp *qp)
return 0;
 }
 
-static int
+int
 cn10k_ml_dev_queue_pair_release(struct rte_ml_dev *dev, uint16_t queue_pair_id)
 {
struct cnxk_ml_qp *qp;
@@ -860,7 +860,7 @@ cn10k_ml_cache_model_data(struct rte_ml_dev *dev, uint16_t 
model_id)
return ret;
 }
 
-static int
+int
 cn10k_ml_dev_info_get(struct rte_ml_dev *dev, struct rte_ml_dev_info *dev_info)
 {
struct cn10k_ml_dev *cn10k_mldev;
@@ -888,7 +888,7 @@ cn10k_ml_dev_info_get(struct rte_ml_dev *dev, struct 
rte_ml_dev_info *dev_info)
return 0;
 }
 
-static int
+int
 cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config 
*conf)
 {
struct rte_ml_dev_info dev_info;
@@ -1087,7 +1087,7 @@ cn10k_ml_dev_configure(struct rte_ml_dev *dev, const 
struct rte_ml_dev_config *c
return ret;
 }
 
-static int
+int
 cn10k_ml_dev_close(struct rte_ml_dev *dev)
 {
struct cn10k_ml_dev *cn10k_mldev;
@@ -1160,7 +1160,7 @@ cn10k_ml_dev_close(struct rte_ml_dev *dev)
return rte_dev_remove(dev->device);
 }
 
-static int
+int
 cn10k_ml_dev_start(struct rte_ml_dev *dev)
 {
struct cn10k_ml_dev *cn10k_mldev;
@@ -1180,7 +1180,7 @@ cn10k_ml_dev_start(struct rte_ml_dev *dev)
return 0;
 }
 
-static int
+int
 cn10k_ml_dev_stop(struct rte_ml_dev *dev)
 {
struct cn10k_ml_dev *cn10k_mldev;
@@ -1200,7 +1200,7 @@ cn10k_ml_dev_stop(struct rte_ml_dev *dev)
return 0;
 }
 
-static int
+int
 cn10k_ml_dev_queue_pair_setup(struct rte_ml_dev *dev, uint16_t queue_pair_id,
  const struct rte_ml_dev_qp_conf *qp_conf, int 
socket_id)
 {
@@ -1241,7 +1241,7 @@ cn10k_ml_dev_queue_pair_setup(struct rte_ml_dev *dev, 
uint16_t queue_pair_id,
return 0;
 }
 
-static int
+int
 cn10k_ml_dev_stats_get(struct rte_ml_dev *dev, struct rte_ml_dev_stats *stats)
 {
struct cnxk_ml_qp *qp;
@@ -1258,7 +1258,7 @@ cn10k_ml_dev_stats_get(struct rte_ml_dev *dev, struct 
rte_ml_dev_stats *stats)
return 0;
 }
 
-static void
+void
 cn10k_ml_dev_stats_reset(struct rte_ml_dev *dev)
 {
struct cnxk_ml_qp *qp;
@@ -1273,7 +1273,7 @@ cn10k_ml_dev_stats_reset(struct rte_ml_dev *dev)
}
 }
 
-static int
+int
 cn10k_ml_dev_xstats_names_get(struct rte_ml_dev *dev, enum 
rte_ml_dev_xstats_mode mode,
  int32_t model_id, struct rte_ml_dev_xstats_map 
*xstats_map,
  uint32_t size)
@@ -1321,7 +1321,7 @@ cn10k_ml_dev_xstats_names_get(struct rte_ml_dev *dev, 
enum rte_ml_dev_xstats_mod
return idx;
 }
 
-static int
+int
 cn10k_ml_dev_xstats_by_name_get(struct rte_ml_dev *dev, const char *name, 
uint16_t *stat_id,
uint64_t *value)
 {
@@ -1363,7 +1363,7 @@ cn10k_ml_dev_xstats_by_name_get(struct rte_ml_dev *dev, 
const char *name, uint16
return -EINVAL;
 }
 
-static int
+int
 cn10k_ml_dev_xstats_get(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode 
mode, int32_t model_id,
const uint16_t stat_ids[], uint64_t values[], uint16_t 
nb_ids)
 {
@@ -1427,7 +1427,7 @@ cn10k_ml_dev_xstats_get(struct rte_ml_dev *dev, enum 
rte_ml_dev_xstats_mode mode
return idx;
 }
 
-static int
+int
 cn10k_ml_dev_xstats_reset(struct rte_ml_dev *dev, enum rte_ml_dev_xstats_mode 
mode,
  int32_t model_id, const uint16_t stat_ids[], uint16_t 
nb_ids)
 {
@@ -1441,7 +1441,7 @@ cn10k_ml_dev_xstats_reset(struct rte_ml_dev *dev, enum 
rte_ml_dev_xstats_mode mo
return 0;
 }
 
-static int
+int
 cn10k_ml_dev_dump(struct rte_ml_dev *dev, FILE *fp)
 {
struct cn10k_ml_dev *cn10k_mldev;
@@ -1528,7 +1528,7 @@ cn10k_ml_dev_dump(struct rte_ml_dev *dev, FILE *fp)
return 0;
 }
 
-static i

[PATCH v8 14/34] ml/cnxk: update device stats functions

2023-10-22 Thread Srikanth Yalavarthi
Added cnxk wrapper function to handle ML device stats

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_ops.c | 32 --
 drivers/ml/cnxk/cn10k_ml_ops.h |  2 --
 drivers/ml/cnxk/cnxk_ml_ops.c  | 36 --
 3 files changed, 34 insertions(+), 36 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index a56d002d4c..8cbf700f6e 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -770,38 +770,6 @@ cn10k_ml_dev_stop(struct cnxk_ml_dev *cnxk_mldev)
return 0;
 }
 
-int
-cn10k_ml_dev_stats_get(struct rte_ml_dev *dev, struct rte_ml_dev_stats *stats)
-{
-   struct cnxk_ml_qp *qp;
-   int qp_id;
-
-   for (qp_id = 0; qp_id < dev->data->nb_queue_pairs; qp_id++) {
-   qp = dev->data->queue_pairs[qp_id];
-   stats->enqueued_count += qp->stats.enqueued_count;
-   stats->dequeued_count += qp->stats.dequeued_count;
-   stats->enqueue_err_count += qp->stats.enqueue_err_count;
-   stats->dequeue_err_count += qp->stats.dequeue_err_count;
-   }
-
-   return 0;
-}
-
-void
-cn10k_ml_dev_stats_reset(struct rte_ml_dev *dev)
-{
-   struct cnxk_ml_qp *qp;
-   int qp_id;
-
-   for (qp_id = 0; qp_id < dev->data->nb_queue_pairs; qp_id++) {
-   qp = dev->data->queue_pairs[qp_id];
-   qp->stats.enqueued_count = 0;
-   qp->stats.dequeued_count = 0;
-   qp->stats.enqueue_err_count = 0;
-   qp->stats.dequeue_err_count = 0;
-   }
-}
-
 int
 cn10k_ml_dev_xstats_names_get(struct rte_ml_dev *dev, enum 
rte_ml_dev_xstats_mode mode,
  int32_t model_id, struct rte_ml_dev_xstats_map 
*xstats_map,
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h
index 5fda98ae88..47e7cb12af 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.h
+++ b/drivers/ml/cnxk/cn10k_ml_ops.h
@@ -298,8 +298,6 @@ int cn10k_ml_dev_stop(struct cnxk_ml_dev *cnxk_mldev);
 int cn10k_ml_dev_dump(struct cnxk_ml_dev *cnxk_mldev, FILE *fp);
 int cn10k_ml_dev_selftest(struct cnxk_ml_dev *cnxk_mldev);
 
-int cn10k_ml_dev_stats_get(struct rte_ml_dev *dev, struct rte_ml_dev_stats 
*stats);
-void cn10k_ml_dev_stats_reset(struct rte_ml_dev *dev);
 int cn10k_ml_dev_xstats_names_get(struct rte_ml_dev *dev, enum 
rte_ml_dev_xstats_mode mode,
  int32_t model_id, struct 
rte_ml_dev_xstats_map *xstats_map,
  uint32_t size);
diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c
index 66b88ddae1..c75317d6da 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.c
+++ b/drivers/ml/cnxk/cnxk_ml_ops.c
@@ -489,6 +489,38 @@ cnxk_ml_dev_queue_pair_setup(struct rte_ml_dev *dev, 
uint16_t queue_pair_id,
return 0;
 }
 
+static int
+cnxk_ml_dev_stats_get(struct rte_ml_dev *dev, struct rte_ml_dev_stats *stats)
+{
+   struct cnxk_ml_qp *qp;
+   int qp_id;
+
+   for (qp_id = 0; qp_id < dev->data->nb_queue_pairs; qp_id++) {
+   qp = dev->data->queue_pairs[qp_id];
+   stats->enqueued_count += qp->stats.enqueued_count;
+   stats->dequeued_count += qp->stats.dequeued_count;
+   stats->enqueue_err_count += qp->stats.enqueue_err_count;
+   stats->dequeue_err_count += qp->stats.dequeue_err_count;
+   }
+
+   return 0;
+}
+
+static void
+cnxk_ml_dev_stats_reset(struct rte_ml_dev *dev)
+{
+   struct cnxk_ml_qp *qp;
+   int qp_id;
+
+   for (qp_id = 0; qp_id < dev->data->nb_queue_pairs; qp_id++) {
+   qp = dev->data->queue_pairs[qp_id];
+   qp->stats.enqueued_count = 0;
+   qp->stats.dequeued_count = 0;
+   qp->stats.enqueue_err_count = 0;
+   qp->stats.dequeue_err_count = 0;
+   }
+}
+
 static int
 cnxk_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *params, 
uint16_t *model_id)
 {
@@ -772,8 +804,8 @@ struct rte_ml_dev_ops cnxk_ml_ops = {
.dev_queue_pair_release = cnxk_ml_dev_queue_pair_release,
 
/* Stats ops */
-   .dev_stats_get = cn10k_ml_dev_stats_get,
-   .dev_stats_reset = cn10k_ml_dev_stats_reset,
+   .dev_stats_get = cnxk_ml_dev_stats_get,
+   .dev_stats_reset = cnxk_ml_dev_stats_reset,
.dev_xstats_names_get = cn10k_ml_dev_xstats_names_get,
.dev_xstats_by_name_get = cn10k_ml_dev_xstats_by_name_get,
.dev_xstats_get = cn10k_ml_dev_xstats_get,
-- 
2.42.0



[PATCH v8 07/34] ml/cnxk: update device handling functions

2023-10-22 Thread Srikanth Yalavarthi
Implement CNXK wrapper functions for dev_info_get,
dev_configure, dev_close, dev_start and dev_stop. The
wrapper functions allocate / release common resources
for the ML driver and invoke device specific functions.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_ops.c | 230 ++
 drivers/ml/cnxk/cn10k_ml_ops.h |  16 +-
 drivers/ml/cnxk/cnxk_ml_dev.h  |   3 +
 drivers/ml/cnxk/cnxk_ml_ops.c  | 286 -
 drivers/ml/cnxk/cnxk_ml_ops.h  |   3 +
 5 files changed, 314 insertions(+), 224 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index a44fb26215..f8c51ab394 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -101,7 +101,7 @@ qp_memzone_name_get(char *name, int size, int dev_id, int 
qp_id)
snprintf(name, size, "cnxk_ml_qp_mem_%u:%u", dev_id, qp_id);
 }
 
-static int
+int
 cnxk_ml_qp_destroy(const struct rte_ml_dev *dev, struct cnxk_ml_qp *qp)
 {
const struct rte_memzone *qp_mem;
@@ -861,20 +861,12 @@ cn10k_ml_cache_model_data(struct rte_ml_dev *dev, 
uint16_t model_id)
 }
 
 int
-cn10k_ml_dev_info_get(struct rte_ml_dev *dev, struct rte_ml_dev_info *dev_info)
+cn10k_ml_dev_info_get(struct cnxk_ml_dev *cnxk_mldev, struct rte_ml_dev_info 
*dev_info)
 {
struct cn10k_ml_dev *cn10k_mldev;
-   struct cnxk_ml_dev *cnxk_mldev;
 
-   if (dev_info == NULL)
-   return -EINVAL;
-
-   cnxk_mldev = dev->data->dev_private;
cn10k_mldev = &cnxk_mldev->cn10k_mldev;
 
-   memset(dev_info, 0, sizeof(struct rte_ml_dev_info));
-   dev_info->driver_name = dev->device->driver->name;
-   dev_info->max_models = ML_CNXK_MAX_MODELS;
if (cn10k_mldev->hw_queue_lock)
dev_info->max_queue_pairs = ML_CN10K_MAX_QP_PER_DEVICE_SL;
else
@@ -889,143 +881,17 @@ cn10k_ml_dev_info_get(struct rte_ml_dev *dev, struct 
rte_ml_dev_info *dev_info)
 }
 
 int
-cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config 
*conf)
+cn10k_ml_dev_configure(struct cnxk_ml_dev *cnxk_mldev, const struct 
rte_ml_dev_config *conf)
 {
-   struct rte_ml_dev_info dev_info;
struct cn10k_ml_dev *cn10k_mldev;
-   struct cnxk_ml_dev *cnxk_mldev;
-   struct cnxk_ml_model *model;
struct cn10k_ml_ocm *ocm;
-   struct cnxk_ml_qp *qp;
-   uint16_t model_id;
-   uint32_t mz_size;
uint16_t tile_id;
-   uint16_t qp_id;
int ret;
 
-   if (dev == NULL || conf == NULL)
-   return -EINVAL;
+   RTE_SET_USED(conf);
 
-   /* Get CN10K device handle */
-   cnxk_mldev = dev->data->dev_private;
cn10k_mldev = &cnxk_mldev->cn10k_mldev;
 
-   cn10k_ml_dev_info_get(dev, &dev_info);
-   if (conf->nb_models > dev_info.max_models) {
-   plt_err("Invalid device config, nb_models > %u\n", 
dev_info.max_models);
-   return -EINVAL;
-   }
-
-   if (conf->nb_queue_pairs > dev_info.max_queue_pairs) {
-   plt_err("Invalid device config, nb_queue_pairs > %u\n", 
dev_info.max_queue_pairs);
-   return -EINVAL;
-   }
-
-   if (cnxk_mldev->state == ML_CNXK_DEV_STATE_PROBED) {
-   plt_ml_dbg("Configuring ML device, nb_queue_pairs = %u, 
nb_models = %u",
-  conf->nb_queue_pairs, conf->nb_models);
-
-   /* Load firmware */
-   ret = cn10k_ml_fw_load(cnxk_mldev);
-   if (ret != 0)
-   return ret;
-   } else if (cnxk_mldev->state == ML_CNXK_DEV_STATE_CONFIGURED) {
-   plt_ml_dbg("Re-configuring ML device, nb_queue_pairs = %u, 
nb_models = %u",
-  conf->nb_queue_pairs, conf->nb_models);
-   } else if (cnxk_mldev->state == ML_CNXK_DEV_STATE_STARTED) {
-   plt_err("Device can't be reconfigured in started state\n");
-   return -ENOTSUP;
-   } else if (cnxk_mldev->state == ML_CNXK_DEV_STATE_CLOSED) {
-   plt_err("Device can't be reconfigured after close\n");
-   return -ENOTSUP;
-   }
-
-   /* Configure queue-pairs */
-   if (dev->data->queue_pairs == NULL) {
-   mz_size = sizeof(dev->data->queue_pairs[0]) * 
conf->nb_queue_pairs;
-   dev->data->queue_pairs =
-   rte_zmalloc("cn10k_mldev_queue_pairs", mz_size, 
RTE_CACHE_LINE_SIZE);
-   if (dev->data->queue_pairs == NULL) {
-   dev->data->nb_queue_pairs = 0;
-   plt_err("Failed to get memory for queue_pairs, 
nb_queue_pairs %u",
-   conf->nb_queue_pairs);
-   return -ENOMEM;
-   }
-   } else { /* Re-configure */
-   void **queue_pairs;
-
-   /* Release all queue pairs as ML spec doesn't support 
queue_pair_destroy. */
-   for (qp_id = 0; qp_id < dev->data->nb_

[PATCH v8 17/34] ml/cnxk: move error handling to cnxk layer

2023-10-22 Thread Srikanth Yalavarthi
Move error type structures to cnxk layer. cn10k layer to
handle fw and hw error sub-types only.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_dev.h | 41 ++-
 drivers/ml/cnxk/cn10k_ml_ops.c | 93 +-
 drivers/ml/cnxk/cnxk_ml_dev.c  |  8 +++
 drivers/ml/cnxk/cnxk_ml_dev.h  | 18 +++
 drivers/ml/cnxk/cnxk_ml_ops.c  |  2 +-
 5 files changed, 78 insertions(+), 84 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h
index 94a94d996f..2e7eb6c9ef 100644
--- a/drivers/ml/cnxk/cn10k_ml_dev.h
+++ b/drivers/ml/cnxk/cn10k_ml_dev.h
@@ -52,38 +52,27 @@ struct cnxk_ml_dev;
 struct cnxk_ml_req;
 struct cnxk_ml_qp;
 
-/* Error types enumeration */
-enum cn10k_ml_error_etype {
-   /* 0x0 */ ML_ETYPE_NO_ERROR = 0, /* No error */
-   /* 0x1 */ ML_ETYPE_FW_NONFATAL,  /* Firmware non-fatal error */
-   /* 0x2 */ ML_ETYPE_HW_NONFATAL,  /* Hardware non-fatal error */
-   /* 0x3 */ ML_ETYPE_HW_FATAL, /* Hardware fatal error */
-   /* 0x4 */ ML_ETYPE_HW_WARNING,   /* Hardware warning */
-   /* 0x5 */ ML_ETYPE_DRIVER,   /* Driver specific error */
-   /* 0x6 */ ML_ETYPE_UNKNOWN,  /* Unknown error */
-};
-
 /* Firmware non-fatal error sub-type */
 enum cn10k_ml_error_stype_fw_nf {
-   /* 0x0 */ ML_FW_ERR_NOERR = 0,   /* No error */
-   /* 0x1 */ ML_FW_ERR_UNLOAD_ID_NOT_FOUND, /* Model ID not found during 
load */
-   /* 0x2 */ ML_FW_ERR_LOAD_LUT_OVERFLOW,   /* Lookup table overflow at 
load */
-   /* 0x3 */ ML_FW_ERR_ID_IN_USE,   /* Model ID already in use */
-   /* 0x4 */ ML_FW_ERR_INVALID_TILEMASK,/* Invalid OCM tilemask */
-   /* 0x5 */ ML_FW_ERR_RUN_LUT_OVERFLOW,/* Lookup table overflow at 
run */
-   /* 0x6 */ ML_FW_ERR_RUN_ID_NOT_FOUND,/* Model ID not found during 
run */
-   /* 0x7 */ ML_FW_ERR_COMMAND_NOTSUP,  /* Unsupported command */
-   /* 0x8 */ ML_FW_ERR_DDR_ADDR_RANGE,  /* DDR address out of range */
-   /* 0x9 */ ML_FW_ERR_NUM_BATCHES_INVALID, /* Invalid number of batches */
-   /* 0xA */ ML_FW_ERR_INSSYNC_TIMEOUT, /* INS sync timeout */
+   /* 0x0 */ ML_CN10K_FW_ERR_NOERR = 0,   /* No error */
+   /* 0x1 */ ML_CN10K_FW_ERR_UNLOAD_ID_NOT_FOUND, /* Model ID not found 
during load */
+   /* 0x2 */ ML_CN10K_FW_ERR_LOAD_LUT_OVERFLOW,   /* Lookup table overflow 
at load */
+   /* 0x3 */ ML_CN10K_FW_ERR_ID_IN_USE,   /* Model ID already in 
use */
+   /* 0x4 */ ML_CN10K_FW_ERR_INVALID_TILEMASK,/* Invalid OCM tilemask 
*/
+   /* 0x5 */ ML_CN10K_FW_ERR_RUN_LUT_OVERFLOW,/* Lookup table overflow 
at run */
+   /* 0x6 */ ML_CN10K_FW_ERR_RUN_ID_NOT_FOUND,/* Model ID not found 
during run */
+   /* 0x7 */ ML_CN10K_FW_ERR_COMMAND_NOTSUP,  /* Unsupported command */
+   /* 0x8 */ ML_CN10K_FW_ERR_DDR_ADDR_RANGE,  /* DDR address out of 
range */
+   /* 0x9 */ ML_CN10K_FW_ERR_NUM_BATCHES_INVALID, /* Invalid number of 
batches */
+   /* 0xA */ ML_CN10K_FW_ERR_INSSYNC_TIMEOUT, /* INS sync timeout */
 };
 
 /* Driver error sub-type */
 enum cn10k_ml_error_stype_driver {
-   /* 0x0 */ ML_DRIVER_ERR_NOERR = 0, /* No error */
-   /* 0x1 */ ML_DRIVER_ERR_UNKNOWN,   /* Unable to determine error 
sub-type */
-   /* 0x2 */ ML_DRIVER_ERR_EXCEPTION, /* Firmware exception */
-   /* 0x3 */ ML_DRIVER_ERR_FW_ERROR,  /* Unknown firmware error */
+   /* 0x0 */ ML_CN10K_DRIVER_ERR_NOERR = 0, /* No error */
+   /* 0x1 */ ML_CN10K_DRIVER_ERR_UNKNOWN,   /* Unable to determine error 
sub-type */
+   /* 0x2 */ ML_CN10K_DRIVER_ERR_EXCEPTION, /* Firmware exception */
+   /* 0x3 */ ML_CN10K_DRIVER_ERR_FW_ERROR,  /* Unknown firmware error */
 };
 
 /* Error structure */
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 8116c8dedb..65eaaf030d 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -22,47 +22,27 @@
 #define ML_FLAGS_POLL_COMPL BIT(0)
 #define ML_FLAGS_SSO_COMPL  BIT(1)
 
-/* Error message length */
-#define ERRMSG_LEN 32
-
-/* Error type database */
-static const struct cn10k_ml_etype_db {
-   enum cn10k_ml_error_etype etype;
-   char name[ERRMSG_LEN];
-} ml_etype_db[] = {
-   {ML_ETYPE_NO_ERROR, "NO_ERROR"},{ML_ETYPE_FW_NONFATAL, 
"FW_NON_FATAL"},
-   {ML_ETYPE_HW_NONFATAL, "HW_NON_FATAL"}, {ML_ETYPE_HW_FATAL, "HW_FATAL"},
-   {ML_ETYPE_HW_WARNING, "HW_WARNING"},{ML_ETYPE_DRIVER, 
"DRIVER_ERROR"},
-   {ML_ETYPE_UNKNOWN, "UNKNOWN_ERROR"},
-};
-
 /* Hardware non-fatal error subtype database */
-static const struct cn10k_ml_stype_db_hw_nf {
-   enum cn10k_ml_error_stype_fw_nf stype;
-   char msg[ERRMSG_LEN];
-} ml_stype_db_hw_nf[] = {
-   {ML_FW_ERR_NOERR, "NO ERROR"},
-   {ML_FW_ERR_UNLOAD_ID_NOT_FOUND, "UNLOAD MODEL ID NOT FOUND"},
-   {ML_FW_ERR_LOAD_LUT_OVERFLOW, "LOAD LUT OVERFLOW"},
-   {ML_F

[PATCH v8 15/34] ml/cnxk: update device and model xstats functions

2023-10-22 Thread Srikanth Yalavarthi
Added cnxk wrapper function to handle ML device and model
extended stats. Handling resources for the xstats is done
in the cnxk layer. Introduced internal xstats group.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_dev.h   |   4 -
 drivers/ml/cnxk/cn10k_ml_ops.c   | 531 +++
 drivers/ml/cnxk/cn10k_ml_ops.h   |  16 +-
 drivers/ml/cnxk/cnxk_ml_dev.h|   5 +
 drivers/ml/cnxk/cnxk_ml_ops.c| 481 +++-
 drivers/ml/cnxk/cnxk_ml_xstats.h |  21 +-
 6 files changed, 551 insertions(+), 507 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h
index be989e0a20..bde9d08901 100644
--- a/drivers/ml/cnxk/cn10k_ml_dev.h
+++ b/drivers/ml/cnxk/cn10k_ml_dev.h
@@ -10,7 +10,6 @@
 #include "cn10k_ml_ocm.h"
 
 #include "cnxk_ml_io.h"
-#include "cnxk_ml_xstats.h"
 
 /* Dummy Device ops */
 extern struct rte_ml_dev_ops ml_dev_dummy_ops;
@@ -133,9 +132,6 @@ struct cn10k_ml_dev {
/* OCM info */
struct cn10k_ml_ocm ocm;
 
-   /* Extended stats data */
-   struct cnxk_ml_xstats xstats;
-
/* Enable / disable model data caching */
int cache_model_data;
 
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 8cbf700f6e..776ad60401 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -198,107 +198,21 @@ cn10k_ml_prep_fp_job_descriptor(struct cnxk_ml_dev 
*cnxk_mldev, struct cnxk_ml_r
req->cn10k_req.jd.model_run.num_batches = op->nb_batches;
 }
 
-static int
-cn10k_ml_xstats_init(struct rte_ml_dev *dev)
-{
-   struct cn10k_ml_dev *cn10k_mldev;
-   struct cnxk_ml_dev *cnxk_mldev;
-   uint16_t nb_stats;
-   uint16_t stat_id;
-   uint16_t model;
-   uint16_t i;
-
-   cnxk_mldev = dev->data->dev_private;
-   cn10k_mldev = &cnxk_mldev->cn10k_mldev;
-
-   /* Allocate memory for xstats entries. Don't allocate during 
reconfigure */
-   nb_stats = RTE_DIM(device_xstats) + ML_CNXK_MAX_MODELS * 
RTE_DIM(layer_xstats);
-   if (cn10k_mldev->xstats.entries == NULL)
-   cn10k_mldev->xstats.entries = rte_zmalloc(
-   "cn10k_ml_xstats", sizeof(struct cnxk_ml_xstats_entry) 
* nb_stats,
-   PLT_CACHE_LINE_SIZE);
-
-   if (cn10k_mldev->xstats.entries == NULL)
-   return -ENOMEM;
-
-   /* Initialize device xstats */
-   stat_id = 0;
-   for (i = 0; i < RTE_DIM(device_xstats); i++) {
-   cn10k_mldev->xstats.entries[stat_id].map.id = stat_id;
-   snprintf(cn10k_mldev->xstats.entries[stat_id].map.name,
-sizeof(cn10k_mldev->xstats.entries[stat_id].map.name), 
"%s",
-device_xstats[i].name);
-
-   cn10k_mldev->xstats.entries[stat_id].mode = 
RTE_ML_DEV_XSTATS_DEVICE;
-   cn10k_mldev->xstats.entries[stat_id].type = 
device_xstats[i].type;
-   cn10k_mldev->xstats.entries[stat_id].fn_id = 
CNXK_ML_XSTATS_FN_DEVICE;
-   cn10k_mldev->xstats.entries[stat_id].obj_idx = 0;
-   cn10k_mldev->xstats.entries[stat_id].reset_allowed = 
device_xstats[i].reset_allowed;
-   stat_id++;
-   }
-   cn10k_mldev->xstats.count_mode_device = stat_id;
-
-   /* Initialize model xstats */
-   for (model = 0; model < ML_CNXK_MAX_MODELS; model++) {
-   cn10k_mldev->xstats.offset_for_model[model] = stat_id;
-
-   for (i = 0; i < RTE_DIM(layer_xstats); i++) {
-   cn10k_mldev->xstats.entries[stat_id].map.id = stat_id;
-   cn10k_mldev->xstats.entries[stat_id].mode = 
RTE_ML_DEV_XSTATS_MODEL;
-   cn10k_mldev->xstats.entries[stat_id].type = 
layer_xstats[i].type;
-   cn10k_mldev->xstats.entries[stat_id].fn_id = 
CNXK_ML_XSTATS_FN_MODEL;
-   cn10k_mldev->xstats.entries[stat_id].obj_idx = model;
-   cn10k_mldev->xstats.entries[stat_id].reset_allowed =
-   layer_xstats[i].reset_allowed;
-
-   /* Name of xstat is updated during model load */
-   snprintf(cn10k_mldev->xstats.entries[stat_id].map.name,
-
sizeof(cn10k_mldev->xstats.entries[stat_id].map.name),
-"Model-%u-%s", model, layer_xstats[i].name);
-
-   stat_id++;
-   }
-
-   cn10k_mldev->xstats.count_per_model[model] = 
RTE_DIM(layer_xstats);
-   }
-
-   cn10k_mldev->xstats.count_mode_model = stat_id - 
cn10k_mldev->xstats.count_mode_device;
-   cn10k_mldev->xstats.count = stat_id;
-
-   return 0;
-}
-
 static void
-cn10k_ml_xstats_uninit(struct rte_ml_dev *dev)
+cn10k_ml_xstats_layer_name_update(struct cnxk_ml_dev *cnxk_mldev, uint16_t 
model_id,
+ uint16_t layer_id)
 {
-   struct cn10k_

[PATCH v8 08/34] ml/cnxk: update queue-pair handling functions

2023-10-22 Thread Srikanth Yalavarthi
Added cnxk wrapper function to handle ML device queue-pairs.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_ops.c | 135 +
 drivers/ml/cnxk/cn10k_ml_ops.h |   7 +-
 drivers/ml/cnxk/cnxk_ml_ops.c  | 153 -
 drivers/ml/cnxk/cnxk_ml_ops.h  |   3 -
 4 files changed, 154 insertions(+), 144 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index f8c51ab394..9691cf03e3 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -95,93 +95,12 @@ cn10k_ml_get_poll_ptr(struct cnxk_ml_req *req)
return plt_read64(req->status);
 }
 
-static void
-qp_memzone_name_get(char *name, int size, int dev_id, int qp_id)
-{
-   snprintf(name, size, "cnxk_ml_qp_mem_%u:%u", dev_id, qp_id);
-}
-
-int
-cnxk_ml_qp_destroy(const struct rte_ml_dev *dev, struct cnxk_ml_qp *qp)
-{
-   const struct rte_memzone *qp_mem;
-   char name[RTE_MEMZONE_NAMESIZE];
-   int ret;
-
-   qp_memzone_name_get(name, RTE_MEMZONE_NAMESIZE, dev->data->dev_id, 
qp->id);
-   qp_mem = rte_memzone_lookup(name);
-   ret = rte_memzone_free(qp_mem);
-   if (ret)
-   return ret;
-
-   rte_free(qp);
-
-   return 0;
-}
-
-int
-cn10k_ml_dev_queue_pair_release(struct rte_ml_dev *dev, uint16_t queue_pair_id)
-{
-   struct cnxk_ml_qp *qp;
-   int ret;
-
-   qp = dev->data->queue_pairs[queue_pair_id];
-   if (qp == NULL)
-   return -EINVAL;
-
-   ret = cnxk_ml_qp_destroy(dev, qp);
-   if (ret) {
-   plt_err("Could not destroy queue pair %u", queue_pair_id);
-   return ret;
-   }
-
-   dev->data->queue_pairs[queue_pair_id] = NULL;
-
-   return 0;
-}
-
-static struct cnxk_ml_qp *
-cnxk_ml_qp_create(const struct rte_ml_dev *dev, uint16_t qp_id, uint32_t 
nb_desc, int socket_id)
+void
+cn10k_ml_qp_initialize(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_qp *qp)
 {
-   const struct rte_memzone *qp_mem;
-   char name[RTE_MEMZONE_NAMESIZE];
-   struct cnxk_ml_qp *qp;
-   uint32_t len;
-   uint8_t *va;
uint64_t i;
 
-   /* Allocate queue pair */
-   qp = rte_zmalloc_socket("cn10k_ml_pmd_queue_pair", sizeof(struct 
cnxk_ml_qp), ROC_ALIGN,
-   socket_id);
-   if (qp == NULL) {
-   plt_err("Could not allocate queue pair");
-   return NULL;
-   }
-
-   /* For request queue */
-   len = nb_desc * sizeof(struct cnxk_ml_req);
-   qp_memzone_name_get(name, RTE_MEMZONE_NAMESIZE, dev->data->dev_id, 
qp_id);
-   qp_mem = rte_memzone_reserve_aligned(
-   name, len, socket_id, RTE_MEMZONE_SIZE_HINT_ONLY | 
RTE_MEMZONE_256MB, ROC_ALIGN);
-   if (qp_mem == NULL) {
-   plt_err("Could not reserve memzone: %s", name);
-   goto qp_free;
-   }
-
-   va = qp_mem->addr;
-   memset(va, 0, len);
-
-   /* Initialize Request queue */
-   qp->id = qp_id;
-   qp->queue.reqs = (struct cnxk_ml_req *)va;
-   qp->queue.head = 0;
-   qp->queue.tail = 0;
-   qp->queue.wait_cycles = ML_CNXK_CMD_TIMEOUT * plt_tsc_hz();
-   qp->nb_desc = nb_desc;
-   qp->stats.enqueued_count = 0;
-   qp->stats.dequeued_count = 0;
-   qp->stats.enqueue_err_count = 0;
-   qp->stats.dequeue_err_count = 0;
+   RTE_SET_USED(cnxk_mldev);
 
/* Initialize job command */
for (i = 0; i < qp->nb_desc; i++) {
@@ -189,13 +108,6 @@ cnxk_ml_qp_create(const struct rte_ml_dev *dev, uint16_t 
qp_id, uint32_t nb_desc
qp->queue.reqs[i].cn10k_req.jcmd.w1.s.jobptr =
PLT_U64_CAST(&qp->queue.reqs[i].cn10k_req.jd);
}
-
-   return qp;
-
-qp_free:
-   rte_free(qp);
-
-   return NULL;
 }
 
 static void
@@ -1002,47 +914,6 @@ cn10k_ml_dev_stop(struct cnxk_ml_dev *cnxk_mldev)
return 0;
 }
 
-int
-cn10k_ml_dev_queue_pair_setup(struct rte_ml_dev *dev, uint16_t queue_pair_id,
- const struct rte_ml_dev_qp_conf *qp_conf, int 
socket_id)
-{
-   struct rte_ml_dev_info dev_info;
-   struct cnxk_ml_qp *qp;
-   uint32_t nb_desc;
-
-   if (queue_pair_id >= dev->data->nb_queue_pairs) {
-   plt_err("Queue-pair id = %u (>= max queue pairs supported, 
%u)\n", queue_pair_id,
-   dev->data->nb_queue_pairs);
-   return -EINVAL;
-   }
-
-   if (dev->data->queue_pairs[queue_pair_id] != NULL)
-   cn10k_ml_dev_queue_pair_release(dev, queue_pair_id);
-
-   cnxk_ml_dev_info_get(dev, &dev_info);
-   if ((qp_conf->nb_desc > dev_info.max_desc) || (qp_conf->nb_desc == 0)) {
-   plt_err("Could not setup queue pair for %u descriptors", 
qp_conf->nb_desc);
-   return -EINVAL;
-   }
-   plt_ml_dbg("Creating queue-pair, queue_pair_id = %u, nb_desc = %u", 
queue_pair_id,
-  qp

[PATCH v8 10/34] ml/cnxk: update model start and stop functions

2023-10-22 Thread Srikanth Yalavarthi
Implemented cnxk wrapper functions to start and stop
ML models. Wrapper functions would invoke the cn10k
model start and stop functions.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_ocm.c |  28 ++--
 drivers/ml/cnxk/cn10k_ml_ocm.h |  12 +-
 drivers/ml/cnxk/cn10k_ml_ops.c | 282 -
 drivers/ml/cnxk/cn10k_ml_ops.h |   8 +-
 drivers/ml/cnxk/cnxk_ml_ops.c  |  48 +-
 drivers/ml/cnxk/cnxk_ml_ops.h  |   1 +
 6 files changed, 240 insertions(+), 139 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_ocm.c b/drivers/ml/cnxk/cn10k_ml_ocm.c
index d71c36eae6..2197e5e0ed 100644
--- a/drivers/ml/cnxk/cn10k_ml_ocm.c
+++ b/drivers/ml/cnxk/cn10k_ml_ocm.c
@@ -215,11 +215,10 @@ cn10k_ml_ocm_tilecount(uint64_t tilemask, int *start, int 
*end)
  * scratch & WB pages and OCM allocation mode.
  */
 int
-cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t num_tiles, uint16_t 
wb_pages,
+cn10k_ml_ocm_tilemask_find(struct cnxk_ml_dev *cnxk_mldev, uint8_t num_tiles, 
uint16_t wb_pages,
   uint16_t scratch_pages, uint64_t *tilemask)
 {
struct cn10k_ml_dev *cn10k_mldev;
-   struct cnxk_ml_dev *cnxk_mldev;
struct cn10k_ml_ocm *ocm;
 
uint16_t used_scratch_pages_max;
@@ -238,7 +237,6 @@ cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t 
num_tiles, uint16_t w
int max_slot_sz;
int page_id;
 
-   cnxk_mldev = dev->data->dev_private;
cn10k_mldev = &cnxk_mldev->cn10k_mldev;
ocm = &cn10k_mldev->ocm;
 
@@ -333,12 +331,10 @@ cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, 
uint8_t num_tiles, uint16_t w
 }
 
 void
-cn10k_ml_ocm_reserve_pages(struct rte_ml_dev *dev, uint16_t model_id, uint16_t 
layer_id,
+cn10k_ml_ocm_reserve_pages(struct cnxk_ml_dev *cnxk_mldev, uint16_t model_id, 
uint16_t layer_id,
   uint64_t tilemask, int wb_page_start, uint16_t 
wb_pages,
   uint16_t scratch_pages)
 {
-   struct cn10k_ml_dev *cn10k_mldev;
-   struct cnxk_ml_dev *cnxk_mldev;
struct cnxk_ml_model *model;
struct cnxk_ml_layer *layer;
struct cn10k_ml_ocm *ocm;
@@ -351,10 +347,8 @@ cn10k_ml_ocm_reserve_pages(struct rte_ml_dev *dev, 
uint16_t model_id, uint16_t l
int tile_id;
int page_id;
 
-   cnxk_mldev = dev->data->dev_private;
-   cn10k_mldev = &cnxk_mldev->cn10k_mldev;
-   ocm = &cn10k_mldev->ocm;
-   model = dev->data->models[model_id];
+   ocm = &cnxk_mldev->cn10k_mldev.ocm;
+   model = cnxk_mldev->mldev->data->models[model_id];
layer = &model->layer[layer_id];
 
/* Get first set bit, tile_start */
@@ -396,12 +390,10 @@ cn10k_ml_ocm_reserve_pages(struct rte_ml_dev *dev, 
uint16_t model_id, uint16_t l
 }
 
 void
-cn10k_ml_ocm_free_pages(struct rte_ml_dev *dev, uint16_t model_id, uint16_t 
layer_id)
+cn10k_ml_ocm_free_pages(struct cnxk_ml_dev *cnxk_mldev, uint16_t model_id, 
uint16_t layer_id)
 {
struct cnxk_ml_model *local_model;
struct cnxk_ml_layer *local_layer;
-   struct cn10k_ml_dev *cn10k_mldev;
-   struct cnxk_ml_dev *cnxk_mldev;
struct cnxk_ml_model *model;
struct cnxk_ml_layer *layer;
struct cn10k_ml_ocm *ocm;
@@ -416,10 +408,8 @@ cn10k_ml_ocm_free_pages(struct rte_ml_dev *dev, uint16_t 
model_id, uint16_t laye
uint16_t i;
uint16_t j;
 
-   cnxk_mldev = dev->data->dev_private;
-   cn10k_mldev = &cnxk_mldev->cn10k_mldev;
-   ocm = &cn10k_mldev->ocm;
-   model = dev->data->models[model_id];
+   ocm = &cnxk_mldev->cn10k_mldev.ocm;
+   model = cnxk_mldev->mldev->data->models[model_id];
layer = &model->layer[layer_id];
 
/* Update OCM info for WB memory */
@@ -438,8 +428,8 @@ cn10k_ml_ocm_free_pages(struct rte_ml_dev *dev, uint16_t 
model_id, uint16_t laye
 
/* Get max scratch pages required, excluding the current model 
*/
scratch_resize_pages = 0;
-   for (i = 0; i < dev->data->nb_models; i++) {
-   local_model = dev->data->models[i];
+   for (i = 0; i < cnxk_mldev->mldev->data->nb_models; i++) {
+   local_model = cnxk_mldev->mldev->data->models[i];
if (local_model == NULL)
continue;
 
diff --git a/drivers/ml/cnxk/cn10k_ml_ocm.h b/drivers/ml/cnxk/cn10k_ml_ocm.h
index 720f8caf76..97b723a56a 100644
--- a/drivers/ml/cnxk/cn10k_ml_ocm.h
+++ b/drivers/ml/cnxk/cn10k_ml_ocm.h
@@ -8,6 +8,8 @@
 #include 
 #include 
 
+struct cnxk_ml_dev;
+
 /* Number of OCM tiles. */
 #define ML_CN10K_OCM_NUMTILES 0x8
 
@@ -75,12 +77,12 @@ struct cn10k_ml_ocm {
 };
 
 int cn10k_ml_ocm_tilecount(uint64_t tilemask, int *start, int *end);
-int cn10k_ml_ocm_tilemask_find(struct rte_ml_dev *dev, uint8_t num_tiles, 
uint16_t wb_pages,
+int cn10k_ml_ocm_tilemask_find(struct cnxk_ml_dev *cnxk_mldev, uint8_t 
num_tiles, uint16_t wb_pages,
  

[PATCH v8 19/34] ml/cnxk: add structures to support TVM model type

2023-10-22 Thread Srikanth Yalavarthi
Introduced model type, sub-type and layer type. Added
internal structures for TVM model objects.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_ocm.c   |  3 ++
 drivers/ml/cnxk/cn10k_ml_ops.c   |  6 ++-
 drivers/ml/cnxk/cnxk_ml_model.h  | 66 +++-
 drivers/ml/cnxk/cnxk_ml_ops.c| 52 -
 drivers/ml/cnxk/mvtvm_ml_model.h | 46 ++
 5 files changed, 160 insertions(+), 13 deletions(-)
 create mode 100644 drivers/ml/cnxk/mvtvm_ml_model.h

diff --git a/drivers/ml/cnxk/cn10k_ml_ocm.c b/drivers/ml/cnxk/cn10k_ml_ocm.c
index dc315cce10..749ddeb344 100644
--- a/drivers/ml/cnxk/cn10k_ml_ocm.c
+++ b/drivers/ml/cnxk/cn10k_ml_ocm.c
@@ -435,6 +435,9 @@ cn10k_ml_ocm_free_pages(struct cnxk_ml_dev *cnxk_mldev, 
uint16_t model_id, uint1
 
for (j = 0; j < local_model->nb_layers; j++) {
local_layer = &local_model->layer[j];
+   if (local_layer->type != 
ML_CNXK_LAYER_TYPE_MRVL)
+   continue;
+
if (local_layer != layer &&
local_layer->glow.ocm_map.ocm_reserved) {
if 
(IS_BIT_SET(local_layer->glow.ocm_map.tilemask, tile_id))
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 65eaaf030d..a471e98fbf 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -725,6 +725,9 @@ cn10k_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct 
rte_ml_model_params *
if (ret != 0)
return ret;
 
+   /* Set model sub type */
+   model->subtype = ML_CNXK_MODEL_SUBTYPE_GLOW_MRVL;
+
/* Copy metadata to internal buffer */
rte_memcpy(&model->glow.metadata, params->addr, sizeof(struct 
cn10k_ml_model_metadata));
cn10k_ml_model_metadata_update(&model->glow.metadata);
@@ -746,6 +749,7 @@ cn10k_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct 
rte_ml_model_params *
 
/* Load layer and get the index */
layer = &model->layer[0];
+   layer->type = ML_CNXK_LAYER_TYPE_MRVL;
ret = cn10k_ml_layer_load(cnxk_mldev, model->model_id, NULL, 
params->addr, params->size,
  &layer->index);
if (ret != 0) {
@@ -969,7 +973,7 @@ cn10k_ml_layer_start(void *device, uint16_t model_id, const 
char *layer_name)
if (ret < 0) {
cn10k_ml_layer_stop(device, model_id, layer_name);
} else {
-   if (cn10k_mldev->cache_model_data)
+   if (cn10k_mldev->cache_model_data && model->type == 
ML_CNXK_MODEL_TYPE_GLOW)
ret = cn10k_ml_cache_model_data(cnxk_mldev, layer);
}
 
diff --git a/drivers/ml/cnxk/cnxk_ml_model.h b/drivers/ml/cnxk/cnxk_ml_model.h
index f618e5aa5f..f100eca203 100644
--- a/drivers/ml/cnxk/cnxk_ml_model.h
+++ b/drivers/ml/cnxk/cnxk_ml_model.h
@@ -11,6 +11,10 @@
 
 #include "cn10k_ml_model.h"
 
+#ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM
+#include "mvtvm_ml_model.h"
+#endif
+
 #include "cnxk_ml_io.h"
 
 struct cnxk_ml_dev;
@@ -18,6 +22,48 @@ struct cnxk_ml_model;
 struct cnxk_ml_qp;
 struct cnxk_ml_req;
 
+/* Model type */
+enum cnxk_ml_model_type {
+   /* Unknown model type */
+   ML_CNXK_MODEL_TYPE_UNKNOWN,
+
+   /* Invalid model type */
+   ML_CNXK_MODEL_TYPE_INVALID,
+
+   /* Glow compiled model, for MLIP target */
+   ML_CNXK_MODEL_TYPE_GLOW,
+
+   /* TVM compiled model, for ARM64 / ARM64 + MLIP target */
+   ML_CNXK_MODEL_TYPE_TVM,
+};
+
+/* Model subtype */
+enum cnxk_ml_model_subtype {
+   /* Marvell Glow model */
+   ML_CNXK_MODEL_SUBTYPE_GLOW_MRVL,
+
+   /* TVM model with single MRVL region */
+   ML_CNXK_MODEL_SUBTYPE_TVM_MRVL,
+
+   /* TVM model with LLVM regions only */
+   ML_CNXK_MODEL_SUBTYPE_TVM_LLVM,
+
+   /* TVM hybrid model, with both MRVL and LLVM regions or (> 1) MRVL 
regions*/
+   ML_CNXK_MODEL_SUBTYPE_TVM_HYBRID,
+};
+
+/* Layer type */
+enum cnxk_ml_layer_type {
+   /* MRVL layer, for MLIP target*/
+   ML_CNXK_LAYER_TYPE_UNKNOWN = 0,
+
+   /* MRVL layer, for MLIP target*/
+   ML_CNXK_LAYER_TYPE_MRVL,
+
+   /* LLVM layer, for ARM64 target*/
+   ML_CNXK_LAYER_TYPE_LLVM,
+};
+
 /* Model state */
 enum cnxk_ml_model_state {
/* Unknown state */
@@ -53,6 +99,9 @@ struct cnxk_ml_layer {
/* Name*/
char name[RTE_ML_STR_MAX];
 
+   /* Type */
+   enum cnxk_ml_layer_type type;
+
/* Model handle */
struct cnxk_ml_model *model;
 
@@ -83,14 +132,27 @@ struct cnxk_ml_model {
/* Device reference */
struct cnxk_ml_dev *cnxk_mldev;
 
+   /* Type */
+   enum cnxk_ml_model_type type;
+
+   /* Model subtype */
+   enum cnxk_ml_model_subtype subtype;
+
/* ID */
uint16_t model_id;
 
/* Name */
char name[R

[PATCH v8 11/34] ml/cnxk: update model utility functions

2023-10-22 Thread Srikanth Yalavarthi
Added cnxk wrapper function to update model params and
fetch model info.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_ops.c | 38 ++-
 drivers/ml/cnxk/cn10k_ml_ops.h |  5 ++--
 drivers/ml/cnxk/cnxk_ml_ops.c  | 48 --
 3 files changed, 56 insertions(+), 35 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 40f484158a..3ff82829f0 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -1835,45 +1835,23 @@ cn10k_ml_model_stop(struct cnxk_ml_dev *cnxk_mldev, 
struct cnxk_ml_model *model)
 }
 
 int
-cn10k_ml_model_info_get(struct rte_ml_dev *dev, uint16_t model_id,
-   struct rte_ml_model_info *model_info)
+cn10k_ml_model_params_update(struct cnxk_ml_dev *cnxk_mldev, struct 
cnxk_ml_model *model,
+void *buffer)
 {
-   struct cnxk_ml_model *model;
-
-   model = dev->data->models[model_id];
-
-   if (model == NULL) {
-   plt_err("Invalid model_id = %u", model_id);
-   return -EINVAL;
-   }
-
-   rte_memcpy(model_info, model->info, sizeof(struct rte_ml_model_info));
-   model_info->input_info = ((struct rte_ml_model_info 
*)model->info)->input_info;
-   model_info->output_info = ((struct rte_ml_model_info 
*)model->info)->output_info;
-
-   return 0;
-}
-
-int
-cn10k_ml_model_params_update(struct rte_ml_dev *dev, uint16_t model_id, void 
*buffer)
-{
-   struct cnxk_ml_model *model;
-
-   model = dev->data->models[model_id];
+   struct cnxk_ml_layer *layer;
 
-   if (model == NULL) {
-   plt_err("Invalid model_id = %u", model_id);
-   return -EINVAL;
-   }
+   RTE_SET_USED(cnxk_mldev);
 
if (model->state == ML_CNXK_MODEL_STATE_UNKNOWN)
return -1;
else if (model->state != ML_CNXK_MODEL_STATE_LOADED)
return -EBUSY;
 
+   layer = &model->layer[0];
+
/* Update model weights & bias */
-   rte_memcpy(model->layer[0].glow.addr.wb_load_addr, buffer,
-  model->layer[0].glow.metadata.weights_bias.file_size);
+   rte_memcpy(layer->glow.addr.wb_load_addr, buffer,
+  layer->glow.metadata.weights_bias.file_size);
 
return 0;
 }
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h
index a222a43d55..ef12069f0d 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.h
+++ b/drivers/ml/cnxk/cn10k_ml_ops.h
@@ -317,9 +317,8 @@ int cn10k_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, 
struct rte_ml_model_para
 int cn10k_ml_model_unload(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model 
*model);
 int cn10k_ml_model_start(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model 
*model);
 int cn10k_ml_model_stop(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model 
*model);
-int cn10k_ml_model_info_get(struct rte_ml_dev *dev, uint16_t model_id,
-   struct rte_ml_model_info *model_info);
-int cn10k_ml_model_params_update(struct rte_ml_dev *dev, uint16_t model_id, 
void *buffer);
+int cn10k_ml_model_params_update(struct cnxk_ml_dev *cnxk_mldev, struct 
cnxk_ml_model *model,
+void *buffer);
 
 /* I/O ops */
 int cn10k_ml_io_quantize(struct rte_ml_dev *dev, uint16_t model_id,
diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c
index b61ed45876..9ce37fcfd1 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.c
+++ b/drivers/ml/cnxk/cnxk_ml_ops.c
@@ -604,6 +604,50 @@ cnxk_ml_model_stop(struct rte_ml_dev *dev, uint16_t 
model_id)
return cn10k_ml_model_stop(cnxk_mldev, model);
 }
 
+static int
+cnxk_ml_model_info_get(struct rte_ml_dev *dev, uint16_t model_id,
+  struct rte_ml_model_info *model_info)
+{
+   struct rte_ml_model_info *info;
+   struct cnxk_ml_model *model;
+
+   if ((dev == NULL) || (model_info == NULL))
+   return -EINVAL;
+
+   model = dev->data->models[model_id];
+   if (model == NULL) {
+   plt_err("Invalid model_id = %u", model_id);
+   return -EINVAL;
+   }
+
+   info = (struct rte_ml_model_info *)model->info;
+   rte_memcpy(model_info, info, sizeof(struct rte_ml_model_info));
+   model_info->input_info = info->input_info;
+   model_info->output_info = info->output_info;
+
+   return 0;
+}
+
+static int
+cnxk_ml_model_params_update(struct rte_ml_dev *dev, uint16_t model_id, void 
*buffer)
+{
+   struct cnxk_ml_dev *cnxk_mldev;
+   struct cnxk_ml_model *model;
+
+   if ((dev == NULL) || (buffer == NULL))
+   return -EINVAL;
+
+   cnxk_mldev = dev->data->dev_private;
+
+   model = dev->data->models[model_id];
+   if (model == NULL) {
+   plt_err("Invalid model_id = %u", model_id);
+   return -EINVAL;
+   }
+
+   return cn10k_ml_model_params_update(cnxk_mldev, model, buffer);
+}
+

[PATCH v8 21/34] ml/cnxk: add support to parse TVM model objects

2023-10-22 Thread Srikanth Yalavarthi
Added support to parse TVM model objects from the model
archive buffer. Added support to check for all expected
objects and copy TVM model objects to internal buffers.

Signed-off-by: Srikanth Yalavarthi 
Signed-off-by: Anup Prabhu 
---
 drivers/ml/cnxk/cnxk_ml_ops.c|  5 ++-
 drivers/ml/cnxk/mvtvm_ml_model.c | 57 +
 drivers/ml/cnxk/mvtvm_ml_model.h |  2 ++
 drivers/ml/cnxk/mvtvm_ml_ops.c   | 62 
 drivers/ml/cnxk/mvtvm_ml_ops.h   |  3 ++
 drivers/ml/cnxk/mvtvm_ml_stubs.c | 11 ++
 drivers/ml/cnxk/mvtvm_ml_stubs.h |  3 ++
 7 files changed, 142 insertions(+), 1 deletion(-)

diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c
index ebc78e36e9..85b37161d2 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.c
+++ b/drivers/ml/cnxk/cnxk_ml_ops.c
@@ -1079,7 +1079,10 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct 
rte_ml_model_params *params, u
model, PLT_ALIGN_CEIL(sizeof(struct cnxk_ml_model), 
dev_info.align_size));
dev->data->models[lcl_model_id] = model;
 
-   ret = cn10k_ml_model_load(cnxk_mldev, params, model);
+   if (type == ML_CNXK_MODEL_TYPE_GLOW)
+   ret = cn10k_ml_model_load(cnxk_mldev, params, model);
+   else
+   ret = mvtvm_ml_model_load(cnxk_mldev, params, model);
if (ret != 0)
goto error;
 
diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c
index ab5f8baa67..4c9a080c05 100644
--- a/drivers/ml/cnxk/mvtvm_ml_model.c
+++ b/drivers/ml/cnxk/mvtvm_ml_model.c
@@ -53,3 +53,60 @@ mvtvm_ml_model_type_get(struct rte_ml_model_params *params)
 
return ML_CNXK_MODEL_TYPE_TVM;
 }
+
+int
+mvtvm_ml_model_blob_parse(struct rte_ml_model_params *params, struct 
mvtvm_ml_model_object *object)
+{
+   bool object_found[ML_MVTVM_MODEL_OBJECT_MAX] = {false, false, false};
+   struct archive_entry *entry;
+   struct archive *a;
+   uint8_t i;
+   int ret;
+
+   /* Open archive */
+   a = archive_read_new();
+   archive_read_support_filter_all(a);
+   archive_read_support_format_all(a);
+
+   ret = archive_read_open_memory(a, params->addr, params->size);
+   if (ret != ARCHIVE_OK)
+   return archive_errno(a);
+
+   /* Read archive */
+   while (archive_read_next_header(a, &entry) == ARCHIVE_OK) {
+   for (i = 0; i < ML_MVTVM_MODEL_OBJECT_MAX; i++) {
+   if (!object_found[i] &&
+   (strcmp(archive_entry_pathname(entry), 
mvtvm_object_list[i]) == 0)) {
+   memcpy(object[i].name, mvtvm_object_list[i], 
RTE_ML_STR_MAX);
+   object[i].size = archive_entry_size(entry);
+   object[i].buffer = rte_malloc(NULL, 
object[i].size, 0);
+
+   if (archive_read_data(a, object[i].buffer, 
object[i].size) !=
+   object[i].size) {
+   plt_err("Failed to read object from 
model archive: %s",
+   object[i].name);
+   goto error;
+   }
+   object_found[i] = true;
+   }
+   }
+   archive_read_data_skip(a);
+   }
+
+   /* Check if all objects are parsed */
+   for (i = 0; i < ML_MVTVM_MODEL_OBJECT_MAX; i++) {
+   if (!object_found[i]) {
+   plt_err("Object %s not found in archive!\n", 
mvtvm_object_list[i]);
+   goto error;
+   }
+   }
+   return 0;
+
+error:
+   for (i = 0; i < ML_MVTVM_MODEL_OBJECT_MAX; i++) {
+   if (object[i].buffer != NULL)
+   rte_free(object[i].buffer);
+   }
+
+   return -EINVAL;
+}
diff --git a/drivers/ml/cnxk/mvtvm_ml_model.h b/drivers/ml/cnxk/mvtvm_ml_model.h
index b6162fceec..b11b66f495 100644
--- a/drivers/ml/cnxk/mvtvm_ml_model.h
+++ b/drivers/ml/cnxk/mvtvm_ml_model.h
@@ -44,5 +44,7 @@ struct mvtvm_ml_model_data {
 };
 
 enum cnxk_ml_model_type mvtvm_ml_model_type_get(struct rte_ml_model_params 
*params);
+int mvtvm_ml_model_blob_parse(struct rte_ml_model_params *params,
+ struct mvtvm_ml_model_object *object);
 
 #endif /* _MVTVM_ML_MODEL_H_ */
diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c
index 88c6d5a864..e2413b6b15 100644
--- a/drivers/ml/cnxk/mvtvm_ml_ops.c
+++ b/drivers/ml/cnxk/mvtvm_ml_ops.c
@@ -8,8 +8,12 @@
 #include 
 
 #include "cnxk_ml_dev.h"
+#include "cnxk_ml_model.h"
 #include "cnxk_ml_ops.h"
 
+/* ML model macros */
+#define MVTVM_ML_MODEL_MEMZONE_NAME "ml_mvtvm_model_mz"
+
 int
 mvtvm_ml_dev_configure(struct cnxk_ml_dev *cnxk_mldev, const struct 
rte_ml_dev_config *conf)
 {
@@ -39,3 +43,61 @@ mvtvm_ml_dev_close(struct cnxk_ml_dev *cnxk_mldev)
 

[PATCH v8 12/34] ml/cnxk: update data quantization functions

2023-10-22 Thread Srikanth Yalavarthi
Added cnxk wrapper functions to quantize input data and
dequantize output data.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_ops.c | 164 -
 drivers/ml/cnxk/cn10k_ml_ops.h |   7 --
 drivers/ml/cnxk/cnxk_ml_io.c   |  95 +++
 drivers/ml/cnxk/cnxk_ml_io.h   |   3 +
 drivers/ml/cnxk/cnxk_ml_ops.c  |  78 +++-
 drivers/ml/cnxk/meson.build|   1 +
 6 files changed, 175 insertions(+), 173 deletions(-)
 create mode 100644 drivers/ml/cnxk/cnxk_ml_io.c

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 3ff82829f0..c68e6c620c 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -1856,170 +1856,6 @@ cn10k_ml_model_params_update(struct cnxk_ml_dev 
*cnxk_mldev, struct cnxk_ml_mode
return 0;
 }
 
-int
-cn10k_ml_io_quantize(struct rte_ml_dev *dev, uint16_t model_id, struct 
rte_ml_buff_seg **dbuffer,
-struct rte_ml_buff_seg **qbuffer)
-{
-   struct cnxk_ml_model *model;
-   uint8_t model_input_type;
-   uint8_t *lcl_dbuffer;
-   uint8_t *lcl_qbuffer;
-   uint8_t input_type;
-   float qscale;
-   uint32_t i;
-   uint32_t j;
-   int ret;
-
-   model = dev->data->models[model_id];
-
-   if (model == NULL) {
-   plt_err("Invalid model_id = %u", model_id);
-   return -EINVAL;
-   }
-
-   lcl_dbuffer = dbuffer[0]->addr;
-   lcl_qbuffer = qbuffer[0]->addr;
-
-   for (i = 0; i < model->layer[0].glow.metadata.model.num_input; i++) {
-   if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) {
-   input_type = 
model->layer[0].glow.metadata.input1[i].input_type;
-   model_input_type = 
model->layer[0].glow.metadata.input1[i].model_input_type;
-   qscale = model->layer[0].glow.metadata.input1[i].qscale;
-   } else {
-   j = i - MRVL_ML_NUM_INPUT_OUTPUT_1;
-   input_type = 
model->layer[0].glow.metadata.input2[j].input_type;
-   model_input_type = 
model->layer[0].glow.metadata.input2[j].model_input_type;
-   qscale = model->layer[0].glow.metadata.input2[j].qscale;
-   }
-
-   if (input_type == model_input_type) {
-   rte_memcpy(lcl_qbuffer, lcl_dbuffer, 
model->layer[0].info.input[i].sz_d);
-   } else {
-   switch 
(model->layer[0].glow.metadata.input1[i].model_input_type) {
-   case RTE_ML_IO_TYPE_INT8:
-   ret = rte_ml_io_float32_to_int8(
-   qscale, 
model->layer[0].info.input[i].nb_elements,
-   lcl_dbuffer, lcl_qbuffer);
-   break;
-   case RTE_ML_IO_TYPE_UINT8:
-   ret = rte_ml_io_float32_to_uint8(
-   qscale, 
model->layer[0].info.input[i].nb_elements,
-   lcl_dbuffer, lcl_qbuffer);
-   break;
-   case RTE_ML_IO_TYPE_INT16:
-   ret = rte_ml_io_float32_to_int16(
-   qscale, 
model->layer[0].info.input[i].nb_elements,
-   lcl_dbuffer, lcl_qbuffer);
-   break;
-   case RTE_ML_IO_TYPE_UINT16:
-   ret = rte_ml_io_float32_to_uint16(
-   qscale, 
model->layer[0].info.input[i].nb_elements,
-   lcl_dbuffer, lcl_qbuffer);
-   break;
-   case RTE_ML_IO_TYPE_FP16:
-   ret = rte_ml_io_float32_to_float16(
-   
model->layer[0].info.input[i].nb_elements, lcl_dbuffer,
-   lcl_qbuffer);
-   break;
-   default:
-   plt_err("Unsupported model_input_type[%u] : 
%u", i,
-   
model->layer[0].glow.metadata.input1[i].model_input_type);
-   ret = -ENOTSUP;
-   }
-   if (ret < 0)
-   return ret;
-   }
-
-   lcl_dbuffer += model->layer[0].info.input[i].sz_d;
-   lcl_qbuffer += model->layer[0].info.input[i].sz_q;
-   }
-
-   return 0;
-}
-
-int
-cn10k_ml_io_dequantize(struct rte_ml_dev *dev, uint16_t model_id, struct 
rte_ml_buff_seg **qbuffer,
-  struct rte_ml_buff_seg **dbuffer)
-{
-   struct cnxk_ml_model *model;
-   uint8_t model_output_type;
-   uint8_t *lcl_qbuffer;
-   uint8_t *lcl_dbuffer;
-   

[PATCH v8 22/34] ml/cnxk: fetch layer info and load TVM model

2023-10-22 Thread Srikanth Yalavarthi
Added support to fetch TVM model layer information and
update internal structures based on the layer information
Set callback functions for layer load and unload and
enable model loading using TVMDP library. Added support
to fetch full metadata after model load.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_model.c | 11 +
 drivers/ml/cnxk/cn10k_ml_model.h |  2 +
 drivers/ml/cnxk/cn10k_ml_ops.c   |  7 ++-
 drivers/ml/cnxk/mvtvm_ml_model.c | 25 ++
 drivers/ml/cnxk/mvtvm_ml_model.h |  4 ++
 drivers/ml/cnxk/mvtvm_ml_ops.c   | 81 
 drivers/ml/cnxk/mvtvm_ml_stubs.c | 10 
 drivers/ml/cnxk/mvtvm_ml_stubs.h |  3 ++
 8 files changed, 141 insertions(+), 2 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_model.c b/drivers/ml/cnxk/cn10k_ml_model.c
index af9d5a666f..0325cd54f1 100644
--- a/drivers/ml/cnxk/cn10k_ml_model.c
+++ b/drivers/ml/cnxk/cn10k_ml_model.c
@@ -716,3 +716,14 @@ cn10k_ml_layer_print(struct cnxk_ml_dev *cnxk_mldev, 
struct cnxk_ml_layer *layer
cnxk_ml_print_line(fp, LINE_LEN);
fprintf(fp, "\n");
 }
+
+int
+cn10k_ml_model_get_layer_id(struct cnxk_ml_model *model, const char 
*layer_name, uint16_t *layer_id)
+{
+   if (model->type == ML_CNXK_MODEL_TYPE_TVM)
+   return mvtvm_ml_model_get_layer_id(model, layer_name, layer_id);
+
+   *layer_id = 0;
+
+   return 0;
+}
diff --git a/drivers/ml/cnxk/cn10k_ml_model.h b/drivers/ml/cnxk/cn10k_ml_model.h
index 45f2ed5fcf..6744175cd5 100644
--- a/drivers/ml/cnxk/cn10k_ml_model.h
+++ b/drivers/ml/cnxk/cn10k_ml_model.h
@@ -461,5 +461,7 @@ void cn10k_ml_model_info_set(struct cnxk_ml_dev 
*cnxk_mldev, struct cnxk_ml_mode
 struct cnxk_ml_io_info *io_info,
 struct cn10k_ml_model_metadata *metadata);
 void cn10k_ml_layer_print(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer 
*layer, FILE *fp);
+int cn10k_ml_model_get_layer_id(struct cnxk_ml_model *model, const char 
*layer_name,
+   uint16_t *layer_id);
 
 #endif /* _CN10K_ML_MODEL_H_ */
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index a471e98fbf..4191ccc840 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -576,7 +576,7 @@ cn10k_ml_layer_load(void *device, uint16_t model_id, const 
char *layer_name, uin
size_t layer_xstats_size;
uint8_t *base_dma_addr;
uint16_t scratch_pages;
-   uint16_t layer_id = 0;
+   uint16_t layer_id;
uint16_t wb_pages;
uint64_t mz_size;
uint16_t idx;
@@ -584,7 +584,6 @@ cn10k_ml_layer_load(void *device, uint16_t model_id, const 
char *layer_name, uin
int ret;
 
PLT_SET_USED(size);
-   PLT_SET_USED(layer_name);
 
cnxk_mldev = (struct cnxk_ml_dev *)device;
if (cnxk_mldev == NULL) {
@@ -598,6 +597,10 @@ cn10k_ml_layer_load(void *device, uint16_t model_id, const 
char *layer_name, uin
return -EINVAL;
}
 
+   ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id);
+   if (ret != 0)
+   return ret;
+
layer = &model->layer[layer_id];
 
ret = cn10k_ml_model_metadata_check(buffer, size);
diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c
index 4c9a080c05..8536fd8927 100644
--- a/drivers/ml/cnxk/mvtvm_ml_model.c
+++ b/drivers/ml/cnxk/mvtvm_ml_model.c
@@ -110,3 +110,28 @@ mvtvm_ml_model_blob_parse(struct rte_ml_model_params 
*params, struct mvtvm_ml_mo
 
return -EINVAL;
 }
+
+int
+mvtvm_ml_model_get_layer_id(struct cnxk_ml_model *model, const char 
*layer_name, uint16_t *layer_id)
+{
+   uint16_t i;
+
+   for (i = 0; i < model->mvtvm.metadata.model.nb_layers; i++) {
+   if (strcmp(model->layer[i].name, layer_name) == 0)
+   break;
+   }
+
+   if (i == model->mvtvm.metadata.model.nb_layers) {
+   plt_err("Invalid layer name: %s", layer_name);
+   return -EINVAL;
+   }
+
+   if (model->layer[i].type != ML_CNXK_LAYER_TYPE_MRVL) {
+   plt_err("Invalid layer type, name: %s type: %d", layer_name, 
model->layer[i].type);
+   return -EINVAL;
+   }
+
+   *layer_id = i;
+
+   return 0;
+}
diff --git a/drivers/ml/cnxk/mvtvm_ml_model.h b/drivers/ml/cnxk/mvtvm_ml_model.h
index b11b66f495..6cb2639876 100644
--- a/drivers/ml/cnxk/mvtvm_ml_model.h
+++ b/drivers/ml/cnxk/mvtvm_ml_model.h
@@ -11,6 +11,8 @@
 
 #include "cnxk_ml_io.h"
 
+struct cnxk_ml_model;
+
 /* Maximum number of objects per model */
 #define ML_MVTVM_MODEL_OBJECT_MAX 3
 
@@ -46,5 +48,7 @@ struct mvtvm_ml_model_data {
 enum cnxk_ml_model_type mvtvm_ml_model_type_get(struct rte_ml_model_params 
*params);
 int mvtvm_ml_model_blob_parse(struct rte_ml_model_params *params,
  struct mvtvm_ml_model_object *object);
+int mvtvm_ml_model_get_layer_id(struct cnxk_ml_mo

[PATCH v8 16/34] ml/cnxk: update fast path functions

2023-10-22 Thread Srikanth Yalavarthi
Implemented cnxk layer fast-path functions and added support
for model specific fast-path functions. CNXK layer functions
would invoke model specific fast-path functions.

Added support for model specific poll handling functions and
updated internal inference sync function. Drop use of rte_ml_op
as argument. Updated function arguments to enable the function
to be used as callback by TVM HW runtime.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_dev.h  |   5 -
 drivers/ml/cnxk/cn10k_ml_ops.c  | 241 
 drivers/ml/cnxk/cn10k_ml_ops.h  |  13 +-
 drivers/ml/cnxk/cnxk_ml_model.h |  14 ++
 drivers/ml/cnxk/cnxk_ml_ops.c   | 128 +
 drivers/ml/cnxk/cnxk_ml_ops.h   |   7 +
 6 files changed, 216 insertions(+), 192 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h
index bde9d08901..94a94d996f 100644
--- a/drivers/ml/cnxk/cn10k_ml_dev.h
+++ b/drivers/ml/cnxk/cn10k_ml_dev.h
@@ -143,11 +143,6 @@ struct cn10k_ml_dev {
 
/* JCMD enqueue function handler */
bool (*ml_jcmdq_enqueue)(struct roc_ml *roc_ml, struct ml_job_cmd_s 
*job_cmd);
-
-   /* Poll handling function pointers */
-   void (*set_poll_addr)(struct cnxk_ml_req *req);
-   void (*set_poll_ptr)(struct cnxk_ml_req *req);
-   uint64_t (*get_poll_ptr)(struct cnxk_ml_req *req);
 };
 
 uint64_t cn10k_ml_fw_flags_get(struct cn10k_ml_fw *fw);
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 776ad60401..8116c8dedb 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -65,24 +65,12 @@ static const struct cn10k_ml_stype_db_driver {
{ML_DRIVER_ERR_FW_ERROR, "UNKNOWN FIRMWARE ERROR"},
 };
 
-static inline void
+__rte_hot void
 cn10k_ml_set_poll_addr(struct cnxk_ml_req *req)
 {
req->status = &req->cn10k_req.status;
 }
 
-static inline void
-cn10k_ml_set_poll_ptr(struct cnxk_ml_req *req)
-{
-   plt_write64(ML_CNXK_POLL_JOB_START, req->status);
-}
-
-static inline uint64_t
-cn10k_ml_get_poll_ptr(struct cnxk_ml_req *req)
-{
-   return plt_read64(req->status);
-}
-
 void
 cn10k_ml_qp_initialize(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_qp *qp)
 {
@@ -177,7 +165,7 @@ cn10k_ml_prep_sp_job_descriptor(struct cnxk_ml_dev 
*cnxk_mldev, struct cnxk_ml_l
 
 static __rte_always_inline void
 cn10k_ml_prep_fp_job_descriptor(struct cnxk_ml_dev *cnxk_mldev, struct 
cnxk_ml_req *req,
-   struct rte_ml_op *op)
+   uint16_t index, void *input, void *output, 
uint16_t nb_batches)
 {
struct cn10k_ml_dev *cn10k_mldev;
 
@@ -185,17 +173,17 @@ cn10k_ml_prep_fp_job_descriptor(struct cnxk_ml_dev 
*cnxk_mldev, struct cnxk_ml_r
 
req->cn10k_req.jd.hdr.jce.w0.u64 = 0;
req->cn10k_req.jd.hdr.jce.w1.u64 = PLT_U64_CAST(req->status);
-   req->cn10k_req.jd.hdr.model_id = op->model_id;
+   req->cn10k_req.jd.hdr.model_id = index;
req->cn10k_req.jd.hdr.job_type = ML_CN10K_JOB_TYPE_MODEL_RUN;
req->cn10k_req.jd.hdr.fp_flags = ML_FLAGS_POLL_COMPL;
req->cn10k_req.jd.hdr.sp_flags = 0x0;
req->cn10k_req.jd.hdr.result =
roc_ml_addr_ap2mlip(&cn10k_mldev->roc, &req->cn10k_req.result);
req->cn10k_req.jd.model_run.input_ddr_addr =
-   PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, 
op->input[0]->addr));
+   PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, input));
req->cn10k_req.jd.model_run.output_ddr_addr =
-   PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, 
op->output[0]->addr));
-   req->cn10k_req.jd.model_run.num_batches = op->nb_batches;
+   PLT_U64_CAST(roc_ml_addr_ap2mlip(&cn10k_mldev->roc, output));
+   req->cn10k_req.jd.model_run.num_batches = nb_batches;
 }
 
 static void
@@ -311,30 +299,15 @@ cn10k_ml_model_xstat_get(struct cnxk_ml_dev *cnxk_mldev, 
struct cnxk_ml_layer *l
 static int
 cn10k_ml_cache_model_data(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer 
*layer)
 {
-   struct rte_ml_buff_seg seg[2];
-   struct rte_ml_buff_seg *inp;
-   struct rte_ml_buff_seg *out;
-   struct rte_ml_op op;
-
char str[RTE_MEMZONE_NAMESIZE];
const struct plt_memzone *mz;
uint64_t isize = 0;
uint64_t osize = 0;
int ret = 0;
-   uint32_t i;
-
-   inp = &seg[0];
-   out = &seg[1];
 
/* Create input and output buffers. */
-   for (i = 0; i < layer->info.nb_inputs; i++)
-   isize += layer->info.input[i].sz_q;
-
-   for (i = 0; i < layer->info.nb_outputs; i++)
-   osize += layer->info.output[i].sz_q;
-
-   isize = layer->batch_size * isize;
-   osize = layer->batch_size * osize;
+   isize = layer->info.total_input_sz_q;
+   osize = layer->info.total_output_sz_q;
 
snprintf(str, RTE_MEMZONE_NAMESIZE, "%s_%u", "ml_dummy_io", 
layer->index);
mz = plt_m

[PATCH v8 25/34] ml/cnxk: enable OCM check for multilayer TVM model

2023-10-22 Thread Srikanth Yalavarthi
From: Anup Prabhu 

Enabled check for OCM size requirement for multi-layer
TVM model. Compute OCM scratch and WB requirement for
all layers during the load stage.

Signed-off-by: Anup Prabhu 
---
 drivers/ml/cnxk/cnxk_ml_ops.c | 60 +++
 1 file changed, 60 insertions(+)

diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c
index ce668e1eb6..d1471971e4 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.c
+++ b/drivers/ml/cnxk/cnxk_ml_ops.c
@@ -1023,8 +1023,12 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct 
rte_ml_model_params *params, u
 
char str[RTE_MEMZONE_NAMESIZE];
const struct plt_memzone *mz;
+   uint16_t max_scratch_pages;
+   struct cn10k_ml_ocm *ocm;
uint64_t model_info_size;
+   uint16_t total_wb_pages;
uint16_t lcl_model_id;
+   uint16_t layer_id;
uint64_t mz_size;
bool found;
int ret;
@@ -1086,6 +1090,62 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct 
rte_ml_model_params *params, u
if (ret != 0)
goto error;
 
+   max_scratch_pages = 0;
+   total_wb_pages = 0;
+   layer_id = 0;
+
+   ocm = &cnxk_mldev->cn10k_mldev.ocm;
+
+   if (model->type == ML_CNXK_MODEL_TYPE_GLOW) {
+   total_wb_pages = total_wb_pages + 
model->layer[layer_id].glow.ocm_map.wb_pages;
+   max_scratch_pages = PLT_MAX(max_scratch_pages,
+   
model->layer[layer_id].glow.ocm_map.scratch_pages);
+#ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM
+   } else {
+   for (layer_id = 0; layer_id < 
model->mvtvm.metadata.model.nb_layers; layer_id++) {
+   if (model->layer[layer_id].type == 
ML_CNXK_LAYER_TYPE_MRVL) {
+   total_wb_pages = total_wb_pages +
+
model->layer[layer_id].glow.ocm_map.wb_pages;
+   max_scratch_pages =
+   PLT_MAX(max_scratch_pages,
+   
model->layer[layer_id].glow.ocm_map.scratch_pages);
+   }
+   }
+#endif
+   }
+
+   if ((total_wb_pages + max_scratch_pages) > ocm->num_pages) {
+   plt_err("model_id = %u: total_wb_pages (%u) + scratch_pages 
(%u) >  %u\n",
+   lcl_model_id, total_wb_pages, max_scratch_pages, 
ocm->num_pages);
+
+   if (model->type == ML_CNXK_MODEL_TYPE_GLOW) {
+   plt_ml_dbg("layer_id = %u: wb_pages = %u, scratch_pages 
= %u\n", layer_id,
+  model->layer[layer_id].glow.ocm_map.wb_pages,
+  
model->layer[layer_id].glow.ocm_map.scratch_pages);
+#ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM
+   } else {
+   for (layer_id = 0; layer_id < 
model->mvtvm.metadata.model.nb_layers;
+layer_id++) {
+   if (model->layer[layer_id].type == 
ML_CNXK_LAYER_TYPE_MRVL) {
+   plt_ml_dbg(
+   "layer_id = %u: wb_pages = %u, 
scratch_pages = %u\n",
+   layer_id,
+   
model->layer[layer_id].glow.ocm_map.wb_pages,
+   
model->layer[layer_id].glow.ocm_map.scratch_pages);
+   }
+   }
+#endif
+   }
+
+   if (model->type == ML_CNXK_MODEL_TYPE_GLOW)
+   cn10k_ml_model_unload(cnxk_mldev, model);
+#ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM
+   else {
+   mvtvm_ml_model_unload(cnxk_mldev, model);
+   return -ENOMEM;
+   }
+#endif
+   }
plt_spinlock_init(&model->lock);
model->state = ML_CNXK_MODEL_STATE_LOADED;
cnxk_mldev->nb_models_loaded++;
-- 
2.42.0



[PATCH v8 26/34] ml/cnxk: support start and stop for TVM models

2023-10-22 Thread Srikanth Yalavarthi
Added support to start and stop TVM models. TVM model
start would invoke layer start for all Glow layers part
of the model. TVM model stop would invoke layer stop
for all Glow layers part of the model.

Signed-off-by: Srikanth Yalavarthi 
Signed-off-by: Anup Prabhu 
---
 drivers/ml/cnxk/cn10k_ml_ops.c   | 16 ++
 drivers/ml/cnxk/cnxk_ml_ops.c| 14 +++--
 drivers/ml/cnxk/mvtvm_ml_ops.c   | 52 
 drivers/ml/cnxk/mvtvm_ml_ops.h   |  2 ++
 drivers/ml/cnxk/mvtvm_ml_stubs.c | 18 +++
 drivers/ml/cnxk/mvtvm_ml_stubs.h |  2 ++
 6 files changed, 96 insertions(+), 8 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index e7208391fd..2d308802cf 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -827,7 +827,7 @@ cn10k_ml_layer_start(void *device, uint16_t model_id, const 
char *layer_name)
struct cn10k_ml_ocm *ocm;
struct cnxk_ml_req *req;
 
-   uint16_t layer_id = 0;
+   uint16_t layer_id;
bool job_enqueued;
bool job_dequeued;
uint8_t num_tiles;
@@ -838,8 +838,6 @@ cn10k_ml_layer_start(void *device, uint16_t model_id, const 
char *layer_name)
bool locked;
int ret = 0;
 
-   PLT_SET_USED(layer_name);
-
cnxk_mldev = (struct cnxk_ml_dev *)device;
if (cnxk_mldev == NULL) {
plt_err("Invalid device = %p", device);
@@ -852,6 +850,10 @@ cn10k_ml_layer_start(void *device, uint16_t model_id, 
const char *layer_name)
return -EINVAL;
}
 
+   ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id);
+   if (ret != 0)
+   return ret;
+
layer = &model->layer[layer_id];
cn10k_mldev = &cnxk_mldev->cn10k_mldev;
ocm = &cn10k_mldev->ocm;
@@ -1015,14 +1017,12 @@ cn10k_ml_layer_stop(void *device, uint16_t model_id, 
const char *layer_name)
struct cn10k_ml_ocm *ocm;
struct cnxk_ml_req *req;
 
-   uint16_t layer_id = 0;
+   uint16_t layer_id;
bool job_enqueued;
bool job_dequeued;
bool locked;
int ret = 0;
 
-   PLT_SET_USED(layer_name);
-
cnxk_mldev = (struct cnxk_ml_dev *)device;
if (cnxk_mldev == NULL) {
plt_err("Invalid device = %p", device);
@@ -1035,6 +1035,10 @@ cn10k_ml_layer_stop(void *device, uint16_t model_id, 
const char *layer_name)
return -EINVAL;
}
 
+   ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id);
+   if (ret != 0)
+   return ret;
+
layer = &model->layer[layer_id];
cn10k_mldev = &cnxk_mldev->cn10k_mldev;
ocm = &cn10k_mldev->ocm;
diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c
index d1471971e4..c38c60bf76 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.c
+++ b/drivers/ml/cnxk/cnxk_ml_ops.c
@@ -1216,7 +1216,12 @@ cnxk_ml_model_start(struct rte_ml_dev *dev, uint16_t 
model_id)
return -EINVAL;
}
 
-   return cn10k_ml_model_start(cnxk_mldev, model);
+   if (model->type == ML_CNXK_MODEL_TYPE_GLOW)
+   return cn10k_ml_model_start(cnxk_mldev, model);
+   else
+   return mvtvm_ml_model_start(cnxk_mldev, model);
+
+   return 0;
 }
 
 int
@@ -1236,7 +1241,12 @@ cnxk_ml_model_stop(struct rte_ml_dev *dev, uint16_t 
model_id)
return -EINVAL;
}
 
-   return cn10k_ml_model_stop(cnxk_mldev, model);
+   if (model->type == ML_CNXK_MODEL_TYPE_GLOW)
+   return cn10k_ml_model_stop(cnxk_mldev, model);
+   else
+   return mvtvm_ml_model_stop(cnxk_mldev, model);
+
+   return 0;
 }
 
 static int
diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c
index 3847f9b6b9..323c7c6fb6 100644
--- a/drivers/ml/cnxk/mvtvm_ml_ops.c
+++ b/drivers/ml/cnxk/mvtvm_ml_ops.c
@@ -213,3 +213,55 @@ mvtvm_ml_model_unload(struct cnxk_ml_dev *cnxk_mldev, 
struct cnxk_ml_model *mode
 
return plt_memzone_free(mz);
 }
+
+int
+mvtvm_ml_model_start(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model 
*model)
+{
+   struct cnxk_ml_layer *layer;
+
+   uint16_t layer_id = 0;
+   int ret = 0;
+
+next_layer:
+   layer = &model->layer[layer_id];
+   if (layer->type == ML_CNXK_LAYER_TYPE_MRVL) {
+   ret = cn10k_ml_layer_start(cnxk_mldev, model->model_id, 
layer->name);
+   if (ret != 0) {
+   plt_err("Layer start failed, model_id = %u, layer_name 
= %s, error = %d",
+   model->model_id, layer->name, ret);
+   return ret;
+   }
+   }
+   layer_id++;
+
+   if (layer_id < model->nb_layers)
+   goto next_layer;
+
+   return 0;
+}
+
+int
+mvtvm_ml_model_stop(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model 
*model)
+{
+   struct cnxk_ml_layer *layer;
+
+   uint16_t layer_id = 0;
+   

[PATCH v8 28/34] ml/cnxk: support device dump for TVM models

2023-10-22 Thread Srikanth Yalavarthi
Enabled support to print TVM model layer info.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cnxk_ml_model.c  |  7 +++-
 drivers/ml/cnxk/mvtvm_ml_model.c | 59 
 drivers/ml/cnxk/mvtvm_ml_model.h |  2 ++
 drivers/ml/cnxk/mvtvm_ml_stubs.c |  8 +
 drivers/ml/cnxk/mvtvm_ml_stubs.h |  2 ++
 5 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/drivers/ml/cnxk/cnxk_ml_model.c b/drivers/ml/cnxk/cnxk_ml_model.c
index 02f80410ec..ed6a1ed866 100644
--- a/drivers/ml/cnxk/cnxk_ml_model.c
+++ b/drivers/ml/cnxk/cnxk_ml_model.c
@@ -68,6 +68,8 @@ cnxk_ml_model_dump(struct cnxk_ml_dev *cnxk_mldev, struct 
cnxk_ml_model *model,
cnxk_ml_print_line(fp, LINE_LEN);
fprintf(fp, "%*s : %u\n", FIELD_LEN, "model_id", model->model_id);
fprintf(fp, "%*s : %s\n", FIELD_LEN, "name", model->name);
+   fprintf(fp, "%*s : %d\n", FIELD_LEN, "type", model->type);
+   fprintf(fp, "%*s : %d\n", FIELD_LEN, "subtype", model->subtype);
fprintf(fp, "%*s : 0x%016lx\n", FIELD_LEN, "model", 
PLT_U64_CAST(model));
fprintf(fp, "%*s : %u\n", FIELD_LEN, "batch_size", model->batch_size);
fprintf(fp, "%*s : %u\n", FIELD_LEN, "nb_layers", model->nb_layers);
@@ -84,6 +86,9 @@ cnxk_ml_model_dump(struct cnxk_ml_dev *cnxk_mldev, struct 
cnxk_ml_model *model,
 
for (layer_id = 0; layer_id < model->nb_layers; layer_id++) {
layer = &model->layer[layer_id];
-   cn10k_ml_layer_print(cnxk_mldev, layer, fp);
+   if (layer->type == ML_CNXK_LAYER_TYPE_MRVL)
+   cn10k_ml_layer_print(cnxk_mldev, layer, fp);
+   else
+   mvtvm_ml_layer_print(cnxk_mldev, layer, fp);
}
 }
diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c
index 650dd970bd..ffbcec8b80 100644
--- a/drivers/ml/cnxk/mvtvm_ml_model.c
+++ b/drivers/ml/cnxk/mvtvm_ml_model.c
@@ -13,6 +13,7 @@
 
 #include "cnxk_ml_dev.h"
 #include "cnxk_ml_model.h"
+#include "cnxk_ml_utils.h"
 
 /* Objects list */
 char mvtvm_object_list[ML_MVTVM_MODEL_OBJECT_MAX][RTE_ML_STR_MAX] = {"mod.so", 
"mod.json",
@@ -311,3 +312,61 @@ mvtvm_ml_model_info_set(struct cnxk_ml_dev *cnxk_mldev, 
struct cnxk_ml_model *mo
cn10k_ml_model_info_set(cnxk_mldev, model, &model->mvtvm.info,
&model->layer[0].glow.metadata);
 }
+
+void
+mvtvm_ml_layer_print(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_layer 
*layer, FILE *fp)
+{
+   char str[STR_LEN];
+   uint8_t i;
+
+   /* Print debug info */
+   cnxk_ml_print_line(fp, LINE_LEN);
+   fprintf(fp, " Layer Information (Layer ID: %u, Name: %s)\n",
+   cnxk_mldev->index_map[layer->index].layer_id, layer->name);
+   cnxk_ml_print_line(fp, LINE_LEN);
+   fprintf(fp, "%*s : %u\n", FIELD_LEN, "layer_id",
+   cnxk_mldev->index_map[layer->index].layer_id);
+   fprintf(fp, "%*s : %s\n", FIELD_LEN, "name", layer->name);
+   fprintf(fp, "%*s : %d\n", FIELD_LEN, "type", layer->type);
+   fprintf(fp, "%*s : 0x%016lx\n", FIELD_LEN, "layer", 
PLT_U64_CAST(layer));
+   fprintf(fp, "%*s : %u\n", FIELD_LEN, "batch_size", layer->batch_size);
+
+   /* Print model state */
+   if (layer->state == ML_CNXK_LAYER_STATE_LOADED)
+   fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "loaded");
+   if (layer->state == ML_CNXK_LAYER_STATE_JOB_ACTIVE)
+   fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "job_active");
+   if (layer->state == ML_CNXK_LAYER_STATE_STARTED)
+   fprintf(fp, "%*s : %s\n", FIELD_LEN, "state", "started");
+
+   fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_inputs", 
layer->info.nb_inputs);
+   fprintf(fp, "%*s : %u\n", FIELD_LEN, "num_outputs", 
layer->info.nb_outputs);
+   fprintf(fp, "\n");
+
+   cnxk_ml_print_line(fp, LINE_LEN);
+   fprintf(fp, "%8s  %16s  %12s\n", "input", "input_name", "input_type");
+   cnxk_ml_print_line(fp, LINE_LEN);
+   for (i = 0; i < layer->info.nb_inputs; i++) {
+   fprintf(fp, "%8u  ", i);
+   fprintf(fp, "%*s  ", 16, layer->info.input[i].name);
+   rte_ml_io_type_to_str(layer->info.input[i].qtype, str, STR_LEN);
+   fprintf(fp, "%*s  ", 12, str);
+   }
+   fprintf(fp, "\n");
+   cnxk_ml_print_line(fp, LINE_LEN);
+   fprintf(fp, "\n");
+
+   cnxk_ml_print_line(fp, LINE_LEN);
+   fprintf(fp, "%8s  %16s  %12s\n", "output", "output_name", 
"output_type");
+   cnxk_ml_print_line(fp, LINE_LEN);
+   for (i = 0; i < layer->info.nb_outputs; i++) {
+   fprintf(fp, "%8u  ", i);
+   fprintf(fp, "%*s  ", 16, layer->info.output[i].name);
+   rte_ml_io_type_to_str(layer->info.output[i].qtype, str, 
STR_LEN);
+   fprintf(fp, "%*s  ", 12, str);
+   fprintf(fp, "\n");
+   }
+   fprintf(fp, "\n");
+   cnxk_ml_print_line(fp, LINE_

[PATCH v8 20/34] ml/cnxk: add support for identify model type

2023-10-22 Thread Srikanth Yalavarthi
Enable support to parse model buffer to identify the
model type and model sub-type. Enabled basic checks
for Glow model type buffer.

Signed-off-by: Srikanth Yalavarthi 
Signed-off-by: Anup Prabhu 
---
 drivers/ml/cnxk/cnxk_ml_model.c  | 49 
 drivers/ml/cnxk/cnxk_ml_model.h  |  3 ++
 drivers/ml/cnxk/cnxk_ml_ops.c|  8 +
 drivers/ml/cnxk/meson.build  |  6 
 drivers/ml/cnxk/mvtvm_ml_model.c | 55 
 drivers/ml/cnxk/mvtvm_ml_model.h |  2 ++
 drivers/ml/cnxk/mvtvm_ml_stubs.c |  9 ++
 drivers/ml/cnxk/mvtvm_ml_stubs.h |  1 +
 8 files changed, 133 insertions(+)
 create mode 100644 drivers/ml/cnxk/mvtvm_ml_model.c

diff --git a/drivers/ml/cnxk/cnxk_ml_model.c b/drivers/ml/cnxk/cnxk_ml_model.c
index b069d4e3a5..02f80410ec 100644
--- a/drivers/ml/cnxk/cnxk_ml_model.c
+++ b/drivers/ml/cnxk/cnxk_ml_model.c
@@ -2,11 +2,60 @@
  * Copyright (c) 2023 Marvell.
  */
 
+#include 
 #include 
 
 #include "cnxk_ml_model.h"
 #include "cnxk_ml_utils.h"
 
+enum cnxk_ml_model_type
+cnxk_ml_model_get_type(struct rte_ml_model_params *params)
+{
+   struct cn10k_ml_model_metadata_header *metadata_header;
+   enum cnxk_ml_model_type type;
+   uint32_t payload_crc32c;
+   uint32_t header_crc32c;
+
+   type = mvtvm_ml_model_type_get(params);
+   if (type == ML_CNXK_MODEL_TYPE_TVM)
+   return ML_CNXK_MODEL_TYPE_TVM;
+   else if (type == ML_CNXK_MODEL_TYPE_INVALID)
+   return ML_CNXK_MODEL_TYPE_INVALID;
+
+   /* Check model magic string */
+   metadata_header = (struct cn10k_ml_model_metadata_header *)params->addr;
+   if (strncmp((char *)metadata_header->magic, MRVL_ML_MODEL_MAGIC_STRING, 
4) != 0) {
+   plt_err("Invalid Glow model, magic = %s", 
metadata_header->magic);
+   return ML_CNXK_MODEL_TYPE_INVALID;
+   }
+
+   /* Header CRC check */
+   if (metadata_header->header_crc32c != 0) {
+   header_crc32c = rte_hash_crc(
+   params->addr,
+   sizeof(struct cn10k_ml_model_metadata_header) - 
sizeof(uint32_t), 0);
+
+   if (header_crc32c != metadata_header->header_crc32c) {
+   plt_err("Invalid Glow model, Header CRC mismatch");
+   return ML_CNXK_MODEL_TYPE_INVALID;
+   }
+   }
+
+   /* Payload CRC check */
+   if (metadata_header->payload_crc32c != 0) {
+   payload_crc32c = rte_hash_crc(
+   PLT_PTR_ADD(params->addr, sizeof(struct 
cn10k_ml_model_metadata_header)),
+   params->size - sizeof(struct 
cn10k_ml_model_metadata_header), 0);
+
+   if (payload_crc32c != metadata_header->payload_crc32c) {
+   plt_err("Invalid Glow model, Payload CRC mismatch");
+   return ML_CNXK_MODEL_TYPE_INVALID;
+   }
+   }
+
+   return ML_CNXK_MODEL_TYPE_GLOW;
+}
+
 void
 cnxk_ml_model_dump(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model 
*model, FILE *fp)
 {
diff --git a/drivers/ml/cnxk/cnxk_ml_model.h b/drivers/ml/cnxk/cnxk_ml_model.h
index f100eca203..a2fced46a2 100644
--- a/drivers/ml/cnxk/cnxk_ml_model.h
+++ b/drivers/ml/cnxk/cnxk_ml_model.h
@@ -13,6 +13,8 @@
 
 #ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM
 #include "mvtvm_ml_model.h"
+#else
+#include "mvtvm_ml_stubs.h"
 #endif
 
 #include "cnxk_ml_io.h"
@@ -184,6 +186,7 @@ struct cnxk_ml_model {
set_poll_addr_t set_poll_addr;
 };
 
+enum cnxk_ml_model_type cnxk_ml_model_get_type(struct rte_ml_model_params 
*params);
 void cnxk_ml_model_dump(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model 
*model, FILE *fp);
 
 #endif /* _CNXK_ML_MODEL_H_ */
diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c
index 96f87128f9..ebc78e36e9 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.c
+++ b/drivers/ml/cnxk/cnxk_ml_ops.c
@@ -1018,6 +1018,7 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct 
rte_ml_model_params *params, u
 {
struct rte_ml_dev_info dev_info;
struct cnxk_ml_dev *cnxk_mldev;
+   enum cnxk_ml_model_type type;
struct cnxk_ml_model *model;
 
char str[RTE_MEMZONE_NAMESIZE];
@@ -1033,6 +1034,12 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct 
rte_ml_model_params *params, u
 
cnxk_mldev = dev->data->dev_private;
 
+   type = cnxk_ml_model_get_type(params);
+   if (type == ML_CNXK_MODEL_TYPE_INVALID) {
+   plt_err("Invalid / unsupported model type");
+   return -EINVAL;
+   }
+
/* Find model ID */
found = false;
for (lcl_model_id = 0; lcl_model_id < dev->data->nb_models; 
lcl_model_id++) {
@@ -1066,6 +1073,7 @@ cnxk_ml_model_load(struct rte_ml_dev *dev, struct 
rte_ml_model_params *params, u
 
model = mz->addr;
model->cnxk_mldev = cnxk_mldev;
+   model->type = type;
model->model_id = lcl_model_id;
model->info = PLT_PTR_ADD(

[PATCH v8 29/34] ml/cnxk: enable reporting model runtime as xstats

2023-10-22 Thread Srikanth Yalavarthi
Added model xstats entries to compute runtime latency.
Allocated internal resources for TVM model xstats.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_ops.c   |   9 +++
 drivers/ml/cnxk/cn10k_ml_ops.h   |   2 +
 drivers/ml/cnxk/cnxk_ml_ops.c| 131 +++
 drivers/ml/cnxk/cnxk_ml_ops.h|   1 +
 drivers/ml/cnxk/cnxk_ml_xstats.h |   7 ++
 drivers/ml/cnxk/mvtvm_ml_model.h |  24 ++
 drivers/ml/cnxk/mvtvm_ml_ops.c   |  96 +-
 drivers/ml/cnxk/mvtvm_ml_ops.h   |   8 ++
 drivers/ml/cnxk/mvtvm_ml_stubs.c |  23 ++
 drivers/ml/cnxk/mvtvm_ml_stubs.h |   6 ++
 10 files changed, 289 insertions(+), 18 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 2d308802cf..0c67ce7b40 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -197,6 +197,15 @@ cn10k_ml_xstats_layer_name_update(struct cnxk_ml_dev 
*cnxk_mldev, uint16_t model
}
 }
 
+void
+cn10k_ml_xstat_model_name_set(struct cnxk_ml_dev *cnxk_mldev, struct 
cnxk_ml_model *model,
+ uint16_t stat_id, uint16_t entry, char *suffix)
+{
+   snprintf(cnxk_mldev->xstats.entries[stat_id].map.name,
+sizeof(cnxk_mldev->xstats.entries[stat_id].map.name), 
"%s-%s-%s",
+model->glow.metadata.model.name, model_xstats[entry].name, 
suffix);
+}
+
 #define ML_AVG_FOREACH_QP(cnxk_mldev, layer, qp_id, str, value, count) 
\
do {
   \
value = 0;  
   \
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h
index 3d18303ed3..045e2e6cd2 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.h
+++ b/drivers/ml/cnxk/cn10k_ml_ops.h
@@ -331,6 +331,8 @@ int cn10k_ml_layer_start(void *device, uint16_t model_id, 
const char *layer_name
 int cn10k_ml_layer_stop(void *device, uint16_t model_id, const char 
*layer_name);
 
 /* xstats ops */
+void cn10k_ml_xstat_model_name_set(struct cnxk_ml_dev *cnxk_mldev, struct 
cnxk_ml_model *model,
+  uint16_t stat_id, uint16_t entry, char 
*suffix);
 uint64_t cn10k_ml_model_xstat_get(struct cnxk_ml_dev *cnxk_mldev, struct 
cnxk_ml_layer *layer,
  enum cnxk_ml_xstats_type type);
 
diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c
index c38c60bf76..2632d70d8c 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.c
+++ b/drivers/ml/cnxk/cnxk_ml_ops.c
@@ -138,7 +138,8 @@ cnxk_ml_xstats_init(struct cnxk_ml_dev *cnxk_mldev)
 
/* Allocate memory for xstats entries. Don't allocate during 
reconfigure */
nb_stats = RTE_DIM(device_xstats) +
-  RTE_DIM(layer_xstats) * ML_CNXK_MAX_MODELS * 
ML_CNXK_MODEL_MAX_LAYERS;
+  RTE_DIM(layer_xstats) * ML_CNXK_MAX_MODELS * 
ML_CNXK_MODEL_MAX_LAYERS +
+  RTE_DIM(model_xstats) * ML_CNXK_MAX_MODELS;
if (cnxk_mldev->xstats.entries == NULL)
cnxk_mldev->xstats.entries = rte_zmalloc(
"cnxk_ml_xstats", sizeof(struct cnxk_ml_xstats_entry) * 
nb_stats,
@@ -169,6 +170,25 @@ cnxk_ml_xstats_init(struct cnxk_ml_dev *cnxk_mldev)
for (model = 0; model < ML_CNXK_MAX_MODELS; model++) {
cnxk_mldev->xstats.offset_for_model[model] = stat_id;
 
+   for (i = 0; i < RTE_DIM(model_xstats); i++) {
+   cnxk_mldev->xstats.entries[stat_id].map.id = stat_id;
+   cnxk_mldev->xstats.entries[stat_id].mode = 
RTE_ML_DEV_XSTATS_MODEL;
+   cnxk_mldev->xstats.entries[stat_id].group = 
CNXK_ML_XSTATS_GROUP_MODEL;
+   cnxk_mldev->xstats.entries[stat_id].type = 
model_xstats[i].type;
+   cnxk_mldev->xstats.entries[stat_id].fn_id = 
CNXK_ML_XSTATS_FN_MODEL;
+   cnxk_mldev->xstats.entries[stat_id].obj_idx = model;
+   cnxk_mldev->xstats.entries[stat_id].layer_id = -1;
+   cnxk_mldev->xstats.entries[stat_id].reset_allowed =
+   model_xstats[i].reset_allowed;
+
+   /* Name of xstat is updated during model load */
+   snprintf(cnxk_mldev->xstats.entries[stat_id].map.name,
+
sizeof(cnxk_mldev->xstats.entries[stat_id].map.name),
+"Model-%u-%s", model, model_xstats[i].name);
+
+   stat_id++;
+   }
+
for (layer = 0; layer < ML_CNXK_MODEL_MAX_LAYERS; layer++) {
cnxk_mldev->xstats.offset_for_layer[model][layer] = 
stat_id;
 
@@ -195,7 +215,8 @@ cnxk_ml_xstats_init(struct cnxk_ml_dev *cnxk_mldev)
cnxk_mldev->xstats.count_per_layer[model][layer] = 
RTE_DIM(

[PATCH v8 23/34] ml/cnxk: update internal info for TVM model

2023-10-22 Thread Srikanth Yalavarthi
Enabled updating internal IO info structures for TVM model.
Compute static fields related to the model I/O.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cnxk_ml_ops.c|   4 ++
 drivers/ml/cnxk/mvtvm_ml_model.c | 111 +++
 drivers/ml/cnxk/mvtvm_ml_model.h |   2 +
 drivers/ml/cnxk/mvtvm_ml_ops.c   |   3 +
 drivers/ml/cnxk/mvtvm_ml_stubs.c |   9 +++
 drivers/ml/cnxk/mvtvm_ml_stubs.h |   1 +
 6 files changed, 130 insertions(+)

diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c
index 85b37161d2..1565e521fd 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.c
+++ b/drivers/ml/cnxk/cnxk_ml_ops.c
@@ -1244,6 +1244,8 @@ cnxk_ml_io_quantize(struct rte_ml_dev *dev, uint16_t 
model_id, struct rte_ml_buf
 
if (model->type == ML_CNXK_MODEL_TYPE_GLOW)
info = cn10k_ml_model_io_info_get(model, 0);
+   else
+   info = mvtvm_ml_model_io_info_get(model, 0);
 
if (info == NULL)
return -EINVAL;
@@ -1296,6 +1298,8 @@ cnxk_ml_io_dequantize(struct rte_ml_dev *dev, uint16_t 
model_id, struct rte_ml_b
 
if (model->type == ML_CNXK_MODEL_TYPE_GLOW)
info = cn10k_ml_model_io_info_get(model, model->nb_layers - 1);
+   else
+   info = mvtvm_ml_model_io_info_get(model, model->nb_layers - 1);
 
if (info == NULL)
return -EINVAL;
diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c
index 8536fd8927..b40b0a13af 100644
--- a/drivers/ml/cnxk/mvtvm_ml_model.c
+++ b/drivers/ml/cnxk/mvtvm_ml_model.c
@@ -7,6 +7,8 @@
 
 #include 
 
+#include 
+
 #include 
 
 #include "cnxk_ml_model.h"
@@ -135,3 +137,112 @@ mvtvm_ml_model_get_layer_id(struct cnxk_ml_model *model, 
const char *layer_name,
 
return 0;
 }
+
+static enum rte_ml_io_type
+mvtvm_ml_io_type_map(uint8_t type)
+{
+   switch (type) {
+   case kDLInt:
+   return RTE_ML_IO_TYPE_INT32;
+   case kDLUInt:
+   return RTE_ML_IO_TYPE_UINT32;
+   case kDLFloat:
+   return RTE_ML_IO_TYPE_FP32;
+   case kDLBfloat:
+   return RTE_ML_IO_TYPE_BFLOAT16;
+   }
+
+   return RTE_ML_IO_TYPE_UNKNOWN;
+}
+
+void
+mvtvm_ml_model_io_info_set(struct cnxk_ml_model *model)
+{
+   struct tvmdp_model_metadata *metadata;
+   int32_t i;
+   int32_t j;
+
+   if (model->subtype == ML_CNXK_MODEL_SUBTYPE_TVM_MRVL)
+   goto tvm_mrvl_model;
+
+   metadata = &model->mvtvm.metadata;
+
+   /* Inputs, set for layer_id = 0 */
+   model->mvtvm.info.nb_inputs = metadata->model.num_input;
+   model->mvtvm.info.total_input_sz_d = 0;
+   model->mvtvm.info.total_input_sz_q = 0;
+   for (i = 0; i < metadata->model.num_input; i++) {
+   rte_strscpy(model->mvtvm.info.input[i].name, 
metadata->input[i].name,
+   TVMDP_NAME_STRLEN);
+   model->mvtvm.info.input[i].dtype =
+   mvtvm_ml_io_type_map(metadata->input[i].datatype.code);
+   model->mvtvm.info.input[i].qtype =
+   
mvtvm_ml_io_type_map(metadata->input[i].model_datatype.code);
+   model->mvtvm.info.input[i].nb_dims = metadata->input[i].ndim;
+
+   model->mvtvm.info.input[i].nb_elements = 1;
+   for (j = 0; j < metadata->input[i].ndim; j++) {
+   model->mvtvm.info.input[i].shape[j] = 
metadata->input[i].shape[j];
+   model->mvtvm.info.input[i].nb_elements *= 
metadata->input[i].shape[j];
+   }
+
+   model->mvtvm.info.input[i].sz_d =
+   model->mvtvm.info.input[i].nb_elements *
+   
rte_ml_io_type_size_get(model->mvtvm.info.input[i].dtype);
+   model->mvtvm.info.input[i].sz_q =
+   model->mvtvm.info.input[i].nb_elements *
+   
rte_ml_io_type_size_get(model->mvtvm.info.input[i].qtype);
+
+   model->mvtvm.info.total_input_sz_d += 
model->mvtvm.info.input[i].sz_d;
+   model->mvtvm.info.total_input_sz_q += 
model->mvtvm.info.input[i].sz_q;
+
+   plt_ml_dbg("model_id = %u, input[%u] - sz_d = %u sz_q = %u", 
model->model_id, i,
+  model->mvtvm.info.input[i].sz_d, 
model->mvtvm.info.input[i].sz_q);
+   }
+
+   /* Outputs, set for nb_layers - 1 */
+   model->mvtvm.info.nb_outputs = metadata->model.num_output;
+   model->mvtvm.info.total_output_sz_d = 0;
+   model->mvtvm.info.total_output_sz_q = 0;
+   for (i = 0; i < metadata->model.num_output; i++) {
+   rte_strscpy(model->mvtvm.info.output[i].name, 
metadata->output[i].name,
+   TVMDP_NAME_STRLEN);
+   model->mvtvm.info.output[i].dtype =
+   mvtvm_ml_io_type_map(metadata->output[i].datatype.code);
+   model->mvtvm.info.output[i].qtype =
+

[PATCH v8 24/34] ml/cnxk: enable model unload in tvmdp library

2023-10-22 Thread Srikanth Yalavarthi
Enable unloading model using external tvmdp library. Updated
layer unload callback to support multiple layers.

Signed-off-by: Srikanth Yalavarthi 
Signed-off-by: Anup Prabhu 
---
 drivers/ml/cnxk/cn10k_ml_ops.c   |  8 +---
 drivers/ml/cnxk/cnxk_ml_ops.c|  7 +--
 drivers/ml/cnxk/mvtvm_ml_ops.c   | 28 
 drivers/ml/cnxk/mvtvm_ml_ops.h   |  1 +
 drivers/ml/cnxk/mvtvm_ml_stubs.c |  9 +
 drivers/ml/cnxk/mvtvm_ml_stubs.h |  1 +
 6 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 4191ccc840..e7208391fd 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -780,11 +780,9 @@ cn10k_ml_layer_unload(void *device, uint16_t model_id, 
const char *layer_name)
struct cnxk_ml_layer *layer;
 
char str[RTE_MEMZONE_NAMESIZE];
-   uint16_t layer_id = 0;
+   uint16_t layer_id;
int ret;
 
-   PLT_SET_USED(layer_name);
-
cnxk_mldev = (struct cnxk_ml_dev *)device;
if (cnxk_mldev == NULL) {
plt_err("Invalid device = %p", device);
@@ -797,6 +795,10 @@ cn10k_ml_layer_unload(void *device, uint16_t model_id, 
const char *layer_name)
return -EINVAL;
}
 
+   ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id);
+   if (ret != 0)
+   return ret;
+
layer = &model->layer[layer_id];
 
snprintf(str, RTE_MEMZONE_NAMESIZE, "%s_%u_%u", 
CN10K_ML_LAYER_MEMZONE_NAME,
diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c
index 1565e521fd..ce668e1eb6 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.c
+++ b/drivers/ml/cnxk/cnxk_ml_ops.c
@@ -1107,7 +1107,7 @@ cnxk_ml_model_unload(struct rte_ml_dev *dev, uint16_t 
model_id)
struct cnxk_ml_model *model;
 
char str[RTE_MEMZONE_NAMESIZE];
-   int ret;
+   int ret = 0;
 
if (dev == NULL)
return -EINVAL;
@@ -1125,7 +1125,10 @@ cnxk_ml_model_unload(struct rte_ml_dev *dev, uint16_t 
model_id)
return -EBUSY;
}
 
-   ret = cn10k_ml_model_unload(cnxk_mldev, model);
+   if (model->type == ML_CNXK_MODEL_TYPE_GLOW)
+   ret = cn10k_ml_model_unload(cnxk_mldev, model);
+   else
+   ret = mvtvm_ml_model_unload(cnxk_mldev, model);
if (ret != 0)
return ret;
 
diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c
index e21bf2dc07..3847f9b6b9 100644
--- a/drivers/ml/cnxk/mvtvm_ml_ops.c
+++ b/drivers/ml/cnxk/mvtvm_ml_ops.c
@@ -185,3 +185,31 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct 
rte_ml_model_params *
 
return ret;
 }
+
+int
+mvtvm_ml_model_unload(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model 
*model)
+{
+   char str[RTE_MEMZONE_NAMESIZE];
+   const struct plt_memzone *mz;
+   int ret;
+
+   RTE_SET_USED(cnxk_mldev);
+
+   /* Initialize model in TVMDP */
+   ret = tvmdp_model_unload(model->model_id);
+   if (ret != 0) {
+   plt_err("TVMDP: Model unload failed, model_id = %u, error = 
%d", model->model_id,
+   ret);
+   return ret;
+   }
+
+   snprintf(str, RTE_MEMZONE_NAMESIZE, "%s_%u", 
MVTVM_ML_MODEL_MEMZONE_NAME, model->model_id);
+   mz = rte_memzone_lookup(str);
+   if (mz == NULL) {
+   plt_err("Memzone lookup failed for TVM model: model_id = %u, mz 
= %s",
+   model->model_id, str);
+   return -EINVAL;
+   }
+
+   return plt_memzone_free(mz);
+}
diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.h b/drivers/ml/cnxk/mvtvm_ml_ops.h
index 6607537599..770794fe7d 100644
--- a/drivers/ml/cnxk/mvtvm_ml_ops.h
+++ b/drivers/ml/cnxk/mvtvm_ml_ops.h
@@ -18,5 +18,6 @@ int mvtvm_ml_dev_configure(struct cnxk_ml_dev *cnxk_mldev, 
const struct rte_ml_d
 int mvtvm_ml_dev_close(struct cnxk_ml_dev *cnxk_mldev);
 int mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct 
rte_ml_model_params *params,
struct cnxk_ml_model *model);
+int mvtvm_ml_model_unload(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model 
*model);
 
 #endif /* _MVTVM_ML_OPS_H_ */
diff --git a/drivers/ml/cnxk/mvtvm_ml_stubs.c b/drivers/ml/cnxk/mvtvm_ml_stubs.c
index 80a9a90b4e..a17a76e41f 100644
--- a/drivers/ml/cnxk/mvtvm_ml_stubs.c
+++ b/drivers/ml/cnxk/mvtvm_ml_stubs.c
@@ -63,3 +63,12 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct 
rte_ml_model_params *
 
return -EINVAL;
 }
+
+int
+mvtvm_ml_model_unload(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model 
*model)
+{
+   RTE_SET_USED(cnxk_mldev);
+   RTE_SET_USED(model);
+
+   return -EINVAL;
+}
diff --git a/drivers/ml/cnxk/mvtvm_ml_stubs.h b/drivers/ml/cnxk/mvtvm_ml_stubs.h
index 29f721072a..3776fb5369 100644
--- a/drivers/ml/cnxk/mvtvm_ml_stubs.h
+++ b/drivers/ml/cnxk/mvtvm_ml_stubs.h
@@ -15,6 +15,7 @@ int mvtvm_ml_dev_configu

[PATCH v8 31/34] ml/cnxk: add generic ML malloc and free callback

2023-10-22 Thread Srikanth Yalavarthi
Implemented generic ML malloc and free callbacks

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_ops.c | 30 ++
 drivers/ml/cnxk/cn10k_ml_ops.h |  3 +++
 drivers/ml/cnxk/mvtvm_ml_ops.c |  2 ++
 3 files changed, 35 insertions(+)

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 7802425c87..01b0a44caa 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -1497,3 +1497,33 @@ cn10k_ml_io_free(void *device, uint16_t model_id, const 
char *layer_name)
 
return plt_memzone_free(mz);
 }
+
+int
+cn10k_ml_malloc(const char *name, size_t size, uint32_t align, void **addr)
+{
+   const struct plt_memzone *mz;
+
+   mz = plt_memzone_reserve_aligned(name, size, 0, align);
+   if (mz == NULL) {
+   plt_err("ml_malloc failed: Unable to allocate memory: name = 
%s", name);
+   return -ENOMEM;
+   }
+
+   *addr = mz->addr;
+
+   return 0;
+}
+
+int
+cn10k_ml_free(const char *name)
+{
+   const struct plt_memzone *mz;
+
+   mz = plt_memzone_lookup(name);
+   if (mz == NULL) {
+   plt_err("ml_free failed: Memzone not found: name = %s", name);
+   return -EINVAL;
+   }
+
+   return plt_memzone_free(mz);
+}
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h
index 9c41c1c0b0..eb3e1c139c 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.h
+++ b/drivers/ml/cnxk/cn10k_ml_ops.h
@@ -333,6 +333,9 @@ int cn10k_ml_io_alloc(void *device, uint16_t model_id, 
const char *layer_name,
  uint64_t **input_qbuffer, uint64_t **output_qbuffer);
 int cn10k_ml_io_free(void *device, uint16_t model_id, const char *layer_name);
 
+int cn10k_ml_malloc(const char *name, size_t size, uint32_t align, void 
**addr);
+int cn10k_ml_free(const char *name);
+
 /* xstats ops */
 void cn10k_ml_xstat_model_name_set(struct cnxk_ml_dev *cnxk_mldev, struct 
cnxk_ml_model *model,
   uint16_t stat_id, uint16_t entry, char 
*suffix);
diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c
index a50b31ec6e..9d59e28661 100644
--- a/drivers/ml/cnxk/mvtvm_ml_ops.c
+++ b/drivers/ml/cnxk/mvtvm_ml_ops.c
@@ -234,6 +234,8 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct 
rte_ml_model_params *
callback->tvmrt_glow_layer_unload = cn10k_ml_layer_unload;
callback->tvmrt_io_alloc = cn10k_ml_io_alloc;
callback->tvmrt_io_free = cn10k_ml_io_free;
+   callback->tvmrt_malloc = cn10k_ml_malloc;
+   callback->tvmrt_free = cn10k_ml_free;
} else {
callback = NULL;
}
-- 
2.42.0



[PATCH v8 27/34] ml/cnxk: update internal TVM model info structure

2023-10-22 Thread Srikanth Yalavarthi
From: Prince Takkar 

Added support to update internal model info structure
for TVM models.

Signed-off-by: Prince Takkar 
Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/mvtvm_ml_model.c | 65 
 drivers/ml/cnxk/mvtvm_ml_model.h |  2 +
 drivers/ml/cnxk/mvtvm_ml_ops.c   |  3 ++
 3 files changed, 70 insertions(+)

diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c
index b40b0a13af..650dd970bd 100644
--- a/drivers/ml/cnxk/mvtvm_ml_model.c
+++ b/drivers/ml/cnxk/mvtvm_ml_model.c
@@ -11,6 +11,7 @@
 
 #include 
 
+#include "cnxk_ml_dev.h"
 #include "cnxk_ml_model.h"
 
 /* Objects list */
@@ -246,3 +247,67 @@ mvtvm_ml_model_io_info_get(struct cnxk_ml_model *model, 
uint16_t layer_id)
 
return &model->mvtvm.info;
 }
+
+void
+mvtvm_ml_model_info_set(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model 
*model)
+{
+   struct tvmdp_model_metadata *metadata;
+   struct rte_ml_model_info *info;
+   struct rte_ml_io_info *output;
+   struct rte_ml_io_info *input;
+   uint8_t i;
+
+   info = PLT_PTR_CAST(model->info);
+   input = PLT_PTR_ADD(info, sizeof(struct rte_ml_model_info));
+   output = PLT_PTR_ADD(input, ML_CNXK_MODEL_MAX_INPUT_OUTPUT * 
sizeof(struct rte_ml_io_info));
+
+   /* Reset model info */
+   memset(info, 0, sizeof(struct rte_ml_model_info));
+
+   if (model->subtype == ML_CNXK_MODEL_SUBTYPE_TVM_MRVL)
+   goto tvm_mrvl_model;
+
+   metadata = &model->mvtvm.metadata;
+   rte_memcpy(info->name, metadata->model.name, TVMDP_NAME_STRLEN);
+   snprintf(info->version, RTE_ML_STR_MAX, "%u.%u.%u.%u", 
metadata->model.version[0],
+metadata->model.version[1], metadata->model.version[2],
+metadata->model.version[3]);
+   info->model_id = model->model_id;
+   info->device_id = cnxk_mldev->mldev->data->dev_id;
+   info->io_layout = RTE_ML_IO_LAYOUT_SPLIT;
+   info->min_batches = model->batch_size;
+   info->max_batches = model->batch_size;
+   info->nb_inputs = metadata->model.num_input;
+   info->input_info = input;
+   info->nb_outputs = metadata->model.num_output;
+   info->output_info = output;
+   info->wb_size = 0;
+
+   /* Set input info */
+   for (i = 0; i < info->nb_inputs; i++) {
+   rte_memcpy(input[i].name, metadata->input[i].name, 
MRVL_ML_INPUT_NAME_LEN);
+   input[i].nb_dims = metadata->input[i].ndim;
+   input[i].shape = &model->mvtvm.info.input[i].shape[0];
+   input[i].type = model->mvtvm.info.input[i].qtype;
+   input[i].nb_elements = model->mvtvm.info.input[i].nb_elements;
+   input[i].size = model->mvtvm.info.input[i].nb_elements *
+   
rte_ml_io_type_size_get(model->mvtvm.info.input[i].qtype);
+   }
+
+   /* Set output info */
+   for (i = 0; i < info->nb_outputs; i++) {
+   rte_memcpy(output[i].name, metadata->output[i].name, 
MRVL_ML_OUTPUT_NAME_LEN);
+   output[i].nb_dims = metadata->output[i].ndim;
+   output[i].shape = &model->mvtvm.info.output[i].shape[0];
+   output[i].type = model->mvtvm.info.output[i].qtype;
+   output[i].nb_elements = model->mvtvm.info.output[i].nb_elements;
+   output[i].size = model->mvtvm.info.output[i].nb_elements *
+
rte_ml_io_type_size_get(model->mvtvm.info.output[i].qtype);
+   }
+
+   return;
+
+tvm_mrvl_model:
+   cn10k_ml_model_info_set(cnxk_mldev, model, &model->mvtvm.info,
+   &model->layer[0].glow.metadata);
+}
diff --git a/drivers/ml/cnxk/mvtvm_ml_model.h b/drivers/ml/cnxk/mvtvm_ml_model.h
index e86581bc6a..a1247ffbde 100644
--- a/drivers/ml/cnxk/mvtvm_ml_model.h
+++ b/drivers/ml/cnxk/mvtvm_ml_model.h
@@ -11,6 +11,7 @@
 
 #include "cnxk_ml_io.h"
 
+struct cnxk_ml_dev;
 struct cnxk_ml_model;
 
 /* Maximum number of objects per model */
@@ -52,5 +53,6 @@ int mvtvm_ml_model_get_layer_id(struct cnxk_ml_model *model, 
const char *layer_n
uint16_t *layer_id);
 void mvtvm_ml_model_io_info_set(struct cnxk_ml_model *model);
 struct cnxk_ml_io_info *mvtvm_ml_model_io_info_get(struct cnxk_ml_model 
*model, uint16_t layer_id);
+void mvtvm_ml_model_info_set(struct cnxk_ml_dev *cnxk_mldev, struct 
cnxk_ml_model *model);
 
 #endif /* _MVTVM_ML_MODEL_H_ */
diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c
index 323c7c6fb6..c6872cd89a 100644
--- a/drivers/ml/cnxk/mvtvm_ml_ops.c
+++ b/drivers/ml/cnxk/mvtvm_ml_ops.c
@@ -178,6 +178,9 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct 
rte_ml_model_params *
/* Update model I/O data */
mvtvm_ml_model_io_info_set(model);
 
+   /* Set model info */
+   mvtvm_ml_model_info_set(cnxk_mldev, model);
+
return 0;
 
 error:
-- 
2.42.0



[PATCH v8 32/34] ml/cnxk: support quantize and dequantize callback

2023-10-22 Thread Srikanth Yalavarthi
From: Prince Takkar 

Added support for quantize and dequantize callback
functions for TVM models.

Signed-off-by: Prince Takkar 
---
 drivers/ml/cnxk/mvtvm_ml_ops.c | 129 +
 drivers/ml/cnxk/mvtvm_ml_ops.h |   4 +
 2 files changed, 133 insertions(+)

diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c
index 9d59e28661..39c8bf0f04 100644
--- a/drivers/ml/cnxk/mvtvm_ml_ops.c
+++ b/drivers/ml/cnxk/mvtvm_ml_ops.c
@@ -2,11 +2,15 @@
  * Copyright (c) 2023 Marvell.
  */
 
+#include 
+
 #include 
 #include 
 #include 
 #include 
 
+#include 
+
 #include "cnxk_ml_dev.h"
 #include "cnxk_ml_model.h"
 #include "cnxk_ml_ops.h"
@@ -236,6 +240,8 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct 
rte_ml_model_params *
callback->tvmrt_io_free = cn10k_ml_io_free;
callback->tvmrt_malloc = cn10k_ml_malloc;
callback->tvmrt_free = cn10k_ml_free;
+   callback->tvmrt_quantize = mvtvm_ml_io_quantize;
+   callback->tvmrt_dequantize = mvtvm_ml_io_dequantize;
} else {
callback = NULL;
}
@@ -366,3 +372,126 @@ mvtvm_ml_model_stop(struct cnxk_ml_dev *cnxk_mldev, 
struct cnxk_ml_model *model)
 
return 0;
 }
+
+int
+mvtvm_ml_io_quantize(void *device, uint16_t model_id, const char *layer_name,
+const DLTensor **deq_tensor, void *qbuffer)
+{
+   struct cnxk_ml_io_info *info = NULL;
+   struct cnxk_ml_dev *cnxk_mldev;
+   struct cnxk_ml_model *model;
+   uint16_t layer_id = 0;
+   uint8_t *lcl_dbuffer;
+   uint8_t *lcl_qbuffer;
+   uint32_t i;
+   int ret;
+
+#ifdef CNXK_ML_DEV_DEBUG
+   if ((device == NULL) || (deq_tensor == NULL) || (qbuffer == NULL))
+   return -EINVAL;
+#endif
+
+   cnxk_mldev = (struct cnxk_ml_dev *)device;
+
+   model = cnxk_mldev->mldev->data->models[model_id];
+#ifdef CNXK_ML_DEV_DEBUG
+   if (model == NULL) {
+   plt_err("Invalid model_id = %u", model_id);
+   return -EINVAL;
+   }
+#endif
+
+   /* Get layer id */
+   for (layer_id = 0; layer_id < model->mvtvm.metadata.model.nb_layers; 
layer_id++) {
+   if (strcmp(model->layer[layer_id].name, layer_name) == 0)
+   break;
+   }
+
+#ifdef CNXK_ML_DEV_DEBUG
+   if (layer_id == model->mvtvm.metadata.model.nb_layers) {
+   plt_err("Invalid layer name: %s", layer_name);
+   return -EINVAL;
+   }
+
+   if (model->layer[layer_id].type != ML_CNXK_LAYER_TYPE_MRVL) {
+   plt_err("Invalid layer name / type: %s", layer_name);
+   return -EINVAL;
+   }
+#endif
+
+   info = &model->layer[layer_id].info;
+   lcl_qbuffer = (uint8_t *)qbuffer;
+
+   for (i = 0; i < info->nb_inputs; i++) {
+   lcl_dbuffer = PLT_PTR_ADD(deq_tensor[i]->data, 
deq_tensor[i]->byte_offset);
+
+   ret = cnxk_ml_io_quantize_single(&info->input[i], lcl_dbuffer, 
lcl_qbuffer);
+   if (ret < 0)
+   return ret;
+
+   lcl_qbuffer += info->input[i].sz_q;
+   }
+
+   return 0;
+}
+
+int
+mvtvm_ml_io_dequantize(void *device, uint16_t model_id, const char 
*layer_name, void *qbuffer,
+  const DLTensor **deq_tensor)
+{
+   struct cnxk_ml_io_info *info = NULL;
+   struct cnxk_ml_dev *cnxk_mldev;
+   struct cnxk_ml_model *model;
+   uint16_t layer_id = 0;
+   uint8_t *lcl_dbuffer;
+   uint8_t *lcl_qbuffer;
+   uint32_t i;
+   int ret;
+
+#ifdef CNXK_ML_DEV_DEBUG
+   if ((device == NULL) || (deq_tensor == NULL) || (qbuffer == NULL))
+   return -EINVAL;
+#endif
+
+   cnxk_mldev = (struct cnxk_ml_dev *)device;
+
+   model = cnxk_mldev->mldev->data->models[model_id];
+#ifdef CNXK_ML_DEV_DEBUG
+   if (model == NULL) {
+   plt_err("Invalid model_id = %u", model_id);
+   return -EINVAL;
+   }
+#endif
+
+   for (layer_id = 0; layer_id < model->mvtvm.metadata.model.nb_layers; 
layer_id++) {
+   if (strcmp(model->layer[layer_id].name, layer_name) == 0)
+   break;
+   }
+
+#ifdef CNXK_ML_DEV_DEBUG
+   if (layer_id == model->mvtvm.metadata.model.nb_layers) {
+   plt_err("Invalid layer name: %s", layer_name);
+   return -EINVAL;
+   }
+
+   if (model->layer[layer_id].type != ML_CNXK_LAYER_TYPE_MRVL) {
+   plt_err("Invalid layer name / type: %s", layer_name);
+   return -EINVAL;
+   }
+#endif
+
+   info = &model->layer[layer_id].info;
+   lcl_qbuffer = (uint8_t *)qbuffer;
+
+   for (i = 0; i < info->nb_outputs; i++) {
+   lcl_dbuffer = PLT_PTR_ADD(deq_tensor[i]->data, 
deq_tensor[i]->byte_offset);
+
+   ret = cnxk_ml_io_dequantize_single(&info->output[i], 
lcl_qbuffer, lcl_dbuffer);
+   if (ret < 0)
+ 

[PATCH v8 30/34] ml/cnxk: implement I/O alloc and free callbacks

2023-10-22 Thread Srikanth Yalavarthi
Implemented callback functions for IO allocation and free
for Glow layers.

Signed-off-by: Srikanth Yalavarthi 
---
 drivers/ml/cnxk/cn10k_ml_ops.c | 87 ++
 drivers/ml/cnxk/cn10k_ml_ops.h |  3 ++
 drivers/ml/cnxk/mvtvm_ml_ops.c |  2 +
 3 files changed, 92 insertions(+)

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 0c67ce7b40..7802425c87 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -1410,3 +1410,90 @@ cn10k_ml_inference_sync(void *device, uint16_t index, 
void *input, void *output,
 error_enqueue:
return ret;
 }
+
+int
+cn10k_ml_io_alloc(void *device, uint16_t model_id, const char *layer_name, 
uint64_t **input_qbuffer,
+ uint64_t **output_qbuffer)
+{
+   struct cnxk_ml_dev *cnxk_mldev;
+   struct cnxk_ml_model *model;
+   struct cnxk_ml_layer *layer;
+
+   char str[RTE_MEMZONE_NAMESIZE];
+   const struct plt_memzone *mz;
+   uint64_t output_size;
+   uint64_t input_size;
+   uint16_t layer_id;
+   int ret;
+
+   cnxk_mldev = (struct cnxk_ml_dev *)device;
+   if (cnxk_mldev == NULL) {
+   plt_err("Invalid device = %p", device);
+   return -EINVAL;
+   }
+
+   model = cnxk_mldev->mldev->data->models[model_id];
+   if (model == NULL) {
+   plt_err("Invalid model_id = %u", model_id);
+   return -EINVAL;
+   }
+
+   ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id);
+   if (ret != 0)
+   return ret;
+
+   layer = &model->layer[layer_id];
+   input_size = PLT_ALIGN_CEIL(layer->info.total_input_sz_q, 
ML_CN10K_ALIGN_SIZE);
+   output_size = PLT_ALIGN_CEIL(layer->info.total_output_sz_q, 
ML_CN10K_ALIGN_SIZE);
+
+   sprintf(str, "cn10k_ml_io_mz_%u_%u", model_id, layer_id);
+   mz = plt_memzone_reserve_aligned(str, input_size + output_size, 0, 
ML_CN10K_ALIGN_SIZE);
+   if (mz == NULL) {
+   plt_err("io_alloc failed: Unable to allocate memory: model_id = 
%u, layer_name = %s",
+   model_id, layer_name);
+   return -ENOMEM;
+   }
+
+   *input_qbuffer = mz->addr;
+   *output_qbuffer = PLT_PTR_ADD(mz->addr, input_size);
+
+   return 0;
+}
+
+int
+cn10k_ml_io_free(void *device, uint16_t model_id, const char *layer_name)
+{
+   struct cnxk_ml_dev *cnxk_mldev;
+   struct cnxk_ml_model *model;
+
+   char str[RTE_MEMZONE_NAMESIZE];
+   const struct plt_memzone *mz;
+   uint16_t layer_id;
+   int ret;
+
+   cnxk_mldev = (struct cnxk_ml_dev *)device;
+   if (cnxk_mldev == NULL) {
+   plt_err("Invalid device = %p", device);
+   return -EINVAL;
+   }
+
+   model = cnxk_mldev->mldev->data->models[model_id];
+   if (model == NULL) {
+   plt_err("Invalid model_id = %u", model_id);
+   return -EINVAL;
+   }
+
+   ret = cn10k_ml_model_get_layer_id(model, layer_name, &layer_id);
+   if (ret != 0)
+   return ret;
+
+   sprintf(str, "cn10k_ml_io_mz_%u_%u", model_id, layer_id);
+   mz = plt_memzone_lookup(str);
+   if (mz == NULL) {
+   plt_err("io_free failed: Memzone not found: model_id = %u, 
layer_name = %s",
+   model_id, layer_name);
+   return -EINVAL;
+   }
+
+   return plt_memzone_free(mz);
+}
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h
index 045e2e6cd2..9c41c1c0b0 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.h
+++ b/drivers/ml/cnxk/cn10k_ml_ops.h
@@ -329,6 +329,9 @@ int cn10k_ml_layer_load(void *device, uint16_t model_id, 
const char *layer_name,
 int cn10k_ml_layer_unload(void *device, uint16_t model_id, const char 
*layer_name);
 int cn10k_ml_layer_start(void *device, uint16_t model_id, const char 
*layer_name);
 int cn10k_ml_layer_stop(void *device, uint16_t model_id, const char 
*layer_name);
+int cn10k_ml_io_alloc(void *device, uint16_t model_id, const char *layer_name,
+ uint64_t **input_qbuffer, uint64_t **output_qbuffer);
+int cn10k_ml_io_free(void *device, uint16_t model_id, const char *layer_name);
 
 /* xstats ops */
 void cn10k_ml_xstat_model_name_set(struct cnxk_ml_dev *cnxk_mldev, struct 
cnxk_ml_model *model,
diff --git a/drivers/ml/cnxk/mvtvm_ml_ops.c b/drivers/ml/cnxk/mvtvm_ml_ops.c
index abfbae2b3a..a50b31ec6e 100644
--- a/drivers/ml/cnxk/mvtvm_ml_ops.c
+++ b/drivers/ml/cnxk/mvtvm_ml_ops.c
@@ -232,6 +232,8 @@ mvtvm_ml_model_load(struct cnxk_ml_dev *cnxk_mldev, struct 
rte_ml_model_params *
callback = &model->mvtvm.cb;
callback->tvmrt_glow_layer_load = cn10k_ml_layer_load;
callback->tvmrt_glow_layer_unload = cn10k_ml_layer_unload;
+   callback->tvmrt_io_alloc = cn10k_ml_io_alloc;
+   callback->tvmrt_io_free = cn10k_ml_io_free;
} else {
ca

[PATCH v8 33/34] ml/cnxk: enable fast-path ops for TVM models

2023-10-22 Thread Srikanth Yalavarthi
From: Anup Prabhu 

Enable fast-path ops support for TVM models. Models would
use TVMDP library function calls to execute inference
operations for Hybrid and LLVM model sub-types.

For TVM MRVL model subtypes that have a single MRVL layer,
the inference requests are directly enqueued to hardware
by the driver.

Signed-off-by: Anup Prabhu 
Signed-off-by: Srikanth Yalavarthi 
---
 doc/guides/rel_notes/release_23_11.rst |   3 +
 drivers/ml/cnxk/cn10k_ml_ops.c |   4 -
 drivers/ml/cnxk/cnxk_ml_io.h   |   6 ++
 drivers/ml/cnxk/cnxk_ml_ops.c  |   4 +
 drivers/ml/cnxk/cnxk_ml_ops.h  |   5 +
 drivers/ml/cnxk/mvtvm_ml_model.c   |  20 
 drivers/ml/cnxk/mvtvm_ml_model.h   |   6 ++
 drivers/ml/cnxk/mvtvm_ml_ops.c | 124 +
 drivers/ml/cnxk/mvtvm_ml_ops.h |  43 +
 9 files changed, 211 insertions(+), 4 deletions(-)

diff --git a/doc/guides/rel_notes/release_23_11.rst 
b/doc/guides/rel_notes/release_23_11.rst
index 0a6fc76a9d..5fcf2a1897 100644
--- a/doc/guides/rel_notes/release_23_11.rst
+++ b/doc/guides/rel_notes/release_23_11.rst
@@ -243,6 +243,9 @@ New Features
   Added dispatcher library which purpose is to help decouple different
   parts (modules) of an eventdev-based application.
 
+* **Updated Marvell cnxk mldev driver.**
+
+  * Added support for models compiled using TVM framework.
 
 Removed Items
 -
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index 01b0a44caa..b9d30278c6 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -371,10 +371,6 @@ cn10k_ml_dev_configure(struct cnxk_ml_dev *cnxk_mldev, 
const struct rte_ml_dev_c
else
cn10k_mldev->ml_jcmdq_enqueue = roc_ml_jcmdq_enqueue_lf;
 
-   cnxk_mldev->mldev->enqueue_burst = cnxk_ml_enqueue_burst;
-   cnxk_mldev->mldev->dequeue_burst = cnxk_ml_dequeue_burst;
-   cnxk_mldev->mldev->op_error_get = cn10k_ml_op_error_get;
-
return 0;
 }
 
diff --git a/drivers/ml/cnxk/cnxk_ml_io.h b/drivers/ml/cnxk/cnxk_ml_io.h
index 5de166c252..6d5d25a7c9 100644
--- a/drivers/ml/cnxk/cnxk_ml_io.h
+++ b/drivers/ml/cnxk/cnxk_ml_io.h
@@ -47,6 +47,12 @@ struct cnxk_ml_io {
 
/* Scale */
float scale;
+
+   /* Dequantized offset */
+   uint32_t offset_d;
+
+   /* Quantized offset */
+   uint32_t offset_q;
 };
 
 /* Model / Layer IO structure */
diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c
index 2632d70d8c..bf266d4d6e 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.c
+++ b/drivers/ml/cnxk/cnxk_ml_ops.c
@@ -632,6 +632,10 @@ cnxk_ml_dev_configure(struct rte_ml_dev *dev, const struct 
rte_ml_dev_config *co
cnxk_mldev->max_nb_layers =

cnxk_mldev->cn10k_mldev.fw.req->cn10k_req.jd.fw_load.cap.s.max_models;
 
+   cnxk_mldev->mldev->enqueue_burst = cnxk_ml_enqueue_burst;
+   cnxk_mldev->mldev->dequeue_burst = cnxk_ml_dequeue_burst;
+   cnxk_mldev->mldev->op_error_get = cn10k_ml_op_error_get;
+
/* Allocate and initialize index_map */
if (cnxk_mldev->index_map == NULL) {
cnxk_mldev->index_map =
diff --git a/drivers/ml/cnxk/cnxk_ml_ops.h b/drivers/ml/cnxk/cnxk_ml_ops.h
index ab32676b3e..7b49793a57 100644
--- a/drivers/ml/cnxk/cnxk_ml_ops.h
+++ b/drivers/ml/cnxk/cnxk_ml_ops.h
@@ -24,6 +24,11 @@ struct cnxk_ml_req {
union {
/* CN10K */
struct cn10k_ml_req cn10k_req;
+
+#ifdef RTE_MLDEV_CNXK_ENABLE_MVTVM
+   /* MVTVM */
+   struct mvtvm_ml_req mvtvm_req;
+#endif
};
 
/* Address of status field */
diff --git a/drivers/ml/cnxk/mvtvm_ml_model.c b/drivers/ml/cnxk/mvtvm_ml_model.c
index ffbcec8b80..95bde6a9cb 100644
--- a/drivers/ml/cnxk/mvtvm_ml_model.c
+++ b/drivers/ml/cnxk/mvtvm_ml_model.c
@@ -198,6 +198,16 @@ mvtvm_ml_model_io_info_set(struct cnxk_ml_model *model)
model->mvtvm.info.total_input_sz_d += 
model->mvtvm.info.input[i].sz_d;
model->mvtvm.info.total_input_sz_q += 
model->mvtvm.info.input[i].sz_q;
 
+   model->mvtvm.info.input[i].offset_d = 
model->mvtvm.info.total_input_sz_d;
+   model->mvtvm.info.input[i].offset_q = 
model->mvtvm.info.total_input_sz_q;
+
+   model->mvtvm.input_tensor[i].device = metadata->input[i].device;
+   model->mvtvm.input_tensor[i].ndim = metadata->input[i].ndim;
+   model->mvtvm.input_tensor[i].dtype = 
metadata->input[i].datatype;
+   model->mvtvm.input_tensor[i].shape = metadata->input[i].shape;
+   model->mvtvm.input_tensor[i].strides = NULL;
+   model->mvtvm.input_tensor[i].byte_offset = 
model->mvtvm.info.input[i].offset_q;
+
plt_ml_dbg("model_id = %u, input[%u] - sz_d = %u sz_q = %u", 
model->model_id, i,
   model->mvtvm.info.input[i].sz_d, 
model->mvtvm.info.input[i].sz_q);
}
@@ -231,6 

[PATCH v8 34/34] ml/cnxk: enable creation of mvtvm virtual device

2023-10-22 Thread Srikanth Yalavarthi
Enable support to create a mvtvm virtual device on
system's without a PCI based ML HW accelerator.

Signed-off-by: Srikanth Yalavarthi 
---
 doc/guides/mldevs/cnxk.rst   |  50 +++-
 drivers/ml/cnxk/cn10k_ml_dev.c   |   8 ++
 drivers/ml/cnxk/cn10k_ml_dev.h   |   3 +
 drivers/ml/cnxk/cnxk_ml_dev.c|   3 +
 drivers/ml/cnxk/cnxk_ml_dev.h|  21 
 drivers/ml/cnxk/cnxk_ml_ops.c|  82 +
 drivers/ml/cnxk/meson.build  |   1 +
 drivers/ml/cnxk/mvtvm_ml_dev.c   | 196 +++
 drivers/ml/cnxk/mvtvm_ml_dev.h   |  40 +++
 drivers/ml/cnxk/mvtvm_ml_ops.c   |  31 +
 drivers/ml/cnxk/mvtvm_ml_ops.h   |   2 +
 drivers/ml/cnxk/mvtvm_ml_stubs.c |  18 +++
 drivers/ml/cnxk/mvtvm_ml_stubs.h |   2 +
 13 files changed, 433 insertions(+), 24 deletions(-)
 create mode 100644 drivers/ml/cnxk/mvtvm_ml_dev.c
 create mode 100644 drivers/ml/cnxk/mvtvm_ml_dev.h

diff --git a/doc/guides/mldevs/cnxk.rst b/doc/guides/mldevs/cnxk.rst
index a4d8903896..28e5b5b87f 100644
--- a/doc/guides/mldevs/cnxk.rst
+++ b/doc/guides/mldevs/cnxk.rst
@@ -239,6 +239,23 @@ Bind the ML PF device to the vfio_pci driver:
usertools/dpdk-devbind.py -u :00:10.0
usertools/dpdk-devbind.py -b vfio-pci :00:10.0
 
+VDEV support
+
+
+On platforms which don't support ML hardware acceleration through PCI device, 
the
+Marvell ML CNXK PMD can execute inference operations on a vdev with the ML 
models
+compiled using Apache TVM framework.
+
+VDEV can be enabled by passing the EAL arguments
+
+.. code-block:: console
+
+   --vdev ml_mvtvm
+
+VDEV can also be used on platforms with ML HW accelerator. However to use VDEV 
in
+this case, the PCI device has to be un-binded. When PCI device is binded, 
creation
+of vdev is skipped.
+
 
 Runtime Config Options
 --
@@ -249,6 +266,8 @@ Runtime Config Options
   The parameter ``fw_path`` can be used by the user
   to load ML firmware from a custom path.
 
+  This option is supported only on PCI HW accelerator.
+
   For example::
 
  -a :00:10.0,fw_path="/home/user/ml_fw.bin"
@@ -264,6 +283,8 @@ Runtime Config Options
   When enabled, firmware would mask the DPE non-fatal hardware errors as 
warnings.
   The parameter ``enable_dpe_warnings`` is used fo this configuration.
 
+  This option is supported only on PCI HW accelerator.
+
   For example::
 
  -a :00:10.0,enable_dpe_warnings=0
@@ -280,11 +301,19 @@ Runtime Config Options
   Caching of model data improves the inferencing throughput / latency for the 
model.
   The parameter ``cache_model_data`` is used to enable data caching.
 
+  This option is supported on PCI HW accelerator and vdev.
+
   For example::
 
  -a :00:10.0,cache_model_data=0
 
-  With the above configuration, model data caching is disabled.
+  With the above configuration, model data caching is disabled on HW 
accelerator.
+
+  For example::
+
+ --vdev ml_mvtvm,cache_model_data=0
+
+  With the above configuration, model data caching is disabled on vdev.
 
 
 **OCM allocation mode** (default ``lowest``)
@@ -300,6 +329,8 @@ Runtime Config Options
   ``largest``
 Allocate OCM for the model from the slot with largest amount of free space.
 
+  This option is supported only on PCI HW accelerator.
+
   For example::
 
  -a :00:10.0,ocm_alloc_mode=lowest
@@ -317,6 +348,8 @@ Runtime Config Options
   Supported page sizes by the driver are 1 KB, 2 KB, 4 KB, 8 KB and 16 KB.
   Default page size is 16 KB.
 
+  This option is supported only on PCI HW accelerator.
+
   For example::
 
  -a :00:10.0,ocm_page_size=8192
@@ -341,6 +374,8 @@ Runtime Config Options
 Enabling spinlock version would disable restrictions on the number of 
queue-pairs
 that can be supported by the driver.
 
+   This option is supported only on PCI HW accelerator.
+
   For example::
 
  -a :00:10.0,hw_queue_lock=1
@@ -349,6 +384,19 @@ Runtime Config Options
   in the fast path enqueue burst operation.
 
 
+**Maximum queue pairs** (default ``1``)
+
+  VDEV supports additional EAL arguments to configure the maximum number of
+  queue-pairs on the ML device through the option ``max_qps``.
+
+  This option is supported only on vdev.
+
+  For example::
+
+ --vdev ml_mvtvm,max_qps=4
+
+  With the above configuration, 4 queue-pairs are created on the vdev.
+
 Debugging Options
 -
 
diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c
index 91813e9d0a..41f3b7a95d 100644
--- a/drivers/ml/cnxk/cn10k_ml_dev.c
+++ b/drivers/ml/cnxk/cn10k_ml_dev.c
@@ -309,6 +309,12 @@ cn10k_ml_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_de
 
PLT_SET_USED(pci_drv);
 
+   if (cnxk_ml_dev_initialized == 1) {
+   plt_err("ML CNXK device already initialized!");
+   plt_err("Cannot initialize CN10K PCI dev");
+   return -EINVAL;
+   }
+
init_params = (struct rte_ml_dev_pmd_init_params){
  

[PATCH v8 18/34] ml/cnxk: support config and close of tvmdp library

2023-10-22 Thread Srikanth Yalavarthi
Added support to configure and close TVMDP library based
on ML device configuration options.

Updated meson build to enable Jansson, TVM runtime, TVMDP
library as build dependencies.

Signed-off-by: Srikanth Yalavarthi 
---
 config/arm/arm64_cn10k_linux_gcc |   1 +
 config/arm/arm64_cn9k_linux_gcc  |   1 +
 doc/guides/mldevs/cnxk.rst   | 169 +++
 drivers/ml/cnxk/cnxk_ml_ops.c|   7 ++
 drivers/ml/cnxk/cnxk_ml_ops.h|   6 ++
 drivers/ml/cnxk/meson.build  |  58 +++
 drivers/ml/cnxk/mvtvm_ml_ops.c   |  41 
 drivers/ml/cnxk/mvtvm_ml_ops.h   |  19 
 drivers/ml/cnxk/mvtvm_ml_stubs.c |  26 +
 drivers/ml/cnxk/mvtvm_ml_stubs.h |  15 +++
 10 files changed, 343 insertions(+)
 create mode 100644 drivers/ml/cnxk/mvtvm_ml_ops.c
 create mode 100644 drivers/ml/cnxk/mvtvm_ml_ops.h
 create mode 100644 drivers/ml/cnxk/mvtvm_ml_stubs.c
 create mode 100644 drivers/ml/cnxk/mvtvm_ml_stubs.h

diff --git a/config/arm/arm64_cn10k_linux_gcc b/config/arm/arm64_cn10k_linux_gcc
index 05d2d64cf2..fa904af5d0 100644
--- a/config/arm/arm64_cn10k_linux_gcc
+++ b/config/arm/arm64_cn10k_linux_gcc
@@ -5,6 +5,7 @@ ar = 'aarch64-linux-gnu-gcc-ar'
 strip = 'aarch64-linux-gnu-strip'
 pkgconfig = 'aarch64-linux-gnu-pkg-config'
 pcap-config = ''
+cmake = 'cmake'
 
 [host_machine]
 system = 'linux'
diff --git a/config/arm/arm64_cn9k_linux_gcc b/config/arm/arm64_cn9k_linux_gcc
index 7416454de0..646ce4b5d3 100644
--- a/config/arm/arm64_cn9k_linux_gcc
+++ b/config/arm/arm64_cn9k_linux_gcc
@@ -5,6 +5,7 @@ ar = 'aarch64-linux-gnu-gcc-ar'
 strip = 'aarch64-linux-gnu-strip'
 pkgconfig = 'aarch64-linux-gnu-pkg-config'
 pcap-config = ''
+cmake = 'cmake'
 
 [host_machine]
 system = 'linux'
diff --git a/doc/guides/mldevs/cnxk.rst b/doc/guides/mldevs/cnxk.rst
index 1834b1f905..a4d8903896 100644
--- a/doc/guides/mldevs/cnxk.rst
+++ b/doc/guides/mldevs/cnxk.rst
@@ -46,6 +46,175 @@ or cross-compiled on an x86 platform.
 
 Refer to :doc:`../platform/cnxk` for instructions to build your DPDK 
application.
 
+Compilation Prerequisites
+-
+
+This driver requires external libraries to optionally enable support for
+models compiled using Apache TVM framework. The following dependencies are
+not part of DPDK and must be installed separately:
+
+- **Jansson**
+
+  This library enables support to parse and read JSON files.
+
+- **DLPack**
+
+  This library provides headers for open in-memory tensor structures.
+
+.. note::
+
+DPDK CNXK ML driver requires DLPack version 0.7
+
+.. code-block:: console
+
+git clone https://github.com/dmlc/dlpack.git
+cd dlpack
+git checkout v0.7 -b v0.7
+cmake -S ./ -B build \
+  -DCMAKE_INSTALL_PREFIX= \
+  -DBUILD_MOCK=OFF
+make -C build
+make -C build install
+
+*Cross-compiling for AArch64*
+
+.. code-block:: console
+
+git clone https://github.com/dmlc/dlpack.git
+cd dlpack
+git checkout v0.7 -b v0.7
+cmake -S ./ -B build \
+  -DCMAKE_INSTALL_PREFIX=
+  -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc \
+  -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++ \
+  -DBUILD_MOCK=OFF
+make -C build
+make -C build install
+
+- **DMLC**
+
+  This is a common bricks library for building scalable and portable 
distributed
+  machine learning.
+
+.. code-block:: console
+
+git clone https://github.com/dmlc/dmlc-core.git
+cd dmlc-core
+git checkout main
+cmake -S ./ -B build \
+  -DCMAKE_INSTALL_PREFIX= \
+  -DCMAKE_C_FLAGS="-fpermissive" \
+  -DCMAKE_CXX_FLAGS="-fpermissive" \
+  -DUSE_OPENMP=OFF
+make -C build
+make -C build install
+
+*Cross-compiling for AArch64*
+
+.. code-block:: console
+
+git clone https://github.com/dmlc/dmlc-core.git
+cd dmlc-core
+git checkout main
+cmake -S ./ -B build \
+  -DCMAKE_INSTALL_PREFIX= \
+  -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc \
+  -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++ \
+  -DCMAKE_C_FLAGS="-fpermissive" \
+  -DCMAKE_CXX_FLAGS="-fpermissive" \
+  -DUSE_OPENMP=OFF
+make -C build
+make -C build install
+
+- **TVM**
+
+  Apache TVM provides a runtime libraries used to execute models on CPU cores
+  or hardware accelerators.
+
+.. note::
+
+DPDK CNXK ML driver requires TVM version 0.10.0
+
+.. code-block:: console
+
+git clone https://github.com/apache/tvm.git
+cd tvm
+git checkout v0.11.0 -b v0.11.0
+git submodule update --init
+cmake -S ./ -B build \
+  -DCMAKE_INSTALL_PREFIX= \
+  -DBUILD_STATIC_RUNTIME=OFF
+make -C build
+make -C build install
+
+*Cross-compiling for AArch64*
+
+.. code-block:: console
+
+git clone https://github.com/apache/tvm.git
+cd tvm
+git checkout v0.11.0 -b v0.11.0
+git submodule update --init
+cmake -S ./ -B build \
+  -DCMAKE_INSTALL_PREFIX= \
+  -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc \
+  -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++ \
+  -DMACHINE_NAME=aarch64-linux-

[PATCH v2] common/qat: limit configuration to the primary process

2023-10-22 Thread Arkadiusz Kusztal
This change prevents certain configuration functions from being
called by the secondary process.

Signed-off-by: Arkadiusz Kusztal 
---
v2:
- fixed incorrect function call
- rephrased comments

 drivers/common/qat/qat_device.c | 115 +++-
 drivers/common/qat/qat_device.h |   2 +
 2 files changed, 67 insertions(+), 50 deletions(-)

diff --git a/drivers/common/qat/qat_device.c b/drivers/common/qat/qat_device.c
index cbf1e6a988..b7bd2ade4a 100644
--- a/drivers/common/qat/qat_device.c
+++ b/drivers/common/qat/qat_device.c
@@ -13,6 +13,15 @@
 #include "adf_pf2vf_msg.h"
 #include "qat_pf2vf.h"
 
+#define NOT_NULL(arg, func, msg, ...)  \
+   do {\
+   if (arg == NULL) {  \
+   QAT_LOG(ERR,\
+   msg, ##__VA_ARGS__);\
+   func;   \
+   }   \
+   } while (0)
+
 /* Hardware device information per generation */
 struct qat_gen_hw_data qat_gen_config[QAT_N_GENS];
 struct qat_dev_hw_spec_funcs *qat_dev_hw_spec[QAT_N_GENS];
@@ -173,6 +182,29 @@ qat_dev_parse_cmd(const char *str, struct qat_dev_cmd_param
}
 }
 
+static enum qat_device_gen
+pick_gen(const struct rte_pci_device *pci_dev)
+{
+   switch (pci_dev->id.device_id) {
+   case 0x0443:
+   return QAT_GEN1;
+   case 0x37c9:
+   case 0x19e3:
+   case 0x6f55:
+   case 0x18ef:
+   return QAT_GEN2;
+   case 0x18a1:
+   return QAT_GEN3;
+   case 0x4941:
+   case 0x4943:
+   case 0x4945:
+   return QAT_GEN4;
+   default:
+   QAT_LOG(ERR, "Invalid dev_id, can't determine generation");
+   return QAT_N_GENS;
+   }
+}
+
 struct qat_pci_device *
 qat_pci_device_allocate(struct rte_pci_device *pci_dev,
struct qat_dev_cmd_param *qat_dev_cmd_param)
@@ -190,25 +222,8 @@ qat_pci_device_allocate(struct rte_pci_device *pci_dev,
rte_pci_device_name(&pci_dev->addr, name, sizeof(name));
snprintf(name+strlen(name), QAT_DEV_NAME_MAX_LEN-strlen(name), "_qat");
 
-   switch (pci_dev->id.device_id) {
-   case 0x0443:
-   qat_dev_gen = QAT_GEN1;
-   break;
-   case 0x37c9:
-   case 0x19e3:
-   case 0x6f55:
-   case 0x18ef:
-   qat_dev_gen = QAT_GEN2;
-   break;
-   case 0x18a1:
-   qat_dev_gen = QAT_GEN3;
-   break;
-   case 0x4941:
-   case 0x4943:
-   case 0x4945:
-   qat_dev_gen = QAT_GEN4;
-   break;
-   default:
+   qat_dev_gen = pick_gen(pci_dev);
+   if (qat_dev_gen == QAT_N_GENS) {
QAT_LOG(ERR, "Invalid dev_id, can't determine generation");
return NULL;
}
@@ -265,20 +280,15 @@ qat_pci_device_allocate(struct rte_pci_device *pci_dev,
qat_dev->dev_private = qat_dev + 1;
strlcpy(qat_dev->name, name, QAT_DEV_NAME_MAX_LEN);
qat_dev->qat_dev_id = qat_dev_id;
-   qat_pci_devs[qat_dev_id].pci_dev = pci_dev;
qat_dev->qat_dev_gen = qat_dev_gen;
 
ops_hw = qat_dev_hw_spec[qat_dev->qat_dev_gen];
-   if (ops_hw->qat_dev_get_misc_bar == NULL) {
-   QAT_LOG(ERR, "qat_dev_get_misc_bar function pointer not set");
-   rte_memzone_free(qat_dev_mz);
-   return NULL;
-   }
+   NOT_NULL(ops_hw->qat_dev_get_misc_bar, goto error,
+   "QAT internal error! qat_dev_get_misc_bar function not set");
if (ops_hw->qat_dev_get_misc_bar(&mem_resource, pci_dev) == 0) {
if (mem_resource->addr == NULL) {
QAT_LOG(ERR, "QAT cannot get access to VF misc bar");
-   rte_memzone_free(qat_dev_mz);
-   return NULL;
+   goto error;
}
qat_dev->misc_bar_io_addr = mem_resource->addr;
} else
@@ -291,22 +301,45 @@ qat_pci_device_allocate(struct rte_pci_device *pci_dev,
QAT_LOG(ERR,
"Cannot acquire ring configuration for QAT_%d",
qat_dev_id);
-   rte_memzone_free(qat_dev_mz);
-   return NULL;
+   goto error;
+   }
+   NOT_NULL(ops_hw->qat_dev_reset_ring_pairs, goto error,
+   "QAT internal error! Reset ring pairs function not set, gen : 
%d",
+   qat_dev_gen);
+   if (ops_hw->qat_dev_reset_ring_pairs(qat_dev)) {
+   QAT_LOG(ERR,
+   "Cannot reset ring pairs, does pf driver supports pf2vf 
comms?"
+   );
+   goto error;
}
+   NOT_NULL(ops_hw->qat_dev_get_slice_map, goto error,
+   "QAT internal error! Reset ring pairs function not set, gen : 
%d",
+   qat_dev_gen);
+ 

[PATCH v2] maintainers: update email address

2023-10-22 Thread Chenbo Xia
I left Intel and joined Nvidia, so update my email address.

Signed-off-by: Chenbo Xia 
Acked-by: Maxime Coquelin 
---
 .mailmap|  2 +-
 MAINTAINERS | 12 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/.mailmap b/.mailmap
index 3f5bab26a8..d40b3ad6c0 100644
--- a/.mailmap
+++ b/.mailmap
@@ -213,7 +213,7 @@ Charles Brett 
 Charles Myers 
 Charles Stoll 
 Chas Williams <3ch...@gmail.com>  
-Chenbo Xia 
+Chenbo Xia  
 Chengchang Tang  
 Chengfeng Ye 
 Chenghu Yao 
diff --git a/MAINTAINERS b/MAINTAINERS
index 4083658697..b1c9495a00 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -50,7 +50,7 @@ T: git://dpdk.org/next/dpdk-next-net-mlx
 
 Next-virtio Tree
 M: Maxime Coquelin 
-M: Chenbo Xia 
+M: Chenbo Xia 
 T: git://dpdk.org/next/dpdk-next-virtio
 
 Next-crypto Tree
@@ -594,7 +594,7 @@ F: drivers/bus/dpaa/
 F: drivers/bus/fslmc/
 
 PCI bus driver
-M: Chenbo Xia 
+M: Chenbo Xia 
 M: Nipun Gupta 
 F: drivers/bus/pci/
 
@@ -983,7 +983,7 @@ F: doc/guides/nics/features/vmxnet3.ini
 
 Vhost-user
 M: Maxime Coquelin 
-M: Chenbo Xia 
+M: Chenbo Xia 
 T: git://dpdk.org/next/dpdk-next-virtio
 F: lib/vhost/
 F: doc/guides/prog_guide/vhost_lib.rst
@@ -997,7 +997,7 @@ F: doc/guides/sample_app_ug/vdpa.rst
 
 Vhost PMD
 M: Maxime Coquelin 
-M: Chenbo Xia 
+M: Chenbo Xia 
 T: git://dpdk.org/next/dpdk-next-virtio
 F: drivers/net/vhost/
 F: doc/guides/nics/vhost.rst
@@ -1005,7 +1005,7 @@ F: doc/guides/nics/features/vhost.ini
 
 Virtio PMD
 M: Maxime Coquelin 
-M: Chenbo Xia 
+M: Chenbo Xia 
 T: git://dpdk.org/next/dpdk-next-virtio
 F: drivers/net/virtio/
 F: doc/guides/nics/virtio.rst
@@ -1661,7 +1661,7 @@ F: app/test/test_rcu*
 F: doc/guides/prog_guide/rcu_lib.rst
 
 PCI
-M: Chenbo Xia 
+M: Chenbo Xia 
 M: Gaetan Rivet 
 F: lib/pci/
 
-- 
2.39.3 (Apple Git-145)



Re: [RFC PATCH v4 1/4] dts: code adjustments for sphinx

2023-10-22 Thread Juraj Linkeš


>
> My only nitpick comment would be on the name of the file common.py that
> only contain the MesonArgs class. Looks good otherwise

Could you elaborate a bit more, Yoan? The common.py module is supposed
to be extended with code common to all other modules in the
testbed_model package. Right now we only have MesonArgs which fits in
common.py, but we could also move something else into common.py. We
also could rename common.py to something else, but then the above
purpose would not be clear.

I'm finishing the docstrings soon so expect a new version where things
like these will be clearer. :-)