Re: [PATCH 1/2] net/virtio: propagate return value of called function

2023-03-30 Thread Andrew Rybchenko

On 3/28/23 06:14, Xia, Chenbo wrote:

-Original Message-
From: Boleslav Stankevich 
Sent: Wednesday, March 22, 2023 6:23 PM
To: dev@dpdk.org
Cc: Boleslav Stankevich ;
sta...@dpdk.org; Andrew Rybchenko ; Maxime
Coquelin ; Xia, Chenbo ;
David Marchand ; Hyong Youb Kim
; Harman Kalra 
Subject: [PATCH 1/2] net/virtio: propagate return value of called function

rte_intr_vec_list_alloc() may fail because of different reasons which
are indicated by different negative errno values.

Fixes: d61138d4f0e2 ("drivers: remove direct access to interrupt handle")
Cc: sta...@dpdk.org

Signed-off-by: Boleslav Stankevich 
Signed-off-by: Andrew Rybchenko 


I see Boleslav's email is updated in mailmap file but patchwork is still 
complaining
about it.

@Adrew & Maxime,

Do you know why?


My idea was that next-virtio was not updated yet at that
moment. Don't know how to check it. May be just resent?

Andrew.



Re: [PATCH] common/sfc_efx/base: support link status change v2 events

2023-03-30 Thread Andrew Rybchenko

On 3/28/23 19:51, Ivan Malov wrote:

FW should send link status change events in either v1 or
v2 format depending on the preference which the driver
can express during CMD_DRV_ATTACH stage. At the moment,
libefx does not request v2, so v1 events must arrive.
However, FW does not honour this choice and always
sends v2 events. So teach libefx to parse such and
add v2 request to CMD_DRV_ATTACH, correspondingly.

Signed-off-by: Ivan Malov 
Reviewed-by: Andy Moreton 


Acked-by: Andrew Rybchenko 




RE: [PATCH v5 1/3] ethdev: add API for buffer recycle mode

2023-03-30 Thread Morten Brørup
> From: Feifei Wang [mailto:feifei.wa...@arm.com]
> Sent: Thursday, 30 March 2023 08.30
> 

[...]

> +/**
> + * @internal
> + * Rx routine for rte_eth_dev_buf_recycle().
> + * Refill Rx descriptors in buffer recycle mode.
> + *
> + * @note
> + * This API can only be called by rte_eth_dev_buf_recycle().
> + * Before calling this API, rte_eth_tx_buf_stash() should be
> + * called to stash Tx used buffers into Rx buffer ring.
> + *
> + * When this functionality is not implemented in the driver, the return
> + * buffer number is 0.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param queue_id
> + *   The index of the receive queue.
> + *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
> + *   to rte_eth_dev_configure().
> + *@param nb
> + *   The number of Rx descriptors to be refilled.
> + * @return
> + *   The number Rx descriptors correct to be refilled.
> + *   - ENODEV: bad port or queue (only if compiled with debug).

If you want errors reported by the return value, the function return type 
cannot be uint16_t.

> + */
> +static inline uint16_t rte_eth_rx_descriptors_refill(uint16_t port_id,
> + uint16_t queue_id, uint16_t nb)
> +{
> + struct rte_eth_fp_ops *p;
> + void *qd;
> +
> +#ifdef RTE_ETHDEV_DEBUG_RX
> + if (port_id >= RTE_MAX_ETHPORTS ||
> + queue_id >= RTE_MAX_QUEUES_PER_PORT) {
> + RTE_ETHDEV_LOG(ERR,
> + "Invalid port_id=%u or queue_id=%u\n",
> + port_id, queue_id);
> + rte_errno = ENODEV;
> + return 0;

If p->rx_descriptors_refill() is likely to return 0, this function should not 
use 0 as return value to indicate errors.

> + }
> +#endif
> +
> + p = &rte_eth_fp_ops[port_id];
> + qd = p->rxq.data[queue_id];
> +
> +#ifdef RTE_ETHDEV_DEBUG_RX
> + if (!rte_eth_dev_is_valid_port(port_id)) {
> + RTE_ETHDEV_LOG(ERR, "Invalid Rx port_id=%u\n", port_id);
> + rte_errno = ENODEV;
> + return 0;
> +
> + if (qd == NULL) {
> + RTE_ETHDEV_LOG(ERR, "Invalid Rx queue_id=%u for port_id=%u\n",
> + queue_id, port_id);
> + rte_errno = ENODEV;
> + return 0;
> + }
> +#endif
> +
> + if (p->rx_descriptors_refill == NULL)
> + return 0;
> +
> + return p->rx_descriptors_refill(qd, nb);
> +}
> +
>  /**@{@name Rx hardware descriptor states
>   * @see rte_eth_rx_descriptor_status
>   */
> @@ -6483,6 +6597,122 @@ rte_eth_tx_buffer(uint16_t port_id, uint16_t queue_id,
>   return rte_eth_tx_buffer_flush(port_id, queue_id, buffer);
>  }
> 
> +/**
> + * @internal
> + * Tx routine for rte_eth_dev_buf_recycle().
> + * Stash Tx used buffers into Rx buffer ring in buffer recycle mode.
> + *
> + * @note
> + * This API can only be called by rte_eth_dev_buf_recycle().
> + * After calling this API, rte_eth_rx_descriptors_refill() should be
> + * called to refill Rx ring descriptors.
> + *
> + * When this functionality is not implemented in the driver, the return
> + * buffer number is 0.
> + *
> + * @param port_id
> + *   The port identifier of the Ethernet device.
> + * @param queue_id
> + *   The index of the transmit queue.
> + *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
> + *   to rte_eth_dev_configure().
> + * @param rxq_buf_recycle_info
> + *   A pointer to a structure of Rx queue buffer ring information in buffer
> + *   recycle mode.
> + *
> + * @return
> + *   The number buffers correct to be filled in the Rx buffer ring.
> + *   - ENODEV: bad port or queue (only if compiled with debug).

If you want errors reported by the return value, the function return type 
cannot be uint16_t.

> + */
> +static inline uint16_t rte_eth_tx_buf_stash(uint16_t port_id, uint16_t
> queue_id,
> + struct rte_eth_rxq_buf_recycle_info *rxq_buf_recycle_info)
> +{
> + struct rte_eth_fp_ops *p;
> + void *qd;
> +
> +#ifdef RTE_ETHDEV_DEBUG_TX
> + if (port_id >= RTE_MAX_ETHPORTS ||
> + queue_id >= RTE_MAX_QUEUES_PER_PORT) {
> + RTE_ETHDEV_LOG(ERR,
> + "Invalid port_id=%u or queue_id=%u\n",
> + port_id, queue_id);
> + rte_errno = ENODEV;
> + return 0;

If p->tx_buf_stash() is likely to return 0, this function should not use 0 as 
return value to indicate errors.

> + }
> +#endif
> +
> + p = &rte_eth_fp_ops[port_id];
> + qd = p->txq.data[queue_id];
> +
> +#ifdef RTE_ETHDEV_DEBUG_TX
> + if (!rte_eth_dev_is_valid_port(port_id)) {
> + RTE_ETHDEV_LOG(ERR, "Invalid Tx port_id=%u\n", port_id);
> + rte_errno = ENODEV;
> + return 0;
> +
> + if (qd == NULL) {
> + RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u for port_id=%u\n",
> + queue_id, port_id);
> + rte_erno = ENODEV;
> + return 0;

RE: [PATCH 2/2] net/gve: update copyright holders

2023-03-30 Thread Guo, Junfeng



> -Original Message-
> From: Thomas Monjalon 
> Sent: Wednesday, March 29, 2023 22:07
> To: Ferruh Yigit ; Zhang, Qi Z
> ; Wu, Jingjing ; Xing,
> Beilei ; Guo, Junfeng 
> Cc: dev@dpdk.org; Rushil Gupta ; Joshua
> Washington ; Jeroen de Borst
> 
> Subject: Re: [PATCH 2/2] net/gve: update copyright holders
> 
> 28/03/2023 11:35, Guo, Junfeng:
> > The background is that, in the past (DPDK 22.11) we didn't get the
> approval
> > of license from Google, thus chose the MIT License for the base code,
> and
> > BSD-3 License for GVE common code (without the files in /base folder).
> > We also left the copyright holder of base code just to Google Inc, and
> made
> > Intel as the copyright holder of GVE common code (without /base
> folder).
> >
> > Today we are working together for GVE dev and maintaining. And we
> got
> > the approval of BSD-3 License from Google for the base code.
> > Thus we dicided to 1) switch the License of GVE base code from MIT to
> BSD-3;
> > 2) add Google LLC as one of the copyright holders for GVE common
> code.
> 
> Do you realize we had lenghty discussions in the Technical Board,
> the Governing Board, and with lawyers, just for this unneeded exception?
> 
> Now looking at the patches, there seem to be some big mistakes like
> removing some copyright. I don't understand how it can be taken so
> lightly.
> 
> I regret how fast we were, next time we will surely operate differently.
> If you want to improve the reputation of this driver,
> please ask other copyright holders to be more active and responsive.
> 

Really sorry for causing such severe trouble.

Yes, we did take lots of efforts in the Technical Board and the Governing
Board about this MIT exception. We really appreciate that.

About this patch set, it is my severe mistake to switch the MIT License
directly for the upstream-ed code in community, in the wrong way.
In the past we upstream-ed this driver with MIT License followed from
the kernel community's gve driver base code. And now we want to
use the code with BSD-3 License (approved by Google). 
So I suppose that the correct way may be 1) first remove all these code 
under MIT License and 2) then add the new files under BSD-3 License.

Please correct me if there are still misunderstanding in my statement. 
Thanks Thomas for pointing out my mistake. I'll be careful to fix this.

Copyright holder for the gve base code will stay unchanged. Google LLC 
will be added as one of the copyright holders for the gve common code.
@Rushil Gupta Please also be more active and responsive for the code
review and contribution in the community. Thanks!

> 



Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP

2023-03-30 Thread Bruce Richardson
On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> Hi,
> 
> While trying to port some code to VPP (which uses DPDK as the backend
> driver), I am running into a problem that calls to API's like
> rte_timer_subsystem_init, rte_hash_create are failing while allocation
> of memory.
> 
> This is presumably because VPP inits the EAL with the following arguments --
> 
> -in-memory --no-telemetry --file-prefix vpp
> 
> Is  there is something that can be done eg. passing some more parms in
> the EAL initialization which hopefully wouldn't break VPP but will
> also be friendly to the RTE timer and hash functions too, that would
> be great, so requesting some advice here.
> 
Hi,

can you provide some more details on what the errors are that you are
receiving? Have you been able to dig a little deeper into what might be
causing the memory failures? The above flags alone are unlikely to cause
issues with hash or timer libraries, for example.

/Bruce


RE: release candidate 23.03-rc4

2023-03-30 Thread Xu, HailinX
> -Original Message-
> From: Thomas Monjalon 
> Sent: Wednesday, March 29, 2023 3:34 AM
> To: annou...@dpdk.org
> Subject: release candidate 23.03-rc4
> 
> A new DPDK release candidate is ready for testing:
>   https://git.dpdk.org/dpdk/tag/?id=v23.03-rc4
> 
> There are 42 new patches in this snapshot.
> 
> Release notes:
>   https://doc.dpdk.org/guides/rel_notes/release_23_03.html
> 
> This is the last release candidate.
> Only documentation should be updated before the release.
> 
> Reviews of deprecation notices are required:
> https://patches.dpdk.org/bundle/dmarchand/deprecation_notices
> 
> You may share some release validation results by replying to this message at
> dev@dpdk.org and by adding tested hardware in the release notes.
> 
> Please think about sharing your roadmap now for DPDK 23.07.
> 
> Thank you everyone
> 
Update the test status for Intel part. Till now dpdk23.03-rc4 test execution 
rate is 90%, no new issue is found.
# Basic Intel(R) NIC testing
* Build or compile:
 *Build: cover the build test combination with latest GCC/Clang version and the 
popular OS revision such as Ubuntu20.04.5, Ubuntu22.04.1, Fedora37, RHEL8.6/9.1 
etc.
  - All test passed.
 *Compile: cover the CFLAGES(O0/O1/O2/O3) with popular OS such as Ubuntu22.04.1 
and RHEL8.6.
  - All test passed.
* Meson test & Asan test:
  known issues:
- https://bugs.dpdk.org/show_bug.cgi?id=1024 [dpdk-22.07][meson test] 
driver-tests/link_bonding_mode4_autotest bond handshake failed
- Not fix yet.
- https://bugs.dpdk.org/show_bug.cgi?id=1107 [22.11-rc1][meson test] 
seqlock_autotest test failed.
- Special issue with gcc 4.8.5.
* PF/VF(i40e, ixgbe): test scenarios including 
PF/VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
- All test done. No new issue is found.
* PF/VF(ice): test scenarios including Switch features/Package Management/Flow 
Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible Descriptor, etc.
- Execution rate is 95%. No new issue is found.
* Intel NIC single core/NIC performance: test scenarios including PF/VF single 
core performance test, RFC2544 Zero packet loss performance test, etc.
- All test done. No new issue is found.
* Power and IPsec:
 * Power: test scenarios including bi-direction/Telemetry/Empty Poll 
Lib/Priority Base Frequency, etc.
- All test done. No new issue is found.
 * IPsec: test scenarios including ipsec/ipsec-gw/ipsec library basic test - 
QAT&SW/FIB library, etc.
- On going.
# Basic cryptodev and virtio testing
* Virtio: both function and performance test are covered. Such as 
PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf testing/VMAWARE 
ESXI 8.0, etc.
- All test done. No new issue is found.
* Cryptodev:
 *Function test: test scenarios including Cryptodev API testing/CompressDev 
ISA-L/QAT/ZLIB PMD Testing/FIPS, etc.
- On going.
 *Performance test: test scenarios including Throughput Performance /Cryptodev 
Latency, etc.
- On going.

Regards,
Xu, Hailin


Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP

2023-03-30 Thread Prashant Upadhyaya
Hi,

The hash creation API throws the following error --
RING: Cannot reserve memory for tailq
HASH: memory allocation failed

The timer subsystem init api throws this error --
EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
memzone segments exceeds RTE_MAX_MEMZONE

I did check the code and apparently the memzone and rte zmalloc
related api's are not being able to allocate memory.

Regards
-Prashant

On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
 wrote:
>
> On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > Hi,
> >
> > While trying to port some code to VPP (which uses DPDK as the backend
> > driver), I am running into a problem that calls to API's like
> > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > of memory.
> >
> > This is presumably because VPP inits the EAL with the following arguments --
> >
> > -in-memory --no-telemetry --file-prefix vpp
> >
> > Is  there is something that can be done eg. passing some more parms in
> > the EAL initialization which hopefully wouldn't break VPP but will
> > also be friendly to the RTE timer and hash functions too, that would
> > be great, so requesting some advice here.
> >
> Hi,
>
> can you provide some more details on what the errors are that you are
> receiving? Have you been able to dig a little deeper into what might be
> causing the memory failures? The above flags alone are unlikely to cause
> issues with hash or timer libraries, for example.
>
> /Bruce


[Bug 1203] ice: cannot create 2 rte_flows with 2 actions, only with 1 action

2023-03-30 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1203

Bug ID: 1203
   Summary: ice: cannot create 2 rte_flows with 2 actions, only
with 1 action
   Product: DPDK
   Version: 23.03
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: maxime.le...@6wind.com
  Target Milestone: ---

kernel driver: 1.10.1.2.2
firmware-version: 4.10 0x80015191 1.3310.0
COMMS DDP: 1.3.37
ICE OS Default Package version 1.3.30.0
testpmd cmdline: ./build/app/dpdk-testpmd --log-level=.*ice.*,debug
--legacy-mem -c 7 -a 17:00.0 -a :17:00.1  --  -i --nb-cores=2 --nb-ports=2
--total-num-mbufs=2048


RTE_FLOWS rules can be created
--


With queue action:

testpmd> flow create 0 ingress pattern eth / ipv4 proto is 1  / end actions
queue index 0 / end
ice_flow_create(): Succeeded to create (2) flow
Flow rule #0 created
testpmd> flow create 0 ingress pattern eth / ipv4 / udp src is 22 / end 
actions queue index 0 / end
ice_flow_create(): Succeeded to create (2) flow
Flow rule #1 created

With mark action:

testpmd> flow create 0 ingress pattern eth / ipv4 proto is 1  / end actions
mark id 1 / end
ice_flow_create(): Succeeded to create (2) flow
Flow rule #0 created
testpmd> flow create 0 ingress pattern eth / ipv4 / udp src is 22 / end 
actions mark id 1 / end
ice_flow_create(): Succeeded to create (2) flow
Flow rule #1 created

RTE_FLOWS rules cannot be created
-

with mark + queue action:

testpmd> flow create 0 ingress pattern eth / ipv4 proto is 1  / end actions
mark id 1 / queue index 0 / end
ice_fdir_rx_parsing_enable(): FDIR processing on RX set to 1
ice_flow_create(): Succeeded to create (1) flow
Flow rule #0 created
testpmd> flow create 0 ingress pattern eth / ipv4 / udp src is 22 / end actions
 mark id 1 / queue index 0 / end
ice_fdir_cross_prof_conflict(): Failed to create profile for flow type 1 due to
conflict with existing rule of flow type 4.
ice_flow_create(): Failed to create flow
port_flow_complain(): Caught PMD error type 2 (flow rule (handle)): Profile
configure failed.: Invalid argument

with mark + passthru action:

testpmd> flow create 0 ingress pattern eth / ipv4 proto is 1  / end actions
mark id 1 / passthru / end
ice_fdir_rx_parsing_enable(): FDIR processing on RX set to 1
ice_flow_create(): Succeeded to create (1) flow
Flow rule #0 created
testpmd> flow create 0 ingress pattern eth / ipv4 / udp src is 22 / end actions
 mark id 1 / passthru / end
ice_fdir_cross_prof_conflict(): Failed to create profile for flow type 1 due to
conflict with existing rule of flow type 4.
ice_flow_create(): Failed to create flow
port_flow_complain(): Caught PMD error type 2 (flow rule (handle)): Profile
configure failed.: Invalid argument

Question


1. Does ice nics support to have several actions ?
2. What is the difference between MARK vs MARK+PASSTRHU ?
   It seems to be the same:
   http://git.dpdk.org/dpdk/commit/?id=0f664f7d57268f9ab9bdef95f0d48b3ce5004a61
   In this case, there is no reason to support MARK and not MARK+PASSTRHU with
2   flows.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 1204] ice: cannot create 2 rte_flows with MARK actions with dpdk 22.11.1, but can with dpdk 23.03.0-rc4

2023-03-30 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1204

Bug ID: 1204
   Summary: ice: cannot create 2 rte_flows with MARK actions with
dpdk 22.11.1, but can with dpdk 23.03.0-rc4
   Product: DPDK
   Version: 22.11
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: maxime.le...@6wind.com
  Target Milestone: ---

kernel driver: 1.10.1.2.2
firmware-version: 4.10 0x80015191 1.3310.0
COMMS DDP: 1.3.37
ICE OS Default Package version 1.3.30.0
testpmd cmdline: ./build/app/dpdk-testpmd --log-level=.*ice.*,debug
--legacy-mem -c 7 -a :17:00.0 -a :17:00.1  --  -i --nb-cores=2
--nb-ports=2 --total-num-mbufs=2048


With dpdk 23.03.0-rc4
-

testpmd> flow create 0 ingress pattern eth / ipv4 proto is 1  / end actions
mark id 1 / end
ice_flow_create(): Succeeded to create (2) flow
Flow rule #0 created
testpmd> flow create 0 ingress pattern eth / ipv4 / udp src is 22 / end 
actions mark id 1 / end
ice_flow_create(): Succeeded to create (2) flow
Flow rule #1 created

With dpdk 22.11.1
--

testpmd> flow create 0 ingress pattern eth / ipv4 proto is 1  / end actions
mark id 1 / end
ice_fdir_rx_parsing_enable(): FDIR processing on RX set to 1
ice_flow_create(): Succeeded to create (1) flow
Flow rule #0 created
testpmd>  flow create 0 ingress pattern eth / ipv4 / udp src is 22 / end 
actions mark id 1 / end
ice_fdir_cross_prof_conflict(): Failed to create profile for flow type 1 due to
conflict with existing rule of flow type 4.
ice_flow_create(): Failed to create flow
port_flow_complain(): Caught PMD error type 2 (flow rule (handle)): Profile
configure failed.: Invalid argument

-- 
You are receiving this mail because:
You are the assignee for the bug.

RE: [PATCH v1] raw/ifpga: check afu device before unplug

2023-03-30 Thread Zhang, Tianfei



> -Original Message-
> From: Huang, Wei 
> Sent: Monday, March 27, 2023 5:42 AM
> To: dev@dpdk.org; tho...@monjalon.net; david.march...@redhat.com
> Cc: sta...@dpdk.org; Xu, Rosen ; Zhang, Tianfei
> ; Zhang, Qi Z ; Huang, Wei
> 
> Subject: [PATCH v1] raw/ifpga: check afu device before unplug
> 
> AFU device may be already unplugged in IFPGA bus cleanup process, unplug AFU
> device only when it exists.
> 
> Signed-off-by: Wei Huang 
> ---
>  drivers/raw/ifpga/ifpga_rawdev.c | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/raw/ifpga/ifpga_rawdev.c 
> b/drivers/raw/ifpga/ifpga_rawdev.c
> index 1020adc..0d43c87 100644
> --- a/drivers/raw/ifpga/ifpga_rawdev.c
> +++ b/drivers/raw/ifpga/ifpga_rawdev.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "base/opae_hw_api.h"
>  #include "base/opae_ifpga_hw_api.h"
> @@ -1832,12 +1833,19 @@ static int ifpga_rawdev_get_string_arg(const char *key
> __rte_unused,
>   return ret;
>  }
> 
> +static int cmp_dev_name(const struct rte_device *dev, const void
> +*_name) {
> + const char *name = _name;
> + return strcmp(dev->name, name);
> +}
> +
>  static int
>  ifpga_cfg_remove(struct rte_vdev_device *vdev)  {
>   struct rte_rawdev *rawdev = NULL;
>   struct ifpga_rawdev *ifpga_dev;
>   struct ifpga_vdev_args args;
> + struct rte_bus *bus;
>   char dev_name[RTE_RAWDEV_NAME_MAX_LEN];
>   const char *vdev_name = NULL;
>   char *tmp_vdev = NULL;
> @@ -1864,7 +1872,13 @@ static int ifpga_rawdev_get_string_arg(const char *key
> __rte_unused,
> 
>   snprintf(dev_name, RTE_RAWDEV_NAME_MAX_LEN, "%d|%s",
>   args.port, args.bdf);
> - ret = rte_eal_hotplug_remove(RTE_STR(IFPGA_BUS_NAME), dev_name);
> + bus = rte_bus_find_by_name(RTE_STR(IFPGA_BUS_NAME));
> + if (bus) {
> + if (bus->find_device(NULL, cmp_dev_name, dev_name)) {
> + ret =
> rte_eal_hotplug_remove(RTE_STR(IFPGA_BUS_NAME),
> + dev_name);
> + }
> + }
> 

It looks good for me.
Acked-by: Tianfei Zhang 


RE: [PATCH] maintainers: update for FIPS validation

2023-03-30 Thread Dooley, Brian
Hi Gowrishankar,

> -Original Message-
> From: Gowrishankar Muthukrishnan 
> Sent: Wednesday 29 March 2023 12:01
> To: dev@dpdk.org
> Cc: jer...@marvell.com; ano...@marvell.com; Akhil Goyal
> ; Dooley, Brian ;
> Gowrishankar Muthukrishnan 
> Subject: [PATCH] maintainers: update for FIPS validation
> 
> Add co-maintainer for FIPS validation example.
> 
> Signed-off-by: Gowrishankar Muthukrishnan 
> ---
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 280058adfc..8df23e5099 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1809,6 +1809,7 @@ F: doc/guides/sample_app_ug/ethtool.rst
> 
>  FIPS validation example
>  M: Brian Dooley 
> +M: Gowrishankar Muthukrishnan 
>  F: examples/fips_validation/
>  F: doc/guides/sample_app_ug/fips_validation.rst
> 
> --
> 2.25.1

Acked-by: Brian Dooley 



Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP

2023-03-30 Thread Bruce Richardson
On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> Hi,
> 

FYI, when replying on list, it's best not to top-post, but put your replies
below the email snippet you are replying to.

> The hash creation API throws the following error --
> RING: Cannot reserve memory for tailq
> HASH: memory allocation failed
> 
> The timer subsystem init api throws this error --
> EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> memzone segments exceeds RTE_MAX_MEMZONE
> 

Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
file, so edit that and then rebuild DPDK. [If you are using the built-in
DPDK from VPP, you may need to do a patch for this, add it into the VPP
patches direction and then do a VPP rebuild.]

Let's see if we can get rid of at least one of the error messages. :-)

/Bruce

> I did check the code and apparently the memzone and rte zmalloc
> related api's are not being able to allocate memory.
> 
> Regards
> -Prashant
> 
> On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
>  wrote:
> >
> > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > Hi,
> > >
> > > While trying to port some code to VPP (which uses DPDK as the backend
> > > driver), I am running into a problem that calls to API's like
> > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > of memory.
> > >
> > > This is presumably because VPP inits the EAL with the following arguments 
> > > --
> > >
> > > -in-memory --no-telemetry --file-prefix vpp
> > >
> > > Is  there is something that can be done eg. passing some more parms in
> > > the EAL initialization which hopefully wouldn't break VPP but will
> > > also be friendly to the RTE timer and hash functions too, that would
> > > be great, so requesting some advice here.
> > >
> > Hi,
> >
> > can you provide some more details on what the errors are that you are
> > receiving? Have you been able to dig a little deeper into what might be
> > causing the memory failures? The above flags alone are unlikely to cause
> > issues with hash or timer libraries, for example.
> >
> > /Bruce


RE: [PATCH v5 1/3] ethdev: add API for buffer recycle mode

2023-03-30 Thread Feifei Wang



> -Original Message-
> From: Morten Brørup 
> Sent: Thursday, March 30, 2023 3:19 PM
> To: Feifei Wang ; tho...@monjalon.net; Ferruh
> Yigit ; Andrew Rybchenko
> 
> Cc: dev@dpdk.org; konstantin.v.anan...@yandex.ru; nd ;
> Honnappa Nagarahalli ; Ruifeng Wang
> 
> Subject: RE: [PATCH v5 1/3] ethdev: add API for buffer recycle mode
> 
> > From: Feifei Wang [mailto:feifei.wa...@arm.com]
> > Sent: Thursday, 30 March 2023 08.30
> >
> 
> [...]
> 
> > +/**
> > + * @internal
> > + * Rx routine for rte_eth_dev_buf_recycle().
> > + * Refill Rx descriptors in buffer recycle mode.
> > + *
> > + * @note
> > + * This API can only be called by rte_eth_dev_buf_recycle().
> > + * Before calling this API, rte_eth_tx_buf_stash() should be
> > + * called to stash Tx used buffers into Rx buffer ring.
> > + *
> > + * When this functionality is not implemented in the driver, the
> > +return
> > + * buffer number is 0.
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + * @param queue_id
> > + *   The index of the receive queue.
> > + *   The value must be in the range [0, nb_rx_queue - 1] previously
> supplied
> > + *   to rte_eth_dev_configure().
> > + *@param nb
> > + *   The number of Rx descriptors to be refilled.
> > + * @return
> > + *   The number Rx descriptors correct to be refilled.
> > + *   - ENODEV: bad port or queue (only if compiled with debug).
> 
> If you want errors reported by the return value, the function return type
> cannot be uint16_t.
Agree. Actually, in the code path, if errors happen, the function will return 0.
For this description line, I refer to 'rte_eth_tx_prepare' notes. Maybe we 
should delete
this line.

> 
> > + */
> > +static inline uint16_t rte_eth_rx_descriptors_refill(uint16_t port_id,
> > +   uint16_t queue_id, uint16_t nb)
> > +{
> > +   struct rte_eth_fp_ops *p;
> > +   void *qd;
> > +
> > +#ifdef RTE_ETHDEV_DEBUG_RX
> > +   if (port_id >= RTE_MAX_ETHPORTS ||
> > +   queue_id >= RTE_MAX_QUEUES_PER_PORT) {
> > +   RTE_ETHDEV_LOG(ERR,
> > +   "Invalid port_id=%u or queue_id=%u\n",
> > +   port_id, queue_id);
> > +   rte_errno = ENODEV;
> > +   return 0;
> 
> If p->rx_descriptors_refill() is likely to return 0, this function should not 
> use 0
> as return value to indicate errors.
However, refer to dpdk code style in ethdev, most of API write like this.
For example, 'rte_eth_rx/tx_burst', 'rte_eth_tx_prep'. 

I'm also confused what's return type for this due to I want
to indicate errors and show the processed buffer number.

> 
> > +   }
> > +#endif
> > +
> > +   p = &rte_eth_fp_ops[port_id];
> > +   qd = p->rxq.data[queue_id];
> > +
> > +#ifdef RTE_ETHDEV_DEBUG_RX
> > +   if (!rte_eth_dev_is_valid_port(port_id)) {
> > +   RTE_ETHDEV_LOG(ERR, "Invalid Rx port_id=%u\n", port_id);
> > +   rte_errno = ENODEV;
> > +   return 0;
> > +
> > +   if (qd == NULL) {
> > +   RTE_ETHDEV_LOG(ERR, "Invalid Rx queue_id=%u for
> port_id=%u\n",
> > +   queue_id, port_id);
> > +   rte_errno = ENODEV;
> > +   return 0;
> > +   }
> > +#endif
> > +
> > +   if (p->rx_descriptors_refill == NULL)
> > +   return 0;
> > +
> > +   return p->rx_descriptors_refill(qd, nb); }
> > +
> >  /**@{@name Rx hardware descriptor states
> >   * @see rte_eth_rx_descriptor_status
> >   */
> > @@ -6483,6 +6597,122 @@ rte_eth_tx_buffer(uint16_t port_id, uint16_t
> queue_id,
> > return rte_eth_tx_buffer_flush(port_id, queue_id, buffer);  }
> >
> > +/**
> > + * @internal
> > + * Tx routine for rte_eth_dev_buf_recycle().
> > + * Stash Tx used buffers into Rx buffer ring in buffer recycle mode.
> > + *
> > + * @note
> > + * This API can only be called by rte_eth_dev_buf_recycle().
> > + * After calling this API, rte_eth_rx_descriptors_refill() should be
> > + * called to refill Rx ring descriptors.
> > + *
> > + * When this functionality is not implemented in the driver, the
> > +return
> > + * buffer number is 0.
> > + *
> > + * @param port_id
> > + *   The port identifier of the Ethernet device.
> > + * @param queue_id
> > + *   The index of the transmit queue.
> > + *   The value must be in the range [0, nb_tx_queue - 1] previously
> supplied
> > + *   to rte_eth_dev_configure().
> > + * @param rxq_buf_recycle_info
> > + *   A pointer to a structure of Rx queue buffer ring information in buffer
> > + *   recycle mode.
> > + *
> > + * @return
> > + *   The number buffers correct to be filled in the Rx buffer ring.
> > + *   - ENODEV: bad port or queue (only if compiled with debug).
> 
> If you want errors reported by the return value, the function return type
> cannot be uint16_t.
> 
> > + */
> > +static inline uint16_t rte_eth_tx_buf_stash(uint16_t port_id,
> > +uint16_t
> > queue_id,
> > +   struct rte_eth_rxq_buf_recycle_info *rxq_buf_recycle_info)
> {
> > +   struct rte_eth_fp_ops *p;
> > +   void *qd;
> > +
> > +#ifdef R

[PATCH] examples/ipsec-secgw: fix zero address in ethernet header

2023-03-30 Thread Rahul Bhansali
During port init, src address stored in ethaddr_tbl is typecast
which violates the stric-aliasing rule and not reflecting
the updated source address in processed packets too.

Fixes: 6eb3ba0399 ("examples/ipsec-secgw: support poll mode NEON LPM lookup")

Signed-off-by: Rahul Bhansali 
---
 examples/ipsec-secgw/ipsec-secgw.c | 20 ++--
 examples/ipsec-secgw/ipsec-secgw.h |  2 +-
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/examples/ipsec-secgw/ipsec-secgw.c 
b/examples/ipsec-secgw/ipsec-secgw.c
index d2d9d85b4a..029749e522 100644
--- a/examples/ipsec-secgw/ipsec-secgw.c
+++ b/examples/ipsec-secgw/ipsec-secgw.c
@@ -99,10 +99,10 @@ uint32_t qp_desc_nb = 2048;
 #define MTU_TO_FRAMELEN(x) ((x) + RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN)
 
 struct ethaddr_info ethaddr_tbl[RTE_MAX_ETHPORTS] = {
-   { 0, ETHADDR(0x00, 0x16, 0x3e, 0x7e, 0x94, 0x9a) },
-   { 0, ETHADDR(0x00, 0x16, 0x3e, 0x22, 0xa1, 0xd9) },
-   { 0, ETHADDR(0x00, 0x16, 0x3e, 0x08, 0x69, 0x26) },
-   { 0, ETHADDR(0x00, 0x16, 0x3e, 0x49, 0x9e, 0xdd) }
+   { {{0}}, {{0x00, 0x16, 0x3e, 0x7e, 0x94, 0x9a}} },
+   { {{0}}, {{0x00, 0x16, 0x3e, 0x22, 0xa1, 0xd9}} },
+   { {{0}}, {{0x00, 0x16, 0x3e, 0x08, 0x69, 0x26}} },
+   { {{0}}, {{0x00, 0x16, 0x3e, 0x49, 0x9e, 0xdd}} }
 };
 
 struct offloads tx_offloads;
@@ -1427,9 +1427,8 @@ add_dst_ethaddr(uint16_t port, const struct 
rte_ether_addr *addr)
if (port >= RTE_DIM(ethaddr_tbl))
return -EINVAL;
 
-   ethaddr_tbl[port].dst = ETHADDR_TO_UINT64(addr);
-   rte_ether_addr_copy((struct rte_ether_addr *)ðaddr_tbl[port].dst,
-   (struct rte_ether_addr *)(val_eth + port));
+   rte_ether_addr_copy(addr, ðaddr_tbl[port].dst);
+   rte_ether_addr_copy(addr, (struct rte_ether_addr *)(val_eth + port));
return 0;
 }
 
@@ -1907,11 +1906,12 @@ port_init(uint16_t portid, uint64_t req_rx_offloads, 
uint64_t req_tx_offloads,
"Error getting MAC address (port %u): %s\n",
portid, rte_strerror(-ret));
 
-   ethaddr_tbl[portid].src = ETHADDR_TO_UINT64(ðaddr);
+   rte_ether_addr_copy(ðaddr, ðaddr_tbl[portid].src);
 
-   rte_ether_addr_copy((struct rte_ether_addr *)ðaddr_tbl[portid].dst,
+   rte_ether_addr_copy(ðaddr_tbl[portid].dst,
(struct rte_ether_addr *)(val_eth + portid));
-   rte_ether_addr_copy((struct rte_ether_addr *)ðaddr_tbl[portid].src,
+
+   rte_ether_addr_copy(ðaddr_tbl[portid].src,
(struct rte_ether_addr *)(val_eth + portid) + 1);
 
print_ethaddr("Address: ", ðaddr);
diff --git a/examples/ipsec-secgw/ipsec-secgw.h 
b/examples/ipsec-secgw/ipsec-secgw.h
index 0e0012d058..53665adf03 100644
--- a/examples/ipsec-secgw/ipsec-secgw.h
+++ b/examples/ipsec-secgw/ipsec-secgw.h
@@ -84,7 +84,7 @@ struct ipsec_traffic_nb {
 
 /* port/source ethernet addr and destination ethernet addr */
 struct ethaddr_info {
-   uint64_t src, dst;
+   struct rte_ether_addr src, dst;
 };
 
 struct ipsec_spd_stats {
-- 
2.25.1



[Bug 1205] iavf: cannot create 2 rte_flows with E810 VF, but can with E810 PF

2023-03-30 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1205

Bug ID: 1205
   Summary: iavf: cannot create 2 rte_flows with E810 VF, but can
with E810 PF
   Product: DPDK
   Version: 23.03
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: maxime.le...@6wind.com
  Target Milestone: ---

Environnement
-

distribution for host/vm: Ubuntu 22.04.2 LTS, kernel 5.15.0-67-generic
kernel driver: 1.10.1.2.2
firmware-version: 4.10 0x80015191 1.3310.0
COMMS DDP: 1.3.37
ICE OS Default Package version 1.3.30.0
testpmd cmdline: ./build/app/dpdk-testpmd --log-level=.*ice.*,debug
--legacy-mem -c 7 -a 17:00.0 -a :17:00.1  --  -i --nb-cores=2 --nb-ports=2
--total-num-mbufs=2048
dpdk version: 23.03.0-rc4
NIC: Intel Corporation Ethernet Controller E810-C for QSFP 

With PF (ice pmd)
-

Working case, no sriov, no VM. 

ICE PMD is able to create the following flows:

./build/app/dpdk-testpmd --log-level=.*ice.*,debug --legacy-mem -c 7 -a 17:00.0
-a :17:00.1  --  -i --nb-cores=2 --nb-ports=2 --total-num-mbufs=2048
testpmd> flow create 0 ingress pattern eth / ipv4 proto is 1  / end actions
queue index 0 / end
ice_flow_create(): Succeeded to create (2) flow
Flow rule #0 created
testpmd> flow create 0 ingress pattern eth / ipv4 / udp src is 22 / end 
actions queue index 0 / end
ice_flow_create(): Succeeded to create (2) flow
Flow rule #1 created


With VF (iavf pmd)
--

No working case sriov with a VM on the same device/hardware.

sriov devices:
 On PF : echo 1 > "/sys/bus/pci/devices/:17:00.0/sriov_numvfs" -> for VF
17.01.0
 On PF : echo 1 > "/sys/bus/pci/devices/:17:00.1/sriov_numvfs" -> for VF
17.11.0

QEMU ARGS: -device vfio-pci,host=:17:01.0,addr=04 -device
vfio-pci,host=:17:11.0,addr=05 


./build/app/dpdk-testpmd --log-level=.*iavf.*,debug -c 0x6   -a :00:04.0 -a
:00:05.0   --  -i  --total-num-mbufs=2048
testpmd> flow create 0 ingress pattern eth / ipv4 proto is 1  / end actions
queue index 0 / end
iavf_handle_virtchnl_msg(): adminq response is received, opcode = 47
iavf_fdir_add(): Succeed in adding rule request by PF
iavf_flow_create(): Succeeded to create (2) flow
Flow rule #0 created
testpmd> flow create 0 ingress pattern eth / ipv4 / udp src is 22 / end 
actions queue index 0 / end
iavf_handle_virtchnl_msg(): adminq response is received, opcode = 47
iavf_fdir_add(): Failed to add rule request due to the rule is conflict with
existing rule
iavf_flow_create(): Failed to create flow
port_flow_complain(): Caught PMD error type 2 (flow rule (handle)): Failed to
create parser engine.: Invalid argument

Conclusion
--
IAVF is not able to create the second flow. Because the kernel driver
1.10.1.2.2 rejects the creation of second flow. There are no such issue with
ICE pmd of dpdk 23.03.0-rc4.

-- 
You are receiving this mail because:
You are the assignee for the bug.

malloc_heap: Possible Control Block Overwrite When Insufficient Space in Elem

2023-03-30 Thread wuchangsheng (C)
Hello,

I seem to have discovered a problem in the heap memory allocation and 
deallocation operations.

|--||

  elem  padsizenewelem

In the malloc_elem_alloc function, when padsize > cache-line (such as 64 bytes) 
and padsize < sizeof(struct malloc_elem), the initialization of new_elem will 
overwrite and damage the struct malloc_elem information of elem, while setting 
the state of new_elem to ELEM_PAD. When releasing new_elem in malloc_elem_free, 
it will be converted to elem using RTE_PTR_SUB(new_elem, new_elem->pad), but at 
this point, the struct malloc_elem information of elem is damaged.



Re: [dpdk-dev] [PATCH] doc: deprecation notice to remove LiquidIO ethdev driver

2023-03-30 Thread Jerin Jacob
On Thu, Mar 9, 2023 at 5:16 PM Ferruh Yigit  wrote:
>
> On 3/9/2023 7:07 AM, jer...@marvell.com wrote:
> > From: Jerin Jacob 
> >
> > The LiquidIO product line(drivers/net/liquidio) has been substituted with
> > CN9K/CN10K OCTEON product line smart NICs located in drivers/net/octeon_ep/.
> > DPDK v20.08 has categorized the LiquidIO driver as UNMAINTAINED
> > because of the absence of updates in the driver.
> > Due to the above reasons, the driver will be unavailable from DPDK 23.07.
> >
> > Signed-off-by: Jerin Jacob 
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst 
> > b/doc/guides/rel_notes/deprecation.rst
> > index 872847e938..eb6c3aedd8 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -135,3 +135,9 @@ Deprecation Notices
> >Its removal has been postponed to let potential users report interest
> >in maintaining it.
> >In the absence of such interest, this library will be removed in DPDK 
> > 23.11.
> > +
> > +* net/liquidio: remove LiquidIO ethdev driver. The LiquidIO product line 
> > has been substituted
> > +  with CN9K/CN10K OCTEON product line smart NICs located in 
> > ``drivers/net/octeon_ep/``.
> > +  DPDK v20.08 has categorized the LiquidIO driver as UNMAINTAINED because 
> > of the absence of
> > +  updates in the driver. Due to the above reasons, the driver will be 
> > unavailable from DPDK 23.07.
> > +
>
> Acked-by: Ferruh Yigit 

Ping for merge.


Re: [dpdk-dev] [PATCH] doc: deprecation notice to remove net/bnx2x driver

2023-03-30 Thread Jerin Jacob
On Tue, Mar 21, 2023 at 4:11 PM Ferruh Yigit  wrote:
>
> On 3/17/2023 6:02 PM, Alok Prasad wrote:
> >> -Original Message-
> >> From: jer...@marvell.com 
> >> Sent: 17 March 2023 18:00
> >> To: dev@dpdk.org
> >> Cc: tho...@monjalon.net; david.march...@redhat.com; ferruh.yi...@amd.com; 
> >> andrew.rybche...@oktetlabs.ru; Alok Prasad
> >> ; Devendra Singh Rawat ; Jerin 
> >> Jacob Kollanukkaran 
> >> Subject: [dpdk-dev] [PATCH] doc: deprecation notice to remove net/bnx2x 
> >> driver
> >>
> >> From: Jerin Jacob 
> >>
> >> Starting from DPDK 23.07, the Marvell QLogic bnx2x driver
> >> will be removed. This decision has been made to alleviate the burden of
> >> maintaining a discontinued product.
> >>
> >> Signed-off-by: Jerin Jacob 
> >> ---
> >>  doc/guides/rel_notes/deprecation.rst | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/doc/guides/rel_notes/deprecation.rst 
> >> b/doc/guides/rel_notes/deprecation.rst
> >> index 872847e938..d3d8d0011c 100644
> >> --- a/doc/guides/rel_notes/deprecation.rst
> >> +++ b/doc/guides/rel_notes/deprecation.rst
> >> @@ -135,3 +135,6 @@ Deprecation Notices
> >>Its removal has been postponed to let potential users report interest
> >>in maintaining it.
> >>In the absence of such interest, this library will be removed in DPDK 
> >> 23.11.
> >> +
> >> +* net/bnx2x: Starting from DPDK 23.07, the Marvell QLogic bnx2x driver 
> >> will be removed.
> >> +  This decision has been made to alleviate the burden of maintaining a 
> >> discontinued product.
> >> --
> >> 2.40.0
> >
> > Thanks Jerin!
> >
> > Acked-by: Alok Prasad 
>
>
> Acked-by: Ferruh Yigit 

Ping for merge.


Re: [dpdk-web] [RFC PATCH] process: new library approval in principle

2023-03-30 Thread Jerin Jacob
On Wed, Mar 15, 2023 at 7:17 PM Jerin Jacob  wrote:
>
> On Fri, Mar 3, 2023 at 11:55 PM Thomas Monjalon  wrote:
> >
> > Thanks for formalizing our process.
>
> Thanks for the review.

Ping

>
> >
> > 13/02/2023 10:26, jer...@marvell.com:
> > > --- /dev/null
> > > +++ b/content/process/_index.md
> >
> > First question: is the website the best place for this process?
> >
> > Inside the code guides, we have a contributing section,
> > but I'm not sure it is a good fit for the decision process.
> >
> > In the website, you are creating a new page "process".
> > Is it what we want?
> > What about making it a sub-page of "Technical Board"?
>
> Since it is a process, I thought of keeping "process" page.
> No specific opinion on where to add it.
> If not other objections, Then I can add at
> doc/guides/contributing/new_library_policy.rst in DPDK repo.
> Let me know if you think better name or better place to keep the file
>
> >
> > > @@ -0,0 +1,33 @@
> > > 
> > > +title = "Process"
> > > +weight = "9"
> > > 
> > > +
> > > +## Process for new library approval in principle
> > > +
> > > +### Rational
> >
> > s/Rational/Rationale/
>
> Ack
>
> >
> > > +
> > > +Adding a new library to DPDK codebase with proper RFC and then full 
> > > patch-sets is
> > > +significant work and getting early approval-in-principle that a library 
> > > help DPDK contributors
> > > +avoid wasted effort if it is not suitable for various reasons.
> >
> > That's a long sentence we could split.
>
> OK Changing as:
>
> Adding a new library to DPDK codebase with proper RFC and full
> patch-sets is significant work.
>
> Getting early approval-in-principle that a library can help DPDK
> contributors avoid wasted effort
> if it is not suitable for various reasons
>
>
> >
> > > +
> > > +### Process
> > > +
> > > +1. When a contributor would like to add a new library to DPDK code base, 
> > > the contributor must send
> > > +the following items to DPDK mailing list for TB approval-in-principle.
> >
> > I think we can remove "code base".
>
> Ack
>
> >
> > TB should be explained: Technical Board.
>
> Ack
>
> >
> > > +
> > > +   - Purpose of the library.
> > > +   - Scope of the library.
> >
> > Not sure I understand the difference between Purpose and Scope.
>
> Purpose → The need for the library
> Scope → I meant the work scope associated with it.
>
> I will change "Scope of the library" to,
>
> - Scope of work: Outline the various additional tasks planned for this
> library, such as developing new test applications, adding new drivers,
> and updating existing applications.
>
> >
> > > +   - Any licensing constraints.
> > > +   - Justification for adding to DPDK.
> > > +   - Any other implementations of the same functionality in other 
> > > libs/products and how this version differs.
> >
> > libs/products -> libraries/projects
>
> Ack
>
> >
> > > +   - Public API specification header file as RFC
> > > +   - Optional and good to have.
> >
> > You mean providing API is optional at this stage?
>
> Yes. I think, TB can request if more clarity is needed as mentioned below.
> "TB may additionally request this collateral if needed to get more
> clarity on scope and purpose"
>
> >
> > > +   - TB may additionally request this collateral if needed to get 
> > > more clarity on scope and purpose.
> > > +
> > > +2. TB to schedule discussion on this in upcoming TB meeting along with 
> > > author. Based on the TB
> > > +schedule and/or author availability, TB may need maximum three TB 
> > > meeting slots.
> >
> > Better to translate the delay into weeks: 5 weeks?
>
> Ack
>
> >
> > > +
> > > +3. Based on mailing list and TB meeting discussions, TB to vote for 
> > > approval-in-principle and share
> > > +the decision in the mailing list.
> >
> > I think we should say here that it is safe to start working
> > on the implementation after this step,
> > but the patches will need to match usual quality criterias
> > to be effectively accepted.
>
> OK.
>
> I will add the following,
>
> 4.  Once TB approves the library in principle, it is safe to start
> working on its implementation.
> However, the patches will need to meet the usual quality criteria in
> order to be effectively accepted.
>
>
> >
> >


Should we try to be more graceful in library init on old Hardware?

2023-03-30 Thread Christian Ehrhardt
Hi,
I've recently gotten a kind of bug I was waiting for many years.
In fact I wondered if it would still come up as each year  made it less likely.
But it happened and I got a crash report of someone using dpdk a
rather old pre sse4.2 hardware.
=> https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/2009635/comments/9

The reporter was nice and tried the newer 22.11, but that is just as affected.

I understand that DPDK, as a project, has set this as the minimal
accepted hardware capability.
But due to some programs - in this case UHD - being able to do many
other things it might happen that UHD or any else just links to DPDK
(as it could be used with it) and due to that runs into a crash when
loading. In theory other tools like collectd which has dpdk support
would be affected by the same.

Example:
root@1bee22d20ca0:/# uhd_usrp_probe
Illegal instruction (core dumped)

(gdb) bt
#0 0x7f4b2d3a3374 in rte_srand () from
/lib/x86_64-linux-gnu/librte_eal.so.23
#1 0x7f4b2d3967ec in ?? () from /lib/x86_64-linux-gnu/librte_eal.so.23
#2 0x7f4b2e5d1fbe in call_init (l=,
argc=argc@entry=1, argv=argv@entry=0x7ffeabf5b488,
env=env@entry=0x7ffeabf5b498)
at ./elf/dl-init.c:70
#3 0x7f4b2e5d20a8 in call_init (env=0x7ffeabf5b498,
argv=0x7ffeabf5b488, argc=1, l=) at ./elf/dl-init.c:33
#4 _dl_init (main_map=0x7f4b2e6042e0, argc=1, argv=0x7ffeabf5b488,
env=0x7ffeabf5b498) at ./elf/dl-init.c:117
#5 0x7f4b2e5ea8b0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#6 0x0001 in ?? ()
#7 0x7ffeabf5c844 in ?? ()
#8 0x in ?? ()

Right now all we could do is:
a) say bad luck old hardware (not nice)
b) make super complex alternative builds with and without dpdk support
c) ask the DPDK project to work on non sse4.2 (unlikely and too late
in 2023 I guess)
d) Somehow make the initialization graceful (that is what I'm RFC here)

If we could manage to get that DPDK to ensure the lib loading paths
are SSE4.2 free.
Then we could check the capabilities on the actual initialization and
return a proper bad result instead of a crash.
Due to that only real-users of DPDK would be required to have
sufficiently new hardware.
And OTOH users of software that links, but in the current config would
not use DPDK would suffer less.

WDYT?
Maybe it has been already discussed and I did neither remember nor find it?

-- 
Christian Ehrhardt
Senior Staff Engineer, Ubuntu Server
Canonical Ltd


Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP

2023-03-30 Thread Prashant Upadhyaya
On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
 wrote:
>
> On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > Hi,
> >
>
> FYI, when replying on list, it's best not to top-post, but put your replies
> below the email snippet you are replying to.
>
> > The hash creation API throws the following error --
> > RING: Cannot reserve memory for tailq
> > HASH: memory allocation failed
> >
> > The timer subsystem init api throws this error --
> > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > memzone segments exceeds RTE_MAX_MEMZONE
> >
>
> Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
> file, so edit that and then rebuild DPDK. [If you are using the built-in
> DPDK from VPP, you may need to do a patch for this, add it into the VPP
> patches direction and then do a VPP rebuild.]
>
> Let's see if we can get rid of at least one of the error messages. :-)
>
> /Bruce
>
> > I did check the code and apparently the memzone and rte zmalloc
> > related api's are not being able to allocate memory.
> >
> > Regards
> > -Prashant
> >
> > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> >  wrote:
> > >
> > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > Hi,
> > > >
> > > > While trying to port some code to VPP (which uses DPDK as the backend
> > > > driver), I am running into a problem that calls to API's like
> > > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > > of memory.
> > > >
> > > > This is presumably because VPP inits the EAL with the following 
> > > > arguments --
> > > >
> > > > -in-memory --no-telemetry --file-prefix vpp
> > > >
> > > > Is  there is something that can be done eg. passing some more parms in
> > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > also be friendly to the RTE timer and hash functions too, that would
> > > > be great, so requesting some advice here.
> > > >
> > > Hi,
> > >
> > > can you provide some more details on what the errors are that you are
> > > receiving? Have you been able to dig a little deeper into what might be
> > > causing the memory failures? The above flags alone are unlikely to cause
> > > issues with hash or timer libraries, for example.
> > >
> > > /Bruce

Thanks Bruce, the error comes from the following function in
lib/eal/common/eal_common_memzone.c
memzone_reserve_aligned_thread_unsafe

The condition which spits out the error is the following
if (arr->count >= arr->len)
So I printed both of the above values inside this function, and the
following output came

vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix vpp
[New Thread 0x7fffa67b6700 (LWP 14732)]
count: 0 len: 2560
count: 1 len: 2560
count: 2 len: 2560
[New Thread 0x7fffa5fb5700 (LWP 14733)]
[New Thread 0x7fffa5db4700 (LWP 14734)]
count: 3 len: 2560
count: 4 len: 2560
### this is the place where I call rte_timer_subsystem_init from my
code, the above must be coming from any other code from VPP/EAL init,
the line below is surely because of my call to
rte_timer_subsystem_init
count: 0 len: 0

So as you can see that both values are coming to be zero -- is this
expected ? I thought the arr->len should have been non zero.
I must add that the thread which is calling the
rte_timer_subsystem_init is possibly different than the one which did
the eal init, do you think that might be a problem...
I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
the above first for any suggestions.

Regards
-Prashant


Re: [PATCH 2/2] net/gve: update copyright holders

2023-03-30 Thread Thomas Monjalon
30/03/2023 09:20, Guo, Junfeng:
> From: Thomas Monjalon 
> > 28/03/2023 11:35, Guo, Junfeng:
> > > The background is that, in the past (DPDK 22.11) we didn't get the
> > approval
> > > of license from Google, thus chose the MIT License for the base code,
> > and
> > > BSD-3 License for GVE common code (without the files in /base folder).
> > > We also left the copyright holder of base code just to Google Inc, and
> > made
> > > Intel as the copyright holder of GVE common code (without /base
> > folder).
> > >
> > > Today we are working together for GVE dev and maintaining. And we
> > got
> > > the approval of BSD-3 License from Google for the base code.
> > > Thus we dicided to 1) switch the License of GVE base code from MIT to
> > BSD-3;
> > > 2) add Google LLC as one of the copyright holders for GVE common
> > code.
> > 
> > Do you realize we had lenghty discussions in the Technical Board,
> > the Governing Board, and with lawyers, just for this unneeded exception?
> > 
> > Now looking at the patches, there seem to be some big mistakes like
> > removing some copyright. I don't understand how it can be taken so
> > lightly.
> > 
> > I regret how fast we were, next time we will surely operate differently.
> > If you want to improve the reputation of this driver,
> > please ask other copyright holders to be more active and responsive.
> > 
> 
> Really sorry for causing such severe trouble.
> 
> Yes, we did take lots of efforts in the Technical Board and the Governing
> Board about this MIT exception. We really appreciate that.
> 
> About this patch set, it is my severe mistake to switch the MIT License
> directly for the upstream-ed code in community, in the wrong way.
> In the past we upstream-ed this driver with MIT License followed from
> the kernel community's gve driver base code. And now we want to
> use the code with BSD-3 License (approved by Google). 
> So I suppose that the correct way may be 1) first remove all these code 
> under MIT License and 2) then add the new files under BSD-3 License.

The code under BSD is different of the MIT code?
If it is the same with a new approved license, you can just change the license.

> Please correct me if there are still misunderstanding in my statement. 
> Thanks Thomas for pointing out my mistake. I'll be careful to fix this.
> 
> Copyright holder for the gve base code will stay unchanged. Google LLC 
> will be added as one of the copyright holders for the gve common code.
> @Rushil Gupta Please also be more active and responsive for the code
> review and contribution in the community. Thanks!





Re: Should we try to be more graceful in library init on old Hardware?

2023-03-30 Thread Bruce Richardson
On Thu, Mar 30, 2023 at 02:53:41PM +0200, Christian Ehrhardt wrote:
> Hi,
> I've recently gotten a kind of bug I was waiting for many years.
> In fact I wondered if it would still come up as each year  made it less 
> likely.
> But it happened and I got a crash report of someone using dpdk a
> rather old pre sse4.2 hardware.
> => https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/2009635/comments/9
> 
> The reporter was nice and tried the newer 22.11, but that is just as affected.
> 
> I understand that DPDK, as a project, has set this as the minimal
> accepted hardware capability.
> But due to some programs - in this case UHD - being able to do many
> other things it might happen that UHD or any else just links to DPDK
> (as it could be used with it) and due to that runs into a crash when
> loading. In theory other tools like collectd which has dpdk support
> would be affected by the same.
> 
> Example:
> root@1bee22d20ca0:/# uhd_usrp_probe
> Illegal instruction (core dumped)
> 
> (gdb) bt
> #0 0x7f4b2d3a3374 in rte_srand () from
> /lib/x86_64-linux-gnu/librte_eal.so.23
> #1 0x7f4b2d3967ec in ?? () from /lib/x86_64-linux-gnu/librte_eal.so.23
> #2 0x7f4b2e5d1fbe in call_init (l=,
> argc=argc@entry=1, argv=argv@entry=0x7ffeabf5b488,
> env=env@entry=0x7ffeabf5b498)
> at ./elf/dl-init.c:70
> #3 0x7f4b2e5d20a8 in call_init (env=0x7ffeabf5b498,
> argv=0x7ffeabf5b488, argc=1, l=) at ./elf/dl-init.c:33
> #4 _dl_init (main_map=0x7f4b2e6042e0, argc=1, argv=0x7ffeabf5b488,
> env=0x7ffeabf5b498) at ./elf/dl-init.c:117
> #5 0x7f4b2e5ea8b0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
> #6 0x0001 in ?? ()
> #7 0x7ffeabf5c844 in ?? ()
> #8 0x in ?? ()
> 
> Right now all we could do is:
> a) say bad luck old hardware (not nice)
> b) make super complex alternative builds with and without dpdk support
> c) ask the DPDK project to work on non sse4.2 (unlikely and too late
> in 2023 I guess)
> d) Somehow make the initialization graceful (that is what I'm RFC here)
> 
> If we could manage to get that DPDK to ensure the lib loading paths
> are SSE4.2 free.
> Then we could check the capabilities on the actual initialization and
> return a proper bad result instead of a crash.
> Due to that only real-users of DPDK would be required to have
> sufficiently new hardware.
> And OTOH users of software that links, but in the current config would
> not use DPDK would suffer less.
> 
> WDYT?
> Maybe it has been already discussed and I did neither remember nor find it?
> 
It certainly hasn't been discussed previously, but there is meant to be
support for this in EAL init itself. Almost the first function called
from eal_init() is "rte_cpu_is_supported()" [1] which checks the build-time
CPU flags against those of the current system.
Unfortunately, from the error message you are getting, that doesn't seem to
be working ok in the case of SSE4.2. It seems the compiler is inserting
SSE4 instructions before we even get to that point. :-(

Perhaps we need to move eal init to a new file, and compile it (and the
cpuflag checks) with very minimal CPU flags.

/Bruce


[1] http://git.dpdk.org/dpdk/tree/lib/eal/common/eal_common_cpuflags.c


Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP

2023-03-30 Thread Bruce Richardson
On Thu, Mar 30, 2023 at 06:42:58PM +0530, Prashant Upadhyaya wrote:
> On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
>  wrote:
> >
> > On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > > Hi,
> > >
> >
> > FYI, when replying on list, it's best not to top-post, but put your replies
> > below the email snippet you are replying to.
> >
> > > The hash creation API throws the following error --
> > > RING: Cannot reserve memory for tailq
> > > HASH: memory allocation failed
> > >
> > > The timer subsystem init api throws this error --
> > > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > > memzone segments exceeds RTE_MAX_MEMZONE
> > >
> >
> > Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
> > file, so edit that and then rebuild DPDK. [If you are using the built-in
> > DPDK from VPP, you may need to do a patch for this, add it into the VPP
> > patches direction and then do a VPP rebuild.]
> >
> > Let's see if we can get rid of at least one of the error messages. :-)
> >
> > /Bruce
> >
> > > I did check the code and apparently the memzone and rte zmalloc
> > > related api's are not being able to allocate memory.
> > >
> > > Regards
> > > -Prashant
> > >
> > > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> > >  wrote:
> > > >
> > > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > > Hi,
> > > > >
> > > > > While trying to port some code to VPP (which uses DPDK as the backend
> > > > > driver), I am running into a problem that calls to API's like
> > > > > rte_timer_subsystem_init, rte_hash_create are failing while allocation
> > > > > of memory.
> > > > >
> > > > > This is presumably because VPP inits the EAL with the following 
> > > > > arguments --
> > > > >
> > > > > -in-memory --no-telemetry --file-prefix vpp
> > > > >
> > > > > Is  there is something that can be done eg. passing some more parms in
> > > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > > also be friendly to the RTE timer and hash functions too, that would
> > > > > be great, so requesting some advice here.
> > > > >
> > > > Hi,
> > > >
> > > > can you provide some more details on what the errors are that you are
> > > > receiving? Have you been able to dig a little deeper into what might be
> > > > causing the memory failures? The above flags alone are unlikely to cause
> > > > issues with hash or timer libraries, for example.
> > > >
> > > > /Bruce
> 
> Thanks Bruce, the error comes from the following function in
> lib/eal/common/eal_common_memzone.c
> memzone_reserve_aligned_thread_unsafe
> 
> The condition which spits out the error is the following
> if (arr->count >= arr->len)
> So I printed both of the above values inside this function, and the
> following output came
> 
> vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix vpp
> [New Thread 0x7fffa67b6700 (LWP 14732)]
> count: 0 len: 2560
> count: 1 len: 2560
> count: 2 len: 2560
> [New Thread 0x7fffa5fb5700 (LWP 14733)]
> [New Thread 0x7fffa5db4700 (LWP 14734)]
> count: 3 len: 2560
> count: 4 len: 2560
> ### this is the place where I call rte_timer_subsystem_init from my
> code, the above must be coming from any other code from VPP/EAL init,
> the line below is surely because of my call to
> rte_timer_subsystem_init
> count: 0 len: 0
> 
> So as you can see that both values are coming to be zero -- is this
> expected ? I thought the arr->len should have been non zero.
> I must add that the thread which is calling the
> rte_timer_subsystem_init is possibly different than the one which did
> the eal init, do you think that might be a problem...
> I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
> the above first for any suggestions.
> 
Given the lengths you printed above, increasing the MAX_MEMZONE will not
help things. Is the init call which is failing coming from a non-DPDK
thread?


Re: Should we try to be more graceful in library init on old Hardware?

2023-03-30 Thread Bruce Richardson
On Thu, Mar 30, 2023 at 02:15:42PM +0100, Bruce Richardson wrote:
> On Thu, Mar 30, 2023 at 02:53:41PM +0200, Christian Ehrhardt wrote:
> > Hi,
> > I've recently gotten a kind of bug I was waiting for many years.
> > In fact I wondered if it would still come up as each year  made it less 
> > likely.
> > But it happened and I got a crash report of someone using dpdk a
> > rather old pre sse4.2 hardware.
> > => https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/2009635/comments/9
> > 
> > The reporter was nice and tried the newer 22.11, but that is just as 
> > affected.
> > 
> > I understand that DPDK, as a project, has set this as the minimal
> > accepted hardware capability.
> > But due to some programs - in this case UHD - being able to do many
> > other things it might happen that UHD or any else just links to DPDK
> > (as it could be used with it) and due to that runs into a crash when
> > loading. In theory other tools like collectd which has dpdk support
> > would be affected by the same.
> > 
> > Example:
> > root@1bee22d20ca0:/# uhd_usrp_probe
> > Illegal instruction (core dumped)
> > 
> > (gdb) bt
> > #0 0x7f4b2d3a3374 in rte_srand () from
> > /lib/x86_64-linux-gnu/librte_eal.so.23
> > #1 0x7f4b2d3967ec in ?? () from /lib/x86_64-linux-gnu/librte_eal.so.23
> > #2 0x7f4b2e5d1fbe in call_init (l=,
> > argc=argc@entry=1, argv=argv@entry=0x7ffeabf5b488,
> > env=env@entry=0x7ffeabf5b498)
> > at ./elf/dl-init.c:70
> > #3 0x7f4b2e5d20a8 in call_init (env=0x7ffeabf5b498,
> > argv=0x7ffeabf5b488, argc=1, l=) at ./elf/dl-init.c:33
> > #4 _dl_init (main_map=0x7f4b2e6042e0, argc=1, argv=0x7ffeabf5b488,
> > env=0x7ffeabf5b498) at ./elf/dl-init.c:117
> > #5 0x7f4b2e5ea8b0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
> > #6 0x0001 in ?? ()
> > #7 0x7ffeabf5c844 in ?? ()
> > #8 0x in ?? ()
> > 
> > Right now all we could do is:
> > a) say bad luck old hardware (not nice)
> > b) make super complex alternative builds with and without dpdk support
> > c) ask the DPDK project to work on non sse4.2 (unlikely and too late
> > in 2023 I guess)
> > d) Somehow make the initialization graceful (that is what I'm RFC here)
> > 
> > If we could manage to get that DPDK to ensure the lib loading paths
> > are SSE4.2 free.
> > Then we could check the capabilities on the actual initialization and
> > return a proper bad result instead of a crash.
> > Due to that only real-users of DPDK would be required to have
> > sufficiently new hardware.
> > And OTOH users of software that links, but in the current config would
> > not use DPDK would suffer less.
> > 
> > WDYT?
> > Maybe it has been already discussed and I did neither remember nor find it?
> > 
> It certainly hasn't been discussed previously, but there is meant to be
> support for this in EAL init itself. Almost the first function called
> from eal_init() is "rte_cpu_is_supported()" [1] which checks the build-time
> CPU flags against those of the current system.
> Unfortunately, from the error message you are getting, that doesn't seem to
> be working ok in the case of SSE4.2. It seems the compiler is inserting
> SSE4 instructions before we even get to that point. :-(
> 
> Perhaps we need to move eal init to a new file, and compile it (and the
> cpuflag checks) with very minimal CPU flags.
> 

Following up to my own mail...

I believe we may be able to solve this easier by maybe using the "target"
attribute for those functions. For x86 builds I don't see why eal init
cannot be compiled for an earlier SSE version, (march=core2, perhaps). It's
not a performance-sensitive function.

Thoughts?
/Bruce


Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP

2023-03-30 Thread Prashant Upadhyaya
On Thu, Mar 30, 2023 at 6:47 PM Bruce Richardson
 wrote:
>
> On Thu, Mar 30, 2023 at 06:42:58PM +0530, Prashant Upadhyaya wrote:
> > On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
> >  wrote:
> > >
> > > On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > > > Hi,
> > > >
> > >
> > > FYI, when replying on list, it's best not to top-post, but put your 
> > > replies
> > > below the email snippet you are replying to.
> > >
> > > > The hash creation API throws the following error --
> > > > RING: Cannot reserve memory for tailq
> > > > HASH: memory allocation failed
> > > >
> > > > The timer subsystem init api throws this error --
> > > > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > > > memzone segments exceeds RTE_MAX_MEMZONE
> > > >
> > >
> > > Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's rte_config.h
> > > file, so edit that and then rebuild DPDK. [If you are using the built-in
> > > DPDK from VPP, you may need to do a patch for this, add it into the VPP
> > > patches direction and then do a VPP rebuild.]
> > >
> > > Let's see if we can get rid of at least one of the error messages. :-)
> > >
> > > /Bruce
> > >
> > > > I did check the code and apparently the memzone and rte zmalloc
> > > > related api's are not being able to allocate memory.
> > > >
> > > > Regards
> > > > -Prashant
> > > >
> > > > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> > > >  wrote:
> > > > >
> > > > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > > > Hi,
> > > > > >
> > > > > > While trying to port some code to VPP (which uses DPDK as the 
> > > > > > backend
> > > > > > driver), I am running into a problem that calls to API's like
> > > > > > rte_timer_subsystem_init, rte_hash_create are failing while 
> > > > > > allocation
> > > > > > of memory.
> > > > > >
> > > > > > This is presumably because VPP inits the EAL with the following 
> > > > > > arguments --
> > > > > >
> > > > > > -in-memory --no-telemetry --file-prefix vpp
> > > > > >
> > > > > > Is  there is something that can be done eg. passing some more parms 
> > > > > > in
> > > > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > > > also be friendly to the RTE timer and hash functions too, that would
> > > > > > be great, so requesting some advice here.
> > > > > >
> > > > > Hi,
> > > > >
> > > > > can you provide some more details on what the errors are that you are
> > > > > receiving? Have you been able to dig a little deeper into what might 
> > > > > be
> > > > > causing the memory failures? The above flags alone are unlikely to 
> > > > > cause
> > > > > issues with hash or timer libraries, for example.
> > > > >
> > > > > /Bruce
> >
> > Thanks Bruce, the error comes from the following function in
> > lib/eal/common/eal_common_memzone.c
> > memzone_reserve_aligned_thread_unsafe
> >
> > The condition which spits out the error is the following
> > if (arr->count >= arr->len)
> > So I printed both of the above values inside this function, and the
> > following output came
> >
> > vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix 
> > vpp
> > [New Thread 0x7fffa67b6700 (LWP 14732)]
> > count: 0 len: 2560
> > count: 1 len: 2560
> > count: 2 len: 2560
> > [New Thread 0x7fffa5fb5700 (LWP 14733)]
> > [New Thread 0x7fffa5db4700 (LWP 14734)]
> > count: 3 len: 2560
> > count: 4 len: 2560
> > ### this is the place where I call rte_timer_subsystem_init from my
> > code, the above must be coming from any other code from VPP/EAL init,
> > the line below is surely because of my call to
> > rte_timer_subsystem_init
> > count: 0 len: 0
> >
> > So as you can see that both values are coming to be zero -- is this
> > expected ? I thought the arr->len should have been non zero.
> > I must add that the thread which is calling the
> > rte_timer_subsystem_init is possibly different than the one which did
> > the eal init, do you think that might be a problem...
> > I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
> > the above first for any suggestions.
> >
> Given the lengths you printed above, increasing the MAX_MEMZONE will not
> help things. Is the init call which is failing coming from a non-DPDK
> thread?

Likely yes, at the moment I am calling it from a CLI which I have added in VPP.
Assuming this is the case, do you foresee a problem ?


Re: Regarding DPDK API's like rte_timer_subsystem_init/rte_hash_create etc. in VPP

2023-03-30 Thread Bruce Richardson
On Thu, Mar 30, 2023 at 07:07:23PM +0530, Prashant Upadhyaya wrote:
> On Thu, Mar 30, 2023 at 6:47 PM Bruce Richardson
>  wrote:
> >
> > On Thu, Mar 30, 2023 at 06:42:58PM +0530, Prashant Upadhyaya wrote:
> > > On Thu, Mar 30, 2023 at 2:50 PM Bruce Richardson
> > >  wrote:
> > > >
> > > > On Thu, Mar 30, 2023 at 01:57:52PM +0530, Prashant Upadhyaya wrote:
> > > > > Hi,
> > > > >
> > > >
> > > > FYI, when replying on list, it's best not to top-post, but put your 
> > > > replies
> > > > below the email snippet you are replying to.
> > > >
> > > > > The hash creation API throws the following error --
> > > > > RING: Cannot reserve memory for tailq
> > > > > HASH: memory allocation failed
> > > > >
> > > > > The timer subsystem init api throws this error --
> > > > > EAL: memzone_reserve_aligned_thread_unsafe(): Number of requested
> > > > > memzone segments exceeds RTE_MAX_MEMZONE
> > > > >
> > > >
> > > > Can you try increasing RTE_MAX_MEMZONE. It' defined in DPDK's 
> > > > rte_config.h
> > > > file, so edit that and then rebuild DPDK. [If you are using the built-in
> > > > DPDK from VPP, you may need to do a patch for this, add it into the VPP
> > > > patches direction and then do a VPP rebuild.]
> > > >
> > > > Let's see if we can get rid of at least one of the error messages. :-)
> > > >
> > > > /Bruce
> > > >
> > > > > I did check the code and apparently the memzone and rte zmalloc
> > > > > related api's are not being able to allocate memory.
> > > > >
> > > > > Regards
> > > > > -Prashant
> > > > >
> > > > > On Thu, Mar 30, 2023 at 1:30 PM Bruce Richardson
> > > > >  wrote:
> > > > > >
> > > > > > On Thu, Mar 30, 2023 at 10:30:24AM +0530, Prashant Upadhyaya wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > While trying to port some code to VPP (which uses DPDK as the 
> > > > > > > backend
> > > > > > > driver), I am running into a problem that calls to API's like
> > > > > > > rte_timer_subsystem_init, rte_hash_create are failing while 
> > > > > > > allocation
> > > > > > > of memory.
> > > > > > >
> > > > > > > This is presumably because VPP inits the EAL with the following 
> > > > > > > arguments --
> > > > > > >
> > > > > > > -in-memory --no-telemetry --file-prefix vpp
> > > > > > >
> > > > > > > Is  there is something that can be done eg. passing some more 
> > > > > > > parms in
> > > > > > > the EAL initialization which hopefully wouldn't break VPP but will
> > > > > > > also be friendly to the RTE timer and hash functions too, that 
> > > > > > > would
> > > > > > > be great, so requesting some advice here.
> > > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > can you provide some more details on what the errors are that you 
> > > > > > are
> > > > > > receiving? Have you been able to dig a little deeper into what 
> > > > > > might be
> > > > > > causing the memory failures? The above flags alone are unlikely to 
> > > > > > cause
> > > > > > issues with hash or timer libraries, for example.
> > > > > >
> > > > > > /Bruce
> > >
> > > Thanks Bruce, the error comes from the following function in
> > > lib/eal/common/eal_common_memzone.c
> > > memzone_reserve_aligned_thread_unsafe
> > >
> > > The condition which spits out the error is the following
> > > if (arr->count >= arr->len)
> > > So I printed both of the above values inside this function, and the
> > > following output came
> > >
> > > vpp[14728]: dpdk: EAL init args: --in-memory --no-telemetry --file-prefix 
> > > vpp
> > > [New Thread 0x7fffa67b6700 (LWP 14732)]
> > > count: 0 len: 2560
> > > count: 1 len: 2560
> > > count: 2 len: 2560
> > > [New Thread 0x7fffa5fb5700 (LWP 14733)]
> > > [New Thread 0x7fffa5db4700 (LWP 14734)]
> > > count: 3 len: 2560
> > > count: 4 len: 2560
> > > ### this is the place where I call rte_timer_subsystem_init from my
> > > code, the above must be coming from any other code from VPP/EAL init,
> > > the line below is surely because of my call to
> > > rte_timer_subsystem_init
> > > count: 0 len: 0
> > >
> > > So as you can see that both values are coming to be zero -- is this
> > > expected ? I thought the arr->len should have been non zero.
> > > I must add that the thread which is calling the
> > > rte_timer_subsystem_init is possibly different than the one which did
> > > the eal init, do you think that might be a problem...
> > > I am yet to increase the value of RTE_MAX_MEMZONE, but wanted to share
> > > the above first for any suggestions.
> > >
> > Given the lengths you printed above, increasing the MAX_MEMZONE will not
> > help things. Is the init call which is failing coming from a non-DPDK
> > thread?
> 
> Likely yes, at the moment I am calling it from a CLI which I have added in 
> VPP.
> Assuming this is the case, do you foresee a problem ?

Could well be a possible cause, yes. With non-DPDK threads, the memory NUMA
node/socket-id entries could be invalid, and cause the DPDK memory
allocation to look for memory heaps on non-existent NUMA nodes.
Can you try using rte_thread_register API in

Re: Should we try to be more graceful in library init on old Hardware?

2023-03-30 Thread Dmitry Kozlyuk
2023-03-30 14:28 (UTC+0100), Bruce Richardson:
> On Thu, Mar 30, 2023 at 02:15:42PM +0100, Bruce Richardson wrote:
> > On Thu, Mar 30, 2023 at 02:53:41PM +0200, Christian Ehrhardt wrote:  
> > > Hi,
> > > I've recently gotten a kind of bug I was waiting for many years.
> > > In fact I wondered if it would still come up as each year  made it less 
> > > likely.
> > > But it happened and I got a crash report of someone using dpdk a
> > > rather old pre sse4.2 hardware.  
> > > => https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/2009635/comments/9 
> > >  
> > > 
> > > The reporter was nice and tried the newer 22.11, but that is just as 
> > > affected.
> > > 
> > > I understand that DPDK, as a project, has set this as the minimal
> > > accepted hardware capability.
> > > But due to some programs - in this case UHD - being able to do many
> > > other things it might happen that UHD or any else just links to DPDK
> > > (as it could be used with it) and due to that runs into a crash when
> > > loading. In theory other tools like collectd which has dpdk support
> > > would be affected by the same.
> > > 
> > > Example:
> > > root@1bee22d20ca0:/# uhd_usrp_probe
> > > Illegal instruction (core dumped)
> > > 
> > > (gdb) bt
> > > #0 0x7f4b2d3a3374 in rte_srand () from
> > > /lib/x86_64-linux-gnu/librte_eal.so.23
> > > #1 0x7f4b2d3967ec in ?? () from /lib/x86_64-linux-gnu/librte_eal.so.23
> > > #2 0x7f4b2e5d1fbe in call_init (l=,
> > > argc=argc@entry=1, argv=argv@entry=0x7ffeabf5b488,
> > > env=env@entry=0x7ffeabf5b498)
> > > at ./elf/dl-init.c:70
> > > #3 0x7f4b2e5d20a8 in call_init (env=0x7ffeabf5b498,
> > > argv=0x7ffeabf5b488, argc=1, l=) at ./elf/dl-init.c:33
> > > #4 _dl_init (main_map=0x7f4b2e6042e0, argc=1, argv=0x7ffeabf5b488,
> > > env=0x7ffeabf5b498) at ./elf/dl-init.c:117
> > > #5 0x7f4b2e5ea8b0 in _dl_start_user () from 
> > > /lib64/ld-linux-x86-64.so.2
> > > #6 0x0001 in ?? ()
> > > #7 0x7ffeabf5c844 in ?? ()
> > > #8 0x in ?? ()
> > > 
> > > Right now all we could do is:
> > > a) say bad luck old hardware (not nice)
> > > b) make super complex alternative builds with and without dpdk support
> > > c) ask the DPDK project to work on non sse4.2 (unlikely and too late
> > > in 2023 I guess)
> > > d) Somehow make the initialization graceful (that is what I'm RFC here)
> > > 
> > > If we could manage to get that DPDK to ensure the lib loading paths
> > > are SSE4.2 free.
> > > Then we could check the capabilities on the actual initialization and
> > > return a proper bad result instead of a crash.
> > > Due to that only real-users of DPDK would be required to have
> > > sufficiently new hardware.
> > > And OTOH users of software that links, but in the current config would
> > > not use DPDK would suffer less.
> > > 
> > > WDYT?
> > > Maybe it has been already discussed and I did neither remember nor find 
> > > it?
> > >   
> > It certainly hasn't been discussed previously, but there is meant to be
> > support for this in EAL init itself. Almost the first function called
> > from eal_init() is "rte_cpu_is_supported()" [1] which checks the build-time
> > CPU flags against those of the current system.
> > Unfortunately, from the error message you are getting, that doesn't seem to
> > be working ok in the case of SSE4.2. It seems the compiler is inserting
> > SSE4 instructions before we even get to that point. :-(
> > 
> > Perhaps we need to move eal init to a new file, and compile it (and the
> > cpuflag checks) with very minimal CPU flags.
> >   
> 
> Following up to my own mail...
> 
> I believe we may be able to solve this easier by maybe using the "target"
> attribute for those functions. For x86 builds I don't see why eal init
> cannot be compiled for an earlier SSE version, (march=core2, perhaps). It's
> not a performance-sensitive function.
> 
> Thoughts?
> /Bruce

The error originates from some RTE_INIT() routine called on library load.
They can also be augmented with the "target" attribute
and a check before calling the actual code supplied by DPDK developer.
The latter is needed because we can't ensure (systematically)
that this code doesn't call some external function that uses SSE4.2.
As for rte_eal_init(), I think the check there is enough with one big "if":
main() must also be compiled for the generic CPU to get there.
So app developers can't be completely freed from thinking about this.
BTW, rte_cpu_is_supported() itself is not protected
against being compiled into unsupported instructions :)


RE: [PATCH v5 1/3] ethdev: add API for buffer recycle mode

2023-03-30 Thread Morten Brørup
> From: Feifei Wang [mailto:feifei.wa...@arm.com]
> Sent: Thursday, 30 March 2023 11.31
> 
> > From: Morten Brørup 
> > Sent: Thursday, March 30, 2023 3:19 PM
> >
> > > From: Feifei Wang [mailto:feifei.wa...@arm.com]
> > > Sent: Thursday, 30 March 2023 08.30
> > >
> >
> > [...]
> >
> > > +/**
> > > + * @internal
> > > + * Rx routine for rte_eth_dev_buf_recycle().
> > > + * Refill Rx descriptors in buffer recycle mode.
> > > + *
> > > + * @note
> > > + * This API can only be called by rte_eth_dev_buf_recycle().
> > > + * Before calling this API, rte_eth_tx_buf_stash() should be
> > > + * called to stash Tx used buffers into Rx buffer ring.
> > > + *
> > > + * When this functionality is not implemented in the driver, the
> > > +return
> > > + * buffer number is 0.
> > > + *
> > > + * @param port_id
> > > + *   The port identifier of the Ethernet device.
> > > + * @param queue_id
> > > + *   The index of the receive queue.
> > > + *   The value must be in the range [0, nb_rx_queue - 1] previously
> > supplied
> > > + *   to rte_eth_dev_configure().
> > > + *@param nb
> > > + *   The number of Rx descriptors to be refilled.
> > > + * @return
> > > + *   The number Rx descriptors correct to be refilled.
> > > + *   - ENODEV: bad port or queue (only if compiled with debug).
> >
> > If you want errors reported by the return value, the function return type
> > cannot be uint16_t.
> Agree. Actually, in the code path, if errors happen, the function will return
> 0.
> For this description line, I refer to 'rte_eth_tx_prepare' notes. Maybe we
> should delete
> this line.
> 
> >
> > > + */
> > > +static inline uint16_t rte_eth_rx_descriptors_refill(uint16_t port_id,
> > > + uint16_t queue_id, uint16_t nb)
> > > +{
> > > + struct rte_eth_fp_ops *p;
> > > + void *qd;
> > > +
> > > +#ifdef RTE_ETHDEV_DEBUG_RX
> > > + if (port_id >= RTE_MAX_ETHPORTS ||
> > > + queue_id >= RTE_MAX_QUEUES_PER_PORT) {
> > > + RTE_ETHDEV_LOG(ERR,
> > > + "Invalid port_id=%u or queue_id=%u\n",
> > > + port_id, queue_id);
> > > + rte_errno = ENODEV;
> > > + return 0;
> >
> > If p->rx_descriptors_refill() is likely to return 0, this function should
> not use 0
> > as return value to indicate errors.
> However, refer to dpdk code style in ethdev, most of API write like this.
> For example, 'rte_eth_rx/tx_burst', 'rte_eth_tx_prep'.
> 
> I'm also confused what's return type for this due to I want
> to indicate errors and show the processed buffer number.

OK. Thanks for the references.

Looking at rte_eth_rx/tx_burst(), you could follow the same conventions here, 
i.e.:
- Use uint16_t as return type.
- Return 0 on error.
- Do not set rte_errno.
- Remove the "ENODEV" line from the @return description.
- Use RTE_ETHDEV_LOG(ERR,...) as the only method to indicate errors.

I now see that you follow the convention of rte_eth_tx_prepare(). This is also 
perfectly fine; then you just need to update the description of @return to 
mention that the error value is set in rte_errno if a value less than 'nb' is 
returned.

> 
> >
> > > + }
> > > +#endif
> > > +
> > > + p = &rte_eth_fp_ops[port_id];
> > > + qd = p->rxq.data[queue_id];
> > > +
> > > +#ifdef RTE_ETHDEV_DEBUG_RX
> > > + if (!rte_eth_dev_is_valid_port(port_id)) {
> > > + RTE_ETHDEV_LOG(ERR, "Invalid Rx port_id=%u\n", port_id);
> > > + rte_errno = ENODEV;
> > > + return 0;
> > > +
> > > + if (qd == NULL) {
> > > + RTE_ETHDEV_LOG(ERR, "Invalid Rx queue_id=%u for
> > port_id=%u\n",
> > > + queue_id, port_id);
> > > + rte_errno = ENODEV;
> > > + return 0;
> > > + }
> > > +#endif
> > > +
> > > + if (p->rx_descriptors_refill == NULL)
> > > + return 0;
> > > +
> > > + return p->rx_descriptors_refill(qd, nb); }

When does p->rx_descriptors_refill() return anything else than 'nb'?

If p->rx_descriptors_refill() always succeeds (and thus always returns 'nb'), 
you could make its return type void. And thus, you could also make the return 
type of rte_eth_rx_descriptors_refill() void.

> > > +
> > >  /**@{@name Rx hardware descriptor states
> > >   * @see rte_eth_rx_descriptor_status
> > >   */
> > > @@ -6483,6 +6597,122 @@ rte_eth_tx_buffer(uint16_t port_id, uint16_t
> > queue_id,
> > >   return rte_eth_tx_buffer_flush(port_id, queue_id, buffer);  }
> > >
> > > +/**
> > > + * @internal
> > > + * Tx routine for rte_eth_dev_buf_recycle().
> > > + * Stash Tx used buffers into Rx buffer ring in buffer recycle mode.
> > > + *
> > > + * @note
> > > + * This API can only be called by rte_eth_dev_buf_recycle().
> > > + * After calling this API, rte_eth_rx_descriptors_refill() should be
> > > + * called to refill Rx ring descriptors.
> > > + *
> > > + * When this functionality is not implemented in the driver, the
> > > +return
> > > + * buffer number is 0.
> > > + *
> > > + * @param port_id
> > > + *   The port identifier of the Ethernet device.
> > > + * @param queue_i

Re: [PATCH 2/2] net/gve: update copyright holders

2023-03-30 Thread Rushil Gupta
We were just trying to comply with the BSD license to get
rid of the exception. You have the MIT license for control path/admin-queue
code. Since admin-queue path is similar across linux, freebsd and dpdk the
code is similar but not exactly the same,
We are about to upstream driver code to FreeBSD under BSD license as well
so you will see this code under BSD license soon. I will consult the
lawyers on my end as well.

On Thu, Mar 30, 2023 at 6:14 AM Thomas Monjalon  wrote:

> 30/03/2023 09:20, Guo, Junfeng:
> > From: Thomas Monjalon 
> > > 28/03/2023 11:35, Guo, Junfeng:
> > > > The background is that, in the past (DPDK 22.11) we didn't get the
> > > approval
> > > > of license from Google, thus chose the MIT License for the base code,
> > > and
> > > > BSD-3 License for GVE common code (without the files in /base
> folder).
> > > > We also left the copyright holder of base code just to Google Inc,
> and
> > > made
> > > > Intel as the copyright holder of GVE common code (without /base
> > > folder).
> > > >
> > > > Today we are working together for GVE dev and maintaining. And we
> > > got
> > > > the approval of BSD-3 License from Google for the base code.
> > > > Thus we dicided to 1) switch the License of GVE base code from MIT to
> > > BSD-3;
> > > > 2) add Google LLC as one of the copyright holders for GVE common
> > > code.
> > >
> > > Do you realize we had lenghty discussions in the Technical Board,
> > > the Governing Board, and with lawyers, just for this unneeded
> exception?
> > >
> > > Now looking at the patches, there seem to be some big mistakes like
> > > removing some copyright. I don't understand how it can be taken so
> > > lightly.
> > >
> > > I regret how fast we were, next time we will surely operate
> differently.
> > > If you want to improve the reputation of this driver,
> > > please ask other copyright holders to be more active and responsive.
> > >
> >
> > Really sorry for causing such severe trouble.
> >
> > Yes, we did take lots of efforts in the Technical Board and the Governing
> > Board about this MIT exception. We really appreciate that.
> >
> > About this patch set, it is my severe mistake to switch the MIT License
> > directly for the upstream-ed code in community, in the wrong way.
> > In the past we upstream-ed this driver with MIT License followed from
> > the kernel community's gve driver base code. And now we want to
> > use the code with BSD-3 License (approved by Google).
> > So I suppose that the correct way may be 1) first remove all these code
> > under MIT License and 2) then add the new files under BSD-3 License.
>
> The code under BSD is different of the MIT code?
> If it is the same with a new approved license, you can just change the
> license.
>
> > Please correct me if there are still misunderstanding in my statement.
> > Thanks Thomas for pointing out my mistake. I'll be careful to fix this.
> >
> > Copyright holder for the gve base code will stay unchanged. Google LLC
> > will be added as one of the copyright holders for the gve common code.
> > @Rushil Gupta Please also be more active and responsive for the code
> > review and contribution in the community. Thanks!
>
>
>
>


Re: [PATCH 2/2] net/gve: update copyright holders

2023-03-30 Thread Rushil Gupta
On Thu, Mar 30, 2023 at 6:14 AM Thomas Monjalon  wrote:

> 30/03/2023 09:20, Guo, Junfeng:
> > From: Thomas Monjalon 
> > > 28/03/2023 11:35, Guo, Junfeng:
> > > > The background is that, in the past (DPDK 22.11) we didn't get the
> > > approval
> > > > of license from Google, thus chose the MIT License for the base code,
> > > and
> > > > BSD-3 License for GVE common code (without the files in /base
> folder).
> > > > We also left the copyright holder of base code just to Google Inc,
> and
> > > made
> > > > Intel as the copyright holder of GVE common code (without /base
> > > folder).
> > > >
> > > > Today we are working together for GVE dev and maintaining. And we
> > > got
> > > > the approval of BSD-3 License from Google for the base code.
> > > > Thus we dicided to 1) switch the License of GVE base code from MIT to
> > > BSD-3;
> > > > 2) add Google LLC as one of the copyright holders for GVE common
> > > code.
> > >
> > > Do you realize we had lenghty discussions in the Technical Board,
> > > the Governing Board, and with lawyers, just for this unneeded
> exception?
> > >
> > > Now looking at the patches, there seem to be some big mistakes like
> > > removing some copyright. I don't understand how it can be taken so
> > > lightly.
> > >
> > > I regret how fast we were, next time we will surely operate
> differently.
> > > If you want to improve the reputation of this driver,
> > > please ask other copyright holders to be more active and responsive.
> > >
> >
> > Really sorry for causing such severe trouble.
> >
> > Yes, we did take lots of efforts in the Technical Board and the Governing
> > Board about this MIT exception. We really appreciate that.
> >
> > About this patch set, it is my severe mistake to switch the MIT License
> > directly for the upstream-ed code in community, in the wrong way.
> > In the past we upstream-ed this driver with MIT License followed from
> > the kernel community's gve driver base code. And now we want to
> > use the code with BSD-3 License (approved by Google).
> > So I suppose that the correct way may be 1) first remove all these code
> > under MIT License and 2) then add the new files under BSD-3 License.
>
> The code under BSD is different of the MIT code?
> If it is the same with a new approved license, you can just change the
> license.
>
> > Please correct me if there are still misunderstanding in my statement.
> > Thanks Thomas for pointing out my mistake. I'll be careful to fix this.
> >
> > Copyright holder for the gve base code will stay unchanged. Google LLC
> > will be added as one of the copyright holders for the gve common code.
> > @Rushil Gupta Please also be more active and responsive for the code
> > review and contribution in the community. Thanks!
>
>
>
> We were just trying to comply with the BSD license to get
rid of the exception. You have the MIT license for control path/admin-queue
code. Since admin-queue path is similar across linux, freebsd and dpdk the
code is similar but not exactly the same,
We are about to upstream driver code to FreeBSD under BSD license as well
so you will see this code under BSD license soon. I will consult the
lawyers on my end as well.


RE: [PATCH v5 1/3] ethdev: add API for buffer recycle mode

2023-03-30 Thread Morten Brørup
> From: Morten Brørup
> Sent: Thursday, 30 March 2023 17.15
> 
> > From: Feifei Wang [mailto:feifei.wa...@arm.com]
> > Sent: Thursday, 30 March 2023 11.31
> >
> > > From: Morten Brørup 
> > > Sent: Thursday, March 30, 2023 3:19 PM
> > >
> > > > From: Feifei Wang [mailto:feifei.wa...@arm.com]
> > > > Sent: Thursday, 30 March 2023 08.30
> > > >
> > >
> > > [...]
> > >
> > > > +/**
> > > > + * @internal
> > > > + * Rx routine for rte_eth_dev_buf_recycle().
> > > > + * Refill Rx descriptors in buffer recycle mode.
> > > > + *
> > > > + * @note
> > > > + * This API can only be called by rte_eth_dev_buf_recycle().
> > > > + * Before calling this API, rte_eth_tx_buf_stash() should be
> > > > + * called to stash Tx used buffers into Rx buffer ring.
> > > > + *
> > > > + * When this functionality is not implemented in the driver, the
> > > > +return
> > > > + * buffer number is 0.
> > > > + *
> > > > + * @param port_id
> > > > + *   The port identifier of the Ethernet device.
> > > > + * @param queue_id
> > > > + *   The index of the receive queue.
> > > > + *   The value must be in the range [0, nb_rx_queue - 1] previously
> > > supplied
> > > > + *   to rte_eth_dev_configure().
> > > > + *@param nb
> > > > + *   The number of Rx descriptors to be refilled.
> > > > + * @return
> > > > + *   The number Rx descriptors correct to be refilled.
> > > > + *   - ENODEV: bad port or queue (only if compiled with debug).
> > >
> > > If you want errors reported by the return value, the function return type
> > > cannot be uint16_t.
> > Agree. Actually, in the code path, if errors happen, the function will
> return
> > 0.
> > For this description line, I refer to 'rte_eth_tx_prepare' notes. Maybe we
> > should delete
> > this line.
> >
> > >
> > > > + */
> > > > +static inline uint16_t rte_eth_rx_descriptors_refill(uint16_t port_id,
> > > > +   uint16_t queue_id, uint16_t nb)
> > > > +{
> > > > +   struct rte_eth_fp_ops *p;
> > > > +   void *qd;
> > > > +
> > > > +#ifdef RTE_ETHDEV_DEBUG_RX
> > > > +   if (port_id >= RTE_MAX_ETHPORTS ||
> > > > +   queue_id >= RTE_MAX_QUEUES_PER_PORT) {
> > > > +   RTE_ETHDEV_LOG(ERR,
> > > > +   "Invalid port_id=%u or queue_id=%u\n",
> > > > +   port_id, queue_id);
> > > > +   rte_errno = ENODEV;
> > > > +   return 0;
> > >
> > > If p->rx_descriptors_refill() is likely to return 0, this function should
> > not use 0
> > > as return value to indicate errors.
> > However, refer to dpdk code style in ethdev, most of API write like this.
> > For example, 'rte_eth_rx/tx_burst', 'rte_eth_tx_prep'.
> >
> > I'm also confused what's return type for this due to I want
> > to indicate errors and show the processed buffer number.
> 
> OK. Thanks for the references.
> 
> Looking at rte_eth_rx/tx_burst(), you could follow the same conventions here,
> i.e.:
> - Use uint16_t as return type.
> - Return 0 on error.
> - Do not set rte_errno.
> - Remove the "ENODEV" line from the @return description.
> - Use RTE_ETHDEV_LOG(ERR,...) as the only method to indicate errors.
> 
> I now see that you follow the convention of rte_eth_tx_prepare(). This is also
> perfectly fine; then you just need to update the description of @return to
> mention that the error value is set in rte_errno if a value less than 'nb' is
> returned.

After further consideration, I have changed my mind:

The primary purpose of rte_eth_tx_prepare() is to test if a packet burst is 
valid, so the ability to return an error value is a natural requirement.

This is not the purpose your functions. The purpose of your functions resemble 
rte_eth_rx/tx_burst(), where there is no requirement to return an error value. 
So you should follow the convention of rte_eth_rx/tx_burst(), as I just 
suggested.

> 
> >
> > >
> > > > +   }
> > > > +#endif
> > > > +
> > > > +   p = &rte_eth_fp_ops[port_id];
> > > > +   qd = p->rxq.data[queue_id];
> > > > +
> > > > +#ifdef RTE_ETHDEV_DEBUG_RX
> > > > +   if (!rte_eth_dev_is_valid_port(port_id)) {
> > > > +   RTE_ETHDEV_LOG(ERR, "Invalid Rx port_id=%u\n", port_id);
> > > > +   rte_errno = ENODEV;
> > > > +   return 0;
> > > > +
> > > > +   if (qd == NULL) {
> > > > +   RTE_ETHDEV_LOG(ERR, "Invalid Rx queue_id=%u for
> > > port_id=%u\n",
> > > > +   queue_id, port_id);
> > > > +   rte_errno = ENODEV;
> > > > +   return 0;
> > > > +   }
> > > > +#endif
> > > > +
> > > > +   if (p->rx_descriptors_refill == NULL)
> > > > +   return 0;
> > > > +
> > > > +   return p->rx_descriptors_refill(qd, nb); }
> 
> When does p->rx_descriptors_refill() return anything else than 'nb'?
> 
> If p->rx_descriptors_refill() always succeeds (and thus always returns 'nb'),
> you could make its return type void. And thus, you could also make the return
> type of rte_eth_rx_descriptors_refill

[PATCH v1] doc: update release notes for 23.03

2023-03-30 Thread John McNamara
Fix grammar, spelling and formatting of DPDK 23.03 release notes.

Signed-off-by: John McNamara 
---

* Minor fixes/changes only.


 doc/guides/rel_notes/release_23_03.rst | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/doc/guides/rel_notes/release_23_03.rst 
b/doc/guides/rel_notes/release_23_03.rst
index b93903447d..a31d34f5f5 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -71,7 +71,7 @@ New Features
 * **Added platform bus support.**
 
   A platform bus provides a way to use Linux platform devices which
-  are compatible with vfio-platform kernel driver.
+  are compatible with the do  vfio-platform kernel driver.
 
 * **Added ARM support for power monitor in the power management library.**
 
@@ -80,6 +80,9 @@ New Features
 
 * **Added Ethernet link speed for 400 Gb/s.**
 
+Added Ethernet link speed for 400 Gb/s since there are some devices already
+supporting that speed and it is well standardized in IEEE.
+
 * **Added support for mapping a queue with an aggregated port.**
 
   * Introduced new function ``rte_eth_dev_count_aggr_ports()``
@@ -88,6 +91,7 @@ New Features
 to map a Tx queue with an aggregated port of the DPDK port.
   * Added Rx affinity flow matching of an aggregated port.
 
+
 * **Added flow matching of IPv6 routing extension.**
 
   Added ``RTE_FLOW_ITEM_TYPE_IPV6_ROUTING_EXT``
@@ -113,7 +117,7 @@ New Features
 
 * **Added cross-port indirect action in asynchronous flow API.**
 
-  * Allowed to share indirect actions between ports by passing
+  * Enabled the ability to share indirect actions between ports by passing
 the flag ``RTE_FLOW_PORT_FLAG_SHARE_INDIRECT`` to ``rte_flow_configure()``.
   * Added ``host_port_id`` in ``rte_flow_port_attr`` structure
 to reference the port hosting the shared objects.
@@ -215,14 +219,14 @@ New Features
 
 * **Updated the eventdev reconfigure logic for service based adapters.**
 
-  * eventdev reconfig logic is enhanced to increment the
+  * The eventdev reconfigure logic was enhanced to increment the
 ``rte_event_dev_config::nb_single_link_event_port_queues`` parameter
 if event port config is of type ``RTE_EVENT_PORT_CFG_SINGLE_LINK``.
   * With this change, the application no longer needs to account for the
 ``rte_event_dev_config::nb_single_link_event_port_queues`` parameter
 required for eth_rx, eth_tx, crypto and timer eventdev adapters.
 
-* **Added pcap trace support in graph library.**
+* **Added PCAP trace support in graph library.**
 
   * Added support to capture packets at each graph node with packet metadata 
and
 node name.
@@ -263,8 +267,8 @@ API Changes
 
 * The telemetry command ``/eal/heap_info`` is fixed to print ``Heap_id``.
 
-* The experimental function ``rte_pcapng_copy`` was updated to support comment
-  section in enhanced packet block in the pcapng library.
+* The experimental function ``rte_pcapng_copy`` was updated to support a 
comment
+  section in enhanced packet block in the PcapNG library.
 
 * The experimental structures ``struct rte_graph_param``, ``struct rte_graph``
   and ``struct graph`` were updated to support pcap trace in the graph library.
-- 
2.31.1



Re: [PATCH v1] doc: update release notes for 23.03

2023-03-30 Thread Zhang, Fan

On 3/30/2023 5:09 PM, John McNamara wrote:

Fix grammar, spelling and formatting of DPDK 23.03 release notes.

Signed-off-by: John McNamara 
---

* Minor fixes/changes only.


  doc/guides/rel_notes/release_23_03.rst | 16 ++--
  1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/doc/guides/rel_notes/release_23_03.rst 
b/doc/guides/rel_notes/release_23_03.rst
index b93903447d..a31d34f5f5 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -71,7 +71,7 @@ New Features
  * **Added platform bus support.**
  
A platform bus provides a way to use Linux platform devices which

-  are compatible with vfio-platform kernel driver.
+  are compatible with the do  vfio-platform kernel driver.


Hi John,

Looks like there are a double spacing problem between "do" and 
"vfio-platform", also I suppose "the vfio-platform" is sufficient.


Other than that,

Acked-by: Fan Zhang 



[Bug 1206] Multiple large memory block allocations using rte_malloc can lead to memory out-of-bounds issues.

2023-03-30 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1206

Bug ID: 1206
   Summary: Multiple large memory block allocations using
rte_malloc can lead to memory out-of-bounds issues.
   Product: DPDK
   Version: 21.11
  Hardware: x86
OS: Linux
Status: UNCONFIRMED
  Severity: major
  Priority: Normal
 Component: core
  Assignee: dev@dpdk.org
  Reporter: killerst...@gmail.com
  Target Milestone: ---

[root@localhost bin]# lscpu
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):8
On-line CPU(s) list:   0-7
Thread(s) per core:2
Core(s) per socket:4
Socket(s): 1
NUMA node(s):  1
Vendor ID: GenuineIntel
CPU family:6
Model: 58
Model name:Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
Stepping:  9
CPU MHz:   3700.073
CPU max MHz:   3900.
CPU min MHz:   1600.
BogoMIPS:  6784.24
Virtualization:VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache:  256K
L3 cache:  8192K
NUMA node0 CPU(s): 0-7
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est
tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer
aes xsave avx f16c rdrand lahf_lm epb ssbd ibrs ibpb tpr_shadow vnmi
flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
[root@localhost bin]# 

Not supported pdpe1gb



There are many free 2M HugePages.

HugePages_Total:6656
HugePages_Free: 5682
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
DirectMap4k:  236476 kB
DirectMap2M:33228800 kB




test code

  char * t_mem1;
  char * t_mem2;
int t_size = 1024*1024*1024;
t_mem1 = rte_malloc(NULL,t_size,RTE_CACHE_LINE_SIZE);
t_mem2 = rte_malloc(NULL,t_size,RTE_CACHE_LINE_SIZE);
printf("rte_malloc1 t_mem1=%p \n",t_mem1);
printf("rte_malloc1 t_mem2=%p \n",t_mem2);  

memset(t_mem1,0,t_size);
memset(t_mem2,1,t_size);

int t_i;
for(t_i=0;t_i

[RFC 0/4] add frequency adjustment support for PTP

2023-03-30 Thread Simei Su
[RFC 1/4] ethdev: add frequency adjustment API.
[RFC 2/4] net/ice: add frequency adjustment support for PTP.
[RFC 3/4] examples/ptpclient: refine application.
[RFC 4/4] examples/ptpclient: add frequency adjustment support.

Simei Su (4):
  ethdev: add frequency adjustment API
  net/ice: add frequency adjustment support for PTP
  examples/ptpclient: refine application
  examples/ptpclient: add frequency adjustment support

 drivers/net/ice/ice_ethdev.c | 111 +---
 examples/ptpclient/ptpclient.c   | 222 +--
 lib/ethdev/ethdev_driver.h   |   5 +
 lib/ethdev/ethdev_trace.h|   9 ++
 lib/ethdev/ethdev_trace_points.c |   3 +
 lib/ethdev/rte_ethdev.c  |  18 
 lib/ethdev/rte_ethdev.h  |  19 
 7 files changed, 317 insertions(+), 70 deletions(-)

-- 
2.9.5



[RFC 1/4] ethdev: add frequency adjustment API

2023-03-30 Thread Simei Su
This patch adds freq adjustment API for PTP high accuracy.

Signed-off-by: Simei Su 
---
 lib/ethdev/ethdev_driver.h   |  5 +
 lib/ethdev/ethdev_trace.h|  9 +
 lib/ethdev/ethdev_trace_points.c |  3 +++
 lib/ethdev/rte_ethdev.c  | 18 ++
 lib/ethdev/rte_ethdev.h  | 19 +++
 5 files changed, 54 insertions(+)

diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 2c9d615..b1451d2 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -633,6 +633,9 @@ typedef int (*eth_timesync_read_tx_timestamp_t)(struct 
rte_eth_dev *dev,
 /** @internal Function used to adjust the device clock. */
 typedef int (*eth_timesync_adjust_time)(struct rte_eth_dev *dev, int64_t);
 
+/** @internal Function used to adjust the clock frequency. */
+typedef int (*eth_timesync_adjust_freq)(struct rte_eth_dev *dev, int64_t);
+
 /** @internal Function used to get time from the device clock. */
 typedef int (*eth_timesync_read_time)(struct rte_eth_dev *dev,
  struct timespec *timestamp);
@@ -1344,6 +1347,8 @@ struct eth_dev_ops {
eth_timesync_read_tx_timestamp_t timesync_read_tx_timestamp;
/** Adjust the device clock */
eth_timesync_adjust_time   timesync_adjust_time;
+   /** Adjust the clock frequency */
+   eth_timesync_adjust_freq   timesync_adjust_freq;
/** Get the device clock time */
eth_timesync_read_time timesync_read_time;
/** Set the device clock time */
diff --git a/lib/ethdev/ethdev_trace.h b/lib/ethdev/ethdev_trace.h
index 3dc7d02..d92554b 100644
--- a/lib/ethdev/ethdev_trace.h
+++ b/lib/ethdev/ethdev_trace.h
@@ -2196,6 +2196,15 @@ RTE_TRACE_POINT_FP(
rte_trace_point_emit_int(ret);
 )
 
+/* Called in loop in examples/ptpclient */
+RTE_TRACE_POINT_FP(
+   rte_eth_trace_timesync_adjust_freq,
+   RTE_TRACE_POINT_ARGS(uint16_t port_id, int64_t ppm, int ret),
+   rte_trace_point_emit_u16(port_id);
+   rte_trace_point_emit_i64(ppm);
+   rte_trace_point_emit_int(ret);
+)
+
 /* Called in loop in app/test-flow-perf */
 RTE_TRACE_POINT_FP(
rte_flow_trace_create,
diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
index 61010ca..c01b5d3 100644
--- a/lib/ethdev/ethdev_trace_points.c
+++ b/lib/ethdev/ethdev_trace_points.c
@@ -406,6 +406,9 @@ 
RTE_TRACE_POINT_REGISTER(rte_eth_trace_timesync_read_tx_timestamp,
 RTE_TRACE_POINT_REGISTER(rte_eth_trace_timesync_adjust_time,
lib.ethdev.timesync_adjust_time)
 
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_timesync_adjust_freq,
+   lib.ethdev.timesync_adjust_freq)
+
 RTE_TRACE_POINT_REGISTER(rte_eth_trace_timesync_read_time,
lib.ethdev.timesync_read_time)
 
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 4d03255..f5934bb 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6017,6 +6017,24 @@ rte_eth_timesync_adjust_time(uint16_t port_id, int64_t 
delta)
 }
 
 int
+rte_eth_timesync_adjust_freq(uint16_t port_id, int64_t ppm)
+{
+   struct rte_eth_dev *dev;
+   int ret;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+   dev = &rte_eth_devices[port_id];
+
+   if (*dev->dev_ops->timesync_adjust_freq == NULL)
+   return -ENOTSUP;
+   ret = eth_err(port_id, (*dev->dev_ops->timesync_adjust_freq)(dev, ppm));
+
+   rte_eth_trace_timesync_adjust_freq(port_id, ppm, ret);
+
+   return ret;
+}
+
+int
 rte_eth_timesync_read_time(uint16_t port_id, struct timespec *timestamp)
 {
struct rte_eth_dev *dev;
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 99fe9e2..9737461 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -5102,6 +5102,25 @@ int rte_eth_timesync_read_tx_timestamp(uint16_t port_id,
 int rte_eth_timesync_adjust_time(uint16_t port_id, int64_t delta);
 
 /**
+ * Adjust the clock increment rate on an Ethernet device.
+ *
+ * This is usually used in conjunction with other Ethdev timesync functions to
+ * synchronize the device time using the IEEE1588/802.1AS protocol.
+ *
+ * @param port_id
+ *  The port identifier of the Ethernet device.
+ * @param ppm
+ *  Parts per million with 16-bit fractional field
+ *
+ * @return
+ *   - 0: Success.
+ *   - -ENODEV: The port ID is invalid.
+ *   - -EIO: if device is removed.
+ *   - -ENOTSUP: The function is not supported by the Ethernet driver.
+ */
+int rte_eth_timesync_adjust_freq(uint16_t port_id, int64_t ppm);
+
+/**
  * Read the time from the timesync clock on an Ethernet device.
  *
  * This is usually used in conjunction with other Ethdev timesync functions to
-- 
2.9.5



[RFC 2/4] net/ice: add frequency adjustment support for PTP

2023-03-30 Thread Simei Su
Add ice support for new ethdev API to adjust frequency for IEEE1588
PTP. Also, this patch reworks code for converting software update
to hardware update.

Signed-off-by: Simei Su 
---
 drivers/net/ice/ice_ethdev.c | 111 ---
 1 file changed, 72 insertions(+), 39 deletions(-)

diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c
index 9a88cf9..fa4d840 100644
--- a/drivers/net/ice/ice_ethdev.c
+++ b/drivers/net/ice/ice_ethdev.c
@@ -158,6 +158,7 @@ static int ice_timesync_read_rx_timestamp(struct 
rte_eth_dev *dev,
 static int ice_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
  struct timespec *timestamp);
 static int ice_timesync_adjust_time(struct rte_eth_dev *dev, int64_t delta);
+static int ice_timesync_adjust_freq(struct rte_eth_dev *dev, int64_t ppm);
 static int ice_timesync_read_time(struct rte_eth_dev *dev,
  struct timespec *timestamp);
 static int ice_timesync_write_time(struct rte_eth_dev *dev,
@@ -274,6 +275,7 @@ static const struct eth_dev_ops ice_eth_dev_ops = {
.timesync_read_rx_timestamp   = ice_timesync_read_rx_timestamp,
.timesync_read_tx_timestamp   = ice_timesync_read_tx_timestamp,
.timesync_adjust_time = ice_timesync_adjust_time,
+   .timesync_adjust_freq = ice_timesync_adjust_freq,
.timesync_read_time   = ice_timesync_read_time,
.timesync_write_time  = ice_timesync_write_time,
.timesync_disable = ice_timesync_disable,
@@ -5840,23 +5842,6 @@ ice_timesync_enable(struct rte_eth_dev *dev)
}
}
 
-   /* Initialize cycle counters for system time/RX/TX timestamp */
-   memset(&ad->systime_tc, 0, sizeof(struct rte_timecounter));
-   memset(&ad->rx_tstamp_tc, 0, sizeof(struct rte_timecounter));
-   memset(&ad->tx_tstamp_tc, 0, sizeof(struct rte_timecounter));
-
-   ad->systime_tc.cc_mask = ICE_CYCLECOUNTER_MASK;
-   ad->systime_tc.cc_shift = 0;
-   ad->systime_tc.nsec_mask = 0;
-
-   ad->rx_tstamp_tc.cc_mask = ICE_CYCLECOUNTER_MASK;
-   ad->rx_tstamp_tc.cc_shift = 0;
-   ad->rx_tstamp_tc.nsec_mask = 0;
-
-   ad->tx_tstamp_tc.cc_mask = ICE_CYCLECOUNTER_MASK;
-   ad->tx_tstamp_tc.cc_shift = 0;
-   ad->tx_tstamp_tc.nsec_mask = 0;
-
ad->ptp_ena = 1;
 
return 0;
@@ -5871,14 +5856,13 @@ ice_timesync_read_rx_timestamp(struct rte_eth_dev *dev,
ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
struct ice_rx_queue *rxq;
uint32_t ts_high;
-   uint64_t ts_ns, ns;
+   uint64_t ts_ns;
 
rxq = dev->data->rx_queues[flags];
 
ts_high = rxq->time_high;
ts_ns = ice_tstamp_convert_32b_64b(hw, ad, 1, ts_high);
-   ns = rte_timecounter_update(&ad->rx_tstamp_tc, ts_ns);
-   *timestamp = rte_ns_to_timespec(ns);
+   *timestamp = rte_ns_to_timespec(ts_ns);
 
return 0;
 }
@@ -5891,7 +5875,7 @@ ice_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
struct ice_adapter *ad =
ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
uint8_t lport;
-   uint64_t ts_ns, ns, tstamp;
+   uint64_t ts_ns, tstamp;
const uint64_t mask = 0x;
int ret;
 
@@ -5904,8 +5888,7 @@ ice_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
}
 
ts_ns = ice_tstamp_convert_32b_64b(hw, ad, 1, (tstamp >> 8) & mask);
-   ns = rte_timecounter_update(&ad->tx_tstamp_tc, ts_ns);
-   *timestamp = rte_ns_to_timespec(ns);
+   *timestamp = rte_ns_to_timespec(ts_ns);
 
return 0;
 }
@@ -5913,12 +5896,66 @@ ice_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
 static int
 ice_timesync_adjust_time(struct rte_eth_dev *dev, int64_t delta)
 {
-   struct ice_adapter *ad =
-   ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
+   struct ice_hw *hw = ICE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   uint8_t tmr_idx = hw->func_caps.ts_func_info.tmr_index_assoc;
+   uint32_t lo, lo2, hi;
+   uint64_t time, ns;
+   int ret;
+
+   if (delta > INT32_MAX || delta < INT32_MIN) {
+   lo = ICE_READ_REG(hw, GLTSYN_TIME_L(tmr_idx));
+   hi = ICE_READ_REG(hw, GLTSYN_TIME_H(tmr_idx));
+   lo2 = ICE_READ_REG(hw, GLTSYN_TIME_L(tmr_idx));
+
+   if (lo2 < lo) {
+   lo = ICE_READ_REG(hw, GLTSYN_TIME_L(tmr_idx));
+   hi = ICE_READ_REG(hw, GLTSYN_TIME_H(tmr_idx));
+   }
+
+   time = ((uint64_t)hi << 32) | lo;
+   ns = time + delta;
+
+   return ice_ptp_init_time(hw, ns);
+   }
+
+   ret = ice_ptp_adj_clock(hw, delta, true);
+   if (ret)
+   return -1;
+
+   return 0;
+}
 
-   ad->systime_tc.nsec += delta;
-   ad->rx_tstamp_tc.nsec += delta;
-   ad->tx_t

[RFC 3/4] examples/ptpclient: refine application

2023-03-30 Thread Simei Su
This patch reworks code to split delay request message parsing
from follow up message parsing.

Signed-off-by: Simei Su 
Signed-off-by: Wenjun Wu 
---
 examples/ptpclient/ptpclient.c | 48 --
 1 file changed, 32 insertions(+), 16 deletions(-)

diff --git a/examples/ptpclient/ptpclient.c b/examples/ptpclient/ptpclient.c
index cdf2da6..74a1bf5 100644
--- a/examples/ptpclient/ptpclient.c
+++ b/examples/ptpclient/ptpclient.c
@@ -382,21 +382,11 @@ parse_sync(struct ptpv2_data_slave_ordinary *ptp_data, 
uint16_t rx_tstamp_idx)
 static void
 parse_fup(struct ptpv2_data_slave_ordinary *ptp_data)
 {
-   struct rte_ether_hdr *eth_hdr;
-   struct rte_ether_addr eth_addr;
struct ptp_header *ptp_hdr;
-   struct clock_id *client_clkid;
struct ptp_message *ptp_msg;
-   struct delay_req_msg *req_msg;
-   struct rte_mbuf *created_pkt;
struct tstamp *origin_tstamp;
-   struct rte_ether_addr eth_multicast = ether_multicast;
-   size_t pkt_size;
-   int wait_us;
struct rte_mbuf *m = ptp_data->m;
-   int ret;
 
-   eth_hdr = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
ptp_hdr = (struct ptp_header *)(rte_pktmbuf_mtod(m, char *)
+ sizeof(struct rte_ether_hdr));
if (memcmp(&ptp_data->master_clock_id,
@@ -413,6 +403,26 @@ parse_fup(struct ptpv2_data_slave_ordinary *ptp_data)
ptp_data->tstamp1.tv_sec =
((uint64_t)ntohl(origin_tstamp->sec_lsb)) |
(((uint64_t)ntohs(origin_tstamp->sec_msb)) << 32);
+}
+
+static void
+send_delay_request(struct ptpv2_data_slave_ordinary *ptp_data)
+{
+   struct rte_ether_hdr *eth_hdr;
+   struct rte_ether_addr eth_addr;
+   struct ptp_header *ptp_hdr;
+   struct clock_id *client_clkid;
+   struct delay_req_msg *req_msg;
+   struct rte_mbuf *created_pkt;
+   struct rte_ether_addr eth_multicast = ether_multicast;
+   size_t pkt_size;
+   int wait_us;
+   struct rte_mbuf *m = ptp_data->m;
+   int ret;
+
+   eth_hdr = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
+   ptp_hdr = (struct ptp_header *)(rte_pktmbuf_mtod(m, char *)
+   + sizeof(struct rte_ether_hdr));
 
if (ptp_data->seqID_FOLLOWUP == ptp_data->seqID_SYNC) {
ret = rte_eth_macaddr_get(ptp_data->portid, ð_addr);
@@ -550,12 +560,6 @@ parse_drsp(struct ptpv2_data_slave_ordinary *ptp_data)
((uint64_t)ntohl(rx_tstamp->sec_lsb)) |
(((uint64_t)ntohs(rx_tstamp->sec_msb)) << 32);
 
-   /* Evaluate the delta for adjustment. */
-   ptp_data->delta = delta_eval(ptp_data);
-
-   rte_eth_timesync_adjust_time(ptp_data->portid,
-ptp_data->delta);
-
ptp_data->current_ptp_port = ptp_data->portid;
 
/* Update kernel time if enabled in app parameters. */
@@ -568,6 +572,16 @@ parse_drsp(struct ptpv2_data_slave_ordinary *ptp_data)
}
 }
 
+static void
+ptp_adjust_time(struct ptpv2_data_slave_ordinary *ptp_data)
+{
+   /* Evaluate the delta for adjustment. */
+   ptp_data->delta = delta_eval(ptp_data);
+
+   rte_eth_timesync_adjust_time(ptp_data->portid,
+ptp_data->delta);
+}
+
 /* This function processes PTP packets, implementing slave PTP IEEE1588 L2
  * functionality.
  */
@@ -594,9 +608,11 @@ parse_ptp_frames(uint16_t portid, struct rte_mbuf *m) {
break;
case FOLLOW_UP:
parse_fup(&ptp_data);
+   send_delay_request(&ptp_data);
break;
case DELAY_RESP:
parse_drsp(&ptp_data);
+   ptp_adjust_time(&ptp_data);
print_clock_info(&ptp_data);
break;
default:
-- 
2.9.5



[RFC 4/4] examples/ptpclient: add frequency adjustment support

2023-03-30 Thread Simei Su
This patch adds PI servo algorithm to support frequency
adjustment API for IEEE1588 PTP.

For example, the command for starting ptpclient with PI algorithm is:
./build/examples/dpdk-ptpclient -a :81:00.0 -c 1 -n 3 -- -T 0 -p 0x1
--controller=pi

Signed-off-by: Simei Su 
Signed-off-by: Wenjun Wu 
---
 examples/ptpclient/ptpclient.c | 178 +
 1 file changed, 161 insertions(+), 17 deletions(-)

diff --git a/examples/ptpclient/ptpclient.c b/examples/ptpclient/ptpclient.c
index 74a1bf5..3d074af 100644
--- a/examples/ptpclient/ptpclient.c
+++ b/examples/ptpclient/ptpclient.c
@@ -43,6 +43,28 @@
 #define KERNEL_TIME_ADJUST_LIMIT  2
 #define PTP_PROTOCOL 0x88F7
 
+#define KP 0.7
+#define KI 0.3
+
+enum servo_state {
+   SERVO_UNLOCKED,
+   SERVO_JUMP,
+   SERVO_LOCKED,
+};
+
+struct pi_servo {
+   double offset[2];
+   double local[2];
+   double drift;
+   int count;
+};
+
+enum controller_mode {
+   MODE_NONE,
+   MODE_PI,
+   MAX_ALL
+} mode;
+
 struct rte_mempool *mbuf_pool;
 uint32_t ptp_enabled_port_mask;
 uint8_t ptp_enabled_port_nb;
@@ -132,6 +154,9 @@ struct ptpv2_data_slave_ordinary {
uint8_t ptpset;
uint8_t kernel_time_set;
uint16_t current_ptp_port;
+   int64_t master_offset;
+   int64_t path_delay;
+   struct pi_servo *servo;
 };
 
 static struct ptpv2_data_slave_ordinary ptp_data;
@@ -290,36 +315,44 @@ print_clock_info(struct ptpv2_data_slave_ordinary 
*ptp_data)
ptp_data->tstamp3.tv_sec,
(ptp_data->tstamp3.tv_nsec));
 
-   printf("\nT4 - Master Clock.  %lds %ldns ",
+   printf("\nT4 - Master Clock.  %lds %ldns\n",
ptp_data->tstamp4.tv_sec,
(ptp_data->tstamp4.tv_nsec));
 
-   printf("\nDelta between master and slave clocks:%"PRId64"ns\n",
+   if (mode == MODE_NONE) {
+   printf("\nDelta between master and slave clocks:%"PRId64"ns\n",
ptp_data->delta);
 
-   clock_gettime(CLOCK_REALTIME, &sys_time);
-   rte_eth_timesync_read_time(ptp_data->current_ptp_port, &net_time);
+   clock_gettime(CLOCK_REALTIME, &sys_time);
+   rte_eth_timesync_read_time(ptp_data->current_ptp_port,
+  &net_time);
 
-   time_t ts = net_time.tv_sec;
+   time_t ts = net_time.tv_sec;
 
-   printf("\n\nComparison between Linux kernel Time and PTP:");
+   printf("\n\nComparison between Linux kernel Time and PTP:");
 
-   printf("\nCurrent PTP Time: %.24s %.9ld ns",
+   printf("\nCurrent PTP Time: %.24s %.9ld ns",
ctime(&ts), net_time.tv_nsec);
 
-   nsec = (int64_t)timespec64_to_ns(&net_time) -
+   nsec = (int64_t)timespec64_to_ns(&net_time) -
(int64_t)timespec64_to_ns(&sys_time);
-   ptp_data->new_adj = ns_to_timeval(nsec);
+   ptp_data->new_adj = ns_to_timeval(nsec);
+
+   gettimeofday(&ptp_data->new_adj, NULL);
 
-   gettimeofday(&ptp_data->new_adj, NULL);
+   time_t tp = ptp_data->new_adj.tv_sec;
 
-   time_t tp = ptp_data->new_adj.tv_sec;
+   printf("\nCurrent SYS Time: %.24s %.6ld ns",
+   ctime(&tp), ptp_data->new_adj.tv_usec);
 
-   printf("\nCurrent SYS Time: %.24s %.6ld ns",
-   ctime(&tp), ptp_data->new_adj.tv_usec);
+   printf("\nDelta between PTP and Linux Kernel 
time:%"PRId64"ns\n",
+   nsec);
+   }
 
-   printf("\nDelta between PTP and Linux Kernel time:%"PRId64"ns\n",
-   nsec);
+   if (mode == MODE_PI) {
+   printf("path delay: %"PRId64"ns\n", ptp_data->path_delay);
+   printf("master offset: %"PRId64"ns\n", ptp_data->master_offset);
+   }
 
printf("[Ctrl+C to quit]\n");
 
@@ -405,6 +438,76 @@ parse_fup(struct ptpv2_data_slave_ordinary *ptp_data)
(((uint64_t)ntohs(origin_tstamp->sec_msb)) << 32);
 }
 
+static double
+pi_sample(struct pi_servo *s, double offset, double local_ts,
+ enum servo_state *state)
+{
+   double ppb = 0.0;
+
+   switch (s->count) {
+   case 0:
+   s->offset[0] = offset;
+   s->local[0] = local_ts;
+   *state = SERVO_UNLOCKED;
+   s->count = 1;
+   break;
+   case 1:
+   s->offset[1] = offset;
+   s->local[1] = local_ts;
+   *state = SERVO_UNLOCKED;
+   s->count = 2;
+   break;
+   case 2:
+   s->drift += (s->offset[1] - s->offset[0]) /
+   (s->local[1] - s->local[0]);
+   *state = SERVO_UNLOCKED;
+   s->count = 3;
+   break;
+   case 3:
+   *state = SERVO_JUMP;
+   

RE: [PATCH 2/2] net/gve: update copyright holders

2023-03-30 Thread Guo, Junfeng



> -Original Message-
> From: Thomas Monjalon 
> Sent: Thursday, March 30, 2023 21:14
> To: Ferruh Yigit ; Zhang, Qi Z
> ; Wu, Jingjing ; Xing,
> Beilei ; Rushil Gupta ; Guo,
> Junfeng 
> Cc: dev@dpdk.org; Joshua Washington ; Jeroen
> de Borst 
> Subject: Re: [PATCH 2/2] net/gve: update copyright holders
> 
> 30/03/2023 09:20, Guo, Junfeng:
> > From: Thomas Monjalon 
> > > 28/03/2023 11:35, Guo, Junfeng:
> > > > The background is that, in the past (DPDK 22.11) we didn't get the
> > > approval
> > > > of license from Google, thus chose the MIT License for the base
> code,
> > > and
> > > > BSD-3 License for GVE common code (without the files in /base
> folder).
> > > > We also left the copyright holder of base code just to Google Inc,
> and
> > > made
> > > > Intel as the copyright holder of GVE common code (without /base
> > > folder).
> > > >
> > > > Today we are working together for GVE dev and maintaining. And
> we
> > > got
> > > > the approval of BSD-3 License from Google for the base code.
> > > > Thus we dicided to 1) switch the License of GVE base code from MIT
> to
> > > BSD-3;
> > > > 2) add Google LLC as one of the copyright holders for GVE common
> > > code.
> > >
> > > Do you realize we had lenghty discussions in the Technical Board,
> > > the Governing Board, and with lawyers, just for this unneeded
> exception?
> > >
> > > Now looking at the patches, there seem to be some big mistakes like
> > > removing some copyright. I don't understand how it can be taken so
> > > lightly.
> > >
> > > I regret how fast we were, next time we will surely operate differently.
> > > If you want to improve the reputation of this driver,
> > > please ask other copyright holders to be more active and responsive.
> > >
> >
> > Really sorry for causing such severe trouble.
> >
> > Yes, we did take lots of efforts in the Technical Board and the Governing
> > Board about this MIT exception. We really appreciate that.
> >
> > About this patch set, it is my severe mistake to switch the MIT License
> > directly for the upstream-ed code in community, in the wrong way.
> > In the past we upstream-ed this driver with MIT License followed from
> > the kernel community's gve driver base code. And now we want to
> > use the code with BSD-3 License (approved by Google).
> > So I suppose that the correct way may be 1) first remove all these code
> > under MIT License and 2) then add the new files under BSD-3 License.
> 
> The code under BSD is different of the MIT code?
> If it is the same with a new approved license, you can just change the
> license.

For this patch set, the code lines remain unchanged.
We want to use BSD licensed source files to replace the MIT licensed ones.
This patch set is mainly used to for the license related purpose.

You can check the latest v4 patch set:
https://patchwork.dpdk.org/project/dpdk/list/?series=27570&state=*

Sorry about the misleading titles and statements in this patch set, that
cause the misunderstanding of changing license/copyright unconsidered.

As Rushil replied, Google is about to upstream driver code to FreeBSD 
under BSD license as well so we will see this code under BSD license soon. 
And he will consult the lawyers on his end as well.

Thanks

> 
> > Please correct me if there are still misunderstanding in my statement.
> > Thanks Thomas for pointing out my mistake. I'll be careful to fix this.
> >
> > Copyright holder for the gve base code will stay unchanged. Google LLC
> > will be added as one of the copyright holders for the gve common code.
> > @Rushil Gupta Please also be more active and responsive for the code
> > review and contribution in the community. Thanks!
> 
> 



[PATCH v5 00/15] graph enhancement for multi-core dispatch

2023-03-30 Thread Zhirun Yan
V5:
Fix CI build issues about dynamically update doc.

V4:
Fix CI build issues about undefined reference of sched apis.
Remove inline for model setting.

V3:
Fix CI build issues about TLS and typo.

V2:
Use git mv to keep git history.
Use TLS for per-thread local storage.
Change model name to mcore dispatch.
Change API with specific mode name.
Split big patch.
Fix CI issues.
Rebase l3fwd-graph example.
Update doc and maintainers files.


Currently, rte_graph supports RTC (Run-To-Completion) model within each
of a single core.
RTC is one of the typical model of packet processing. Others like
Pipeline or Hybrid are lack of support.

The patch set introduces a 'multicore dispatch' model selection which
is a self-reacting scheme according to the core affinity.
The new model enables a cross-core dispatching mechanism which employs a
scheduling work-queue to dispatch streams to other worker cores which
being associated with the destination node. When core flavor of the
destination node is a default 'current', the stream can be continue
executed as normal.

Example:
3-node graph targets 3-core budget

RTC:
Graph: node-0 -> node-1 -> node-2 @Core0.

+ - - - - - - - - - - - - - - - - - - - - - +
'Core #0/1/2'
'   '
' ++ +-+ ++ '
' | Node-0 | --> | Node-1  | --> | Node-2 | '
' ++ +-+ ++ '
'   '
+ - - - - - - - - - - - - - - - - - - - - - +

Dispatch:

Graph topo: node-0 -> Core1; node-1 -> node-2; node-2 -> node-3.
Config graph: node-0 @Core0; node-1/3 @Core1; node-2 @Core2.

.. code-block:: diff

+ - - - - - -+ +- - - - - - - - - - - - - + + - - - - - -+
'  Core #0   ' '  Core #1 ' '  Core #2   '
'' '  ' ''
' ++ ' ' ++++ ' ' ++ '
' | Node-0 | - - - ->| Node-1 || Node-3 |<- - - - | Node-2 | '
' ++ ' ' ++++ ' ' ++ '
'' ' |' '  ^ '
+ - - - - - -+ +- - -|- - - - - - - - - - + + - - -|- - -+
 | |
 + - - - - - - - - - - - - - - - - +


The patch set has been break down as below:

1. Split graph worker into common and default model part.
2. Inline graph node processing to make it reusable.
3. Add set/get APIs to choose worker model.
4. Introduce core affinity API to set the node run on specific worker core.
  (only use in new model)
5. Introduce graph affinity API to bind one graph with specific worker
  core.
6. Introduce graph clone API.
7. Introduce stream moving with scheduler work-queue in patch 8~12.
8. Add stats for new models.
9. Abstract default graph config process and integrate new model into
  example/l3fwd-graph. Add new parameters for model choosing.

We could run with new worker model by this:
./dpdk-l3fwd-graph -l 8,9,10,11 -n 4 -- -p 0x1 --config="(0,0,9)" -P
--model="dispatch"

References:
https://static.sched.com/hosted_files/dpdkuserspace22/a6/graph%20introduce%20remote%20dispatch%20for%20mult-core%20scaling.pdf



Zhirun Yan (15):
  graph: rename rte_graph_work as common
  graph: split graph worker into common and default model
  graph: move node process into inline function
  graph: add get/set graph worker model APIs
  graph: introduce graph node core affinity API
  graph: introduce graph bind unbind API
  graph: introduce graph clone API for other worker core
  graph: add struct for stream moving between cores
  graph: introduce stream moving cross cores
  graph: enable create and destroy graph scheduling workqueue
  graph: introduce graph walk by cross-core dispatch
  graph: enable graph multicore dispatch scheduler model
  graph: add stats for cross-core dispatching
  examples/l3fwd-graph: introduce multicore dispatch worker model
  doc: update multicore dispatch model in graph guides

 MAINTAINERS  |   1 +
 doc/guides/prog_guide/graph_lib.rst  |  59 ++-
 examples/l3fwd-graph/main.c  | 236 +---
 lib/graph/graph.c| 179 +
 lib/graph/graph_debug.c  |   6 +
 lib/graph/graph_populate.c   |   1 +
 lib/graph/graph_private.h|  44 +++
 lib/graph/graph_stats.c  |  74 +++-
 lib/graph/meson.build|   4 +-
 lib/graph/node.c |   1 +
 lib/graph/rte_graph.h|  44 +++
 lib/graph/rte_graph_model_dispatch.c | 179 +
 lib/graph/rte_graph_model_dispatch.h | 122 ++
 lib/graph/rte_graph_model_rtc.h  |  45 +++
 lib/graph/rte_graph_worker.c |  54 +++
 lib/graph/rte_graph_worker.h | 498 +
 lib/graph/rte_graph_worker_common.h  | 539 +++
 lib/graph/version.map  

[PATCH v5 01/15] graph: rename rte_graph_work as common

2023-03-30 Thread Zhirun Yan
Rename rte_graph_work.h to rte_graph_work_common.h for supporting
multiple graph worker model.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 MAINTAINERS | 1 +
 lib/graph/graph_pcap.c  | 2 +-
 lib/graph/graph_private.h   | 2 +-
 lib/graph/meson.build   | 2 +-
 lib/graph/{rte_graph_worker.h => rte_graph_worker_common.h} | 6 +++---
 5 files changed, 7 insertions(+), 6 deletions(-)
 rename lib/graph/{rte_graph_worker.h => rte_graph_worker_common.h} (99%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 280058adfc..9d9467dd00 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1714,6 +1714,7 @@ F: doc/guides/prog_guide/bpf_lib.rst
 Graph - EXPERIMENTAL
 M: Jerin Jacob 
 M: Kiran Kumar K 
+M: Zhirun Yan 
 F: lib/graph/
 F: doc/guides/prog_guide/graph_lib.rst
 F: app/test/test_graph*
diff --git a/lib/graph/graph_pcap.c b/lib/graph/graph_pcap.c
index 6c43330029..8a220370fa 100644
--- a/lib/graph/graph_pcap.c
+++ b/lib/graph/graph_pcap.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 
-#include "rte_graph_worker.h"
+#include "rte_graph_worker_common.h"
 
 #include "graph_pcap_private.h"
 
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index 7d1b30b8ac..f08dbc7e9d 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -12,7 +12,7 @@
 #include 
 
 #include "rte_graph.h"
-#include "rte_graph_worker.h"
+#include "rte_graph_worker_common.h"
 
 extern int rte_graph_logtype;
 
diff --git a/lib/graph/meson.build b/lib/graph/meson.build
index 3526d1b5d4..4e2b612ad3 100644
--- a/lib/graph/meson.build
+++ b/lib/graph/meson.build
@@ -16,6 +16,6 @@ sources = files(
 'graph_populate.c',
 'graph_pcap.c',
 )
-headers = files('rte_graph.h', 'rte_graph_worker.h')
+headers = files('rte_graph.h', 'rte_graph_worker_common.h')
 
 deps += ['eal', 'pcapng']
diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker_common.h
similarity index 99%
rename from lib/graph/rte_graph_worker.h
rename to lib/graph/rte_graph_worker_common.h
index 438595b15c..0bad2938f3 100644
--- a/lib/graph/rte_graph_worker.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -2,8 +2,8 @@
  * Copyright(C) 2020 Marvell International Ltd.
  */
 
-#ifndef _RTE_GRAPH_WORKER_H_
-#define _RTE_GRAPH_WORKER_H_
+#ifndef _RTE_GRAPH_WORKER_COMMON_H_
+#define _RTE_GRAPH_WORKER_COMMON_H_
 
 /**
  * @file rte_graph_worker.h
@@ -518,4 +518,4 @@ rte_node_next_stream_move(struct rte_graph *graph, struct 
rte_node *src,
 }
 #endif
 
-#endif /* _RTE_GRAPH_WORKER_H_ */
+#endif /* _RTE_GRAPH_WORKER_COIMMON_H_ */
-- 
2.37.2



[PATCH v5 03/15] graph: move node process into inline function

2023-03-30 Thread Zhirun Yan
Node process is a single and reusable block, move the code into an inline
function.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/rte_graph_model_rtc.h | 20 ++---
 lib/graph/rte_graph_worker_common.h | 33 +
 2 files changed, 35 insertions(+), 18 deletions(-)

diff --git a/lib/graph/rte_graph_model_rtc.h b/lib/graph/rte_graph_model_rtc.h
index 665560f831..0dcb7151e9 100644
--- a/lib/graph/rte_graph_model_rtc.h
+++ b/lib/graph/rte_graph_model_rtc.h
@@ -20,9 +20,6 @@ rte_graph_walk_rtc(struct rte_graph *graph)
const rte_node_t mask = graph->cir_mask;
uint32_t head = graph->head;
struct rte_node *node;
-   uint64_t start;
-   uint16_t rc;
-   void **objs;
 
/*
 * Walk on the source node(s) ((cir_start - head) -> cir_start) and then
@@ -41,21 +38,8 @@ rte_graph_walk_rtc(struct rte_graph *graph)
 */
while (likely(head != graph->tail)) {
node = (struct rte_node *)RTE_PTR_ADD(graph, 
cir_start[(int32_t)head++]);
-   RTE_ASSERT(node->fence == RTE_GRAPH_FENCE);
-   objs = node->objs;
-   rte_prefetch0(objs);
-
-   if (rte_graph_has_stats_feature()) {
-   start = rte_rdtsc();
-   rc = node->process(graph, node, objs, node->idx);
-   node->total_cycles += rte_rdtsc() - start;
-   node->total_calls++;
-   node->total_objs += rc;
-   } else {
-   node->process(graph, node, objs, node->idx);
-   }
-   node->idx = 0;
-   head = likely((int32_t)head > 0) ? head & mask : head;
+   __rte_node_process(graph, node);
+   head = likely((int32_t)head > 0) ? head & mask : head;
}
graph->tail = 0;
 }
diff --git a/lib/graph/rte_graph_worker_common.h 
b/lib/graph/rte_graph_worker_common.h
index b58f8f6947..41428974db 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -130,6 +130,39 @@ void __rte_node_stream_alloc_size(struct rte_graph *graph,
 
 /* Fast path helper functions */
 
+/**
+ * @internal
+ *
+ * Enqueue a given node to the tail of the graph reel.
+ *
+ * @param graph
+ *   Pointer Graph object.
+ * @param node
+ *   Pointer to node object to be enqueued.
+ */
+static __rte_always_inline void
+__rte_node_process(struct rte_graph *graph, struct rte_node *node)
+{
+   uint64_t start;
+   uint16_t rc;
+   void **objs;
+
+   RTE_ASSERT(node->fence == RTE_GRAPH_FENCE);
+   objs = node->objs;
+   rte_prefetch0(objs);
+
+   if (rte_graph_has_stats_feature()) {
+   start = rte_rdtsc();
+   rc = node->process(graph, node, objs, node->idx);
+   node->total_cycles += rte_rdtsc() - start;
+   node->total_calls++;
+   node->total_objs += rc;
+   } else {
+   node->process(graph, node, objs, node->idx);
+   }
+   node->idx = 0;
+}
+
 /**
  * @internal
  *
-- 
2.37.2



[PATCH v5 02/15] graph: split graph worker into common and default model

2023-03-30 Thread Zhirun Yan
To support multiple graph worker model, split graph into common
and default. Naming the current walk function as rte_graph_model_rtc
cause the default model is RTC(Run-to-completion).

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph_pcap.c  |  2 +-
 lib/graph/graph_private.h   |  2 +-
 lib/graph/meson.build   |  2 +-
 lib/graph/rte_graph_model_rtc.h | 61 +
 lib/graph/rte_graph_worker.h| 34 
 lib/graph/rte_graph_worker_common.h | 57 ---
 6 files changed, 98 insertions(+), 60 deletions(-)
 create mode 100644 lib/graph/rte_graph_model_rtc.h
 create mode 100644 lib/graph/rte_graph_worker.h

diff --git a/lib/graph/graph_pcap.c b/lib/graph/graph_pcap.c
index 8a220370fa..6c43330029 100644
--- a/lib/graph/graph_pcap.c
+++ b/lib/graph/graph_pcap.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 
-#include "rte_graph_worker_common.h"
+#include "rte_graph_worker.h"
 
 #include "graph_pcap_private.h"
 
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index f08dbc7e9d..7d1b30b8ac 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -12,7 +12,7 @@
 #include 
 
 #include "rte_graph.h"
-#include "rte_graph_worker_common.h"
+#include "rte_graph_worker.h"
 
 extern int rte_graph_logtype;
 
diff --git a/lib/graph/meson.build b/lib/graph/meson.build
index 4e2b612ad3..3526d1b5d4 100644
--- a/lib/graph/meson.build
+++ b/lib/graph/meson.build
@@ -16,6 +16,6 @@ sources = files(
 'graph_populate.c',
 'graph_pcap.c',
 )
-headers = files('rte_graph.h', 'rte_graph_worker_common.h')
+headers = files('rte_graph.h', 'rte_graph_worker.h')
 
 deps += ['eal', 'pcapng']
diff --git a/lib/graph/rte_graph_model_rtc.h b/lib/graph/rte_graph_model_rtc.h
new file mode 100644
index 00..665560f831
--- /dev/null
+++ b/lib/graph/rte_graph_model_rtc.h
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Intel Corporation
+ */
+
+#include "rte_graph_worker_common.h"
+
+/**
+ * Perform graph walk on the circular buffer and invoke the process function
+ * of the nodes and collect the stats.
+ *
+ * @param graph
+ *   Graph pointer returned from rte_graph_lookup function.
+ *
+ * @see rte_graph_lookup()
+ */
+static inline void
+rte_graph_walk_rtc(struct rte_graph *graph)
+{
+   const rte_graph_off_t *cir_start = graph->cir_start;
+   const rte_node_t mask = graph->cir_mask;
+   uint32_t head = graph->head;
+   struct rte_node *node;
+   uint64_t start;
+   uint16_t rc;
+   void **objs;
+
+   /*
+* Walk on the source node(s) ((cir_start - head) -> cir_start) and then
+* on the pending streams (cir_start -> (cir_start + mask) -> cir_start)
+* in a circular buffer fashion.
+*
+*  +-+ <= cir_start - head [number of source nodes]
+*  | |
+*  | ... | <= source nodes
+*  | |
+*  +-+ <= cir_start [head = 0] [tail = 0]
+*  | |
+*  | ... | <= pending streams
+*  | |
+*  +-+ <= cir_start + mask
+*/
+   while (likely(head != graph->tail)) {
+   node = (struct rte_node *)RTE_PTR_ADD(graph, 
cir_start[(int32_t)head++]);
+   RTE_ASSERT(node->fence == RTE_GRAPH_FENCE);
+   objs = node->objs;
+   rte_prefetch0(objs);
+
+   if (rte_graph_has_stats_feature()) {
+   start = rte_rdtsc();
+   rc = node->process(graph, node, objs, node->idx);
+   node->total_cycles += rte_rdtsc() - start;
+   node->total_calls++;
+   node->total_objs += rc;
+   } else {
+   node->process(graph, node, objs, node->idx);
+   }
+   node->idx = 0;
+   head = likely((int32_t)head > 0) ? head & mask : head;
+   }
+   graph->tail = 0;
+}
diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h
new file mode 100644
index 00..7ea18ba80a
--- /dev/null
+++ b/lib/graph/rte_graph_worker.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Intel Corporation
+ */
+
+#ifndef _RTE_GRAPH_WORKER_H_
+#define _RTE_GRAPH_WORKER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "rte_graph_model_rtc.h"
+
+/**
+ * Perform graph walk on the circular buffer and invoke the process function
+ * of the nodes and collect the stats.
+ *
+ * @param graph
+ *   Graph pointer returned from rte_graph_lookup function.
+ *
+ * @see rte_graph_lookup()
+ */
+__rte_experimental
+static inline void
+rte_graph_walk(struct rte_graph *graph)
+{
+   rte_graph_walk_rtc(graph);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GRAPH_WORKER_H_ */
diff --git a/lib/graph/

[PATCH v5 04/15] graph: add get/set graph worker model APIs

2023-03-30 Thread Zhirun Yan
Add new get/set APIs to configure graph worker model which is used to
determine which model will be chosen.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/meson.build   |  1 +
 lib/graph/rte_graph_worker.c| 54 +
 lib/graph/rte_graph_worker_common.h | 19 ++
 lib/graph/version.map   |  3 ++
 4 files changed, 77 insertions(+)
 create mode 100644 lib/graph/rte_graph_worker.c

diff --git a/lib/graph/meson.build b/lib/graph/meson.build
index 3526d1b5d4..9fab8243da 100644
--- a/lib/graph/meson.build
+++ b/lib/graph/meson.build
@@ -15,6 +15,7 @@ sources = files(
 'graph_stats.c',
 'graph_populate.c',
 'graph_pcap.c',
+'rte_graph_worker.c',
 )
 headers = files('rte_graph.h', 'rte_graph_worker.h')
 
diff --git a/lib/graph/rte_graph_worker.c b/lib/graph/rte_graph_worker.c
new file mode 100644
index 00..cabc101262
--- /dev/null
+++ b/lib/graph/rte_graph_worker.c
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Intel Corporation
+ */
+
+#include "rte_graph_worker_common.h"
+
+RTE_DEFINE_PER_LCORE(enum rte_graph_worker_model, worker_model) = 
RTE_GRAPH_MODEL_DEFAULT;
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ * Set the graph worker model
+ *
+ * @note This function does not perform any locking, and is only safe to call
+ *before graph running.
+ *
+ * @param name
+ *   Name of the graph worker model.
+ *
+ * @return
+ *   0 on success, -1 otherwise.
+ */
+int
+rte_graph_worker_model_set(enum rte_graph_worker_model model)
+{
+   if (model >= RTE_GRAPH_MODEL_LIST_END)
+   goto fail;
+
+   RTE_PER_LCORE(worker_model) = model;
+   return 0;
+
+fail:
+   RTE_PER_LCORE(worker_model) = RTE_GRAPH_MODEL_DEFAULT;
+   return -1;
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Get the graph worker model
+ *
+ * @param name
+ *   Name of the graph worker model.
+ *
+ * @return
+ *   Graph worker model on success.
+ */
+inline
+enum rte_graph_worker_model
+rte_graph_worker_model_get(void)
+{
+   return RTE_PER_LCORE(worker_model);
+}
diff --git a/lib/graph/rte_graph_worker_common.h 
b/lib/graph/rte_graph_worker_common.h
index 41428974db..1526da6e2c 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -95,6 +96,16 @@ struct rte_node {
struct rte_node *nodes[] __rte_cache_min_aligned; /**< Next nodes. */
 } __rte_cache_aligned;
 
+/** Graph worker models */
+enum rte_graph_worker_model {
+   RTE_GRAPH_MODEL_DEFAULT,
+   RTE_GRAPH_MODEL_RTC = RTE_GRAPH_MODEL_DEFAULT,
+   RTE_GRAPH_MODEL_MCORE_DISPATCH,
+   RTE_GRAPH_MODEL_LIST_END
+};
+
+RTE_DECLARE_PER_LCORE(enum rte_graph_worker_model, worker_model);
+
 /**
  * @internal
  *
@@ -490,6 +501,14 @@ rte_node_next_stream_move(struct rte_graph *graph, struct 
rte_node *src,
}
 }
 
+__rte_experimental
+enum rte_graph_worker_model
+rte_graph_worker_model_get(void);
+
+__rte_experimental
+int
+rte_graph_worker_model_set(enum rte_graph_worker_model model);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/graph/version.map b/lib/graph/version.map
index 13b838752d..eea73ec9ca 100644
--- a/lib/graph/version.map
+++ b/lib/graph/version.map
@@ -43,5 +43,8 @@ EXPERIMENTAL {
rte_node_next_stream_put;
rte_node_next_stream_move;
 
+   rte_graph_worker_model_set;
+   rte_graph_worker_model_get;
+
local: *;
 };
-- 
2.37.2



[PATCH v5 05/15] graph: introduce graph node core affinity API

2023-03-30 Thread Zhirun Yan
Add lcore_id for node to hold affinity core id and impl
rte_graph_model_dispatch_lcore_affinity_set to set node affinity
with specific lcore.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph_private.h|  1 +
 lib/graph/meson.build|  1 +
 lib/graph/node.c |  1 +
 lib/graph/rte_graph_model_dispatch.c | 31 
 lib/graph/rte_graph_model_dispatch.h | 43 
 lib/graph/version.map|  2 ++
 6 files changed, 79 insertions(+)
 create mode 100644 lib/graph/rte_graph_model_dispatch.c
 create mode 100644 lib/graph/rte_graph_model_dispatch.h

diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index 7d1b30b8ac..409eed3284 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -50,6 +50,7 @@ struct node {
STAILQ_ENTRY(node) next;  /**< Next node in the list. */
char name[RTE_NODE_NAMESIZE]; /**< Name of the node. */
uint64_t flags;   /**< Node configuration flag. */
+   unsigned int lcore_id;/**< Node runs on the Lcore ID */
rte_node_process_t process;   /**< Node process function. */
rte_node_init_t init; /**< Node init function. */
rte_node_fini_t fini; /**< Node fini function. */
diff --git a/lib/graph/meson.build b/lib/graph/meson.build
index 9fab8243da..c729d984b6 100644
--- a/lib/graph/meson.build
+++ b/lib/graph/meson.build
@@ -16,6 +16,7 @@ sources = files(
 'graph_populate.c',
 'graph_pcap.c',
 'rte_graph_worker.c',
+'rte_graph_model_dispatch.c',
 )
 headers = files('rte_graph.h', 'rte_graph_worker.h')
 
diff --git a/lib/graph/node.c b/lib/graph/node.c
index 149414dcd9..339b4a0da5 100644
--- a/lib/graph/node.c
+++ b/lib/graph/node.c
@@ -100,6 +100,7 @@ __rte_node_register(const struct rte_node_register *reg)
goto free;
}
 
+   node->lcore_id = RTE_MAX_LCORE;
node->id = node_id++;
 
/* Add the node at tail */
diff --git a/lib/graph/rte_graph_model_dispatch.c 
b/lib/graph/rte_graph_model_dispatch.c
new file mode 100644
index 00..4a2f99496d
--- /dev/null
+++ b/lib/graph/rte_graph_model_dispatch.c
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Intel Corporation
+ */
+
+#include "graph_private.h"
+#include "rte_graph_model_dispatch.h"
+
+int
+rte_graph_model_dispatch_lcore_affinity_set(const char *name, unsigned int 
lcore_id)
+{
+   struct node *node;
+   int ret = -EINVAL;
+
+   if (lcore_id >= RTE_MAX_LCORE)
+   return ret;
+
+   graph_spinlock_lock();
+
+   STAILQ_FOREACH(node, node_list_head_get(), next) {
+   if (strncmp(node->name, name, RTE_NODE_NAMESIZE) == 0) {
+   node->lcore_id = lcore_id;
+   ret = 0;
+   break;
+   }
+   }
+
+   graph_spinlock_unlock();
+
+   return ret;
+}
+
diff --git a/lib/graph/rte_graph_model_dispatch.h 
b/lib/graph/rte_graph_model_dispatch.h
new file mode 100644
index 00..179624e972
--- /dev/null
+++ b/lib/graph/rte_graph_model_dispatch.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Intel Corporation
+ */
+
+#ifndef _RTE_GRAPH_MODEL_DISPATCH_H_
+#define _RTE_GRAPH_MODEL_DISPATCH_H_
+
+/**
+ * @file rte_graph_model_dispatch.h
+ *
+ * @warning
+ * @b EXPERIMENTAL:
+ * All functions in this file may be changed or removed without prior notice.
+ *
+ * This API allows to set core affinity with the node.
+ */
+#include "rte_graph_worker_common.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Set lcore affinity with the node.
+ *
+ * @param name
+ *   Valid node name. In the case of the cloned node, the name will be
+ * "parent node name" + "-" + name.
+ * @param lcore_id
+ *   The lcore ID value.
+ *
+ * @return
+ *   0 on success, error otherwise.
+ */
+__rte_experimental
+int rte_graph_model_dispatch_lcore_affinity_set(const char *name,
+   unsigned int lcore_id);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_GRAPH_MODEL_DISPATCH_H_ */
diff --git a/lib/graph/version.map b/lib/graph/version.map
index eea73ec9ca..1f090be74e 100644
--- a/lib/graph/version.map
+++ b/lib/graph/version.map
@@ -46,5 +46,7 @@ EXPERIMENTAL {
rte_graph_worker_model_set;
rte_graph_worker_model_get;
 
+   rte_graph_model_dispatch_lcore_affinity_set;
+
local: *;
 };
-- 
2.37.2



[PATCH v5 06/15] graph: introduce graph bind unbind API

2023-03-30 Thread Zhirun Yan
Add lcore_id for graph to hold affinity core id where graph would run on.
Add bind/unbind API to set/unset graph affinity attribute. lcore_id will
be set as MAX by default, it means not enable this attribute.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph.c | 59 +++
 lib/graph/graph_private.h |  2 ++
 lib/graph/rte_graph.h | 22 +++
 lib/graph/version.map |  2 ++
 4 files changed, 85 insertions(+)

diff --git a/lib/graph/graph.c b/lib/graph/graph.c
index a839a2803b..b39a99aac6 100644
--- a/lib/graph/graph.c
+++ b/lib/graph/graph.c
@@ -254,6 +254,64 @@ graph_mem_fixup_secondary(struct rte_graph *graph)
return graph_mem_fixup_node_ctx(graph);
 }
 
+static __rte_always_inline bool
+graph_src_node_avail(struct graph *graph)
+{
+   struct graph_node *graph_node;
+
+   STAILQ_FOREACH(graph_node, &graph->node_list, next)
+   if ((graph_node->node->flags & RTE_NODE_SOURCE_F) &&
+   (graph_node->node->lcore_id == RTE_MAX_LCORE ||
+graph->lcore_id == graph_node->node->lcore_id))
+   return true;
+
+   return false;
+}
+
+int
+rte_graph_model_dispatch_core_bind(rte_graph_t id, int lcore)
+{
+   struct graph *graph;
+
+   GRAPH_ID_CHECK(id);
+   if (!rte_lcore_is_enabled(lcore))
+   SET_ERR_JMP(ENOLINK, fail,
+   "lcore %d not enabled\n",
+   lcore);
+
+   STAILQ_FOREACH(graph, &graph_list, next)
+   if (graph->id == id)
+   break;
+
+   graph->lcore_id = lcore;
+   graph->socket = rte_lcore_to_socket_id(lcore);
+
+   /* check the availability of source node */
+   if (!graph_src_node_avail(graph))
+   graph->graph->head = 0;
+
+   return 0;
+
+fail:
+   return -rte_errno;
+}
+
+void
+rte_graph_model_dispatch_core_unbind(rte_graph_t id)
+{
+   struct graph *graph;
+
+   GRAPH_ID_CHECK(id);
+   STAILQ_FOREACH(graph, &graph_list, next)
+   if (graph->id == id)
+   break;
+
+   graph->lcore_id = RTE_MAX_LCORE;
+
+fail:
+   return;
+}
+
 struct rte_graph *
 rte_graph_lookup(const char *name)
 {
@@ -340,6 +398,7 @@ rte_graph_create(const char *name, struct rte_graph_param 
*prm)
graph->src_node_count = src_node_count;
graph->node_count = graph_nodes_count(graph);
graph->id = graph_id;
+   graph->lcore_id = RTE_MAX_LCORE;
graph->num_pkt_to_capture = prm->num_pkt_to_capture;
if (prm->pcap_filename)
rte_strscpy(graph->pcap_filename, prm->pcap_filename, 
RTE_GRAPH_PCAP_FILE_SZ);
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index 409eed3284..ad1d058945 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -98,6 +98,8 @@ struct graph {
/**< Circular buffer mask for wrap around. */
rte_graph_t id;
/**< Graph identifier. */
+   unsigned int lcore_id;
+   /**< Lcore identifier where the graph prefer to run on. */
size_t mem_sz;
/**< Memory size of the graph. */
int socket;
diff --git a/lib/graph/rte_graph.h b/lib/graph/rte_graph.h
index c9a77297fc..c523809d1f 100644
--- a/lib/graph/rte_graph.h
+++ b/lib/graph/rte_graph.h
@@ -285,6 +285,28 @@ char *rte_graph_id_to_name(rte_graph_t id);
 __rte_experimental
 int rte_graph_export(const char *name, FILE *f);
 
+/**
+ * Bind graph with specific lcore
+ *
+ * @param id
+ *   Graph id to get the pointer of graph object
+ * @param lcore
+ * The lcore where the graph will run on
+ * @return
+ *   0 on success, error otherwise.
+ */
+__rte_experimental
+int rte_graph_model_dispatch_core_bind(rte_graph_t id, int lcore);
+
+/**
+ * Unbind graph with lcore
+ *
+ * @param id
+ * Graph id to get the pointer of graph object
+ */
+__rte_experimental
+void rte_graph_model_dispatch_core_unbind(rte_graph_t id);
+
 /**
  * Get graph object from its name.
  *
diff --git a/lib/graph/version.map b/lib/graph/version.map
index 1f090be74e..7de6f08f59 100644
--- a/lib/graph/version.map
+++ b/lib/graph/version.map
@@ -18,6 +18,8 @@ EXPERIMENTAL {
rte_graph_node_get_by_name;
rte_graph_obj_dump;
rte_graph_walk;
+   rte_graph_model_dispatch_core_bind;
+   rte_graph_model_dispatch_core_unbind;
 
rte_graph_cluster_stats_create;
rte_graph_cluster_stats_destroy;
-- 
2.37.2



[PATCH v5 07/15] graph: introduce graph clone API for other worker core

2023-03-30 Thread Zhirun Yan
This patch adds graph API for supporting to clone the graph object for
a specified worker core. The new graph will also clone all nodes.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph.c | 110 ++
 lib/graph/graph_private.h |   2 +
 lib/graph/rte_graph.h |  20 +++
 lib/graph/version.map |   1 +
 4 files changed, 133 insertions(+)

diff --git a/lib/graph/graph.c b/lib/graph/graph.c
index b39a99aac6..90eaad0378 100644
--- a/lib/graph/graph.c
+++ b/lib/graph/graph.c
@@ -398,6 +398,7 @@ rte_graph_create(const char *name, struct rte_graph_param 
*prm)
graph->src_node_count = src_node_count;
graph->node_count = graph_nodes_count(graph);
graph->id = graph_id;
+   graph->parent_id = RTE_GRAPH_ID_INVALID;
graph->lcore_id = RTE_MAX_LCORE;
graph->num_pkt_to_capture = prm->num_pkt_to_capture;
if (prm->pcap_filename)
@@ -462,6 +463,115 @@ rte_graph_destroy(rte_graph_t id)
return rc;
 }
 
+static int
+clone_name(struct graph *graph, struct graph *parent_graph, const char *name)
+{
+   ssize_t sz, rc;
+
+#define SZ RTE_GRAPH_NAMESIZE
+   rc = rte_strscpy(graph->name, parent_graph->name, SZ);
+   if (rc < 0)
+   goto fail;
+   sz = rc;
+   rc = rte_strscpy(graph->name + sz, "-", RTE_MAX((int16_t)(SZ - sz), 0));
+   if (rc < 0)
+   goto fail;
+   sz += rc;
+   sz = rte_strscpy(graph->name + sz, name, RTE_MAX((int16_t)(SZ - sz), 
0));
+   if (sz < 0)
+   goto fail;
+
+   return 0;
+fail:
+   rte_errno = E2BIG;
+   return -rte_errno;
+}
+
+static rte_graph_t
+graph_clone(struct graph *parent_graph, const char *name)
+{
+   struct graph_node *graph_node;
+   struct graph *graph;
+
+   graph_spinlock_lock();
+
+   /* Don't allow to clone a node from a cloned graph */
+   if (parent_graph->parent_id != RTE_GRAPH_ID_INVALID)
+   SET_ERR_JMP(EEXIST, fail, "A cloned graph is not allowed to be 
cloned");
+
+   /* Create graph object */
+   graph = calloc(1, sizeof(*graph));
+   if (graph == NULL)
+   SET_ERR_JMP(ENOMEM, fail, "Failed to calloc cloned graph 
object");
+
+   /* Naming ceremony of the new graph. name is node->name + "-" + name */
+   if (clone_name(graph, parent_graph, name))
+   goto free;
+
+   /* Check for existence of duplicate graph */
+   if (rte_graph_from_name(graph->name) != RTE_GRAPH_ID_INVALID)
+   SET_ERR_JMP(EEXIST, free, "Found duplicate graph %s",
+   graph->name);
+
+   /* Clone nodes from parent graph firstly */
+   STAILQ_INIT(&graph->node_list);
+   STAILQ_FOREACH(graph_node, &parent_graph->node_list, next) {
+   if (graph_node_add(graph, graph_node->node))
+   goto graph_cleanup;
+   }
+
+   /* Just update adjacency list of all nodes in the graph */
+   if (graph_adjacency_list_update(graph))
+   goto graph_cleanup;
+
+   /* Initialize the graph object */
+   graph->src_node_count = parent_graph->src_node_count;
+   graph->node_count = parent_graph->node_count;
+   graph->parent_id = parent_graph->id;
+   graph->lcore_id = parent_graph->lcore_id;
+   graph->socket = parent_graph->socket;
+   graph->id = graph_id;
+
+   /* Allocate the Graph fast path memory and populate the data */
+   if (graph_fp_mem_create(graph))
+   goto graph_cleanup;
+
+   /* Call init() of the all the nodes in the graph */
+   if (graph_node_init(graph))
+   goto graph_mem_destroy;
+
+   /* All good, Lets add the graph to the list */
+   graph_id++;
+   STAILQ_INSERT_TAIL(&graph_list, graph, next);
+
+   graph_spinlock_unlock();
+   return graph->id;
+
+graph_mem_destroy:
+   graph_fp_mem_destroy(graph);
+graph_cleanup:
+   graph_cleanup(graph);
+free:
+   free(graph);
+fail:
+   graph_spinlock_unlock();
+   return RTE_GRAPH_ID_INVALID;
+}
+
+rte_graph_t
+rte_graph_clone(rte_graph_t id, const char *name)
+{
+   struct graph *graph;
+
+   GRAPH_ID_CHECK(id);
+   STAILQ_FOREACH(graph, &graph_list, next)
+   if (graph->id == id)
+   return graph_clone(graph, name);
+
+fail:
+   return RTE_GRAPH_ID_INVALID;
+}
+
 rte_graph_t
 rte_graph_from_name(const char *name)
 {
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index ad1d058945..d28a5af93e 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -98,6 +98,8 @@ struct graph {
/**< Circular buffer mask for wrap around. */
rte_graph_t id;
/**< Graph identifier. */
+   rte_graph_t parent_id;
+   /**< Parent graph identifier. */
unsigned int lcore_id;
/**< Lcore identifier where the graph prefer to run on. *

[PATCH v5 08/15] graph: add struct for stream moving between cores

2023-03-30 Thread Zhirun Yan
Add graph_sched_wq_node to hold graph scheduling workqueue
node.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph.c   |  1 +
 lib/graph/graph_populate.c  |  1 +
 lib/graph/graph_private.h   | 12 
 lib/graph/rte_graph_worker_common.h | 21 +
 4 files changed, 35 insertions(+)

diff --git a/lib/graph/graph.c b/lib/graph/graph.c
index 90eaad0378..dd3d69dbf7 100644
--- a/lib/graph/graph.c
+++ b/lib/graph/graph.c
@@ -284,6 +284,7 @@ rte_graph_model_dispatch_core_bind(rte_graph_t id, int 
lcore)
break;
 
graph->lcore_id = lcore;
+   graph->graph->lcore_id = graph->lcore_id;
graph->socket = rte_lcore_to_socket_id(lcore);
 
/* check the availability of source node */
diff --git a/lib/graph/graph_populate.c b/lib/graph/graph_populate.c
index 2c0844ce92..7dcf1420c1 100644
--- a/lib/graph/graph_populate.c
+++ b/lib/graph/graph_populate.c
@@ -89,6 +89,7 @@ graph_nodes_populate(struct graph *_graph)
}
node->id = graph_node->node->id;
node->parent_id = pid;
+   node->lcore_id = graph_node->node->lcore_id;
nb_edges = graph_node->node->nb_edges;
node->nb_edges = nb_edges;
off += sizeof(struct rte_node);
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index d28a5af93e..b66b18ebbc 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -60,6 +60,18 @@ struct node {
char next_nodes[][RTE_NODE_NAMESIZE]; /**< Names of next nodes. */
 };
 
+/**
+ * @internal
+ *
+ * Structure that holds the graph scheduling workqueue node stream.
+ * Used for mcore dispatch model.
+ */
+struct graph_sched_wq_node {
+   rte_graph_off_t node_off;
+   uint16_t nb_objs;
+   void *objs[RTE_GRAPH_BURST_SIZE];
+} __rte_cache_aligned;
+
 /**
  * @internal
  *
diff --git a/lib/graph/rte_graph_worker_common.h 
b/lib/graph/rte_graph_worker_common.h
index 1526da6e2c..dc0a0b5554 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -30,6 +30,13 @@
 extern "C" {
 #endif
 
+/**
+ * @internal
+ *
+ * Singly-linked list head for graph schedule run-queue.
+ */
+SLIST_HEAD(rte_graph_rq_head, rte_graph);
+
 /**
  * @internal
  *
@@ -41,6 +48,15 @@ struct rte_graph {
uint32_t cir_mask;   /**< Circular buffer wrap around mask. */
rte_node_t nb_nodes; /**< Number of nodes in the graph. */
rte_graph_off_t *cir_start;  /**< Pointer to circular buffer. */
+   /* Graph schedule */
+   struct rte_graph_rq_head *rq __rte_cache_aligned; /* The run-queue */
+   struct rte_graph_rq_head rq_head; /* The head for run-queue list */
+
+   SLIST_ENTRY(rte_graph) rq_next;   /* The next for run-queue list */
+   unsigned int lcore_id;  /**< The graph running Lcore. */
+   struct rte_ring *wq;/**< The work-queue for pending streams. */
+   struct rte_mempool *mp; /**< The mempool for scheduling streams. */
+   /* Graph schedule area */
rte_graph_off_t nodes_start; /**< Offset at which node memory starts. */
rte_graph_t id; /**< Graph identifier. */
int socket; /**< Socket ID where memory is allocated. */
@@ -74,6 +90,11 @@ struct rte_node {
/** Original process function when pcap is enabled. */
rte_node_process_t original_process;
 
+   RTE_STD_C11
+   union {
+   /* Fast schedule area for mcore dispatch model */
+   unsigned int lcore_id;  /**< Node running lcore. */
+   };
/* Fast path area  */
 #define RTE_NODE_CTX_SZ 16
uint8_t ctx[RTE_NODE_CTX_SZ] __rte_cache_aligned; /**< Node Context. */
-- 
2.37.2



[PATCH v5 09/15] graph: introduce stream moving cross cores

2023-03-30 Thread Zhirun Yan
This patch introduces key functions to allow a worker thread to
enable enqueue and move streams of objects to the next nodes over
different cores.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph_private.h|  27 +
 lib/graph/meson.build|   2 +-
 lib/graph/rte_graph_model_dispatch.c | 145 +++
 lib/graph/rte_graph_model_dispatch.h |  37 +++
 lib/graph/version.map|   2 +
 5 files changed, 212 insertions(+), 1 deletion(-)

diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index b66b18ebbc..e1a2a4bfd8 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -366,4 +366,31 @@ void graph_dump(FILE *f, struct graph *g);
  */
 void node_dump(FILE *f, struct node *n);
 
+/**
+ * @internal
+ *
+ * Create the graph schedule work queue. And all cloned graphs attached to the
+ * parent graph MUST be destroyed together for fast schedule design limitation.
+ *
+ * @param _graph
+ *   The graph object
+ * @param _parent_graph
+ *   The parent graph object which holds the run-queue head.
+ *
+ * @return
+ *   - 0: Success.
+ *   - <0: Graph schedule work queue related error.
+ */
+int graph_sched_wq_create(struct graph *_graph, struct graph *_parent_graph);
+
+/**
+ * @internal
+ *
+ * Destroy the graph schedule work queue.
+ *
+ * @param _graph
+ *   The graph object
+ */
+void graph_sched_wq_destroy(struct graph *_graph);
+
 #endif /* _RTE_GRAPH_PRIVATE_H_ */
diff --git a/lib/graph/meson.build b/lib/graph/meson.build
index c729d984b6..e21affa280 100644
--- a/lib/graph/meson.build
+++ b/lib/graph/meson.build
@@ -20,4 +20,4 @@ sources = files(
 )
 headers = files('rte_graph.h', 'rte_graph_worker.h')
 
-deps += ['eal', 'pcapng']
+deps += ['eal', 'pcapng', 'mempool', 'ring']
diff --git a/lib/graph/rte_graph_model_dispatch.c 
b/lib/graph/rte_graph_model_dispatch.c
index 4a2f99496d..a300fefb85 100644
--- a/lib/graph/rte_graph_model_dispatch.c
+++ b/lib/graph/rte_graph_model_dispatch.c
@@ -5,6 +5,151 @@
 #include "graph_private.h"
 #include "rte_graph_model_dispatch.h"
 
+int
+graph_sched_wq_create(struct graph *_graph, struct graph *_parent_graph)
+{
+   struct rte_graph *parent_graph = _parent_graph->graph;
+   struct rte_graph *graph = _graph->graph;
+   unsigned int wq_size;
+
+   wq_size = GRAPH_SCHED_WQ_SIZE(graph->nb_nodes);
+   wq_size = rte_align32pow2(wq_size + 1);
+
+   graph->wq = rte_ring_create(graph->name, wq_size, graph->socket,
+   RING_F_SC_DEQ);
+   if (graph->wq == NULL)
+   SET_ERR_JMP(EIO, fail, "Failed to allocate graph WQ");
+
+   graph->mp = rte_mempool_create(graph->name, wq_size,
+  sizeof(struct graph_sched_wq_node),
+  0, 0, NULL, NULL, NULL, NULL,
+  graph->socket, MEMPOOL_F_SP_PUT);
+   if (graph->mp == NULL)
+   SET_ERR_JMP(EIO, fail_mp,
+   "Failed to allocate graph WQ schedule entry");
+
+   graph->lcore_id = _graph->lcore_id;
+
+   if (parent_graph->rq == NULL) {
+   parent_graph->rq = &parent_graph->rq_head;
+   SLIST_INIT(parent_graph->rq);
+   }
+
+   graph->rq = parent_graph->rq;
+   SLIST_INSERT_HEAD(graph->rq, graph, rq_next);
+
+   return 0;
+
+fail_mp:
+   rte_ring_free(graph->wq);
+   graph->wq = NULL;
+fail:
+   return -rte_errno;
+}
+
+void
+graph_sched_wq_destroy(struct graph *_graph)
+{
+   struct rte_graph *graph = _graph->graph;
+
+   if (graph == NULL)
+   return;
+
+   rte_ring_free(graph->wq);
+   graph->wq = NULL;
+
+   rte_mempool_free(graph->mp);
+   graph->mp = NULL;
+}
+
+static __rte_always_inline bool
+__graph_sched_node_enqueue(struct rte_node *node, struct rte_graph *graph)
+{
+   struct graph_sched_wq_node *wq_node;
+   uint16_t off = 0;
+   uint16_t size;
+
+submit_again:
+   if (rte_mempool_get(graph->mp, (void **)&wq_node) < 0)
+   goto fallback;
+
+   size = RTE_MIN(node->idx, RTE_DIM(wq_node->objs));
+   wq_node->node_off = node->off;
+   wq_node->nb_objs = size;
+   rte_memcpy(wq_node->objs, &node->objs[off], size * sizeof(void *));
+
+   while (rte_ring_mp_enqueue_bulk_elem(graph->wq, (void *)&wq_node,
+ sizeof(wq_node), 1, NULL) == 0)
+   rte_pause();
+
+   off += size;
+   node->idx -= size;
+   if (node->idx > 0)
+   goto submit_again;
+
+   return true;
+
+fallback:
+   if (off != 0)
+   memmove(&node->objs[0], &node->objs[off],
+   node->idx * sizeof(void *));
+
+   return false;
+}
+
+bool __rte_noinline
+__rte_graph_sched_node_enqueue(struct rte_node *node,
+  struct rte_graph_rq_hea

[PATCH v5 10/15] graph: enable create and destroy graph scheduling workqueue

2023-03-30 Thread Zhirun Yan
This patch enables to create and destroy scheduling workqueue into
common graph operations.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/lib/graph/graph.c b/lib/graph/graph.c
index dd3d69dbf7..1f1ee9b622 100644
--- a/lib/graph/graph.c
+++ b/lib/graph/graph.c
@@ -443,6 +443,10 @@ rte_graph_destroy(rte_graph_t id)
while (graph != NULL) {
tmp = STAILQ_NEXT(graph, next);
if (graph->id == id) {
+   /* Destroy the schedule work queue if has */
+   if (rte_graph_worker_model_get() == 
RTE_GRAPH_MODEL_MCORE_DISPATCH)
+   graph_sched_wq_destroy(graph);
+
/* Call fini() of the all the nodes in the graph */
graph_node_fini(graph);
/* Destroy graph fast path memory */
@@ -537,6 +541,11 @@ graph_clone(struct graph *parent_graph, const char *name)
if (graph_fp_mem_create(graph))
goto graph_cleanup;
 
+   /* Create the graph schedule work queue */
+   if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_MCORE_DISPATCH &&
+   graph_sched_wq_create(graph, parent_graph))
+   goto graph_mem_destroy;
+
/* Call init() of the all the nodes in the graph */
if (graph_node_init(graph))
goto graph_mem_destroy;
-- 
2.37.2



[PATCH v5 11/15] graph: introduce graph walk by cross-core dispatch

2023-03-30 Thread Zhirun Yan
This patch introduces the task scheduler mechanism to enable dispatching
tasks to another worker cores. Currently, there is only a local work
queue for one graph to walk. We introduce a scheduler worker queue in
each worker core for dispatching tasks. It will perform the walk on
scheduler work queue first, then handle the local work queue.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/rte_graph_model_dispatch.h | 42 
 1 file changed, 42 insertions(+)

diff --git a/lib/graph/rte_graph_model_dispatch.h 
b/lib/graph/rte_graph_model_dispatch.h
index 18fa7ce0ab..65b2cc6d87 100644
--- a/lib/graph/rte_graph_model_dispatch.h
+++ b/lib/graph/rte_graph_model_dispatch.h
@@ -73,6 +73,48 @@ __rte_experimental
 int rte_graph_model_dispatch_lcore_affinity_set(const char *name,
unsigned int lcore_id);
 
+/**
+ * Perform graph walk on the circular buffer and invoke the process function
+ * of the nodes and collect the stats.
+ *
+ * @param graph
+ *   Graph pointer returned from rte_graph_lookup function.
+ *
+ * @see rte_graph_lookup()
+ */
+__rte_experimental
+static inline void
+rte_graph_walk_mcore_dispatch(struct rte_graph *graph)
+{
+   const rte_graph_off_t *cir_start = graph->cir_start;
+   const rte_node_t mask = graph->cir_mask;
+   uint32_t head = graph->head;
+   struct rte_node *node;
+
+   if (graph->wq != NULL)
+   __rte_graph_sched_wq_process(graph);
+
+   while (likely(head != graph->tail)) {
+   node = (struct rte_node *)RTE_PTR_ADD(graph, 
cir_start[(int32_t)head++]);
+
+   /* skip the src nodes which not bind with current worker */
+   if ((int32_t)head < 0 && node->lcore_id != graph->lcore_id)
+   continue;
+
+   /* Schedule the node until all task/objs are done */
+   if (node->lcore_id != RTE_MAX_LCORE &&
+   graph->lcore_id != node->lcore_id && graph->rq != NULL &&
+   __rte_graph_sched_node_enqueue(node, graph->rq))
+   continue;
+
+   __rte_node_process(graph, node);
+
+   head = likely((int32_t)head > 0) ? head & mask : head;
+   }
+
+   graph->tail = 0;
+}
+
 #ifdef __cplusplus
 }
 #endif
-- 
2.37.2



[PATCH v5 12/15] graph: enable graph multicore dispatch scheduler model

2023-03-30 Thread Zhirun Yan
This patch enables to chose new scheduler model.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/rte_graph_worker.h | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h
index 7ea18ba80a..d608c7513e 100644
--- a/lib/graph/rte_graph_worker.h
+++ b/lib/graph/rte_graph_worker.h
@@ -10,6 +10,7 @@ extern "C" {
 #endif
 
 #include "rte_graph_model_rtc.h"
+#include "rte_graph_model_dispatch.h"
 
 /**
  * Perform graph walk on the circular buffer and invoke the process function
@@ -24,7 +25,13 @@ __rte_experimental
 static inline void
 rte_graph_walk(struct rte_graph *graph)
 {
-   rte_graph_walk_rtc(graph);
+   int model = rte_graph_worker_model_get();
+
+   if (model == RTE_GRAPH_MODEL_DEFAULT ||
+   model == RTE_GRAPH_MODEL_RTC)
+   rte_graph_walk_rtc(graph);
+   else if (model == RTE_GRAPH_MODEL_MCORE_DISPATCH)
+   rte_graph_walk_mcore_dispatch(graph);
 }
 
 #ifdef __cplusplus
-- 
2.37.2



[PATCH v5 13/15] graph: add stats for cross-core dispatching

2023-03-30 Thread Zhirun Yan
Add stats for cross-core dispatching scheduler if stats collection is
enabled.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 lib/graph/graph_debug.c  |  6 +++
 lib/graph/graph_stats.c  | 74 +---
 lib/graph/rte_graph.h|  2 +
 lib/graph/rte_graph_model_dispatch.c |  3 ++
 lib/graph/rte_graph_worker_common.h  |  2 +
 5 files changed, 79 insertions(+), 8 deletions(-)

diff --git a/lib/graph/graph_debug.c b/lib/graph/graph_debug.c
index b84412f5dd..7dcf07b080 100644
--- a/lib/graph/graph_debug.c
+++ b/lib/graph/graph_debug.c
@@ -74,6 +74,12 @@ rte_graph_obj_dump(FILE *f, struct rte_graph *g, bool all)
fprintf(f, "   size=%d\n", n->size);
fprintf(f, "   idx=%d\n", n->idx);
fprintf(f, "   total_objs=%" PRId64 "\n", n->total_objs);
+   if (rte_graph_worker_model_get() == 
RTE_GRAPH_MODEL_MCORE_DISPATCH) {
+   fprintf(f, "   total_sched_objs=%" PRId64 "\n",
+   n->total_sched_objs);
+   fprintf(f, "   total_sched_fail=%" PRId64 "\n",
+   n->total_sched_fail);
+   }
fprintf(f, "   total_calls=%" PRId64 "\n", n->total_calls);
for (i = 0; i < n->nb_edges; i++)
fprintf(f, "  edge[%d] <%s>\n", i,
diff --git a/lib/graph/graph_stats.c b/lib/graph/graph_stats.c
index c0140ba922..aa22cc403c 100644
--- a/lib/graph/graph_stats.c
+++ b/lib/graph/graph_stats.c
@@ -40,13 +40,19 @@ struct rte_graph_cluster_stats {
struct cluster_node clusters[];
 } __rte_cache_aligned;
 
+#define boarder_model_dispatch()   
   \
+   fprintf(f, "+---+---+" \
+  "---+---+---+---+" \
+  "---+---+-" \
+  "--+\n")
+
 #define boarder()  
\
fprintf(f, "+---+---+" \
   "---+---+---+---+-" \
   "--+\n")
 
 static inline void
-print_banner(FILE *f)
+print_banner_default(FILE *f)
 {
boarder();
fprintf(f, "%-32s%-16s%-16s%-16s%-16s%-16s%-16s\n", "|Node", "|calls",
@@ -55,6 +61,27 @@ print_banner(FILE *f)
boarder();
 }
 
+static inline void
+print_banner_dispatch(FILE *f)
+{
+   boarder_model_dispatch();
+   fprintf(f, "%-32s%-16s%-16s%-16s%-16s%-16s%-16s%-16s%-16s\n",
+   "|Node", "|calls",
+   "|objs", "|sched objs", "|sched fail",
+   "|realloc_count", "|objs/call", "|objs/sec(10E6)",
+   "|cycles/call|");
+   boarder_model_dispatch();
+}
+
+static inline void
+print_banner(FILE *f)
+{
+   if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_MCORE_DISPATCH)
+   print_banner_dispatch(f);
+   else
+   print_banner_default(f);
+}
+
 static inline void
 print_node(FILE *f, const struct rte_graph_cluster_node_stats *stat)
 {
@@ -76,11 +103,21 @@ print_node(FILE *f, const struct 
rte_graph_cluster_node_stats *stat)
objs_per_sec = ts_per_hz ? (objs - prev_objs) / ts_per_hz : 0;
objs_per_sec /= 100;
 
-   fprintf(f,
-   "|%-31s|%-15" PRIu64 "|%-15" PRIu64 "|%-15" PRIu64
-   "|%-15.3f|%-15.6f|%-11.4f|\n",
-   stat->name, calls, objs, stat->realloc_count, objs_per_call,
-   objs_per_sec, cycles_per_call);
+   if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_MCORE_DISPATCH) {
+   fprintf(f,
+   "|%-31s|%-15" PRIu64 "|%-15" PRIu64 "|%-15" PRIu64
+   "|%-15" PRIu64 "|%-15" PRIu64
+   "|%-15.3f|%-15.6f|%-11.4f|\n",
+   stat->name, calls, objs, stat->sched_objs,
+   stat->sched_fail, stat->realloc_count, objs_per_call,
+   objs_per_sec, cycles_per_call);
+   } else {
+   fprintf(f,
+   "|%-31s|%-15" PRIu64 "|%-15" PRIu64 "|%-15" PRIu64
+   "|%-15.3f|%-15.6f|%-11.4f|\n",
+   stat->name, calls, objs, stat->realloc_count, 
objs_per_call,
+   objs_per_sec, cycles_per_call);
+   }
 }
 
 static int
@@ -88,13 +125,20 @@ graph_cluster_stats_cb(bool is_first, bool is_last, void 
*cookie,
   const struct rte_graph_cluster_node_stats *stat)
 {
FILE *f = cookie;
+   int model;
+
+   model = rte_graph_worker_model_get();
 
if (unlikely(is_first))
print_banner(f);
if (stat->objs)
print_node(f, stat);
- 

[PATCH v5 14/15] examples/l3fwd-graph: introduce multicore dispatch worker model

2023-03-30 Thread Zhirun Yan
Add new parameter "model" to choose dispatch or rtc worker model.
And in dispatch model, the node will affinity to worker core successively.

Note:
only support one RX node for remote model in current implementation.

./dpdk-l3fwd-graph  -l 8,9,10,11 -n 4 -- -p 0x1 --config="(0,0,9)" -P
--model="dispatch"

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 examples/l3fwd-graph/main.c | 236 +---
 1 file changed, 194 insertions(+), 42 deletions(-)

diff --git a/examples/l3fwd-graph/main.c b/examples/l3fwd-graph/main.c
index 5feeab4f0f..7078ed4c77 100644
--- a/examples/l3fwd-graph/main.c
+++ b/examples/l3fwd-graph/main.c
@@ -55,6 +55,9 @@
 
 #define NB_SOCKETS 8
 
+/* Graph module */
+#define WORKER_MODEL_RTC "rtc"
+#define WORKER_MODEL_MCORE_DISPATCH "dispatch"
 /* Static global variables used within this file. */
 static uint16_t nb_rxd = RX_DESC_DEFAULT;
 static uint16_t nb_txd = TX_DESC_DEFAULT;
@@ -88,6 +91,10 @@ struct lcore_rx_queue {
char node_name[RTE_NODE_NAMESIZE];
 };
 
+struct model_conf {
+   enum rte_graph_worker_model model;
+};
+
 /* Lcore conf */
 struct lcore_conf {
uint16_t n_rx_queue;
@@ -153,6 +160,19 @@ static struct ipv4_l3fwd_lpm_route 
ipv4_l3fwd_lpm_route_array[] = {
{RTE_IPV4(198, 18, 6, 0), 24, 6}, {RTE_IPV4(198, 18, 7, 0), 24, 7},
 };
 
+static int
+check_worker_model_params(void)
+{
+   if (rte_graph_worker_model_get() == RTE_GRAPH_MODEL_MCORE_DISPATCH &&
+   nb_lcore_params > 1) {
+   printf("Exceeded max number of lcore params for remote model: 
%hu\n",
+  nb_lcore_params);
+   return -1;
+   }
+
+   return 0;
+}
+
 static int
 check_lcore_params(void)
 {
@@ -276,6 +296,7 @@ print_usage(const char *prgname)
"  --eth-dest=X,MM:MM:MM:MM:MM:MM: Ethernet destination for "
"port X\n"
"  --max-pkt-len PKTLEN: maximum packet length in decimal 
(64-9600)\n"
+   "  --model NAME: walking model name, dispatch or rtc(by 
default)\n"
"  --no-numa: Disable numa awareness\n"
"  --per-port-pool: Use separate buffer pool per port\n"
"  --pcap-enable: Enables pcap capture\n"
@@ -318,6 +339,20 @@ parse_max_pkt_len(const char *pktlen)
return len;
 }
 
+static int
+parse_worker_model(const char *model)
+{
+   if (strcmp(model, WORKER_MODEL_MCORE_DISPATCH) == 0) {
+   rte_graph_worker_model_set(RTE_GRAPH_MODEL_MCORE_DISPATCH);
+   return RTE_GRAPH_MODEL_MCORE_DISPATCH;
+   } else if (strcmp(model, WORKER_MODEL_RTC) == 0)
+   return RTE_GRAPH_MODEL_RTC;
+
+   rte_exit(EXIT_FAILURE, "Invalid worker model: %s", model);
+
+   return RTE_GRAPH_MODEL_LIST_END;
+}
+
 static int
 parse_portmask(const char *portmask)
 {
@@ -434,6 +469,8 @@ static const char short_options[] = "p:" /* portmask */
 #define CMD_LINE_OPT_PCAP_ENABLE   "pcap-enable"
 #define CMD_LINE_OPT_NUM_PKT_CAP   "pcap-num-cap"
 #define CMD_LINE_OPT_PCAP_FILENAME "pcap-file-name"
+#define CMD_LINE_OPT_WORKER_MODEL  "model"
+
 enum {
/* Long options mapped to a short option */
 
@@ -449,6 +486,7 @@ enum {
CMD_LINE_OPT_PARSE_PCAP_ENABLE,
CMD_LINE_OPT_PARSE_NUM_PKT_CAP,
CMD_LINE_OPT_PCAP_FILENAME_CAP,
+   CMD_LINE_OPT_WORKER_MODEL_TYPE,
 };
 
 static const struct option lgopts[] = {
@@ -460,6 +498,7 @@ static const struct option lgopts[] = {
{CMD_LINE_OPT_PCAP_ENABLE, 0, 0, CMD_LINE_OPT_PARSE_PCAP_ENABLE},
{CMD_LINE_OPT_NUM_PKT_CAP, 1, 0, CMD_LINE_OPT_PARSE_NUM_PKT_CAP},
{CMD_LINE_OPT_PCAP_FILENAME, 1, 0, CMD_LINE_OPT_PCAP_FILENAME_CAP},
+   {CMD_LINE_OPT_WORKER_MODEL, 1, 0, CMD_LINE_OPT_WORKER_MODEL_TYPE},
{NULL, 0, 0, 0},
 };
 
@@ -551,6 +590,11 @@ parse_args(int argc, char **argv)
printf("Pcap file name: %s\n", pcap_filename);
break;
 
+   case CMD_LINE_OPT_WORKER_MODEL_TYPE:
+   printf("Use new worker model: %s\n", optarg);
+   parse_worker_model(optarg);
+   break;
+
default:
print_usage(prgname);
return -1;
@@ -726,15 +770,15 @@ print_stats(void)
 static int
 graph_main_loop(void *conf)
 {
+   struct model_conf *mconf = conf;
struct lcore_conf *qconf;
struct rte_graph *graph;
uint32_t lcore_id;
 
-   RTE_SET_USED(conf);
-
lcore_id = rte_lcore_id();
qconf = &lcore_conf[lcore_id];
graph = qconf->graph;
+   rte_graph_worker_model_set(mconf->model);
 
if (!graph) {
RTE_LOG(INFO, L3FWD_GRAPH, "Lcore %u has nothing to do\n",
@@ -788,6 +832,139 @@ config_port_max_pkt_len(struct rte_eth_conf *conf,
return 0;
 }
 
+static void
+graph_config_mcore_dispatch(struct rte_graph_param gra

[PATCH v5 15/15] doc: update multicore dispatch model in graph guides

2023-03-30 Thread Zhirun Yan
Update graph documentation to introduce new multicore dispatch model.

Signed-off-by: Haiyue Wang 
Signed-off-by: Cunming Liang 
Signed-off-by: Zhirun Yan 
---
 doc/guides/prog_guide/graph_lib.rst | 59 +++--
 1 file changed, 55 insertions(+), 4 deletions(-)

diff --git a/doc/guides/prog_guide/graph_lib.rst 
b/doc/guides/prog_guide/graph_lib.rst
index 1cfdc86433..72e26f3a5a 100644
--- a/doc/guides/prog_guide/graph_lib.rst
+++ b/doc/guides/prog_guide/graph_lib.rst
@@ -189,14 +189,65 @@ In the above example, A graph object will be created with 
ethdev Rx
 node of port 0 and queue 0, all ipv4* nodes in the system,
 and ethdev tx node of all ports.
 
-Multicore graph processing
-~~
-In the current graph library implementation, specifically,
-``rte_graph_walk()`` and ``rte_node_enqueue*`` fast path API functions
+graph model chossing
+
+Currently, there are 2 different walking model. Use
+``rte_graph_worker_model_set()`` to set the walking model.
+
+RTC (Run-To-Completion)
+^^^
+This is the default graph walking model. specifically,
+``rte_graph_walk_rtc()`` and ``rte_node_enqueue*`` fast path API functions
 are designed to work on single-core to have better performance.
 The fast path API works on graph object, So the multi-core graph
 processing strategy would be to create graph object PER WORKER.
 
+Example:
+
+Graph: node-0 -> node-1 -> node-2 @Core0.
+
+.. code-block:: diff
+
++ - - - - - - - - - - - - - - - - - - - - - +
+'  Core #0  '
+'   '
+' ++ +-+ ++ '
+' | Node-0 | --> | Node-1  | --> | Node-2 | '
+' ++ +-+ ++ '
+'   '
++ - - - - - - - - - - - - - - - - - - - - - +
+
+Dispatch model
+^^
+The dispatch model enables a cross-core dispatching mechanism which employs
+a scheduling work-queue to dispatch streams to other worker cores which
+being associated with the destination node.
+
+Use ``rte_graph_model_dispatch_lcore_affinity_set()`` to set lcore affinity
+with the node.
+Each worker core will have a graph repetition. Use ``rte_graph_clone()`` to
+clone graph for each worker and use``rte_graph_model_dispatch_core_bind()``
+to bind graph with the worker core.
+
+Example:
+
+Graph topo: node-0 -> Core1; node-1 -> node-2; node-2 -> node-3.
+Config graph: node-0 @Core0; node-1/3 @Core1; node-2 @Core2.
+
+.. code-block:: diff
+
++ - - - - - -+ +- - - - - - - - - - - - - + + - - - - - -+
+'  Core #0   ' '  Core #1 ' '  Core #2   '
+'' '  ' ''
+' ++ ' ' ++++ ' ' ++ '
+' | Node-0 | - - - ->| Node-1 || Node-3 |<- - - - | Node-2 | '
+' ++ ' ' ++++ ' ' ++ '
+'' ' |' '  ^ '
++ - - - - - -+ +- - -|- - - - - - - - - - + + - - -|- - -+
+ | |
+ + - - - - - - - - - - - - - - - - +
+
+
 In fast path
 
 Typical fast-path code looks like below, where the application
-- 
2.37.2



[Bug 1207] testpmd: No probed ethernet devices when running testpmd with Intel E810XXVDA2G1P5

2023-03-30 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1207

Bug ID: 1207
   Summary: testpmd: No probed ethernet devices when running
testpmd with Intel E810XXVDA2G1P5
   Product: DPDK
   Version: 22.11
  Hardware: x86
OS: Linux
Status: UNCONFIRMED
  Severity: major
  Priority: Normal
 Component: testpmd
  Assignee: dev@dpdk.org
  Reporter: wang.j...@nokia-sbell.com
  Target Milestone: ---

Created attachment 248
  --> https://bugs.dpdk.org/attachment.cgi?id=248&action=edit
Test logs and debug logs

Hi  DPDK support:
When I try to install DPDK and run testpmd,No probed ethernet devices  always
returned and can not forwad any packet .Test NIC is  Intel E810XXVDA2G1P5. 

Attached all test logs for details.

Pls check to check what may cuase the issue and what method I can try to solve
the issue?

Thanks a lot.

BR,
wangjing

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 1208] testpmd: No probed ethernet devices when running testpmd with Intel E810XXVDA2G1P5

2023-03-30 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1208

Bug ID: 1208
   Summary: testpmd: No probed ethernet devices when running
testpmd with Intel E810XXVDA2G1P5
   Product: DPDK
   Version: 22.11
  Hardware: x86
OS: Linux
Status: UNCONFIRMED
  Severity: major
  Priority: Normal
 Component: testpmd
  Assignee: dev@dpdk.org
  Reporter: wang.j...@nokia-sbell.com
  Target Milestone: ---

Hi  DPDK support:
When I try to install DPDK and run testpmd,No probed ethernet devices  always
returned and can not forwad any packet .Test NIC is  Intel E810XXVDA2G1P5. 

Attached all test logs for details.

Pls check to check what may cuase the issue and what method I can try to solve
the issue?

Thanks a lot.

BR,
wangjing

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 1208] testpmd: No probed ethernet devices when running testpmd with Intel E810XXVDA2G1P5

2023-03-30 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1208

David Marchand (david.march...@redhat.com) changed:

   What|Removed |Added

 CC||david.march...@redhat.com
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from David Marchand (david.march...@redhat.com) ---
.

*** This bug has been marked as a duplicate of bug 1207 ***

-- 
You are receiving this mail because:
You are the assignee for the bug.