[PATCH] net/ixgbe: fix RSS redirection table configuration for E610

2024-11-25 Thread Yuan Wang
Add labels to get the correct table size and register address for RETA.

Fixes: 316637762a5f ("net/ixgbe/base: enable E610 device")

Signed-off-by: Yuan Wang 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index eb431889c3..bed4374ad1 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -7238,10 +7238,12 @@ ixgbe_reta_size_get(enum ixgbe_mac_type mac_type) {
case ixgbe_mac_X550:
case ixgbe_mac_X550EM_x:
case ixgbe_mac_X550EM_a:
+   case ixgbe_mac_E610:
return RTE_ETH_RSS_RETA_SIZE_512;
case ixgbe_mac_X550_vf:
case ixgbe_mac_X550EM_x_vf:
case ixgbe_mac_X550EM_a_vf:
+   case ixgbe_mac_E610_vf:
return RTE_ETH_RSS_RETA_SIZE_64;
case ixgbe_mac_X540_vf:
case ixgbe_mac_82599_vf:
@@ -7257,6 +7259,7 @@ ixgbe_reta_reg_get(enum ixgbe_mac_type mac_type, uint16_t 
reta_idx) {
case ixgbe_mac_X550:
case ixgbe_mac_X550EM_x:
case ixgbe_mac_X550EM_a:
+   case ixgbe_mac_E610:
if (reta_idx < RTE_ETH_RSS_RETA_SIZE_128)
return IXGBE_RETA(reta_idx >> 2);
else
@@ -7264,6 +7267,7 @@ ixgbe_reta_reg_get(enum ixgbe_mac_type mac_type, uint16_t 
reta_idx) {
case ixgbe_mac_X550_vf:
case ixgbe_mac_X550EM_x_vf:
case ixgbe_mac_X550EM_a_vf:
+   case ixgbe_mac_E610_vf:
return IXGBE_VFRETA(reta_idx >> 2);
default:
return IXGBE_RETA(reta_idx >> 2);
-- 
2.43.0



Re: [PATCH v1 0/3] Bugfixes

2024-11-25 Thread Serhii Iliushyk
>On 24.11.2024, 20:13, "Stephen Hemminger" wrote:
>
>
>On Wed, 20 Nov 2024 19:02:54 +0100
>Serhii Iliushyk mailto:sil-...@napatech.com>> wrote:
>
>
>> This patch set provides further fixes:
>> 
>> Feature DSCP:
>> The issue with the modification of the DSCP field
>> for IPV4 and IPV6 is fixed by adding
>> copying the DSCP value to the flow handler.
>> 
>> Feature MTU:
>> Supplement existing toggle-like macros
>> with respective opposites for the correct enablement of the MTU
>> 
>> The header rte_pmd_ntnic.h was moved
>> to the root of ntnic PMD. The correct place for this header.
>> 
>> Danylo Vodopianov (1):
>> net/ntnic: fix action modify field DSCP
>> 
>> Oleksandr Kolomeiets (2):
>> net/ntnic: add supplementary macros
>> net/ntnic: move headers to driver's root
>> 
>> drivers/net/ntnic/https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fmeson.build&c=E,1,KPLdWALh1y_hzTNAJ-J55jj5ck-Z4ano3LwTmgN8N-LYP4Z64ZnvMaYCxWcittpHHlIqOaUZJ7LdJ3eCxvocGIh_ND_7k1sv_6QlT0HMJxMZDQ,,&typo=1
>>  
>> 
>>  | 3 +++
>> .../nthw/flow_api/profile_inline/flow_api_profile_inline.c | 4 
>> .../nthw/flow_api/profile_inline/flow_api_profile_inline.h | 3 +++
>> drivers/net/ntnic/{nthw => }/rte_pmd_ntnic.h | 0
>> 4 files changed, 10 insertions(+)
>> rename drivers/net/ntnic/{nthw => }/rte_pmd_ntnic.h (100%)
>> 
>
>
>Applied to next-net
>
>

Hi Stephen,

I would like to know if these patches will be included in the RC4 for DPDK 
24.11. Their current state on the patchwork is "under review."
I want to know the exact state due to the coming code freeze date for RC4. 
Please let me know the details.

Also, I'm wondering about this patch set: 
https://patches.dpdk.org/project/dpdk/list/?series=34016
May It be included in the RC4?

It is essential for us to have these fixes included in the DPDK 24.11.

Thanks,
Serhii



Re: [PATCH] vhost: fix read vs write lock mismatch

2024-11-25 Thread Maxime Coquelin




On 11/25/24 17:28, David Marchand wrote:

On Mon, Nov 25, 2024 at 5:20 PM Maxime Coquelin
 wrote:

On 11/25/24 12:14, David Marchand wrote:

On Mon, Nov 18, 2024 at 5:24 PM Stephen Hemminger
 wrote:


If lock is acquired for write, it must be released for write
or a deadlock is likely.

Bugzilla ID: 1582
Fixes: 9fc93a1e2320 ("vhost: fix virtqueue access check in datapath")
Cc: david.march...@redhat.com
Cc: sta...@dpdk.org

Signed-off-by: Stephen Hemminger 
---
   lib/vhost/virtio_net.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index 298a5dae74..d764d4bc6a 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -2538,7 +2538,7 @@ virtio_dev_rx_async_submit(struct virtio_net *dev, struct 
vhost_virtqueue *vq,

  if (unlikely(!vq->access_ok)) {
  vhost_user_iotlb_rd_unlock(vq);
-   rte_rwlock_read_unlock(&vq->access_lock);
+   rte_rwlock_write_unlock(&vq->access_lock);


A write lock is taken earlier, because virtio_dev_rx_async_submit_*
need it for access to vq->async (as opposed to the sync code that only
takes read lock).

Here, no need to release/take again all locks.
A simpler fix is to directly call vring_translate(dev, vq).




Ok, so both solutions are correct.

David's one is more optimized, but this is a corner case in the async
datapath so not really critical.

On the other hand, Stephen's solution keeps the same pattern as the
other datapaths.


Ok, it avoids having a special case, and I prefer limiting special cases too.

I'll send a RFC for new lock annotations (for catching such bug) for
the next release.


Great, thanks for that!





I'd go with Stephen's solution, but would change the commit title to:

vhost: fix deadlock in Rx async path

With this change:

Reviewed-by: Maxime Coquelin 


Yes, the title needed some work.




Ack, will change while applying.

Thanks,
Maxime



Re: [PATCH] vhost: fix read vs write lock mismatch

2024-11-25 Thread Stephen Hemminger
On Mon, 25 Nov 2024 17:31:26 +0100
Maxime Coquelin  wrote:

> On 11/25/24 17:28, David Marchand wrote:
> > On Mon, Nov 25, 2024 at 5:20 PM Maxime Coquelin
> >  wrote:  
> >> On 11/25/24 12:14, David Marchand wrote:  
> >>> On Mon, Nov 18, 2024 at 5:24 PM Stephen Hemminger
> >>>  wrote:  
> 
>  If lock is acquired for write, it must be released for write
>  or a deadlock is likely.
> 
>  Bugzilla ID: 1582
>  Fixes: 9fc93a1e2320 ("vhost: fix virtqueue access check in datapath")
>  Cc: david.march...@redhat.com
>  Cc: sta...@dpdk.org
> 
>  Signed-off-by: Stephen Hemminger 
>  ---
> lib/vhost/virtio_net.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
>  diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
>  index 298a5dae74..d764d4bc6a 100644
>  --- a/lib/vhost/virtio_net.c
>  +++ b/lib/vhost/virtio_net.c
>  @@ -2538,7 +2538,7 @@ virtio_dev_rx_async_submit(struct virtio_net *dev, 
>  struct vhost_virtqueue *vq,
> 
>    if (unlikely(!vq->access_ok)) {
>    vhost_user_iotlb_rd_unlock(vq);
>  -   rte_rwlock_read_unlock(&vq->access_lock);
>  +   rte_rwlock_write_unlock(&vq->access_lock);  
> >>>
> >>> A write lock is taken earlier, because virtio_dev_rx_async_submit_*
> >>> need it for access to vq->async (as opposed to the sync code that only
> >>> takes read lock).
> >>>
> >>> Here, no need to release/take again all locks.
> >>> A simpler fix is to directly call vring_translate(dev, vq).
> >>>
> >>>  
> >>
> >> Ok, so both solutions are correct.
> >>
> >> David's one is more optimized, but this is a corner case in the async
> >> datapath so not really critical.
> >>
> >> On the other hand, Stephen's solution keeps the same pattern as the
> >> other datapaths.  
> > 
> > Ok, it avoids having a special case, and I prefer limiting special cases 
> > too.
> > 
> > I'll send a RFC for new lock annotations (for catching such bug) for
> > the next release.  
> 
> Great, thanks for that!
> 
> >   
> >>
> >> I'd go with Stephen's solution, but would change the commit title to:
> >>
> >> vhost: fix deadlock in Rx async path
> >>
> >> With this change:
> >>
> >> Reviewed-by: Maxime Coquelin   
> > 
> > Yes, the title needed some work.
> > 
> >   
> 
> Ack, will change while applying.

Yes better title would be great.


Re: [PATCH v3 3/3] net/hns3: fix Rx packet without CRC data

2024-11-25 Thread Stephen Hemminger
On Fri, 19 Jul 2024 17:04:15 +0800
Jie Hai  wrote:

> +static inline void
> +hns3_recalculate_crc(struct rte_mbuf *m)
> +{
> + char *append_data;
> + uint32_t crc;
> +
> + crc = rte_net_crc_calc(rte_pktmbuf_mtod(m, void *),
> +m->data_len, RTE_NET_CRC32_ETH);
> +
> + /*
> +  * The hns3 driver requires that mbuf size must be at least 512B.
> +  * When CRC is stripped by hardware, the pkt_len must be less than
> +  * or equal to 60B. Therefore, the space of the mbuf is enough
> +  * to insert the CRC.
> +  *
> +  * In addition, after CRC is stripped by hardware, pkt_len and data_len
> +  * do not contain the CRC length. Therefore, after CRC data is appended
> +  * by PMD again, both pkt_len and data_len add the CRC length.
> +  */
> + append_data = rte_pktmbuf_append(m, RTE_NET_CRC32_ETH);
> + /* The CRC data is binary data and does not care about the byte order. 
> */
> + rte_memcpy(append_data, (void *)&crc, RTE_NET_CRC32_ETH);
> +}
> +

There is a bug here. If there is no space at end of mbuf (tailroom) then
rte_pktmbuf_append will return NULL. This would only happen if mbuf pool
was sized such that there was space of a full size packet without CRC
and then the code to add kept CRC was executed.

You need to check for NULL return and figure out some kind of unwind
path such as making it a receive error and dropping the packet.


Re: [PATCH] examples/l3fwd: fix Tx performance deteriorate

2024-11-25 Thread Stephen Hemminger
On Fri, 22 Nov 2024 15:13:36 +0800
Jie Hai  wrote:

> The application send packets only when the buffer is full, or the
> buffer is empty and the number of packets to be sent extends half
> of the buffer.
> 
> The change of MAX_PKT_BURST increases TX buffer size, while the
> default size of local cache on each lcore is 256, which not greater
> than the limit of transmitting. That would make the mbuf not on the
> local cache be frequently used and the performance deteriorates.
> 
> This problem can be solved by making the TX threshold smaller than
> the local cache size. For example, use the '--mbcache' parameter to
> make the local cache greater. This patch optimizes the default
> performance by lowering TX threshold.
> 
> Fixes: d5c4897ecfb2 ("examples/l3fwd: add option to set Rx burst size")
> 
> Signed-off-by: Jie Hai 

Do the other variants of l3fwd have the same problem?


[PATCH] net/mlx5: fix Rx queue control deref

2024-11-25 Thread Bing Zhao
When the Rx queue is shared, only the control structure is shared and
the private structure of each Rx queue is still independent. During
the port stop stage, the hardware resource will be released, and the
memory will be freed in the device close stage. Then the control
structure reference count should be decreased when freeing a private
structure.

In the previous implementation, the decreasing action was wrongly
put inside the owners list empty condition. Indeed, they should be
in the same level. And since the reference count was set to 1 after
the 1st queue is created, when checking the value, it should be
subtracted firstly and then check the value.

With this commit, the reference calculation and condition checking
will be corrected. The shared Rx queues' control structures will be
freed successlly to avoid the crash in the port restarting.

Fixes: 3c9a82fa6edc ("net/mlx5: fix Rx queue control management")
Cc: sta...@dpdk.org

Signed-off-by: Bing Zhao 
Acked-by: Dariusz Sosnowski 
---
 drivers/net/mlx5/mlx5_rx.h  |  2 +-
 drivers/net/mlx5/mlx5_rxq.c | 12 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index da7c448948..1a6f174c40 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -157,7 +157,7 @@ struct mlx5_rxq_ctrl {
bool is_hairpin; /* Whether RxQ type is Hairpin. */
unsigned int socket; /* CPU socket ID for allocations. */
LIST_ENTRY(mlx5_rxq_ctrl) share_entry; /* Entry in shared RXQ list. */
-   RTE_ATOMIC(uint32_t) ctrl_ref; /* Reference counter. */
+   RTE_ATOMIC(int32_t) ctrl_ref; /* Reference counter. */
uint32_t share_group; /* Group ID of shared RXQ. */
uint16_t share_qid; /* Shared RxQ ID in group. */
unsigned int started:1; /* Whether (shared) RXQ has been started. */
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 0737f60272..126b1970e6 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -2268,6 +2268,7 @@ mlx5_rxq_release(struct rte_eth_dev *dev, uint16_t idx)
struct mlx5_rxq_priv *rxq;
struct mlx5_rxq_ctrl *rxq_ctrl;
uint32_t refcnt;
+   int32_t ctrl_ref;
 
if (priv->rxq_privs == NULL)
return 0;
@@ -2293,15 +2294,14 @@ mlx5_rxq_release(struct rte_eth_dev *dev, uint16_t idx)
}
} else { /* Refcnt zero, closing device. */
LIST_REMOVE(rxq, owner_entry);
-   if (LIST_EMPTY(&rxq_ctrl->owners)) {
+   ctrl_ref = rte_atomic_fetch_sub_explicit(&rxq_ctrl->ctrl_ref, 1,
+
rte_memory_order_relaxed) - 1;
+   if (ctrl_ref == 1 && LIST_EMPTY(&rxq_ctrl->owners)) {
if (!rxq_ctrl->is_hairpin)
mlx5_mr_btree_free
(&rxq_ctrl->rxq.mr_ctrl.cache_bh);
-   if (rte_atomic_fetch_sub_explicit(&rxq_ctrl->ctrl_ref, 
1,
-   rte_memory_order_relaxed) == 1) {
-   LIST_REMOVE(rxq_ctrl, share_entry);
-   mlx5_free(rxq_ctrl);
-   }
+   LIST_REMOVE(rxq_ctrl, share_entry);
+   mlx5_free(rxq_ctrl);
}
dev->data->rx_queues[idx] = NULL;
mlx5_free(rxq);
-- 
2.34.1



Re: Doubts in JumboFrames and stats_checks tests in DTS.

2024-11-25 Thread Patrick Robb
Hi Bharati,

Thanks, here is the meeting info. I'll see you tomorrow!

---

Patrick Robb is inviting you to a scheduled Zoom meeting.

Topic: Bharati & Patrick DTS Discussion
Time: Nov 26, 2024 11:00 AM Eastern Time (US and Canada)

Join from PC, Mac, Linux, iOS or Android: https://unh.zoom.us/j/92634291594

Keyboard shortcuts are available to navigate this Zoom meeting or webinar:
https://support.zoom.us/hc/en-us/articles/205683899-Hot-Keys-and-Keyboard-for-Zoom

Or iPhone one-tap:  13017158592,92634291594# or 13052241968,92634291594#

Or Telephone:
Dial: +1 301 715 8592 (US Toll)
Meeting ID: 926 3429 1594
International numbers available: https://unh.zoom.us/u/aHJatyofh

Or a H.323/SIP room system:
H.323: rc.unh.edu or 162.255.37.11 (US West) or 162.255.36.11 (US East)
Meeting ID: 926 3429 1594
SIP: 92634291...@zoomcrc.com

TROUBLESHOOTING STEPS:

Audio Echo In A Meeting:
https://support.zoom.us/hc/en-us/articles/202050538-Audio-Echo-In-A-Meeting

Want to Join a Test Meeting?: https://zoom.us/test

On Mon, Nov 25, 2024 at 12:37 PM Bharati Bhole - Geminus <
c_bhara...@xsightlabs.com> wrote:

> Hi Patrick,
>
> 11/26 16:00 UTC works for me.
> Please let me know which link to join.
>
> Thanks,
> Bharati.
> --
> *From:* Patrick Robb 
> *Sent:* Monday, November 25, 2024 9:27:29 PM
> *To:* Bharati Bhole - Geminus 
> *Cc:* d...@dpdk.org ; Nicholas Pratte ;
> Dean Marx ; Paul Szczepanek ;
> Luca Vizzarro ; NBU-Contact-Thomas Monjalon
> (EXTERNAL) ; dev 
> *Subject:* Re: Doubts in JumboFrames and stats_checks tests in DTS.
>
> Hi Bharati,
>
> It might be easiest to address your questions over a video conference call
> instead of email. Would this be okay?
>
> I am free tomorrow 11/26 16:00-18:00 UTC, or Wednesday 11/27 14:00-16:00
> UTC and 20:00-22:00 UTC. Or I have other availability if none of these work.
>
> On Mon, Nov 25, 2024 at 5:45 AM Bharati Bhole - Geminus <
> c_bhara...@xsightlabs.com> wrote:
>
> Hi Patrik,
>
> I used site - https://dpdk.org/git/dpdk to clone the DPDK code. I tried
> to go through the DTS/README.md file.
>
> This file says, it uses docker container for dev as well as test
> execution. But I did not find any steps for setting up the test environment
> for it.
>
> I tried to look for the steps at
> https://doc.dpdk.org/guides/tools/dts.html but its not there.
> Can you please point me to the document for the execution steps?
>
> Thanks,
> Bharati.
>
> --
> *From:* Patrick Robb 
> *Sent:* 22 November 2024 10:29 PM
> *To:* Bharati Bhole - Geminus 
> *Cc:* d...@dpdk.org ; Nicholas Pratte ;
> Dean Marx ; Paul Szczepanek ;
> Luca Vizzarro ; NBU-Contact-Thomas Monjalon
> (EXTERNAL) ; dev 
> *Subject:* Re: Doubts in JumboFrames and stats_checks tests in DTS.
>
> Hi Bharati,
>
> Welcome to the DTS mailing list. I will try to provide some answers based
> on my experience running DTS at the DPDK Community Lab at UNH. I will also
> flag that this "legacy" version of DTS is deprecated and getting minimal
> maintenance. The majority of the current efforts for DTS are directed
> towards the rewrite which exists within the /dts dir of the DPDK repo:
> https://git.dpdk.org/dpdk/tree/dts
>
> With that being said, of course the legacy repo is still useful and I
> encourage you to use it, so I will provide some comments inline below:
>
> On Fri, Nov 22, 2024 at 9:43 AM Bharati Bhole - Geminus <
> c_bhara...@xsightlabs.com> wrote:
>
> Hi,
>
> I am Bharati Bhole. I am a new member of DTS mailing list.
> I have recently started working on DTS for my company and facing some
> issues/failures while running the DTS.
> Please help me with understanding the test cases and expected behaviours.
>
> I am trying to understand the DTS behaviour for following TCs:
>
> 1. JumboFrames :
>
>1. When the test set the max_pkt_len for testpmd and calculate the
>expected acceptable packet size, does it consider NICs supporting 2 VLANS?
>(In case of MTU update test, I have seen that 2 VLANs NIC are being
>considered while calculating acceptable packets size but in JumboFrames I
>dont see it).
>
>
> No, 2 VLANs is not properly accounted for in the Jumboframes testsuite.
> And, this is actually highly topical, as this is an ongoing point of
> discussion in rewriting jumboframes and mtu_update for the new DTS
> framework (the testcases are getting combined into 1 testsuite).  I will
> paste the function from mtu_update of legacy DTS which you may be referring
> to:
>
> --
>
> def send_packet_of_size_to_port(self, port_id: int, pktsize: int):
>
> # The packet total size include ethernet header, ip header, and
> payload.
> # ethernet header length is 18 bytes, ip standard header length is
> 20 bytes.
> # pktlen = pktsize - ETHER_HEADER_LEN
> if self.kdriver in ["igb", "igc", "ixgbe"]:
> max_pktlen = pktsize + ETHER_HEADER_LEN + VLAN
>

Re: [PATCH v5 01/16] eal: provide pack start macro for MSVC

2024-11-25 Thread Andre Muezerie
On Fri, Nov 22, 2024 at 09:13:38AM +0100, Morten Brørup wrote:
> > From: Andre Muezerie [mailto:andre...@linux.microsoft.com]
> > Sent: Friday, 22 November 2024 01.12
> > 
> > On Thu, Nov 21, 2024 at 09:51:36PM +0100, Thomas Monjalon wrote:
> > > 21/11/2024 20:39, Andre Muezerie:
> > > > On Tue, Nov 19, 2024 at 09:32:07AM +0100, Morten Brørup wrote:
> > > > > > From: Andre Muezerie [mailto:andre...@linux.microsoft.com]
> > > > > > Sent: Tuesday, 19 November 2024 05.35
> > > > > >
> > > > > > From: Tyler Retzlaff 
> > > > > >
> > > > > > MSVC struct packing is not compatible with GCC. Provide a macro
> > that
> > > > > > can be used to push existing pack value and sets packing to 1-
> > byte.
> > > > > > The existing __rte_packed macro is then used to restore the
> > pack value
> > > > > > prior to the push.
> > > > > >
> > > > > > Instead of providing macros exclusively for MSVC and for GCC
> > the
> > > > > > existing macro is deliberately utilized to trigger a warning if
> > no
> > > > > > existing packing has been pushed allowing easy identification
> > of
> > > > > > locations where the __rte_msvc_pack is missing.
> > > > > >
> > > > > > Signed-off-by: Tyler Retzlaff 
> > > > > > ---
> > > > > >  lib/eal/include/rte_common.h | 4 +++-
> > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/lib/eal/include/rte_common.h
> > > > > > b/lib/eal/include/rte_common.h
> > > > > > index 4d299f2b36..409890863e 100644
> > > > > > --- a/lib/eal/include/rte_common.h
> > > > > > +++ b/lib/eal/include/rte_common.h
> > > > > > @@ -103,8 +103,10 @@ typedef uint16_t unaligned_uint16_t;
> > > > > >   * Force a structure to be packed
> > > > > >   */
> > > > > >  #ifdef RTE_TOOLCHAIN_MSVC
> > > > > > -#define __rte_packed
> > > > > > +#define __rte_msvc_pack __pragma(pack(push, 1))
> > > > > > +#define __rte_packed __pragma(pack(pop))
> > > > > >  #else
> > > > > > +#define __rte_msvc_pack
> > > > > >  #define __rte_packed __attribute__((__packed__))
> > > > > >  #endif
> > > > > >
> > > > > > --
> > > > > > 2.47.0.vfs.0.3
> > > > >
> > > > > Before proceeding with this, can we please discuss the
> > alternative, proposed here:
> > > > >
> > https://inbox.dpdk.org/dev/CAJFAV8yStgiBbe+Nkt9mC30r0+ZP64_kGuRHOzqd90R
> > d2hx...@mail.gmail.com/
> > > > >
> > > > > The definition of the packing macro in OVS, for reference:
> > > > >
> > https://github.com/openvswitch/ovs/blob/main/include/openvswitch/compil
> > er.h#L209
> > > > >
> > > > > The current solution requires __rte_packed to be placed at the
> > end of a structure, although __attribute__((packed)) is normally
> > allowed at the beginning (between the "struct" tag and the name of the
> > structure), which introduces a high risk of contributors placing it
> > "incorrectly", thus causing errors.
> > > > >
> > > > > I have a strong preference for an __RTE_PACKED(decl) variant.
> > > > >
> > > > > Here's a third alternative:
> > > > > #ifdef RTE_TOOLCHAIN_MSVC
> > > > > #define __rte_msvc_pack_begin __pragma(pack(push, 1))
> > > > > #define __rte_msvc_pack_end   __pragma(pack(pop))
> > > > > #else
> > > > > #define __rte_msvc_pack_begin
> > > > > #define __rte_msvc_pack_end
> > > > > #endif
> > > > >
> > > > > The third alternative is also problematic, e.g. if a contributor
> > forgets the _end after the structure declaration, or adds another
> > structure declaration before the _end.
> > > > >
> > > > > -Morten
> > > >
> > > > I looked at the suggestions made and I liked the one having a
> > __RTE_PACKED macro
> > > > the most.
> > > >
> > > > Advantages:
> > > > - Can be placed in front of the struct, or even in the middle. Good
> > for readability.
> > > > - Does not require a different macro to be placed at the end of the
> > structure as was
> > > >   proposed in V5 series.
> > > > - Works well in 99% of the cases.
> > > >
> > > > Problems can arise when compiler directives are present in the
> > struct, as they
> > > > become arguments for __RTE_PACKED macro. This is not portable.
> > > > I've seen two situations in the DPDK code:
> > > >
> > > > 1) #defines mentioned in the struct. In this situation we can just
> > move the
> > > >#define out of the struct.
> 
> No problem.
> 
> > > >
> > > > 2) #if/#ifdef/#elif mentioned in the struct.
> > > > This is a somewhat common pattern in structs where fields change
> > based on
> > > > endianness.
> > > > Example:
> > > >
> > > > /**
> > > >  * IPv4 Header
> > > >  */
> > > > struct __rte_aligned(2) rte_ipv4_hdr {
> > > > __extension__
> > > > union {
> > > > uint8_t version_ihl;/**< version and header length 
> > > > */
> > > > struct {
> > > > #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > uint8_t ihl:4; /**< header length */
> > > > uint8_t version:4; /**< version */
> > > > #elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > > > uint8_t version:4; /**< vers

Re: [PATCH] vhost: fix read vs write lock mismatch

2024-11-25 Thread David Marchand
On Mon, Nov 18, 2024 at 5:24 PM Stephen Hemminger
 wrote:
>
> If lock is acquired for write, it must be released for write
> or a deadlock is likely.
>
> Bugzilla ID: 1582
> Fixes: 9fc93a1e2320 ("vhost: fix virtqueue access check in datapath")
> Cc: david.march...@redhat.com
> Cc: sta...@dpdk.org
>
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/vhost/virtio_net.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
> index 298a5dae74..d764d4bc6a 100644
> --- a/lib/vhost/virtio_net.c
> +++ b/lib/vhost/virtio_net.c
> @@ -2538,7 +2538,7 @@ virtio_dev_rx_async_submit(struct virtio_net *dev, 
> struct vhost_virtqueue *vq,
>
> if (unlikely(!vq->access_ok)) {
> vhost_user_iotlb_rd_unlock(vq);
> -   rte_rwlock_read_unlock(&vq->access_lock);
> +   rte_rwlock_write_unlock(&vq->access_lock);

A write lock is taken earlier, because virtio_dev_rx_async_submit_*
need it for access to vq->async (as opposed to the sync code that only
takes read lock).

Here, no need to release/take again all locks.
A simpler fix is to directly call vring_translate(dev, vq).


-- 
David Marchand



[PATCH v2 2/3] net/i40e: initialize PTP to system time

2024-11-25 Thread Anatoly Burakov
Currently, i40e driver initializes PTP timestamp to 0. This is different
from what kernel driver does (which initializes it to system time).

Align the DPDK driver to kernel driver by setting PTP timestamp to system
time when enabling PTP.

Note that i40e driver always uses zero-based timestamps for PTP, so we
would only ever update the internal timecounter and not the actual NIC
registers.

Signed-off-by: Anatoly Burakov 
---
 drivers/net/i40e/i40e_ethdev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index ca128c7556..30dcdc68a8 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -10556,6 +10556,9 @@ i40e_timesync_enable(struct rte_eth_dev *dev)
struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
uint32_t tsync_ctl_l;
uint32_t tsync_ctl_h;
+   struct timespec ts;
+
+   memset(&ts, 0, sizeof(struct timespec));
 
/* Stop the timesync system time. */
I40E_WRITE_REG(hw, I40E_PRTTSYN_INC_L, 0x0);
@@ -10585,6 +10588,9 @@ i40e_timesync_enable(struct rte_eth_dev *dev)
I40E_WRITE_REG(hw, I40E_PRTTSYN_CTL0, tsync_ctl_l);
I40E_WRITE_REG(hw, I40E_PRTTSYN_CTL1, tsync_ctl_h);
 
+   /* i40e uses zero-based timestamping so only adjust timecounter */
+   i40e_timesync_write_time(dev, &ts);
+
return 0;
 }
 
-- 
2.43.5



[PATCH v2 1/3] net/ixgbe: initialize PTP to system time

2024-11-25 Thread Anatoly Burakov
Currently, ixgbe driver initializes PTP timestamp to 0. This is different
from what kernel driver does (which initializes it to system time).

Align the DPDK driver to kernel driver by setting PTP timestamp to system
time when enabling PTP.

Note that ixgbe driver always uses zero-based timestamps for PTP, so we
would only ever update the internal timecounter and not the actual NIC
registers.

Implementation note: in order to get access to clock_gettime on MinGW, we
have to use rte_os_shim.h header, which provides a wrapper around that
function. However, what it *also* provides is wrapper macros around various
other OS-related functions such as read(). Due to one of the mailbox ops
in base code being called "read", MinGW will misinterpret a call into
that op as an attempt to call read() the OS function, and produce a
compile error. This is being worked around by using parentheses around
access to the read op.

Signed-off-by: Anatoly Burakov 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index d02d1e43a3..277fdbc9f2 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -35,6 +35,7 @@
 #ifdef RTE_LIB_SECURITY
 #include 
 #endif
+#include 
 
 #include "ixgbe_logs.h"
 #include "base/ixgbe_api.h"
@@ -4168,7 +4169,12 @@ ixgbevf_check_link(struct ixgbe_hw *hw, ixgbe_link_speed 
*speed,
/* if the read failed it could just be a mailbox collision, best wait
 * until we are called again and don't report an error
 */
-   if (mbx->ops[0].read(hw, &in_msg, 1, 0))
+
+   /*
+* on MinGW, the read op call is interpreted as call into read() macro,
+* so avoid calling it directly.
+*/
+   if ((mbx->ops[0].read)(hw, &in_msg, 1, 0))
goto out;
 
if (!(in_msg & IXGBE_VT_MSGTYPE_CTS)) {
@@ -6924,6 +6930,12 @@ ixgbe_timesync_enable(struct rte_eth_dev *dev)
struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
uint32_t tsync_ctl;
uint32_t tsauxc;
+   struct timespec ts;
+
+   memset(&ts, 0, sizeof(struct timespec));
+
+   /* get current system time */
+   clock_gettime(CLOCK_REALTIME, &ts);
 
/* Stop the timesync system time. */
IXGBE_WRITE_REG(hw, IXGBE_TIMINCA, 0x0);
@@ -6956,6 +6968,9 @@ ixgbe_timesync_enable(struct rte_eth_dev *dev)
 
IXGBE_WRITE_FLUSH(hw);
 
+   /* ixgbe uses zero-based timestamping so only adjust timecounter */
+   ixgbe_timesync_write_time(dev, &ts);
+
return 0;
 }
 
-- 
2.43.5



[PATCH v2 3/3] net/e1000: initialize PTP to system time

2024-11-25 Thread Anatoly Burakov
Currently, e1000 driver initializes PTP timestamp to 0. This is different
from what kernel driver does (which initializes it to system time).

Align the DPDK driver to kernel driver by setting PTP timestamp to system
time when enabling PTP.

Note that e1000 driver always uses zero-based timestamps for PTP, so we
would only ever update the internal timecounter and not the actual NIC
registers.

Signed-off-by: Anatoly Burakov 
---
 drivers/net/e1000/igb_ethdev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index d3a9181874..c695f44c4c 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -4817,6 +4817,9 @@ igb_timesync_enable(struct rte_eth_dev *dev)
struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
uint32_t tsync_ctl;
uint32_t tsauxc;
+   struct timespec ts;
+
+   memset(&ts, 0, sizeof(struct timespec));
 
/* Stop the timesync system time. */
E1000_WRITE_REG(hw, E1000_TIMINCA, 0x0);
@@ -4861,6 +4864,9 @@ igb_timesync_enable(struct rte_eth_dev *dev)
tsync_ctl |= E1000_TSYNCTXCTL_ENABLED;
E1000_WRITE_REG(hw, E1000_TSYNCTXCTL, tsync_ctl);
 
+   /* e1000 uses zero-based timestamping so only adjust timecounter */
+   igb_timesync_write_time(dev, &ts);
+
return 0;
 }
 
-- 
2.43.5



Re: Doubts in JumboFrames and stats_checks tests in DTS.

2024-11-25 Thread Patrick Robb
Hi Bharati,

It might be easiest to address your questions over a video conference call
instead of email. Would this be okay?

I am free tomorrow 11/26 16:00-18:00 UTC, or Wednesday 11/27 14:00-16:00
UTC and 20:00-22:00 UTC. Or I have other availability if none of these work.

On Mon, Nov 25, 2024 at 5:45 AM Bharati Bhole - Geminus <
c_bhara...@xsightlabs.com> wrote:

> Hi Patrik,
>
> I used site - https://dpdk.org/git/dpdk to clone the DPDK code. I tried
> to go through the DTS/README.md file.
>
> This file says, it uses docker container for dev as well as test
> execution. But I did not find any steps for setting up the test environment
> for it.
>
> I tried to look for the steps at
> https://doc.dpdk.org/guides/tools/dts.html but its not there.
> Can you please point me to the document for the execution steps?
>
> Thanks,
> Bharati.
>
> --
> *From:* Patrick Robb 
> *Sent:* 22 November 2024 10:29 PM
> *To:* Bharati Bhole - Geminus 
> *Cc:* d...@dpdk.org ; Nicholas Pratte ;
> Dean Marx ; Paul Szczepanek ;
> Luca Vizzarro ; NBU-Contact-Thomas Monjalon
> (EXTERNAL) ; dev 
> *Subject:* Re: Doubts in JumboFrames and stats_checks tests in DTS.
>
> Hi Bharati,
>
> Welcome to the DTS mailing list. I will try to provide some answers based
> on my experience running DTS at the DPDK Community Lab at UNH. I will also
> flag that this "legacy" version of DTS is deprecated and getting minimal
> maintenance. The majority of the current efforts for DTS are directed
> towards the rewrite which exists within the /dts dir of the DPDK repo:
> https://git.dpdk.org/dpdk/tree/dts
>
> With that being said, of course the legacy repo is still useful and I
> encourage you to use it, so I will provide some comments inline below:
>
> On Fri, Nov 22, 2024 at 9:43 AM Bharati Bhole - Geminus <
> c_bhara...@xsightlabs.com> wrote:
>
> Hi,
>
> I am Bharati Bhole. I am a new member of DTS mailing list.
> I have recently started working on DTS for my company and facing some
> issues/failures while running the DTS.
> Please help me with understanding the test cases and expected behaviours.
>
> I am trying to understand the DTS behaviour for following TCs:
>
> 1. JumboFrames :
>
>1. When the test set the max_pkt_len for testpmd and calculate the
>expected acceptable packet size, does it consider NICs supporting 2 VLANS?
>(In case of MTU update test, I have seen that 2 VLANs NIC are being
>considered while calculating acceptable packets size but in JumboFrames I
>dont see it).
>
>
> No, 2 VLANs is not properly accounted for in the Jumboframes testsuite.
> And, this is actually highly topical, as this is an ongoing point of
> discussion in rewriting jumboframes and mtu_update for the new DTS
> framework (the testcases are getting combined into 1 testsuite).  I will
> paste the function from mtu_update of legacy DTS which you may be referring
> to:
>
> --
>
> def send_packet_of_size_to_port(self, port_id: int, pktsize: int):
>
> # The packet total size include ethernet header, ip header, and
> payload.
> # ethernet header length is 18 bytes, ip standard header length is
> 20 bytes.
> # pktlen = pktsize - ETHER_HEADER_LEN
> if self.kdriver in ["igb", "igc", "ixgbe"]:
> max_pktlen = pktsize + ETHER_HEADER_LEN + VLAN
> padding = max_pktlen - IP_HEADER_LEN - ETHER_HEADER_LEN - VLAN
> else:
> max_pktlen = pktsize + ETHER_HEADER_LEN + VLAN * 2
> padding = max_pktlen - IP_HEADER_LEN - ETHER_HEADER_LEN
> out = self.send_scapy_packet(
> port_id,
> f'Ether(dst=dutmac,
> src="52:00:00:00:00:00")/IP()/Raw(load="\x50"*{padding})',
>
> --
>
> One difference between legacy DTS and the "new" DTS is that in legacy DTS
> a master list of devices/drivers was maintained, and there were an endless
> amount of conditions like this where a device list would be checked, and
> then some behavior modified based on that list. Because this strategy leads
> to bugs, it's unresponsive to changes in driver code, hard to maintain, and
> for other reasons, we are no longer follow this approach in new DTS. Now,
> if we want to toggle different behavior (like determine max_pkt_len for a
> given MTU for a given device) that needs to be accomplished by querying
> testpmd for device info (there are various testpmd runtime commands for
> this). And, in situations where testpmd doesn't expose the information we
> need for checking device behavior in a particular testsuite - testpmd needs
> to be updated to allow for this.
>
> I am CC'ing Nick who is the person writing the new jumboframes + MTU
> testsuite, which (work in progress) is on patchwork here:
> https://patchwork.dpdk.org/project/dpdk/patch/20240726141307.14410-3-npra...@iol.unh.edu/
>
> Nick, maybe you can include the mailing list threads Thomas linke you, and
> explain your current understanding of h

DPDK CI Testing Meeting schedule updates

2024-11-25 Thread Patrick Robb
Hello,

I have a couple reminders/updates for the schedule of the upcoming DPDK CI
Testing meetings.

1. The meeting which was scheduled to happen this Thursday, November 28, is
cancelled due to the American Thanksgiving Holiday. To "make up" for this,
I will add a brief CI Labs discussion topic to the DTS meeting happening
next Thursday, November 5.

2. There is a conflict between the November 12 DPDK Ci Meeting, and a joint
Governing Board and Tech Board meeting about 2025 objectives. I am moving
the DPDK CI meeting back 1 hour to 15:00 UTC to avoid this conflict.
Ensuing meetings will return to the normal 14:00 UTC timeslot.

3. The December 26 DPDK CI meeting is cancelled due to Winter Holidays.

Since a number of CI meetings are being affected due to randomness, and the
DTS meetings are relatively unaffected, I will be sure to leave some space
for CI lab discussion during those DTS meetings.

Thanks.


Re: [PATCH v6 06/14] net/af_xdp: use rte strerror

2024-11-25 Thread Maryam Tahhan

The rte_errno may be an RTE-specific error code,
use rte_strerror() instead of strerror().

Signed-off-by: Dengdui Huang 
---
  drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 814398ba4b..c74107d46a 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -2518,7 +2518,7 @@ rte_pmd_af_xdp_probe(struct rte_vdev_device *dev)
ret = rte_mp_action_register(ETH_AF_XDP_MP_KEY, 
afxdp_mp_send_fds);
if (ret < 0 && rte_errno != ENOTSUP) {
AF_XDP_LOG_LINE(ERR, "%s: Failed to register multi-process 
IPC callback: %s",
-  name, strerror(rte_errno));
+  name, rte_strerror(rte_errno));
return -1;
}
}


Acked-by: Maryam Tahhan 



Re: [PATCH] vhost: fix read vs write lock mismatch

2024-11-25 Thread Maxime Coquelin




On 11/25/24 12:14, David Marchand wrote:

On Mon, Nov 18, 2024 at 5:24 PM Stephen Hemminger
 wrote:


If lock is acquired for write, it must be released for write
or a deadlock is likely.

Bugzilla ID: 1582
Fixes: 9fc93a1e2320 ("vhost: fix virtqueue access check in datapath")
Cc: david.march...@redhat.com
Cc: sta...@dpdk.org

Signed-off-by: Stephen Hemminger 
---
  lib/vhost/virtio_net.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index 298a5dae74..d764d4bc6a 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -2538,7 +2538,7 @@ virtio_dev_rx_async_submit(struct virtio_net *dev, struct 
vhost_virtqueue *vq,

 if (unlikely(!vq->access_ok)) {
 vhost_user_iotlb_rd_unlock(vq);
-   rte_rwlock_read_unlock(&vq->access_lock);
+   rte_rwlock_write_unlock(&vq->access_lock);


A write lock is taken earlier, because virtio_dev_rx_async_submit_*
need it for access to vq->async (as opposed to the sync code that only
takes read lock).

Here, no need to release/take again all locks.
A simpler fix is to directly call vring_translate(dev, vq).




Ok, so both solutions are correct.

David's one is more optimized, but this is a corner case in the async
datapath so not really critical.

On the other hand, Stephen's solution keeps the same pattern as the
other datapaths.

I'd go with Stephen's solution, but would change the commit title to:

vhost: fix deadlock in Rx async path

With this change:

Reviewed-by: Maxime Coquelin 

Thanks,
Maxime



Re: [RFC PATCH 00/21] Reduce code duplication across Intel NIC drivers

2024-11-25 Thread David Marchand
Hello Bruce,

On Fri, Nov 22, 2024 at 1:54 PM Bruce Richardson
 wrote:
>
> This RFC attempts to reduce the amount of code duplication across a
> number of Intel NIC drivers, specifically: ixgbe, i40e, iavf, and ice.

Thanks for starting this effort!

>
> The first patch extract a function from the Rx side, otherwise the
> majority of the changes are on the Tx side, leading to a converged Tx
> queue structure across the 4 drivers, and a large number of common
> functions.
>
> Open question:
> * How should common code across drivers within a single device class be
>   managed?
>   - For now, I've created an "intel_eth" folder within the "common"
> driver directory, thinking about it after, it  implies to me that
> it is common across driver classes.
>   - Would it be better to create an "intel_common" directory within the
> "net" folder?

common/ drivers currently host code that is device class agnostic,
like providing helpers to talk with hw.
No common/ driver has a dependency on some device class library.

This series adds code that is not built into a library so there is no
need to express dependencies in meson.
But if the need arises, could it become a problem? (adding a
dependency to lib/ethdev to some drivers/common/xx/).


For now, I prefer the second proposition and have this code hosted in
drivers/net/.


-- 
David Marchand



Re: [PATCH] vhost: fix read vs write lock mismatch

2024-11-25 Thread David Marchand
On Mon, Nov 25, 2024 at 5:20 PM Maxime Coquelin
 wrote:
> On 11/25/24 12:14, David Marchand wrote:
> > On Mon, Nov 18, 2024 at 5:24 PM Stephen Hemminger
> >  wrote:
> >>
> >> If lock is acquired for write, it must be released for write
> >> or a deadlock is likely.
> >>
> >> Bugzilla ID: 1582
> >> Fixes: 9fc93a1e2320 ("vhost: fix virtqueue access check in datapath")
> >> Cc: david.march...@redhat.com
> >> Cc: sta...@dpdk.org
> >>
> >> Signed-off-by: Stephen Hemminger 
> >> ---
> >>   lib/vhost/virtio_net.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
> >> index 298a5dae74..d764d4bc6a 100644
> >> --- a/lib/vhost/virtio_net.c
> >> +++ b/lib/vhost/virtio_net.c
> >> @@ -2538,7 +2538,7 @@ virtio_dev_rx_async_submit(struct virtio_net *dev, 
> >> struct vhost_virtqueue *vq,
> >>
> >>  if (unlikely(!vq->access_ok)) {
> >>  vhost_user_iotlb_rd_unlock(vq);
> >> -   rte_rwlock_read_unlock(&vq->access_lock);
> >> +   rte_rwlock_write_unlock(&vq->access_lock);
> >
> > A write lock is taken earlier, because virtio_dev_rx_async_submit_*
> > need it for access to vq->async (as opposed to the sync code that only
> > takes read lock).
> >
> > Here, no need to release/take again all locks.
> > A simpler fix is to directly call vring_translate(dev, vq).
> >
> >
>
> Ok, so both solutions are correct.
>
> David's one is more optimized, but this is a corner case in the async
> datapath so not really critical.
>
> On the other hand, Stephen's solution keeps the same pattern as the
> other datapaths.

Ok, it avoids having a special case, and I prefer limiting special cases too.

I'll send a RFC for new lock annotations (for catching such bug) for
the next release.


>
> I'd go with Stephen's solution, but would change the commit title to:
>
> vhost: fix deadlock in Rx async path
>
> With this change:
>
> Reviewed-by: Maxime Coquelin 

Yes, the title needed some work.


-- 
David Marchand



Re: [RFC PATCH 00/21] Reduce code duplication across Intel NIC drivers

2024-11-25 Thread Bruce Richardson
On Mon, Nov 25, 2024 at 05:25:47PM +0100, David Marchand wrote:
> Hello Bruce,
> 
> On Fri, Nov 22, 2024 at 1:54 PM Bruce Richardson
>  wrote:
> >
> > This RFC attempts to reduce the amount of code duplication across a
> > number of Intel NIC drivers, specifically: ixgbe, i40e, iavf, and ice.
> 
> Thanks for starting this effort!
> 
> >
> > The first patch extract a function from the Rx side, otherwise the
> > majority of the changes are on the Tx side, leading to a converged Tx
> > queue structure across the 4 drivers, and a large number of common
> > functions.
> >
> > Open question:
> > * How should common code across drivers within a single device class be
> >   managed?
> >   - For now, I've created an "intel_eth" folder within the "common"
> > driver directory, thinking about it after, it  implies to me that
> > it is common across driver classes.
> >   - Would it be better to create an "intel_common" directory within the
> > "net" folder?
> 
> common/ drivers currently host code that is device class agnostic,
> like providing helpers to talk with hw.
> No common/ driver has a dependency on some device class library.
> 
> This series adds code that is not built into a library so there is no
> need to express dependencies in meson.
> But if the need arises, could it become a problem? (adding a
> dependency to lib/ethdev to some drivers/common/xx/).
> 
> 
> For now, I prefer the second proposition and have this code hosted in
> drivers/net/.
> 
Thanks for the feedback. While when I started this prototyping I felt that
common was the right place for it, at this point I'm now tending towards
this second location - keeping it in net.
Any other thoughts on the relative merits of the various locations?

/Bruce


[PATCH] net/ixgbe: add support for new device

2024-11-25 Thread Zhichao Zeng
Add support for loopback_mode for new device.

Signed-off-by: Zhichao Zeng 
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 0d42fd8a3b..7d16eb9df7 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -5359,7 +5359,8 @@ ixgbe_check_supported_loopback_mode(struct rte_eth_dev 
*dev)
 hw->mac.type == ixgbe_mac_X540 ||
 hw->mac.type == ixgbe_mac_X550 ||
 hw->mac.type == ixgbe_mac_X550EM_x ||
-hw->mac.type == ixgbe_mac_X550EM_a)
+hw->mac.type == ixgbe_mac_X550EM_a ||
+hw->mac.type == ixgbe_mac_E610)
return 0;
 
return -ENOTSUP;
-- 
2.34.1



Re: [PATCH v3 3/3] net/hns3: fix Rx packet without CRC data

2024-11-25 Thread Stephen Hemminger
On Tue, 26 Nov 2024 10:40:27 +0800
huangdengdui  wrote:

> On 2024/11/26 1:45, Stephen Hemminger wrote:
> > On Fri, 19 Jul 2024 17:04:15 +0800
> > Jie Hai  wrote:
> >   
> >> +static inline void
> >> +hns3_recalculate_crc(struct rte_mbuf *m)
> >> +{
> >> +  char *append_data;
> >> +  uint32_t crc;
> >> +
> >> +  crc = rte_net_crc_calc(rte_pktmbuf_mtod(m, void *),
> >> + m->data_len, RTE_NET_CRC32_ETH);
> >> +
> >> +  /*
> >> +   * The hns3 driver requires that mbuf size must be at least 512B.
> >> +   * When CRC is stripped by hardware, the pkt_len must be less than
> >> +   * or equal to 60B. Therefore, the space of the mbuf is enough
> >> +   * to insert the CRC.
> >> +   *
> >> +   * In addition, after CRC is stripped by hardware, pkt_len and data_len
> >> +   * do not contain the CRC length. Therefore, after CRC data is appended
> >> +   * by PMD again, both pkt_len and data_len add the CRC length.
> >> +   */
> >> +  append_data = rte_pktmbuf_append(m, RTE_NET_CRC32_ETH);
> >> +  /* The CRC data is binary data and does not care about the byte order. 
> >> */
> >> +  rte_memcpy(append_data, (void *)&crc, RTE_NET_CRC32_ETH);
> >> +}
> >> +  
> > 
> > There is a bug here. If there is no space at end of mbuf (tailroom) then
> > rte_pktmbuf_append will return NULL. This would only happen if mbuf pool
> > was sized such that there was space of a full size packet without CRC
> > and then the code to add kept CRC was executed.
> > 
> > You need to check for NULL return and figure out some kind of unwind
> > path such as making it a receive error and dropping the packet.  
> 
> This situation has been described in the comments.
> The hns3 driver requires that mbuf size must be at least 512B.[1]
> When CRC needs to be recalculated, the packet length must be less than 60.
> So the space of the mbuf must be enough to insert the CRC.
> [1]:https://elixir.bootlin.com/dpdk/v23.11/source/drivers/net/hns3/hns3_rxtx.c#L1723

Ok, but maybe add an RTE_ASSERT() to be safe.


Re: [PATCH v3 3/3] net/hns3: fix Rx packet without CRC data

2024-11-25 Thread huangdengdui


On 2024/11/26 1:45, Stephen Hemminger wrote:
> On Fri, 19 Jul 2024 17:04:15 +0800
> Jie Hai  wrote:
> 
>> +static inline void
>> +hns3_recalculate_crc(struct rte_mbuf *m)
>> +{
>> +char *append_data;
>> +uint32_t crc;
>> +
>> +crc = rte_net_crc_calc(rte_pktmbuf_mtod(m, void *),
>> +   m->data_len, RTE_NET_CRC32_ETH);
>> +
>> +/*
>> + * The hns3 driver requires that mbuf size must be at least 512B.
>> + * When CRC is stripped by hardware, the pkt_len must be less than
>> + * or equal to 60B. Therefore, the space of the mbuf is enough
>> + * to insert the CRC.
>> + *
>> + * In addition, after CRC is stripped by hardware, pkt_len and data_len
>> + * do not contain the CRC length. Therefore, after CRC data is appended
>> + * by PMD again, both pkt_len and data_len add the CRC length.
>> + */
>> +append_data = rte_pktmbuf_append(m, RTE_NET_CRC32_ETH);
>> +/* The CRC data is binary data and does not care about the byte order. 
>> */
>> +rte_memcpy(append_data, (void *)&crc, RTE_NET_CRC32_ETH);
>> +}
>> +
> 
> There is a bug here. If there is no space at end of mbuf (tailroom) then
> rte_pktmbuf_append will return NULL. This would only happen if mbuf pool
> was sized such that there was space of a full size packet without CRC
> and then the code to add kept CRC was executed.
> 
> You need to check for NULL return and figure out some kind of unwind
> path such as making it a receive error and dropping the packet.

This situation has been described in the comments.
The hns3 driver requires that mbuf size must be at least 512B.[1]
When CRC needs to be recalculated, the packet length must be less than 60.
So the space of the mbuf must be enough to insert the CRC.
[1]:https://elixir.bootlin.com/dpdk/v23.11/source/drivers/net/hns3/hns3_rxtx.c#L1723


Re: [PATCH] examples/l3fwd: fix Tx performance deteriorate

2024-11-25 Thread Jie Hai

Hi, konstantin.ananyev,

That sounds better, will send V2。

Thanks,
Jie Hai



In  commit:
examples/l3fwd: add option to set Rx burst size
you introduced new global
uint32_t nb_pkt_per_burst;
Why not to use it for both (rx and tx) paths?
Or if necessary introduce another one for tx, so we'll have:
uint32_t nb_rx_pkt_per_burst, nb_tx_pkt_per_burst,;
To me that is much better then create some hardcoded
and implicit thresholds.


Fixes: d5c4897ecfb2 ("examples/l3fwd: add option to set Rx burst size")

Signed-off-by: Jie Hai 
---
  examples/l3fwd/l3fwd.h| 8 +---
  examples/l3fwd/l3fwd_common.h | 6 +++---
  2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
index 0cce3406ee7d..a01fecd51261 100644
--- a/examples/l3fwd/l3fwd.h
+++ b/examples/l3fwd/l3fwd.h
@@ -35,7 +35,7 @@
  /*
   * Try to avoid TX buffering if we have at least MAX_TX_BURST packets to send.
   */
-#defineMAX_TX_BURST  (MAX_PKT_BURST / 2)
+#defineMAX_TX_BURST DEFAULT_PKT_BURST

  #define NB_SOCKETS8

@@ -57,6 +57,8 @@
  #define L3FWD_HASH_ENTRIES(1024*1024*1)
  #endif

+static_assert(MAX_TX_BURST <= MAX_PKT_BURST, "MAX_TX_BURST should be at most 
MAX_PKT_BURST");
+
  struct parm_cfg {
const char *rule_ipv4_name;
const char *rule_ipv6_name;
@@ -152,8 +154,8 @@ send_single_packet(struct lcore_conf *qconf,
len++;

/* enough pkts to be sent */
-   if (unlikely(len == MAX_PKT_BURST)) {
-   send_burst(qconf, MAX_PKT_BURST, port);
+   if (unlikely(len == MAX_TX_BURST)) {
+   send_burst(qconf, MAX_TX_BURST, port);
len = 0;
}

diff --git a/examples/l3fwd/l3fwd_common.h b/examples/l3fwd/l3fwd_common.h
index d94e5f135791..3f504dc0a552 100644
--- a/examples/l3fwd/l3fwd_common.h
+++ b/examples/l3fwd/l3fwd_common.h
@@ -71,7 +71,7 @@ send_packetsx4(struct lcore_conf *qconf, uint16_t port, 
struct rte_mbuf *m[],
 * If TX buffer for that queue is empty, and we have enough packets,
 * then send them straightway.
 */
-   if (num >= MAX_TX_BURST && len == 0) {
+   if (num >= MAX_TX_BURST / 2 && len == 0) {
n = rte_eth_tx_burst(port, qconf->tx_queue_id[port], m, num);
if (unlikely(n < num)) {
do {
@@ -112,9 +112,9 @@ send_packetsx4(struct lcore_conf *qconf, uint16_t port, 
struct rte_mbuf *m[],
len += n;

/* enough pkts to be sent */
-   if (unlikely(len == MAX_PKT_BURST)) {
+   if (unlikely(len > MAX_TX_BURST)) {

-   send_burst(qconf, MAX_PKT_BURST, port);
+   send_burst(qconf, len, port);

/* copy rest of the packets into the TX buffer. */
len = num - n;
--
2.22.0




Re: [PATCH v3 1/3] ethdev: add description for KEEP CRC offload

2024-11-25 Thread Jie Hai



Hi, Stephen Hemminger

Thanks for your review.
I will add doc and fix on drivers in the next version.
The test will be done later.

On 2024/11/23 1:10, Stephen Hemminger wrote:

On Fri, 19 Jul 2024 17:04:13 +0800
Jie Hai  wrote:


From: Dengdui Huang 

The data exceeds the pkt_len in mbuf is inavailable for user.

   unavailable


When KEEP CRC offload is enabled, CRC field length should be
included in the pkt_len in mbuf. However, almost of drivers
supported KEEP CRC feature didn't add the CRC data length to
pkt_len. So it is very necessary to add comments for this.


All drivers must do the same thing, or this is a serious bug
in the drivers. Just changing a comment is not going to be helpful.

To fix this right:
  1. Do a test with one of the original drivers in DPDK that has this
 feature. I would suggest ixgbe, mlx5 or bnxt.
I can test it on ixgbe and mlx5.
  2. Add a test to the PMD tests that validates this (if there is not
 one already).


Maybe later and not come with this patchset.

  3. Put the documentation in a place where it shows up in user documentation.
 Either in doxygen comment or in doc/guides/nics


Will add in the next version.

  4. Verify that all devices conform to the desired behavior

I can help, but only have some old mlx5 cards to test here.

Thanks.

Just putting comment in ethdev.h is not enough.



Thanks,
Jie Hai





Re: [PATCH] doc: reword glossary

2024-11-25 Thread Stephen Hemminger
On Thu, 21 Nov 2024 18:26:45 -0800
Nandini Persad  wrote:

> I added additional reference links and definitions to many
> of the terms in the glossary. Please feel free to provide
> feedback to ensure my definitions suit the proper context
> in the DPDK community.
> 
> Signed-off-by: Nandini Persad 

Rather than putting links after the text, they can be embedded
which solves lots of the reference warnings.

Please only link to neutral third party places like IETF (RFC's), Wikipedia, 
etc.
Not vendor sites.

I added a few more links, and there are many more that could go here.
HPET, IPv4 (RFC), IPv6 (RFC), LAN (wikipedia), MTU, NIC, Out of Order execution,
PCI (PCI standard), PHY (wikipedia), PIE (RFC), QoS (various RFC's),
RCU (wikipedia), RED(RFC), RSS (wikipedia) , SLA, srTCM, Traffic Class,
TLB, TSC, TUN/TAP, VLAN, WRED, ...

diff --git a/doc/guides/prog_guide/glossary.rst 
b/doc/guides/prog_guide/glossary.rst
index 9f85e46437..d832d4c0be 100644
--- a/doc/guides/prog_guide/glossary.rst
+++ b/doc/guides/prog_guide/glossary.rst
@@ -6,35 +6,35 @@ Glossary
 
 
 ACL
-   An access control list (ACL) is a set of rules that define who can access a 
resource and what actions they can perform.
-   `ACL Link 
`_
+   An `access control list (ACL) 
`_
+   is a set of rules that define who can access a resource and what actions 
they can perform.
 
 API
Application Programming Interface
 
 ASLR
-   Linux* kernel Address-Space Layout Randomization
-   A computer security technique that protects against buffer overflow attacks 
by randomizing the location of executables in memory in Linux.
-   `ASLR Link 
`_
+   `Address-Space Layout Randomization (ASLR) 
`_
+   is a computer security technique that protects against buffer overflow 
attacks by randomizing the location of
+   executables in memory.
 
 BSD
-   Berkeley Software Distribution is a Unix-like operating system.
+   `Berkeley Software Distribution (BSD) 
`_
+   is an version of Unix™ operating system.
 
 Clr
Clear
 
 CIDR
-   Classless Inter-Domain Routing
-   A method of assigning IP address that improves data routing efficiency on 
the internet and is used in IPv4 and IPv6.
-   `RFC Link `_
+   `Classless Inter-Domain Routing (CIDR) 
`_
+   is a method of assigning IP address that improves data routing efficiency 
on the internet and is used in IPv4 and IPv6.
 
 Control Plane
-   A Control Plane is a key concept in networking that refers to the part of a 
network system
+   A `Control Plane `_ is a 
concept in networking that refers to the part of the system
responsible for managing and making decisions about where and how data 
packets are forwarded within a network.
 
 Core
-   A core may include several lcores or threads if the processor supports 
simultaneous multithreading (SMT).
-   `Simultaneous Multithreading 
`_
+   A core may include several lcores or threads if the processor supports
+   `simultaneous multithreading (SMT) 
`_
 
 Core Components
A set of libraries provided by DPDK which are used by nearly all 
applications and
@@ -49,8 +49,9 @@ CRC
 
 Data Plane
In contrast to the control plane, which is responsible for setting up and 
managing data connections,
-   the data plane in a network architecture includes the layers involved when 
processing and forwarding
-   data packets between communicating endpoints. These layers must be highly 
optimized to achieve good performance.
+   the `data plane `_ in a network 
architecture includes the
+   layers involved when processing and forwarding data packets between 
communicating endpoints.
+   These layers must be highly optimized to achieve good performance.
 
 DIMM
Dual In-line Memory Module
@@ -58,23 +59,22 @@ DIMM
circuit board that connect it directly to the computer motherboard.
 
 Doxygen
-   A documentation generator used in the DPDK to generate the API reference.
-   `Doxygen Link `_
+   `Doxygen `_ is a
+   documentation generator used in the DPDK to generate the API reference.
 
 DPDK
Data Plane Development Kit
 
 DRAM
-   Dynamic Random Access Memory
-   A type of random access memory (RAM) that is used in computers to 
temporarily store information.
-   `Link 

[PATCH v5 3/3] test/graph: fix graph autotest second run test failure

2024-11-25 Thread kirankumark
From: Kiran Kumar K 

The graph autotest second run test is failing due to the
node name is already present in the node list. Adding changes
to free nodes at the time of test cleanup.

Fixes: 6b89650418fe ("test/graph: add functional tests")

Signed-off-by: Kiran Kumar K 
---
 app/test/test_graph.c | 95 ---
 1 file changed, 90 insertions(+), 5 deletions(-)

diff --git a/app/test/test_graph.c b/app/test/test_graph.c
index 2840a25b13..635b8dd527 100644
--- a/app/test/test_graph.c
+++ b/app/test/test_graph.c
@@ -68,6 +68,8 @@ static void *mbuf_p[MAX_NODES + 1][MBUFF_SIZE];
 static rte_graph_t graph_id;
 static uint64_t obj_stats[MAX_NODES + 1];
 static uint64_t fn_calls[MAX_NODES + 1];
+static uint32_t dummy_nodes_id[MAX_NODES];
+static uint32_t dummy_nodes_id_count;
 
 const char *node_patterns[] = {
"test_node_source1",   "test_node00",
@@ -541,6 +543,66 @@ test_lookup_functions(void)
return 0;
 }
 
+static int
+test_node_id(void)
+{
+   uint32_t node_id, odummy_id, dummy_id, dummy_id1;
+
+   node_id = rte_node_from_name("test_node00");
+
+   dummy_id = rte_node_clone(node_id, "test_node_id00");
+   if (rte_node_is_invalid(dummy_id)) {
+   printf("Got invalid id when clone\n");
+   return -1;
+   }
+
+   dummy_id1 = rte_node_clone(node_id, "test_node_id01");
+   if (rte_node_is_invalid(dummy_id1)) {
+   printf("Got invalid id when clone\n");
+   return -1;
+   }
+
+   /* Expect next node id to be node_id + 1 */
+   if ((dummy_id + 1) != dummy_id1) {
+   printf("Node id didn't match, expected = %d got = %d\n",
+  dummy_id+1, dummy_id1);
+   return -1;
+   }
+
+   odummy_id = dummy_id;
+   /* Free one of the cloned node */
+   if (rte_node_free(dummy_id)) {
+   printf("Failed to free node\n");
+   return -1;
+   }
+
+   /* Clone again, should get the same id, that is freed */
+   dummy_id = rte_node_clone(node_id, "test_node_id00");
+   if (rte_node_is_invalid(dummy_id)) {
+   printf("Got invalid id when clone\n");
+   return -1;
+   }
+
+   if (dummy_id != odummy_id) {
+   printf("Node id didn't match, expected = %d got = %d\n",
+  odummy_id, dummy_id);
+   return -1;
+   }
+
+   /* Free the node */
+   if (rte_node_free(dummy_id)) {
+   printf("Failed to free node\n");
+   return -1;
+   }
+
+   if (rte_node_free(dummy_id1)) {
+   printf("Failed to free node\n");
+   return -1;
+   }
+
+   return 0;
+}
+
 static int
 test_node_clone(void)
 {
@@ -551,11 +613,12 @@ test_node_clone(void)
node_id = rte_node_from_name("test_node00");
tm->test_node[0].idx = node_id;
 
-   dummy_id = rte_node_clone(node_id, "test_node00");
-   if (rte_node_is_invalid(dummy_id)) {
+   dummy_nodes_id[dummy_nodes_id_count] = rte_node_clone(node_id, 
"test_node00");
+   if (rte_node_is_invalid(dummy_nodes_id[dummy_nodes_id_count])) {
printf("Got invalid id when clone, Expecting fail\n");
return -1;
}
+   dummy_nodes_id_count++;
 
/* Clone with same name, should fail */
dummy_id = rte_node_clone(node_id, "test_node00");
@@ -635,15 +698,15 @@ test_create_graph(void)
.nb_node_patterns = 6,
.node_patterns = node_patterns_dummy,
};
-   uint32_t dummy_node_id;
uint32_t node_id;
 
node_id = rte_node_from_name("test_node00");
-   dummy_node_id = rte_node_clone(node_id, "dummy_node");
-   if (rte_node_is_invalid(dummy_node_id)) {
+   dummy_nodes_id[dummy_nodes_id_count] = rte_node_clone(node_id, 
"dummy_node");
+   if (rte_node_is_invalid(dummy_nodes_id[dummy_nodes_id_count])) {
printf("Got invalid node id\n");
return -1;
}
+   dummy_nodes_id_count++;
 
graph_id = rte_graph_create("worker0", &gconf);
if (graph_id != RTE_GRAPH_ID_INVALID) {
@@ -1026,17 +1089,39 @@ graph_setup(void)
}
printf("test_node_clone: pass\n");
 
+   if (test_node_id()) {
+   printf("test_node_id: fail\n");
+   return -1;
+   }
+   printf("test_node_id: pass\n");
+
return 0;
 }
 
 static void
 graph_teardown(void)
 {
+   uint32_t i;
int id;
 
id = rte_graph_destroy(rte_graph_from_name("worker0"));
if (id)
printf("Graph Destroy failed\n");
+
+   for (i = 1; i < MAX_NODES; i++) {
+   if (rte_node_free(test_main.test_node[i].idx)) {
+   printf("Node free failed\n");
+   return;
+   }
+   }
+
+   for (i = 0; i < dummy_nodes_id_count; i++) {
+   if (rte_node_free(dummy_no

[PATCH v5 2/3] graph: add support for node free API

2024-11-25 Thread kirankumark
From: Kiran Kumar K 

Add support for rte_node_free API to free the node and its
memory, if node is not part of any of the created graphs.

Signed-off-by: Kiran Kumar K 
---
v5:
- Fix review comments.

 lib/graph/graph.c | 16 
 lib/graph/graph_private.h | 13 +
 lib/graph/node.c  | 25 +
 lib/graph/rte_graph.h | 15 +++
 lib/graph/version.map |  3 +++
 5 files changed, 72 insertions(+)

diff --git a/lib/graph/graph.c b/lib/graph/graph.c
index dff8e690a8..bad00c8fc0 100644
--- a/lib/graph/graph.c
+++ b/lib/graph/graph.c
@@ -20,6 +20,22 @@
 static struct graph_head graph_list = STAILQ_HEAD_INITIALIZER(graph_list);
 static rte_spinlock_t graph_lock = RTE_SPINLOCK_INITIALIZER;

+bool
+graph_is_node_active_in_graph(struct node *node)
+{
+   struct graph *graph;
+
+   STAILQ_FOREACH(graph, &graph_list, next) {
+   struct graph_node *graph_node;
+
+   STAILQ_FOREACH(graph_node, &graph->node_list, next)
+   if (graph_node->node == node)
+   return 1;
+   }
+
+   return 0;
+}
+
 /* Private functions */
 static struct graph *
 graph_from_id(rte_graph_t id)
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index fdaf5649b8..6eecba4f78 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -440,4 +440,17 @@ int graph_sched_wq_create(struct graph *_graph, struct 
graph *_parent_graph,
  */
 void graph_sched_wq_destroy(struct graph *_graph);

+/**
+ * @internal
+ *
+ * Check if given node present in any of the created graphs.
+ *
+ * @param node
+ *   The node object
+ *
+ * @return
+ *   0 if not present in any graph, else return 1.
+ */
+bool graph_is_node_active_in_graph(struct node *_node);
+
 #endif /* _RTE_GRAPH_PRIVATE_H_ */
diff --git a/lib/graph/node.c b/lib/graph/node.c
index 5ff8dfcd55..5aaf787d5e 100644
--- a/lib/graph/node.c
+++ b/lib/graph/node.c
@@ -481,3 +481,28 @@ rte_node_max_count(void)
}
return node_id;
 }
+
+int
+rte_node_free(rte_node_t id)
+{
+   struct node *node;
+   int rc = -1;
+
+   if (node_from_id(id) == NULL)
+   goto fail;
+
+   graph_spinlock_lock();
+   STAILQ_FOREACH(node, &node_list, next) {
+   if (id == node->id) {
+   if (!graph_is_node_active_in_graph(node)) {
+   STAILQ_REMOVE(&node_list, node, node, next);
+   free(node);
+   rc = 0;
+   }
+   break;
+   }
+   }
+   graph_spinlock_unlock();
+fail:
+   return rc;
+}
diff --git a/lib/graph/rte_graph.h b/lib/graph/rte_graph.h
index f5e575dbed..097d0dc9d5 100644
--- a/lib/graph/rte_graph.h
+++ b/lib/graph/rte_graph.h
@@ -22,6 +22,7 @@
 #include 

 #include 
+#include 

 #ifdef __cplusplus
 extern "C" {
@@ -661,6 +662,20 @@ rte_node_is_invalid(rte_node_t id)
return (id == RTE_NODE_ID_INVALID);
 }

+/**
+ * Release the memory allocated for a node created using RTE_NODE_REGISTER or 
rte_node_clone,
+ * if it is not linked to any graphs.
+ *
+ * @param id
+ *   Node id to check.
+ *
+ * @return
+ *   - 0: Success.
+ *   -<0: Failure.
+ */
+__rte_experimental
+int rte_node_free(rte_node_t id);
+
 /**
  * Test the validity of edge id.
  *
diff --git a/lib/graph/version.map b/lib/graph/version.map
index 44fadc00fd..a793ea1d8e 100644
--- a/lib/graph/version.map
+++ b/lib/graph/version.map
@@ -58,4 +58,7 @@ EXPERIMENTAL {

# added in 24.11
rte_node_xstat_increment;
+
+   # added in 25.03
+   rte_node_free;
 };
--
2.43.0



[PATCH v5 1/3] graph: avoid global node ID counter

2024-11-25 Thread kirankumark
From: Kiran Kumar K 

The node id is determined based on a global variable that is
incremented every time a node is created. Adding changes to
remove the global counter. Make sure that the node list is always
ordered by increasing node ids. When creating a new node, pick a free
id which is not allocated.

Signed-off-by: Kiran Kumar K 
---
 lib/graph/graph_populate.c | 11 +++--
 lib/graph/graph_private.h  |  8 
 lib/graph/node.c   | 86 --
 3 files changed, 82 insertions(+), 23 deletions(-)

diff --git a/lib/graph/graph_populate.c b/lib/graph/graph_populate.c
index 1e6b08319e..026daecb21 100644
--- a/lib/graph/graph_populate.c
+++ b/lib/graph/graph_populate.c
@@ -78,7 +78,6 @@ graph_nodes_populate(struct graph *_graph)
struct rte_graph *graph = _graph->graph;
struct graph_node *graph_node;
rte_edge_t count, nb_edges;
-   const char *parent;
rte_node_t pid;
 
STAILQ_FOREACH(graph_node, &_graph->node_list, next) {
@@ -94,8 +93,14 @@ graph_nodes_populate(struct graph *_graph)
memcpy(node->name, graph_node->node->name, RTE_GRAPH_NAMESIZE);
pid = graph_node->node->parent_id;
if (pid != RTE_NODE_ID_INVALID) { /* Cloned node */
-   parent = rte_node_id_to_name(pid);
-   memcpy(node->parent, parent, RTE_GRAPH_NAMESIZE);
+   struct node *pnode;
+
+   STAILQ_FOREACH(pnode, node_list_head_get(), next) {
+   if (pnode->id == pid) {
+   memcpy(node->parent, pnode->name, 
RTE_GRAPH_NAMESIZE);
+   break;
+   }
+   }
}
node->id = graph_node->node->id;
node->parent_id = pid;
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index da48d73587..fdaf5649b8 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -29,14 +29,6 @@ extern int rte_graph_logtype;
 #define graph_info(...) GRAPH_LOG(INFO, __VA_ARGS__)
 #define graph_dbg(...) GRAPH_LOG(DEBUG, __VA_ARGS__)
 
-#define ID_CHECK(id, id_max)   
\
-   do {   \
-   if ((id) >= (id_max)) {\
-   rte_errno = EINVAL;\
-   goto fail; \
-   }  \
-   } while (0)
-
 #define SET_ERR_JMP(err, where, fmt, ...)  
\
do {   \
graph_err(fmt, ##__VA_ARGS__); \
diff --git a/lib/graph/node.c b/lib/graph/node.c
index 63db629da8..5ff8dfcd55 100644
--- a/lib/graph/node.c
+++ b/lib/graph/node.c
@@ -15,9 +15,56 @@
 #include "graph_private.h"
 
 static struct node_head node_list = STAILQ_HEAD_INITIALIZER(node_list);
-static rte_node_t node_id;
 
-#define NODE_ID_CHECK(id) ID_CHECK(id, node_id)
+static struct node *
+node_from_id(rte_node_t id)
+{
+   struct node *node = NULL;
+
+   graph_spinlock_lock();
+   rte_errno = EINVAL;
+   STAILQ_FOREACH(node, &node_list, next) {
+   if (node->id == id) {
+   rte_errno = 0;
+   goto exit;
+   }
+   }
+exit:
+   graph_spinlock_unlock();
+   return node;
+}
+
+static rte_node_t
+next_next_free_id(void)
+{
+   struct node *node;
+   rte_node_t id = 0;
+
+   STAILQ_FOREACH(node, &node_list, next) {
+   if (id < node->id)
+   break;
+   id = node->id + 1;
+   }
+   return id;
+}
+
+static void
+node_insert_ordered(struct node *node)
+{
+   struct node *after, *g;
+
+   after = NULL;
+   STAILQ_FOREACH(g, &node_list, next) {
+   if (g->id < node->id)
+   after = g;
+   else if (g->id > node->id)
+   break;
+   }
+   if (after == NULL)
+   STAILQ_INSERT_HEAD(&node_list, node, next);
+   else
+   STAILQ_INSERT_AFTER(&node_list, after, node, next);
+}
 
 /* Private functions */
 struct node_head *
@@ -116,10 +163,10 @@ __rte_node_register(const struct rte_node_register *reg)
}
 
node->lcore_id = RTE_MAX_LCORE;
-   node->id = node_id++;
+   node->id = next_next_free_id();
 
-   /* Add the node at tail */
-   STAILQ_INSERT_TAIL(&node_list, node, next);
+   /* Add the node in ordered list */
+   node_insert_ordered(node);
graph_spinlock_unlock();
 
return node->id;
@@ -194,7 +241,9 @@ rte_node_clone(rt