date:20220711

Minutes of Technical Board Meeting, 2022-06-29

2022-07-11 Thread Thomas Monjalon

Members Attending: 9/10
- Aaron Conole
- Bruce Richardson
- Honnappa Nagarahalli
- Jerin Jacob
- Kevin Traynor
- Konstantin Ananyev
- Maxime Coquelin
- Stephen Hemminger
- Thomas Monjalon (Chair)

NOTE: The Technical Board meetings take place every second Wednesday
on https://meet.jit.si/DPDK at 3 pm UTC.
Meetings are public, and DPDK community members are welcome to attend.
Agenda and minutes can be found at http://core.dpdk.org/techboard/minutes

NOTE: Next meeting will be on Wednesday 2022-07-13 @3pm UTC,
and will be chaired by Aaron.


1/ Actions from last meeting:
- Thomas to send an update of the security process
- Bruce to send a deprecation notice for the KNI example
- Looking for volunteers to start a doc intro based on the white paper
  (note: we should include a reference to the white paper)

2/ Event in France:
- Discussed last details about the CFP, should be ready soon.
- There are attendee fees to avoid cancellations,
  and early bird price to avoid late registrations.
- Hackathon will happen on September 6 with several topics
  (static analysis, code coverage, coccinelle, vhost vulnerabilities)
- Open discussions with the Technical Board should happen
  at the end of September 6 or 7 (agenda to be finalized)

3/ Technical writer hiring process:
- Will ask what could be a doc audit to an agency
- Position is still open

4/ Servers backup:
- Asking Linux Foundation for hosting of backups.
- Access should be granted to a group (techboard members?).

RE: [PATCH] vhost: fix unnecessary dirty page logging

2022-07-11 Thread He, Xingguang

Hi,

> -Original Message-
> From: Ding, Xuan 
> Sent: Friday, July 8, 2022 4:09 PM
> To: Maxime Coquelin ; Xia, Chenbo
> 
> Cc: dev@dpdk.org; Hu, Jiayu ; He, Xingguang
> ; Yang, YvonneX ;
> Jiang, Cheng1 
> Subject: RE: [PATCH] vhost: fix unnecessary dirty page logging
> 
> 
> 
> > -Original Message-
> > From: Maxime Coquelin 
> > Sent: 2022年7月8日 15:58
> > To: Ding, Xuan ; Xia, Chenbo
> > 
> > Cc: dev@dpdk.org; Hu, Jiayu ; He, Xingguang
> > ; Yang, YvonneX ;
> > Jiang,
> > Cheng1 
> > Subject: Re: [PATCH] vhost: fix unnecessary dirty page logging
> >
> > Hi,
> >
> > On 7/8/22 09:04, Ding, Xuan wrote:
> > > Hi,
> > >
> > >> -Original Message-
> > >> From: Maxime Coquelin 
> > >> Sent: 2022年7月7日 19:31
> > >> To: Xia, Chenbo ; Ding, Xuan
> > >> 
> > >> Cc: dev@dpdk.org; Hu, Jiayu ; He, Xingguang
> > >> ; Yang, YvonneX ;
> > >> Jiang,
> > >> Cheng1 
> > >> Subject: Re: [PATCH] vhost: fix unnecessary dirty page logging
> > >>
> > >>
> > >>
> > >> On 7/7/22 11:51, Xia, Chenbo wrote:
> >  -Original Message-
> >  From: Ding, Xuan 
> >  Sent: Thursday, July 7, 2022 2:55 PM
> >  To: maxime.coque...@redhat.com; Xia, Chenbo
> >  
> >  Cc: dev@dpdk.org; Hu, Jiayu ; He, Xingguang
> >  ; Yang, YvonneX
> ;
> >  Jiang,
> >  Cheng1 ; Ding, Xuan 
> >  Subject: [PATCH] vhost: fix unnecessary dirty page logging
> > 
> >  From: Xuan Ding 
> > 
> >  The dirty page logging is only required in vhost enqueue
> >  direction for live migration. This patch removes the unnecessary
> >  dirty page logging in vhost dequeue direction. Otherwise, it will
> >  result in a performance drop. Some if-else judgements are also
> >  optimized to improve
> > >> performance.
> > 
> >  Fixes: 6d823bb302c7 ("vhost: prepare sync for descriptor to mbuf
> >  refactoring")
> >  Fixes: b6eee3e83402 ("vhost: fix sync dequeue offload")
> > 
> >  Signed-off-by: Xuan Ding 
> >  ---
> > >>>
> > >>> Reviewed-by: Chenbo Xia 
> > >>>
> > >>> Although it's late in release, we can consider to merge this as I
> > >>> see it impacts the performance by 5%:
> > >>>
> > >>
> >
> http://inbox.dpdk.org/dev/BYAPR11MB2711F13CDA2B0A4535A6591EFE839
> @B
> > >> YAPR
> > >>> 11MB2711.namprd11.prod.outlook.com/T/#t
> > >>
> > >> Yes, I raised we need it in -rc4 at today's Release status meeting.
> > >> I'll review it today.
> > >>
> > >>> But also, it will be good to know the performance issue is solved
> > >>> by sharing the test results.
> > >>
> > >> Yes, Intel performance results would be appreciated.
> > >
> > > This fix patch is for the issue reported at 22.07-rc3. The
> > > refactoring patch
> > brings a 3%~5% perf drop in vhost sync path.
> > > With fix patch, the perf drop introduced by refactoring is solved.
> >
> > Good.
> >
> > > However, the testing result shows there still exists ~5% packed ring
> > > perf drop
> > compared with 22.03.
> > > We find the improving checksum offload patch series in 22.07 may
> > > contribute
> > to the packed ring perf drop.
> > > Because we always do checksum checks in PMD.
> > >
> > > Could you help to double check this patch series, is it as expected?
> > > Your assistance is really appreciated.
> >
> > I will look again, but if your analysis is right, there's not much we
> > can do. The series is filling gaps in the checksum offload support.

What is the result of your investigation? As dpdk-22.07 will be released soon, 
I'd like to know your advice about this issue. Thanks.

> 
> Yes, I think so.
> 
> >
> > I will not have time to work on it for v22.07, we're too late in the
> > cycle. What surprises me is that only packed ring is impacted.
> 
> From the testing results, the packed ring is more sensitive to the patch 
> series,
> while split ring shows a much lower drop in performance.
> 
> Thanks for your support!
> 

Regards,
Xingguang

> Regards,
> Xuan
> 
> >
> > Regards,
> > Maxime
> >
> > >
> > > Regards,
> > > Xuan
> > >
> > >>
> > >>> Thanks,
> > >>> Chenbo
> > >>>
> > >>
> > >> Thanks,
> > >> Maxime
> > >

Re: [PATCH] common/sfc_efx/base: remove VQ index check during VQ start

2022-07-11 Thread Andrew Rybchenko


Hi Vijay,

On 7/8/22 12:17, Srivastava, Vijay wrote:


Acked-by: Vijay Srivastava 


Please, send a patch to MAINTAINERS to update your E-mail
address from Xilinx to AMD domain.

Andrew.

Re: [PATCH] common/sfc_efx/base: remove VQ index check during VQ start

2022-07-11 Thread David Marchand

Hello,

On Mon, Jul 11, 2022 at 10:26 AM Andrew Rybchenko
 wrote:
>
> On 7/8/22 12:17, Srivastava, Vijay wrote:
>
> > Acked-by: Vijay Srivastava 
>
> Please, send a patch to MAINTAINERS to update your E-mail
> address from Xilinx to AMD domain.

Please register this mail address to the mailing list, too.
Your recent messages from @amd.com were waiting in the moderation queue.

-- 
David Marchand

Re: [PATCH] common/sfc_efx/base: remove VQ index check during VQ start

2022-07-11 Thread Andrew Rybchenko


On 7/8/22 10:37, abhimanyu.sa...@xilinx.com wrote:

From: Abhimanyu Saini 

The used/avail queue indexes are not bound by queue
size, because the descriptor entry index is calculated
by a simple modulo between queue index and queue_size


"is calculated" is a bit vague since looking at the code
I've failed to find the place where modulo operation is
done. Don't we need to apply it these values are put
into MCDI message?



So, do not check initial used and avail queue indexes
against queue size because it is possible for these
indexes to be greater than queue size in the
following cases:
1) The queue is created to be migrated into, or
2) The client issues a qstop/qstart after running datapath

Fixes: 4dda72dbdeab3 ("common/sfc_efx/base: add base virtio support for vDPA")
Cc: sta...@dpdk.org

Signed-off-by: Abhimanyu Saini 
---
  drivers/common/sfc_efx/base/rhead_virtio.c | 12 +---
  1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/drivers/common/sfc_efx/base/rhead_virtio.c 
b/drivers/common/sfc_efx/base/rhead_virtio.c
index 335cb74..7f08717 100644
--- a/drivers/common/sfc_efx/base/rhead_virtio.c
+++ b/drivers/common/sfc_efx/base/rhead_virtio.c
@@ -47,14 +47,6 @@
goto fail2;
}
  
-	if (evvdp != NULL) {

-   if ((evvdp->evvd_vq_cidx > evvcp->evvc_vq_size) ||
-   (evvdp->evvd_vq_pidx > evvcp->evvc_vq_size)) {
-   rc = EINVAL;
-   goto fail3;
-   }
-   }
-
req.emr_cmd = MC_CMD_VIRTIO_INIT_QUEUE;
req.emr_in_buf = payload;
req.emr_in_length = MC_CMD_VIRTIO_INIT_QUEUE_REQ_LEN;
@@ -116,15 +108,13 @@
  
  	if (req.emr_rc != 0) {

rc = req.emr_rc;
-   goto fail4;
+   goto fail3;
}
  
  	evvp->evv_vi_index = vi_index;
  
  	return (0);
  
-fail4:

-   EFSYS_PROBE(fail4);
  fail3:
EFSYS_PROBE(fail3);
  fail2:

Re: [RFC v2 v2 00/29] Bus and device cleanup for 22.11

2022-07-11 Thread Bruce Richardson

On Sat, Jul 09, 2022 at 10:26:15AM +0200, David Marchand wrote:
> This is a PoC for hiding the rte_bus, rte_driver and rte_device objects.
> And mark associated driver only API as internal.
> 
> A good amount of the patches are preparation work on rte_bus.h,
> rte_dev.h, rte_devargs.h and rte_eal.h headers, removing dependencies
> between them. This is something I had in store for some time, maybe I
> should have dropped it from the PoC, but I think those cleanups are
> worth it in any case.
> 
> Then PCI bus specific handling are moved from unit tests and examples,
> though there is still a special case left in testpmd that may require a
> new API, to be discussed.
> 
> After this series, driver-only API headers for registering to buses are
> not exported anymore, unless the enable_driver_sdk meson option is
> selected.
> 
> New accessors for rte_bus, rte_driver and rte_device have been added,
> marked with an experimental tag though we may declare them as stable
> right away so that users can switch to them directly. That's also
> something to agree on.
> 
Yes, I think we need to make them stable.

> I simplified my series and switched to only update "external" users,
> like app/ and examples/ files.
> We need some checkpatch new checks to be sure we won't get some
> driver-only headers included in these areas. That's something I'll work
> on in the non RFC series.
> 
> "Internal" users are simply using the internal headers. That helps
> greatly reducing the size of the changes.
> 
> Disclaimer: again, in this v2, this series is a bit rushed (I brute forced
> compilation tests in GHA so that it passes between patches, but there still
> may be something broken...).
> Not surprisingly, the ABI check in the CI is expected to fail.
> 
> 
> Comments welcome.
>
This is great cleanup. Thanks for all the work on it.

/Bruce

RE: [PATCH v3] vhost: fix deadlock when message handling failed

2022-07-11 Thread Pei, Andy

HI ALL,

I see that in function vhost_user_msg_handler.
We use "ret" to store both vhost msg return code like 
"RTE_VHOST_MSG_RESULT_XXX" and function return value.
I wonder if it is better to use two different variable to make it easy to read.

> -Original Message-
> From: Maxime Coquelin 
> Sent: Tuesday, May 17, 2022 9:24 PM
> To: Ma, WenwuX ; Xia, Chenbo
> ; dev@dpdk.org
> Cc: Hu, Jiayu ; Wang, Yinan ; He,
> Xingguang ; sta...@dpdk.org
> Subject: Re: [PATCH v3] vhost: fix deadlock when message handling failed
> 
> 
> 
> On 5/7/22 15:27, Wenwu Ma wrote:
> > In vhost_user_msg_handler(), if vhost message handling failed, we
> > should check whether the queue is locked and release the lock before
> > returning. Or, it will cause a deadlock later.
> >
> > Fixes: 7f31d4ea05ca ("vhost: fix lock on device readiness
> > notification")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Wenwu Ma 
> > ---
> >   lib/vhost/vhost_user.c | 10 ++
> >   1 file changed, 6 insertions(+), 4 deletions(-)
> >
> 
> Applied to dpdk-next-virtio/main.
> 
> Thanks,
> Maxime

RE: [PATCH v3] vhost: fix deadlock when message handling failed

2022-07-11 Thread Xia, Chenbo

> -Original Message-
> From: Pei, Andy 
> Sent: Monday, July 11, 2022 4:42 PM
> To: Maxime Coquelin ; Ma, WenwuX
> ; Xia, Chenbo ; dev@dpdk.org
> Cc: Hu, Jiayu ; Wang, Yinan ; He,
> Xingguang ; sta...@dpdk.org
> Subject: RE: [PATCH v3] vhost: fix deadlock when message handling failed
> 
> HI ALL,
> 
> I see that in function vhost_user_msg_handler.
> We use "ret" to store both vhost msg return code like
> "RTE_VHOST_MSG_RESULT_XXX" and function return value.
> I wonder if it is better to use two different variable to make it easy to
> read.

By saying 'function return value', you mean which function? Can you
elaborate more?

Thanks,
Chenbo

> 
> > -Original Message-
> > From: Maxime Coquelin 
> > Sent: Tuesday, May 17, 2022 9:24 PM
> > To: Ma, WenwuX ; Xia, Chenbo
> > ; dev@dpdk.org
> > Cc: Hu, Jiayu ; Wang, Yinan ;
> He,
> > Xingguang ; sta...@dpdk.org
> > Subject: Re: [PATCH v3] vhost: fix deadlock when message handling failed
> >
> >
> >
> > On 5/7/22 15:27, Wenwu Ma wrote:
> > > In vhost_user_msg_handler(), if vhost message handling failed, we
> > > should check whether the queue is locked and release the lock before
> > > returning. Or, it will cause a deadlock later.
> > >
> > > Fixes: 7f31d4ea05ca ("vhost: fix lock on device readiness
> > > notification")
> > > Cc: sta...@dpdk.org
> > >
> > > Signed-off-by: Wenwu Ma 
> > > ---
> > >   lib/vhost/vhost_user.c | 10 ++
> > >   1 file changed, 6 insertions(+), 4 deletions(-)
> > >
> >
> > Applied to dpdk-next-virtio/main.
> >
> > Thanks,
> > Maxime

RE: [PATCH v3] vhost: fix deadlock when message handling failed

2022-07-11 Thread Pei, Andy

HI Chenbo,

See below example

1.
ret = read_vhost_message(dev, fd, &ctx);
if (ret <= 0) {
if (ret < 0)
VHOST_LOG_CONFIG(dev->ifname, ERR, "vhost read message 
failed\n");
else
VHOST_LOG_CONFIG(dev->ifname, INFO, "vhost peer 
closed\n");

return -1;
}

2.
ret = vhost_user_check_and_alloc_queue_pair(dev, &ctx);
if (ret < 0) {
VHOST_LOG_CONFIG(dev->ifname, ERR, "failed to alloc queue\n");
return -1;
}


3. ret = -1  
} else if (ret == RTE_VHOST_MSG_RESULT_ERR) {
VHOST_LOG_CONFIG(dev->ifname, ERR, "vhost message handling 
failed.\n");
ret = -1;
goto unlock;
}

I mean ret is used to check the function is returned successfully or not while 
it is also used to store vhost msg return code like "RTE_VHOST_MSG_RESULT_XXX".




> -Original Message-
> From: Xia, Chenbo 
> Sent: Monday, July 11, 2022 4:54 PM
> To: Pei, Andy ; Maxime Coquelin
> ; Ma, WenwuX ;
> dev@dpdk.org
> Cc: Hu, Jiayu ; Wang, Yinan ; He,
> Xingguang ; sta...@dpdk.org
> Subject: RE: [PATCH v3] vhost: fix deadlock when message handling failed
> 
> > -Original Message-
> > From: Pei, Andy 
> > Sent: Monday, July 11, 2022 4:42 PM
> > To: Maxime Coquelin ; Ma, WenwuX
> > ; Xia, Chenbo ;
> > dev@dpdk.org
> > Cc: Hu, Jiayu ; Wang, Yinan
> > ; He, Xingguang ;
> > sta...@dpdk.org
> > Subject: RE: [PATCH v3] vhost: fix deadlock when message handling
> > failed
> >
> > HI ALL,
> >
> > I see that in function vhost_user_msg_handler.
> > We use "ret" to store both vhost msg return code like
> > "RTE_VHOST_MSG_RESULT_XXX" and function return value.
> > I wonder if it is better to use two different variable to make it easy
> > to read.
> 
> By saying 'function return value', you mean which function? Can you elaborate
> more?
> 
> Thanks,
> Chenbo
> 
> >
> > > -Original Message-
> > > From: Maxime Coquelin 
> > > Sent: Tuesday, May 17, 2022 9:24 PM
> > > To: Ma, WenwuX ; Xia, Chenbo
> > > ; dev@dpdk.org
> > > Cc: Hu, Jiayu ; Wang, Yinan
> > > ;
> > He,
> > > Xingguang ; sta...@dpdk.org
> > > Subject: Re: [PATCH v3] vhost: fix deadlock when message handling
> > > failed
> > >
> > >
> > >
> > > On 5/7/22 15:27, Wenwu Ma wrote:
> > > > In vhost_user_msg_handler(), if vhost message handling failed, we
> > > > should check whether the queue is locked and release the lock
> > > > before returning. Or, it will cause a deadlock later.
> > > >
> > > > Fixes: 7f31d4ea05ca ("vhost: fix lock on device readiness
> > > > notification")
> > > > Cc: sta...@dpdk.org
> > > >
> > > > Signed-off-by: Wenwu Ma 
> > > > ---
> > > >   lib/vhost/vhost_user.c | 10 ++
> > > >   1 file changed, 6 insertions(+), 4 deletions(-)
> > > >
> > >
> > > Applied to dpdk-next-virtio/main.
> > >
> > > Thanks,
> > > Maxime

RE: [PATCH v3] vhost: fix deadlock when message handling failed

2022-07-11 Thread Xia, Chenbo

Hi Andy,

> -Original Message-
> From: Pei, Andy 
> Sent: Monday, July 11, 2022 5:03 PM
> To: Xia, Chenbo ; Maxime Coquelin
> ; Ma, WenwuX ;
> dev@dpdk.org
> Cc: Hu, Jiayu ; Wang, Yinan ; He,
> Xingguang ; sta...@dpdk.org
> Subject: RE: [PATCH v3] vhost: fix deadlock when message handling failed
> 
> HI Chenbo,
> 
> See below example
> 
> 1.
>   ret = read_vhost_message(dev, fd, &ctx);
>   if (ret <= 0) {
>   if (ret < 0)
>   VHOST_LOG_CONFIG(dev->ifname, ERR, "vhost read message
> failed\n");
>   else
>   VHOST_LOG_CONFIG(dev->ifname, INFO, "vhost peer
> closed\n");
> 
>   return -1;
>   }
> 
> 2.
>   ret = vhost_user_check_and_alloc_queue_pair(dev, &ctx);
>   if (ret < 0) {
>   VHOST_LOG_CONFIG(dev->ifname, ERR, "failed to alloc queue\n");
>   return -1;
>   }
> 
> 
> 3. ret = -1
>   } else if (ret == RTE_VHOST_MSG_RESULT_ERR) {
>   VHOST_LOG_CONFIG(dev->ifname, ERR, "vhost message handling
> failed.\n");
>   ret = -1;
>   goto unlock;
>   }
> 
> I mean ret is used to check the function is returned successfully or not
> while it is also used to store vhost msg return code like
> "RTE_VHOST_MSG_RESULT_XXX".

It makes sense to make it two, one int for return value, one enum
rte_vhost_msg_result only for message handling. IMHO, this could improve 
readability.

Thanks,
Chenbo

> 
> 
> 
> 
> > -Original Message-
> > From: Xia, Chenbo 
> > Sent: Monday, July 11, 2022 4:54 PM
> > To: Pei, Andy ; Maxime Coquelin
> > ; Ma, WenwuX ;
> > dev@dpdk.org
> > Cc: Hu, Jiayu ; Wang, Yinan ;
> He,
> > Xingguang ; sta...@dpdk.org
> > Subject: RE: [PATCH v3] vhost: fix deadlock when message handling failed
> >
> > > -Original Message-
> > > From: Pei, Andy 
> > > Sent: Monday, July 11, 2022 4:42 PM
> > > To: Maxime Coquelin ; Ma, WenwuX
> > > ; Xia, Chenbo ;
> > > dev@dpdk.org
> > > Cc: Hu, Jiayu ; Wang, Yinan
> > > ; He, Xingguang ;
> > > sta...@dpdk.org
> > > Subject: RE: [PATCH v3] vhost: fix deadlock when message handling
> > > failed
> > >
> > > HI ALL,
> > >
> > > I see that in function vhost_user_msg_handler.
> > > We use "ret" to store both vhost msg return code like
> > > "RTE_VHOST_MSG_RESULT_XXX" and function return value.
> > > I wonder if it is better to use two different variable to make it easy
> > > to read.
> >
> > By saying 'function return value', you mean which function? Can you
> elaborate
> > more?
> >
> > Thanks,
> > Chenbo
> >
> > >
> > > > -Original Message-
> > > > From: Maxime Coquelin 
> > > > Sent: Tuesday, May 17, 2022 9:24 PM
> > > > To: Ma, WenwuX ; Xia, Chenbo
> > > > ; dev@dpdk.org
> > > > Cc: Hu, Jiayu ; Wang, Yinan
> > > > ;
> > > He,
> > > > Xingguang ; sta...@dpdk.org
> > > > Subject: Re: [PATCH v3] vhost: fix deadlock when message handling
> > > > failed
> > > >
> > > >
> > > >
> > > > On 5/7/22 15:27, Wenwu Ma wrote:
> > > > > In vhost_user_msg_handler(), if vhost message handling failed, we
> > > > > should check whether the queue is locked and release the lock
> > > > > before returning. Or, it will cause a deadlock later.
> > > > >
> > > > > Fixes: 7f31d4ea05ca ("vhost: fix lock on device readiness
> > > > > notification")
> > > > > Cc: sta...@dpdk.org
> > > > >
> > > > > Signed-off-by: Wenwu Ma 
> > > > > ---
> > > > >   lib/vhost/vhost_user.c | 10 ++
> > > > >   1 file changed, 6 insertions(+), 4 deletions(-)
> > > > >
> > > >
> > > > Applied to dpdk-next-virtio/main.
> > > >
> > > > Thanks,
> > > > Maxime

Re: [PATCH v2] ip_frag: add IPv4 fragment copy packet API

2022-07-11 Thread Konstantin Ananyev


11/07/2022 00:35, Konstantin Ananyev пишет:

 > Some NIC drivers support MBUF_FAST_FREE(Device supports optimization
 > for fast release of mbufs. When set application must guarantee that
 > per-queue all mbufs comes from the same mempool and has refcnt = 1)
 > offload. In order to adapt to this offload function, add this API.
 > Add some test data for this API.
 >
 > Signed-off-by: Huichao Cai 
 > ---

...

 > --- a/lib/ip_frag/rte_ip_frag.h
 > +++ b/lib/ip_frag/rte_ip_frag.h
 > @@ -179,6 +179,32 @@ int32_t rte_ipv4_fragment_packet(struct rte_mbuf
 > *pkt_in,
 >   struct rte_mempool *pool_indirect);
 >
 >   /**
 > + * IPv4 fragmentation by copy.
 > + *
 > + * This function implements the fragmentation of IPv4 packets by copy.
 > + *
 > + * @param pkt_in
 > + *   The input packet.
 > + * @param pkts_out
 > + *   Array storing the output fragments.
 > + * @param nb_pkts_out
 > + *   The size of the array that stores the output fragments.
 > + * @param mtu_size
 > + *   Size in bytes of the Maximum Transfer Unit (MTU) for the outgoing
 > IPv4
 > + *   datagrams. This value includes the size of the IPv4 header.
 > + * @return
 > + *   Upon successful completion - number of output fragments placed
 > + *   in the pkts_out array.
 > + *   Otherwise - (-1) * errno.
 > + */
 > +__rte_experimental
 > +int32_t
 > +rte_ipv4_fragment_copy_packet(struct rte_mbuf *pkt_in,
 > +    struct rte_mbuf **pkts_out,
 > +    uint16_t nb_pkts_out,
 > +    uint16_t mtu_size);
 > +
 > +/**
 >    * This function implements reassembly of fragmented IPv4 packets.
 >    * Incoming mbufs should have its l2_len/l3_len fields setup 
correctly.

 >    *
 > diff --git a/lib/ip_frag/rte_ipv4_fragmentation.c
 > b/lib/ip_frag/rte_ipv4_fragmentation.c
 > index a562424..9e050cc 100644
 > --- a/lib/ip_frag/rte_ipv4_fragmentation.c
 > +++ b/lib/ip_frag/rte_ipv4_fragmentation.c
 > @@ -262,3 +262,168 @@ static inline uint16_t
 > __create_ipopt_frag_hdr(uint8_t *iph,
 >
 >   return out_pkt_pos;
 >   }
 > +
 > +/**
 > + * IPv4 fragmentation by copy.
 > + *
 > + * This function implements the fragmentation of IPv4 packets by copy.
 > + *
 > + * @param pkt_in
 > + *   The input packet.
 > + * @param pkts_out
 > + *   Array storing the output fragments.
 > + * @param nb_pkts_out
 > + *   The size of the array that stores the output fragments.
 > + * @param mtu_size
 > + *   Size in bytes of the Maximum Transfer Unit (MTU) for the outgoing
 > IPv4
 > + *   datagrams. This value includes the size of the IPv4 header.
 > + * @return
 > + *   Upon successful completion - number of output fragments placed
 > + *   in the pkts_out array.
 > + *   Otherwise - (-1) * errno.
 > + */
 > +int32_t
 > +rte_ipv4_fragment_copy_packet(struct rte_mbuf *pkt_in,
 > +    struct rte_mbuf **pkts_out,
 > +    uint16_t nb_pkts_out,
 > +    uint16_t mtu_size)



Forgot to mention, new API has to be experimental.


 > +{
 > +    struct rte_mbuf *in_seg = NULL;
 > +    struct rte_ipv4_hdr *in_hdr;
 > +    uint32_t out_pkt_pos, in_seg_data_pos;
 > +    uint32_t more_in_segs;
 > +    uint16_t fragment_offset, flag_offset, frag_size, header_len;
 > +    uint16_t frag_bytes_remaining;
 > +    uint8_t ipopt_frag_hdr[IPV4_HDR_MAX_LEN];
 > +    uint16_t ipopt_len;
 > +
 > +    /*
 > + * Formal parameter checking.
 > + */
 > +    if (unlikely(pkt_in == NULL) || unlikely(pkts_out == NULL) ||
 > +    unlikely(nb_pkts_out == 0) ||
 > +    unlikely(pkt_in->pool == NULL) ||
 > +    unlikely(mtu_size < RTE_ETHER_MIN_MTU))
 > +    return -EINVAL;
 > +
 > +    in_hdr = rte_pktmbuf_mtod(pkt_in, struct rte_ipv4_hdr *);
 > +    header_len = (in_hdr->version_ihl & RTE_IPV4_HDR_IHL_MASK) *
 > +    RTE_IPV4_IHL_MULTIPLIER;
 > +
 > +    /* Check IP header length */
 > +    if (unlikely(pkt_in->data_len < header_len) ||
 > +    unlikely(mtu_size < header_len))
 > +    return -EINVAL;
 > +
 > +    /*
 > + * Ensure the IP payload length of all fragments is aligned to a
 > + * multiple of 8 bytes as per RFC791 section 2.3.
 > + */
 > +    frag_size = RTE_ALIGN_FLOOR((mtu_size - header_len),
 > +    IPV4_HDR_FO_ALIGN);
 > +
 > +    flag_offset = rte_cpu_to_be_16(in_hdr->fragment_offset);
 > +
 > +    /* If Don't Fragment flag is set */
 > +    if (unlikely((flag_offset & IPV4_HDR_DF_MASK) != 0))
 > +    return -ENOTSUP;
 > +
 > +    /* Check that pkts_out is big enough to hold all fragments */
 > +    if (unlikely(frag_size * nb_pkts_out <
 > +    (uint16_t)(pkt_in->pkt_len - header_len)))
 > +    return -EINVAL;
 > +
 > +    in_seg = pkt_in;
 > +    in_seg_data_pos = header_len;
 > +    out_pkt_pos = 0;
 > +    fragment_offset = 0;
 > +
 > +    ipopt_len = header_len - sizeof(struct rte_ipv4_hdr);
 > +    if (unlikely(ipopt_len > RTE_IPV4_HDR_OPT_MAX_LEN))
 > +    return -EINVAL;
 > +
 > +    more_in_segs = 1;
 > +    while (likely(more_in_segs)) {
 > +    struct rte_mbuf *out_pkt = NULL;
 > +    uint32_t

Re: Minutes of Technical Board Meeting, 2022-06-29

2022-07-11 Thread Konstantin Ananyev




Ack


Members Attending: 9/10
- Aaron Conole
- Bruce Richardson
- Honnappa Nagarahalli
- Jerin Jacob
- Kevin Traynor
- Konstantin Ananyev
- Maxime Coquelin
- Stephen Hemminger
- Thomas Monjalon (Chair)

NOTE: The Technical Board meetings take place every second Wednesday
on https://meet.jit.si/DPDK at 3 pm UTC.
Meetings are public, and DPDK community members are welcome to attend.
Agenda and minutes can be found at http://core.dpdk.org/techboard/minutes

NOTE: Next meeting will be on Wednesday 2022-07-13 @3pm UTC,
and will be chaired by Aaron.


1/ Actions from last meeting:
- Thomas to send an update of the security process
- Bruce to send a deprecation notice for the KNI example
- Looking for volunteers to start a doc intro based on the white paper
  (note: we should include a reference to the white paper)

2/ Event in France:
- Discussed last details about the CFP, should be ready soon.
- There are attendee fees to avoid cancellations,
  and early bird price to avoid late registrations.
- Hackathon will happen on September 6 with several topics
  (static analysis, code coverage, coccinelle, vhost vulnerabilities)
- Open discussions with the Technical Board should happen
  at the end of September 6 or 7 (agenda to be finalized)

3/ Technical writer hiring process:
- Will ask what could be a doc audit to an agency
- Position is still open

4/ Servers backup:
- Asking Linux Foundation for hosting of backups.
- Access should be granted to a group (techboard members?).

RE: release candidate 22.07-rc3

2022-07-11 Thread Jiang, YuX

> -Original Message-
> From: Jiang, YuX
> Sent: Thursday, July 7, 2022 5:07 PM
> To: Thomas Monjalon ; dev (dev@dpdk.org)
> 
> Cc: Devlin, Michelle ; Mcnamara, John
> ; Ferruh Yigit 
> Subject: RE: release candidate 22.07-rc3
>
> > -Original Message-
> > From: Thomas Monjalon 
> > Sent: Wednesday, July 6, 2022 4:36 AM
> > To: annou...@dpdk.org
> > Subject: release candidate 22.07-rc3
> >
> > A new DPDK release candidate is ready for testing:
> > https://git.dpdk.org/dpdk/tag/?id=v22.07-rc3
> >
> > There are 82 new patches in this snapshot.
> >
> > Release notes:
> > https://doc.dpdk.org/guides/rel_notes/release_22_07.html
> >
> > Please test and report issues on bugs.dpdk.org.
> > You may share some release validation results by replying to this
> > message at dev@dpdk.org.
> >
> > DPDK 22.07-rc4 should be the last chance for bug fixes and doc updates.
> > Do not forget to review pending patches for examples and documentation.
> >
> > Please think about sharing your roadmap now for DPDK 22.11.
> >
> > Thank you everyone
> >
>
> Update the test status for Intel part. Till now dpdk22.07-rc3 test execution
> rate is 65%, find 2 new bugs, No.2 is critical bug.
> No.1, "Bug 1048 - [dpdk-22.07rc3] i40e_rss_input/flow_query: RSS types is
> not displayed tests failed". Only display failed, there is no fix yet.
> No.2, [dpdk-22.07rc3]The performance of X710 drop about 45% when testing
> nic_single_core: has fix and verify passed.
> http://patchwork.dpdk.org/project/dpdk/patch/20220707162602.2123584-1-
> kevinx@intel.com/  ---Need push to merge.
>
> Known critical bugs:
> 1, The performance of virtio PVP multi_paths has a drop(~about 5%)
> between DPDK22.07-rc2 and DPDK-22.03 split ring has fix,
> http://patchwork.dpdk.org/project/dpdk/patch/20220707065513.66458-1-
> xuan.d...@intel.com/  ---Need verify then need push to merge.
> packed ring has no fix, Intel Dev is investigating.
> 2, Bug 1044 - [dpdk-22.07] ice_iavf_fdir: the ipv4/6 gtpu_eh rule not take
> effect
> has fix,
> http://patches.dpdk.org/project/dpdk/patch/20220706025626.1070737-1-
> wenxuanx...@intel.com/   ---Need push to merge.
>
> # Basic Intel(R) NIC testing
> * Build or compile:
>  *Build: cover the build test combination with latest GCC/Clang version and
> the popular OS revision such as Ubuntu20.04, Ubuntu22.04, Fedora36,
> RHEL8.6 etc.
>   - All test passed.
>
> * PF/VF(i40e, ixgbe): test scenarios including PF/VF-
> RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
>   - Execution rate is 80%. Find 1 new bug: "Bug 1048 - [dpdk-22.07rc3]
> i40e_rss_input/flow_query: RSS types is not displayed tests failed". There is
> no fix yet.
>
> * PF/VF(ice): test scenarios including Switch features/Package
> Management/Flow Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible
> Descriptor, etc.
>   - Execution rate is 80%. No new bug is found.
>
> * Intel NIC single core/NIC performance: test scenarios including PF/VF single
> core performance test, RFC2544 Zero packet loss performance test, etc.
>   - Execution rate is 80%. Find 1 critical bug:[dpdk-22.07rc3]The
> performance of X710 drop about 45% when testing nic_single_core: has fix
> and verify passed.
>
> # Basic virtio testing
> * Virtio: both function and performance test are covered. Such as
> PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf
> testing/VMAWARE ESXI 7.0u3, etc.
>   - Execution rate is 80%. No new bug is found.
>
> Best regards,
> Yu Jiang

Update the test status for Intel part. Till now dpdk22.07-rc3 test is almost 
finished.
Till now, find 5 new bugs, Bug2 about perf drop is critical bug.
Bug1, "Bug 1048 - [dpdk-22.07rc3] i40e_rss_input/flow_query: RSS types is not 
displayed tests failed": has fix and verify passed.
https://patchwork.dpdk.org/project/dpdk/patch/20220708014159.13499-1-lihuis...@huawei.com/
 ---wait for merge to main branch
Bug2, [dpdk-22.07rc3]The performance of X710 drop about 45% when testing 
nic_single_core: has fix and verify passed.
http://patchwork.dpdk.org/project/dpdk/patch/20220707170434.2159759-1-kevinx@intel.com/
  ---wait for merge to main branch

Known critical bugs:
Bug3, The performance of virtio PVP multi_paths has a drop between 
DPDK22.07-rc2 and DPDK-22.03
split ring has fix, 
http://patchwork.dpdk.org/project/dpdk/patch/20220707065513.66458-1-xuan.d...@intel.com/
  --wait for merge to main branch
packed ring has no fix, Intel Dev is investigating.
Bug4, Bug 1044 - [dpdk-22.07] ice_iavf_fdir: the ipv4/6 gtpu_eh rule not take 
effect
has fix, 
http://patches.dpdk.org/project/dpdk/patch/20220706025626.1070737-1-wenxuanx...@intel.com/
   --wait for merge to main branch

# Basic Intel(R) NIC testing
* Build or compile:
 *Build: cover the build test combination with latest GCC/Clang version and the 
popular OS revision such as Ubuntu20.04, Ubuntu22.04, Fedora36, RHEL8.6 etc.
  - All test passed.

* PF/VF(i40e, ixgbe): test scenarios including 
PF/VF-RTE_FLOW/TSO/Jumboframe/checksum

Re: Minutes of Technical Board Meeting, 2022-06-29

2022-07-11 Thread Konstantin Ananyev






Ack


Ups, sorry, wrong thread :(
Lack of coffee :)




Members Attending: 9/10
- Aaron Conole
- Bruce Richardson
- Honnappa Nagarahalli
- Jerin Jacob
- Kevin Traynor
- Konstantin Ananyev
- Maxime Coquelin
- Stephen Hemminger
- Thomas Monjalon (Chair)

NOTE: The Technical Board meetings take place every second Wednesday
on https://meet.jit.si/DPDK at 3 pm UTC.
Meetings are public, and DPDK community members are welcome to attend.
Agenda and minutes can be found at http://core.dpdk.org/techboard/minutes

NOTE: Next meeting will be on Wednesday 2022-07-13 @3pm UTC,
and will be chaired by Aaron.


1/ Actions from last meeting:
- Thomas to send an update of the security process
- Bruce to send a deprecation notice for the KNI example
- Looking for volunteers to start a doc intro based on the white 
paper

  (note: we should include a reference to the white paper)

2/ Event in France:
- Discussed last details about the CFP, should be ready soon.
- There are attendee fees to avoid cancellations,
  and early bird price to avoid late registrations.
- Hackathon will happen on September 6 with several topics
  (static analysis, code coverage, coccinelle, vhost vulnerabilities)
- Open discussions with the Technical Board should happen
  at the end of September 6 or 7 (agenda to be finalized)

3/ Technical writer hiring process:
- Will ask what could be a doc audit to an agency
- Position is still open

4/ Servers backup:
- Asking Linux Foundation for hosting of backups.
- Access should be granted to a group (techboard members?).

Re: [PATCH v2 1/2] app/test: add cksum performance test

2022-07-11 Thread Olivier Matz

Hi Mattias,

Please see few comments below.

On Fri, Jul 08, 2022 at 02:56:07PM +0200, Mattias Rönnblom wrote:
> Add performance test for the rte_raw_cksum() function, which delegates
> the actual work to __rte_raw_cksum(), which in turn is used by other
> functions in need of Internet checksum calculation.
> 
> Signed-off-by: Mattias Rönnblom 
> 
> ---
> 
> v2:
>   * Added __rte_unused to unused volatile variable, to keep the Intel
> compiler happy.
> ---
>  MAINTAINERS|   1 +
>  app/test/meson.build   |   1 +
>  app/test/test_cksum_perf.c | 118 +
>  3 files changed, 120 insertions(+)
>  create mode 100644 app/test/test_cksum_perf.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c923712946..2a4c99e05a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1414,6 +1414,7 @@ Network headers
>  M: Olivier Matz 
>  F: lib/net/
>  F: app/test/test_cksum.c
> +F: app/test/test_cksum_perf.c
>  
>  Packet CRC
>  M: Jasvinder Singh 
> diff --git a/app/test/meson.build b/app/test/meson.build
> index 431c5bd318..191db03d1d 100644
> --- a/app/test/meson.build
> +++ b/app/test/meson.build
> @@ -18,6 +18,7 @@ test_sources = files(
>  'test_bpf.c',
>  'test_byteorder.c',
>  'test_cksum.c',
> +'test_cksum_perf.c',
>  'test_cmdline.c',
>  'test_cmdline_cirbuf.c',
>  'test_cmdline_etheraddr.c',
> diff --git a/app/test/test_cksum_perf.c b/app/test/test_cksum_perf.c
> new file mode 100644
> index 00..bff73cb3bb
> --- /dev/null
> +++ b/app/test/test_cksum_perf.c
> @@ -0,0 +1,118 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2022 Ericsson AB
> + */
> +
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "test.h"
> +
> +#define NUM_BLOCKS (10)
> +#define ITERATIONS (100)

Parenthesis can be safely removed

> +
> +static const size_t data_sizes[] = { 20, 21, 100, 101, 1500, 1501 };
> +
> +static __rte_noinline uint16_t
> +do_rte_raw_cksum(const void *buf, size_t len)
> +{
> + return rte_raw_cksum(buf, len);
> +}

I don't understand the need to have this wrapper, especially marked
__rte_noinline. What is the objective?

Note that when I remove the __rte_noinline, the performance is better
for size 20 and 21.

> +
> +static void
> +init_block(void *buf, size_t len)

Can buf be a (char *) instead?
It would avoid a cast below.

> +{
> + size_t i;
> +
> + for (i = 0; i < len; i++)
> + ((char *)buf)[i] = (uint8_t)rte_rand();
> +}
> +
> +static int
> +test_cksum_perf_size_alignment(size_t block_size, bool aligned)
> +{
> + char *data[NUM_BLOCKS];
> + char *blocks[NUM_BLOCKS];
> + unsigned int i;
> + uint64_t start;
> + uint64_t end;
> + /* Floating point to handle low (pseudo-)TSC frequencies */
> + double block_latency;
> + double byte_latency;
> + volatile __rte_unused uint64_t sum = 0;
> +
> + for (i = 0; i < NUM_BLOCKS; i++) {
> + data[i] = rte_malloc(NULL, block_size + 1, 0);
> +
> + if (data[i] == NULL) {
> + printf("Failed to allocate memory for block\n");
> + return TEST_FAILED;
> + }
> +
> + init_block(data[i], block_size + 1);
> +
> + blocks[i] = aligned ? data[i] : data[i] + 1;
> + }
> +
> + start = rte_rdtsc();
> +
> + for (i = 0; i < ITERATIONS; i++) {
> + unsigned int j;
> + for (j = 0; j < NUM_BLOCKS; j++)
> + sum += do_rte_raw_cksum(blocks[j], block_size);
> + }
> +
> + end = rte_rdtsc();
> +
> + block_latency = (end - start) / (double)(ITERATIONS * NUM_BLOCKS);
> + byte_latency = block_latency / block_size;
> +
> + printf("%-9s %10zd %19.1f %16.2f\n", aligned ? "Aligned" : "Unaligned",
> +block_size, block_latency, byte_latency);

When I run the test on my dev machine, I get the following results,
which are quite reproductible:

Aligned   20   10.4  0.52 (range is 0.48 - 0.52)
Unaligned 207.9  0.39 (range is 0.39 - 0.40)
...

If I increase the number of iterations, the first results
change significantly:

Aligned   208.2  0.42 (range is 0.41 - 0.42)
Unaligned 208.0  0.40 (always this value)

To have more precise tests with small size, would it make sense to
target a test time instead of an iteration count? Something like
this:

#define ITERATIONS 100
uint64_t iterations = 0;

...

do {
for (i = 0; i < ITERATIONS; i++) {
unsigned int j;
for (j = 0; j < NUM_BLOCKS; j++)
sum += do_rte_raw_cksum(blocks[j], block_size);
}
iterations += ITERATIONS;
end = rte_rdtsc();
} while ((end - start) < rte_get_tsc_hz());

Re: [PATCH v2 2/2] net: have checksum routines accept unaligned data

2022-07-11 Thread Olivier Matz

Hi,

On Fri, Jul 08, 2022 at 02:56:08PM +0200, Mattias Rönnblom wrote:
> __rte_raw_cksum() (used by rte_raw_cksum() among others) accessed its
> data through an uint16_t pointer, which allowed the compiler to assume
> the data was 16-bit aligned. This in turn would, with certain
> architectures and compiler flag combinations, result in code with SIMD
> load or store instructions with restrictions on data alignment.
> 
> This patch keeps the old algorithm, but data is read using memcpy()
> instead of direct pointer access, forcing the compiler to always
> generate code that handles unaligned input. The __may_alias__ GCC
> attribute is no longer needed.
> 
> The data on which the Internet checksum functions operates are almost
> always 16-bit aligned, but there are exceptions. In particular, the
> PDCP protocol header may (literally) have an odd size.
> 
> Performance impact seems to range from none to a very slight
> regression.
> 
> Bugzilla ID: 1035
> Cc: sta...@dpdk.org
> 
> ---

Using memcpy() looks to be a good solution fix the issue, while avoiding a
branch and the __may_alias__.

I just have one minor comment below.

> 
> v2:
>   * Simplified the odd-length conditional (Morten Brørup).
> 
> Reviewed-by: Morten Brørup 
> 
> Signed-off-by: Mattias Rönnblom 
> ---
>  lib/net/rte_ip.h | 17 ++---
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> index b502481670..a0334d931e 100644
> --- a/lib/net/rte_ip.h
> +++ b/lib/net/rte_ip.h
> @@ -160,18 +160,21 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr)
>  static inline uint32_t
>  __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
>  {
> - /* extend strict-aliasing rules */
> - typedef uint16_t __attribute__((__may_alias__)) u16_p;
> - const u16_p *u16_buf = (const u16_p *)buf;
> - const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> + const void *end;
>  
> - for (; u16_buf != end; ++u16_buf)
> - sum += *u16_buf;
> + for (end = RTE_PTR_ADD(buf, (len/sizeof(uint16_t)) * sizeof(uint16_t));

What do you think about this form:

for (end = RTE_PTR_ADD(buf, RTE_ALIGN_FLOOR(len, sizeof(uint16_t)));

This also has the good property to solve the debate about the
spaces around the '/' :)


> +  buf != end; buf = RTE_PTR_ADD(buf, sizeof(uint16_t))) {
> + uint16_t v;
> +
> + memcpy(&v, buf, sizeof(uint16_t));
> + sum += v;
> + }
>  
>   /* if length is odd, keeping it byte order independent */
>   if (unlikely(len % 2)) {
>   uint16_t left = 0;
> - *(unsigned char *)&left = *(const unsigned char *)end;
> +
> + memcpy(&left, end, 1);
>   sum += left;
>   }
>  
> -- 
> 2.25.1
>

RE: [PATCH v9 2/4] ethdev: introduce protocol hdr based buffer split

2022-07-11 Thread Ding, Xuan

Hi Thomas,

> -Original Message-
> From: Thomas Monjalon 
> Sent: Thursday, July 7, 2022 5:08 PM
> To: Wu, WenxuanX 
> Cc: andrew.rybche...@oktetlabs.ru; Li, Xiaoyun ;
> ferruh.yi...@xilinx.com; Singh, Aman Deep ;
> dev@dpdk.org; Zhang, Yuying ; Zhang, Qi Z
> ; jerinjac...@gmail.com;
> step...@networkplumber.org; Wu, WenxuanX ;
> Ding, Xuan ; Wang, YuanX ;
> Ray Kinsella 
> Subject: Re: [PATCH v9 2/4] ethdev: introduce protocol hdr based buffer split
> 
> 13/06/2022 12:25, wenxuanx...@intel.com:
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -1176,6 +1176,9 @@ struct rte_eth_txmode {
> >   *   specified in the first array element, the second buffer, from the
> >   *   pool in the second element, and so on.
> >   *
> > + * - The proto_hdrs in the elements define the split position of
> > + *   received packets.
> > + *
> >   * - The offsets from the segment description elements specify
> >   *   the data offset from the buffer beginning except the first mbuf.
> >   *   The first segment offset is added with RTE_PKTMBUF_HEADROOM.
> > @@ -1197,12 +1200,21 @@ struct rte_eth_txmode {
> >   * - pool from the last valid element
> >   * - the buffer size from this pool
> >   * - zero offset
> > + *
> > + * - Length based buffer split:
> > + * - mp, length, offset should be configured.
> > + * - The proto_hdr field should not be configured.
> > + *
> > + * - Protocol header based buffer split:
> > + * - mp, offset, proto_hdr should be configured.
> > + * - The length field should not be configured.
> >   */
> >  struct rte_eth_rxseg_split {
> > struct rte_mempool *mp; /**< Memory pool to allocate segment
> from. */
> > uint16_t length; /**< Segment data length, configures split point. */
> > uint16_t offset; /**< Data offset from beginning of mbuf data buffer.
> */
> > -   uint32_t reserved; /**< Reserved field. */
> > +   /**< Supported ptypes mask of a specific pmd, configures split point.
> */
> 
> The doxygen syntax is wrong: remove the "<" which is for post-comment.

Thanks for your catch.

> 
> > +   uint32_t proto_hdr;
> >  };
> 
> How do we know it is a length or buffer split?
> Is it based on checking some 0 value?

Yes, as Andrew suggests, we introduced the API rte_eth_supported_hdrs_get() in 
v9.
It will report the driver supported protocol headers to be split.
If the API returns ENOTSUP, it means driver supports length based buffer split.

Of course, no matter what kind of buffer split it is, we need to check
RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT first.

Thanks,
Xuan

>

[PATCH] doc: add IP ECN header rewrite to mlx5 notes

2022-07-11 Thread Jiawei Wang

This patch updates the mlx5 notes for IPv4/IPv6 ECN field
rewrite offload.

Fixes: c4e442fa4c45 ("ethdev: add IPv4/IPv6 ECN header rewrite action")
Fixes: 097d84a42a87 ("common/mlx5: check ECN modification capability")
Fixes: 76d575612277 ("net/mlx5: support modifying ECN field")

Signed-off-by: Jiawei Wang 
Acked-by: Asaf Penso 
---
 doc/guides/nics/mlx5.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index a7f9851d16..eb16669b3f 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -1355,6 +1355,11 @@ Supported hardware offloads
| | | | rdma-core 24  | | rdma-core 24  |
| | | | ConnectX-5| | ConnectX-5|
+---+-+-+
+   | | Header rewrite  | | DPDK 22.07| | DPDK 22.07|
+   | | (ipv4_ecn)  | | OFED 5.6-2| | OFED 5.6-2|
+   | |  ipv6_ecn)  | | rdma-core 41  | | rdma-core 41  |
+   | | | | ConnectX-5| | ConnectX-5|
+   +---+-+-+
| Jump  | | DPDK 19.05| | DPDK 19.02|
|   | | OFED 4.7-1| | OFED 4.7-1|
|   | | rdma-core 24  | | N/A   |
-- 
2.18.1

RE: [PATCH v9 0/4] add an api to support proto based buffer split

2022-07-11 Thread Ding, Xuan

Hi,

> -Original Message-
> From: Thomas Monjalon 
> Sent: Thursday, July 7, 2022 5:10 PM
> To: Wu, WenxuanX ; Ding, Xuan
> 
> Cc: andrew.rybche...@oktetlabs.ru; dev@dpdk.org; Li, Xiaoyun
> ; ferruh.yi...@xilinx.com; dev@dpdk.org; Zhang,
> Yuying ; Zhang, Qi Z ;
> jerinjac...@gmail.com; step...@networkplumber.org; Zhang, Qi Z
> ; Richardson, Bruce ;
> Mcnamara, John 
> Subject: Re: [PATCH v9 0/4] add an api to support proto based buffer split
> 
> 21/06/2022 10:56, Ding, Xuan:
> > This protocol based buffer split patch series have been updated to v9.
> > Sincerely thank you for the effort you put into this series.
> >
> > Hope to know your considerations about this series now.
> > Do you think is it possible to get in 22.07? Or there are still some 
> > critical
> gaps need to be solved?
> > Because we don't hope the same thing happens in 22.11.
> 
> My quick comment, I think you must better care about all details.
> Precise explanations are very important.
> It is more encouraging to review when we see the author tried hard to avoid
> any confusion or approximation.

Thanks a lot to all the reviewers for the effort on this series.
So far, although we tried to make the explanation or documentation detailed,
there must have some places that are not clearly explained or the doc is not 
good enough.

We will do more self-checks and try to refine where it might not be clear.
Your comments are welcome.

Regards,
Xuan

> 
>

Re: [PATCH v9 2/4] ethdev: introduce protocol hdr based buffer split

2022-07-11 Thread Thomas Monjalon

11/07/2022 11:54, Ding, Xuan:
> From: Thomas Monjalon 
> > 13/06/2022 12:25, wenxuanx...@intel.com:
> > > --- a/lib/ethdev/rte_ethdev.h
> > > +++ b/lib/ethdev/rte_ethdev.h
> > > @@ -1176,6 +1176,9 @@ struct rte_eth_txmode {
> > >   *   specified in the first array element, the second buffer, from the
> > >   *   pool in the second element, and so on.
> > >   *
> > > + * - The proto_hdrs in the elements define the split position of
> > > + *   received packets.
> > > + *
> > >   * - The offsets from the segment description elements specify
> > >   *   the data offset from the buffer beginning except the first mbuf.
> > >   *   The first segment offset is added with RTE_PKTMBUF_HEADROOM.
> > > @@ -1197,12 +1200,21 @@ struct rte_eth_txmode {
> > >   * - pool from the last valid element
> > >   * - the buffer size from this pool
> > >   * - zero offset
> > > + *
> > > + * - Length based buffer split:
> > > + * - mp, length, offset should be configured.
> > > + * - The proto_hdr field should not be configured.
> > > + *
> > > + * - Protocol header based buffer split:
> > > + * - mp, offset, proto_hdr should be configured.
> > > + * - The length field should not be configured.
> > >   */
> > >  struct rte_eth_rxseg_split {
> > >   struct rte_mempool *mp; /**< Memory pool to allocate segment
> > from. */
> > >   uint16_t length; /**< Segment data length, configures split point. */
> > >   uint16_t offset; /**< Data offset from beginning of mbuf data buffer.
> > */
> > > - uint32_t reserved; /**< Reserved field. */
> > > + /**< Supported ptypes mask of a specific pmd, configures split point.
> > */
> > 
> > The doxygen syntax is wrong: remove the "<" which is for post-comment.
> 
> Thanks for your catch.
> 
> > 
> > > + uint32_t proto_hdr;
> > >  };
> > 
> > How do we know it is a length or buffer split?
> > Is it based on checking some 0 value?
> 
> Yes, as Andrew suggests, we introduced the API rte_eth_supported_hdrs_get() 
> in v9.
> It will report the driver supported protocol headers to be split.
> If the API returns ENOTSUP, it means driver supports length based buffer 
> split.
> 
> Of course, no matter what kind of buffer split it is, we need to check
> RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT first.

So you need to talk about RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT in the comment of 
this struct.

Re: [PATCH v2 1/2] app/test: add cksum performance test

2022-07-11 Thread Mattias Rönnblom

On 2022-07-11 11:47, Olivier Matz wrote:
> Hi Mattias,
> 
> Please see few comments below.
> 
> On Fri, Jul 08, 2022 at 02:56:07PM +0200, Mattias Rönnblom wrote:
>> Add performance test for the rte_raw_cksum() function, which delegates
>> the actual work to __rte_raw_cksum(), which in turn is used by other
>> functions in need of Internet checksum calculation.
>>
>> Signed-off-by: Mattias Rönnblom 
>>
>> ---
>>
>> v2:
>>* Added __rte_unused to unused volatile variable, to keep the Intel
>>  compiler happy.
>> ---
>>   MAINTAINERS|   1 +
>>   app/test/meson.build   |   1 +
>>   app/test/test_cksum_perf.c | 118 +
>>   3 files changed, 120 insertions(+)
>>   create mode 100644 app/test/test_cksum_perf.c
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index c923712946..2a4c99e05a 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -1414,6 +1414,7 @@ Network headers
>>   M: Olivier Matz 
>>   F: lib/net/
>>   F: app/test/test_cksum.c
>> +F: app/test/test_cksum_perf.c
>>   
>>   Packet CRC
>>   M: Jasvinder Singh 
>> diff --git a/app/test/meson.build b/app/test/meson.build
>> index 431c5bd318..191db03d1d 100644
>> --- a/app/test/meson.build
>> +++ b/app/test/meson.build
>> @@ -18,6 +18,7 @@ test_sources = files(
>>   'test_bpf.c',
>>   'test_byteorder.c',
>>   'test_cksum.c',
>> +'test_cksum_perf.c',
>>   'test_cmdline.c',
>>   'test_cmdline_cirbuf.c',
>>   'test_cmdline_etheraddr.c',
>> diff --git a/app/test/test_cksum_perf.c b/app/test/test_cksum_perf.c
>> new file mode 100644
>> index 00..bff73cb3bb
>> --- /dev/null
>> +++ b/app/test/test_cksum_perf.c
>> @@ -0,0 +1,118 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright(c) 2022 Ericsson AB
>> + */
>> +
>> +#include 
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "test.h"
>> +
>> +#define NUM_BLOCKS (10)
>> +#define ITERATIONS (100)
> 
> Parenthesis can be safely removed
> 
>> +
>> +static const size_t data_sizes[] = { 20, 21, 100, 101, 1500, 1501 };
>> +
>> +static __rte_noinline uint16_t
>> +do_rte_raw_cksum(const void *buf, size_t len)
>> +{
>> +return rte_raw_cksum(buf, len);
>> +}
> 
> I don't understand the need to have this wrapper, especially marked
> __rte_noinline. What is the objective?
> 

The intention is to disallow the compiler to perform unrolling and 
integrating/interleave one cksum operating with the next buffer's in a 
way that wouldn't be feasable in a real application.

It will result in an overestimation of the cost for small cksums, so 
it's still misleading, but in another direction. :)

> Note that when I remove the __rte_noinline, the performance is better
> for size 20 and 21.
> 
>> +
>> +static void
>> +init_block(void *buf, size_t len)
> 
> Can buf be a (char *) instead?
> It would avoid a cast below.
> 

Yes.

>> +{
>> +size_t i;
>> +
>> +for (i = 0; i < len; i++)
>> +((char *)buf)[i] = (uint8_t)rte_rand();
>> +}
>> +
>> +static int
>> +test_cksum_perf_size_alignment(size_t block_size, bool aligned)
>> +{
>> +char *data[NUM_BLOCKS];
>> +char *blocks[NUM_BLOCKS];
>> +unsigned int i;
>> +uint64_t start;
>> +uint64_t end;
>> +/* Floating point to handle low (pseudo-)TSC frequencies */
>> +double block_latency;
>> +double byte_latency;
>> +volatile __rte_unused uint64_t sum = 0;
>> +
>> +for (i = 0; i < NUM_BLOCKS; i++) {
>> +data[i] = rte_malloc(NULL, block_size + 1, 0);
>> +
>> +if (data[i] == NULL) {
>> +printf("Failed to allocate memory for block\n");
>> +return TEST_FAILED;
>> +}
>> +
>> +init_block(data[i], block_size + 1);
>> +
>> +blocks[i] = aligned ? data[i] : data[i] + 1;
>> +}
>> +
>> +start = rte_rdtsc();
>> +
>> +for (i = 0; i < ITERATIONS; i++) {
>> +unsigned int j;
>> +for (j = 0; j < NUM_BLOCKS; j++)
>> +sum += do_rte_raw_cksum(blocks[j], block_size);
>> +}
>> +
>> +end = rte_rdtsc();
>> +
>> +block_latency = (end - start) / (double)(ITERATIONS * NUM_BLOCKS);
>> +byte_latency = block_latency / block_size;
>> +
>> +printf("%-9s %10zd %19.1f %16.2f\n", aligned ? "Aligned" : "Unaligned",
>> +   block_size, block_latency, byte_latency);
> 
> When I run the test on my dev machine, I get the following results,
> which are quite reproductible:
> 
> Aligned   20   10.4  0.52 (range is 0.48 - 0.52)
> Unaligned 207.9  0.39 (range is 0.39 - 0.40)
> ...
> 
> If I increase the number of iterations, the first results
> change significantly:
> 
> Aligned   208.2  0.42 (range is 0.41 - 0.42)
> Unaligned 208.0  0.40 (always this value)


I suspect you have frequency scaling enabled on your system. This is 
gen

RE: [PATCH] doc: add deprecation for restrictions in telemetry naming

2022-07-11 Thread Power, Ciara




> -Original Message-
> From: Richardson, Bruce 
> Sent: Thursday 7 July 2022 14:40
> To: dev@dpdk.org
> Cc: Richardson, Bruce ;
> m...@smartsharesystems.com; step...@networkplumber.org; Power, Ciara
> 
> Subject: [PATCH] doc: add deprecation for restrictions in telemetry naming
> 
> Following discussion on-list [1], we will look to limited the allowed 
> characters in
> names for items in telemetry. This will simplify the escaping needed for json
> output, or any future output formats. The lists will initially be minimal, 
> since
> expansion to allow more characters can be done without affecting
> compatibility, while reducing the set cannot.
> 
> Cc: m...@smartsharesystems.com
> Cc: step...@networkplumber.org
> Cc: ciara.po...@intel.com
> 
> Signed-off-by: Bruce Richardson 
> 
> [1] http://inbox.dpdk.org/dev/20220623164245.561371-1-
> bruce.richard...@intel.com/#r
> ---
>  doc/guides/rel_notes/deprecation.rst | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index 4e5b23c53d..9366690ec5 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -119,6 +119,12 @@ Deprecation Notices
>  * metrics: The function ``rte_metrics_init`` will have a non-void return
>in order to notify errors instead of calling ``rte_exit``.
> 
> +* telemetry: The allowed characters in names for dictionary values will
> +be limited to
> +  alphanumeric characters and a small subset of additional printable 
> characters.
> +  This will ensure that all dictionary parameter names can be output
> +without escaping
> +  in json - or in any future output format used. Names for the
> +telemetry commands will
> +  be similarly limited.
> +
>  * raw/ioat: The ``ioat`` rawdev driver has been deprecated, since it's
>functionality is provided through the new ``dmadev`` infrastructure.
>To continue to use hardware previously supported by the ``ioat`` rawdev
> driver,
> --
> 2.34.1

 Acked-by: Ciara Power

Re: [PATCH] doc: add deprecation for restrictions in telemetry naming

2022-07-11 Thread David Marchand

On Thu, Jul 7, 2022 at 3:39 PM Bruce Richardson
 wrote:
>
> Following discussion on-list [1], we will look to limited the allowed
> characters in names for items in telemetry. This will simplify the
> escaping needed for json output, or any future output formats. The lists
> will initially be minimal, since expansion to allow more characters can
> be done without affecting compatibility, while reducing the set cannot.
>
> Cc: m...@smartsharesystems.com
> Cc: step...@networkplumber.org
> Cc: ciara.po...@intel.com
>
> Signed-off-by: Bruce Richardson 
Acked-by: David Marchand 


-- 
David Marchand

Re: [PATCH] doc: add IP ECN header rewrite to mlx5 notes

2022-07-11 Thread David Marchand

On Mon, Jul 11, 2022 at 11:59 AM Jiawei Wang  wrote:
>
> This patch updates the mlx5 notes for IPv4/IPv6 ECN field
> rewrite offload.
>
> Fixes: c4e442fa4c45 ("ethdev: add IPv4/IPv6 ECN header rewrite action")
> Fixes: 097d84a42a87 ("common/mlx5: check ECN modification capability")
> Fixes: 76d575612277 ("net/mlx5: support modifying ECN field")
>
> Signed-off-by: Jiawei Wang 
> Acked-by: Asaf Penso 
> ---
>  doc/guides/nics/mlx5.rst | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
> index a7f9851d16..eb16669b3f 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -1355,6 +1355,11 @@ Supported hardware offloads
> | | | | rdma-core 24  | | rdma-core 24  |
> | | | | ConnectX-5| | ConnectX-5|
> +---+-+-+
> +   | | Header rewrite  | | DPDK 22.07| | DPDK 22.07|
> +   | | (ipv4_ecn)  | | OFED 5.6-2| | OFED 5.6-2|
> +   | |  ipv6_ecn)  | | rdma-core 41  | | rdma-core 41  |

I guess the first ) is a typo.
Did you mean "(ipv4_ecn / ipv6_ecn)" ?


> +   | | | | ConnectX-5| | ConnectX-5|
> +   +---+-+-+


-- 
David Marchand

Re: [PATCH v2 2/2] net: have checksum routines accept unaligned data

2022-07-11 Thread Mattias Rönnblom

On 2022-07-11 11:53, Olivier Matz wrote:
> Hi,
> 
> On Fri, Jul 08, 2022 at 02:56:08PM +0200, Mattias Rönnblom wrote:
>> __rte_raw_cksum() (used by rte_raw_cksum() among others) accessed its
>> data through an uint16_t pointer, which allowed the compiler to assume
>> the data was 16-bit aligned. This in turn would, with certain
>> architectures and compiler flag combinations, result in code with SIMD
>> load or store instructions with restrictions on data alignment.
>>
>> This patch keeps the old algorithm, but data is read using memcpy()
>> instead of direct pointer access, forcing the compiler to always
>> generate code that handles unaligned input. The __may_alias__ GCC
>> attribute is no longer needed.
>>
>> The data on which the Internet checksum functions operates are almost
>> always 16-bit aligned, but there are exceptions. In particular, the
>> PDCP protocol header may (literally) have an odd size.
>>
>> Performance impact seems to range from none to a very slight
>> regression.
>>
>> Bugzilla ID: 1035
>> Cc: sta...@dpdk.org
>>
>> ---
> 
> Using memcpy() looks to be a good solution fix the issue, while avoiding a
> branch and the __may_alias__.
> 
> I just have one minor comment below.
> 
>>
>> v2:
>>* Simplified the odd-length conditional (Morten Brørup).
>>
>> Reviewed-by: Morten Brørup 
>>
>> Signed-off-by: Mattias Rönnblom 
>> ---
>>   lib/net/rte_ip.h | 17 ++---
>>   1 file changed, 10 insertions(+), 7 deletions(-)
>>
>> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
>> index b502481670..a0334d931e 100644
>> --- a/lib/net/rte_ip.h
>> +++ b/lib/net/rte_ip.h
>> @@ -160,18 +160,21 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr)
>>   static inline uint32_t
>>   __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
>>   {
>> -/* extend strict-aliasing rules */
>> -typedef uint16_t __attribute__((__may_alias__)) u16_p;
>> -const u16_p *u16_buf = (const u16_p *)buf;
>> -const u16_p *end = u16_buf + len / sizeof(*u16_buf);
>> +const void *end;
>>   
>> -for (; u16_buf != end; ++u16_buf)
>> -sum += *u16_buf;
>> +for (end = RTE_PTR_ADD(buf, (len/sizeof(uint16_t)) * sizeof(uint16_t));
> 
> What do you think about this form:
> 
>   for (end = RTE_PTR_ADD(buf, RTE_ALIGN_FLOOR(len, sizeof(uint16_t)));
> 
> This also has the good property to solve the debate about the
> spaces around the '/' :)
> 

Shorter, more readable. Looks good to me.

Thanks.

> 
>> + buf != end; buf = RTE_PTR_ADD(buf, sizeof(uint16_t))) {
>> +uint16_t v;
>> +
>> +memcpy(&v, buf, sizeof(uint16_t));
>> +sum += v;
>> +}
>>   
>>  /* if length is odd, keeping it byte order independent */
>>  if (unlikely(len % 2)) {
>>  uint16_t left = 0;
>> -*(unsigned char *)&left = *(const unsigned char *)end;
>> +
>> +memcpy(&left, end, 1);
>>  sum += left;
>>  }
>>   
>> -- 
>> 2.25.1
>>

Re: [PATCH] doc: add deprecation for restrictions in telemetry naming

2022-07-11 Thread Bruce Richardson

On Fri, Jul 08, 2022 at 12:06:31AM +0200, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> > Sent: Thursday, 7 July 2022 15.40
> > 
> > Following discussion on-list [1], we will look to limited the allowed
> > characters in names for items in telemetry. This will simplify the
> > escaping needed for json output, or any future output formats. The
> > lists
> > will initially be minimal, since expansion to allow more characters can
> > be done without affecting compatibility, while reducing the set cannot.
> > 
> > Cc: m...@smartsharesystems.com
> > Cc: step...@networkplumber.org
> > Cc: ciara.po...@intel.com
> > 
> > Signed-off-by: Bruce Richardson 
> > 
> > [1] http://inbox.dpdk.org/dev/20220623164245.561371-1-
> > bruce.richard...@intel.com/#r
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index 4e5b23c53d..9366690ec5 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -119,6 +119,12 @@ Deprecation Notices
> >  * metrics: The function ``rte_metrics_init`` will have a non-void
> > return
> >in order to notify errors instead of calling ``rte_exit``.
> > 
> > +* telemetry: The allowed characters in names for dictionary values
> > will be limited to
> > +  alphanumeric characters and a small subset of additional printable
> > characters.
> > +  This will ensure that all dictionary parameter names can be output
> > without escaping
> > +  in json - or in any future output format used. Names for the
> 
> json -> JSON
> 
Capital idea! (pun very much intended :-) )

> > telemetry commands will
> > +  be similarly limited.
> 
> Perhaps also add a comment about parameters to telemetry commands, for 
> completeness.
> 

I was not intending to impose restrictions on the parameters themselves
since currently they are not output as part of any json. However, now you
have got me thinking that perhaps we should look to scan parameters for
invalid characters before we hand them over to the individual functions.
That would allow the possibility of including parameters in any replies in
a future format.

Was this what you had in mind, or any other thoughts?

/Bruce

[PATCH v2 1/2] test/service: add perf measurements for with stats mode

2022-07-11 Thread Harry van Haaren

This commit improves the performance reporting of the service
cores polling loop to show both with and without statistics
collection modes. Collecting cycle statistics is costly, due
to calls to rte_rdtsc() per service iteration.

Reported-by: Mattias Rönnblom 
Suggested-by: Honnappa Nagarahalli 
Suggested-by: Morten Brørup 
Signed-off-by: Harry van Haaren 

---

This is split out as a seperate patch from the fix to allow
measuring the before/after of the service stats atomic fixup.
---
 app/test/test_service_cores.c | 36 ---
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c
index ced6ed0081..7415b6b686 100644
--- a/app/test/test_service_cores.c
+++ b/app/test/test_service_cores.c
@@ -777,6 +777,22 @@ service_run_on_app_core_func(void *arg)
return rte_service_run_iter_on_app_lcore(*delay_service_id, 1);
 }
 
+static float
+service_app_lcore_perf_measure(uint32_t id)
+{
+   /* Performance test: call in a loop, and measure tsc() */
+   const uint32_t perf_iters = (1 << 12);
+   uint64_t start = rte_rdtsc();
+   uint32_t i;
+   for (i = 0; i < perf_iters; i++) {
+   int err = service_run_on_app_core_func(&id);
+   TEST_ASSERT_EQUAL(0, err, "perf test: returned run failure");
+   }
+   uint64_t end = rte_rdtsc();
+
+   return (end - start)/(float)perf_iters;
+}
+
 static int
 service_app_lcore_poll_impl(const int mt_safe)
 {
@@ -828,17 +844,15 @@ service_app_lcore_poll_impl(const int mt_safe)
"MT Unsafe: App core1 didn't return -EBUSY");
}
 
-   /* Performance test: call in a loop, and measure tsc() */
-   const uint32_t perf_iters = (1 << 12);
-   uint64_t start = rte_rdtsc();
-   uint32_t i;
-   for (i = 0; i < perf_iters; i++) {
-   int err = service_run_on_app_core_func(&id);
-   TEST_ASSERT_EQUAL(0, err, "perf test: returned run failure");
-   }
-   uint64_t end = rte_rdtsc();
-   printf("perf test for %s: %0.1f cycles per call\n", mt_safe ?
-   "MT Safe" : "MT Unsafe", (end - start)/(float)perf_iters);
+   /* Measure performance of no-stats and with-stats. */
+   float cyc_no_stats = service_app_lcore_perf_measure(id);
+
+   TEST_ASSERT_EQUAL(0, rte_service_set_stats_enable(id, 1),
+   "failed to enable stats for service.");
+   float cyc_with_stats = service_app_lcore_perf_measure(id);
+
+   printf("perf test for %s, no stats: %0.1f, with stats %0.1f 
cycles/call\n",
+   mt_safe ? "MT Safe" : "MT Unsafe", cyc_no_stats, 
cyc_with_stats);
 
unregister_all();
return TEST_SUCCESS;
-- 
2.32.0

[PATCH v2 2/2] service: fix potential stats race-condition on MT services

2022-07-11 Thread Harry van Haaren

This commit fixes a potential racey-add that could occur if
multiple service-lcores were executing the same MT-safe service
at the same time, with service statistics collection enabled.

Because multiple threads can run and execute the service, the
stats values can have multiple writer threads, resulting in the
requirement of using atomic addition for correctness.

Note that when a MT unsafe service is executed, a spinlock is
held, so the stats increments are protected. This fact is used
to avoid executing atomic add instructions when not required.
Regular reads and increments are used, and only the store is
specified as atomic, reducing perf impact on e.g. x86 arch.

This patch causes a 1.25x increase in cycle-cost for polling a
MT safe service when statistics are enabled. No change was seen
for MT unsafe services, or when statistics are disabled.

Reported-by: Mattias Rönnblom 
Suggested-by: Honnappa Nagarahalli 
Suggested-by: Morten Brørup 
Suggested-by: Bruce Richardson 
Signed-off-by: Harry van Haaren 

---

v2 (Thanks Honnappa, Morten, Bruce & Mattias for discussion):
- Improved handling of stat stores to ensure they're atomic by
  using __atomic_store_n() with regular loads/increments.
- Added BUILD_BUG_ON alignment checks for the uint64_t stats
  variables, tested with __rte_packed to ensure build breaks
  if not aligned naturally.

---
 lib/eal/common/rte_service.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index d2b7275ac0..90d12032f0 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -54,6 +54,9 @@ struct rte_service_spec_impl {
uint64_t cycles_spent;
 } __rte_cache_aligned;
 
+/* Mask used to ensure uint64_t 8 byte vars are naturally aligned. */
+#define RTE_SERVICE_STAT_ALIGN_MASK (8 - 1)
+
 /* the internal values of a service core */
 struct core_state {
/* map of services IDs are run on this core */
@@ -359,13 +362,29 @@ service_runner_do_callback(struct rte_service_spec_impl 
*s,
 {
void *userdata = s->spec.callback_userdata;
 
+   /* Ensure the atomically stored variables are naturally aligned,
+* as required for regular loads to be atomic.
+*/
+   RTE_BUILD_BUG_ON((offsetof(struct rte_service_spec_impl, calls)
+   & RTE_SERVICE_STAT_ALIGN_MASK) != 0);
+   RTE_BUILD_BUG_ON((offsetof(struct rte_service_spec_impl, cycles_spent)
+   & RTE_SERVICE_STAT_ALIGN_MASK) != 0);
+
if (service_stats_enabled(s)) {
uint64_t start = rte_rdtsc();
s->spec.callback(userdata);
uint64_t end = rte_rdtsc();
-   s->cycles_spent += end - start;
+   uint64_t cycles = end - start;
cs->calls_per_service[service_idx]++;
-   s->calls++;
+   if (service_mt_safe(s)) {
+   __atomic_fetch_add(&s->cycles_spent, cycles, 
__ATOMIC_RELAXED);
+   __atomic_fetch_add(&s->calls, 1, __ATOMIC_RELAXED);
+   } else {
+   uint64_t cycles_new = s->cycles_spent + cycles;
+   uint64_t calls_new = s->calls++;
+   __atomic_store_n(&s->cycles_spent, cycles_new, 
__ATOMIC_RELAXED);
+   __atomic_store_n(&s->calls, calls_new, 
__ATOMIC_RELAXED);
+   }
} else
s->spec.callback(userdata);
 }
-- 
2.32.0

Re: [PATCH v2 1/2] app/test: add cksum performance test

2022-07-11 Thread Olivier Matz

On Mon, Jul 11, 2022 at 10:42:37AM +, Mattias Rönnblom wrote:
> On 2022-07-11 11:47, Olivier Matz wrote:
> > Hi Mattias,
> > 
> > Please see few comments below.
> > 
> > On Fri, Jul 08, 2022 at 02:56:07PM +0200, Mattias Rönnblom wrote:
> >> Add performance test for the rte_raw_cksum() function, which delegates
> >> the actual work to __rte_raw_cksum(), which in turn is used by other
> >> functions in need of Internet checksum calculation.
> >>
> >> Signed-off-by: Mattias Rönnblom 
> >>
> >> ---
> >>
> >> v2:
> >>* Added __rte_unused to unused volatile variable, to keep the Intel
> >>  compiler happy.
> >> ---
> >>   MAINTAINERS|   1 +
> >>   app/test/meson.build   |   1 +
> >>   app/test/test_cksum_perf.c | 118 +
> >>   3 files changed, 120 insertions(+)
> >>   create mode 100644 app/test/test_cksum_perf.c
> >>
> >> diff --git a/MAINTAINERS b/MAINTAINERS
> >> index c923712946..2a4c99e05a 100644
> >> --- a/MAINTAINERS
> >> +++ b/MAINTAINERS
> >> @@ -1414,6 +1414,7 @@ Network headers
> >>   M: Olivier Matz 
> >>   F: lib/net/
> >>   F: app/test/test_cksum.c
> >> +F: app/test/test_cksum_perf.c
> >>   
> >>   Packet CRC
> >>   M: Jasvinder Singh 
> >> diff --git a/app/test/meson.build b/app/test/meson.build
> >> index 431c5bd318..191db03d1d 100644
> >> --- a/app/test/meson.build
> >> +++ b/app/test/meson.build
> >> @@ -18,6 +18,7 @@ test_sources = files(
> >>   'test_bpf.c',
> >>   'test_byteorder.c',
> >>   'test_cksum.c',
> >> +'test_cksum_perf.c',
> >>   'test_cmdline.c',
> >>   'test_cmdline_cirbuf.c',
> >>   'test_cmdline_etheraddr.c',
> >> diff --git a/app/test/test_cksum_perf.c b/app/test/test_cksum_perf.c
> >> new file mode 100644
> >> index 00..bff73cb3bb
> >> --- /dev/null
> >> +++ b/app/test/test_cksum_perf.c
> >> @@ -0,0 +1,118 @@
> >> +/* SPDX-License-Identifier: BSD-3-Clause
> >> + * Copyright(c) 2022 Ericsson AB
> >> + */
> >> +
> >> +#include 
> >> +
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >> +
> >> +#include "test.h"
> >> +
> >> +#define NUM_BLOCKS (10)
> >> +#define ITERATIONS (100)
> > 
> > Parenthesis can be safely removed
> > 
> >> +
> >> +static const size_t data_sizes[] = { 20, 21, 100, 101, 1500, 1501 };
> >> +
> >> +static __rte_noinline uint16_t
> >> +do_rte_raw_cksum(const void *buf, size_t len)
> >> +{
> >> +  return rte_raw_cksum(buf, len);
> >> +}
> > 
> > I don't understand the need to have this wrapper, especially marked
> > __rte_noinline. What is the objective?
> > 
> 
> The intention is to disallow the compiler to perform unrolling and 
> integrating/interleave one cksum operating with the next buffer's in a 
> way that wouldn't be feasable in a real application.
> 
> It will result in an overestimation of the cost for small cksums, so 
> it's still misleading, but in another direction. :)

OK, got it. I think it's fine like you did then.

> 
> > Note that when I remove the __rte_noinline, the performance is better
> > for size 20 and 21.
> > 
> >> +
> >> +static void
> >> +init_block(void *buf, size_t len)
> > 
> > Can buf be a (char *) instead?
> > It would avoid a cast below.
> > 
> 
> Yes.
> 
> >> +{
> >> +  size_t i;
> >> +
> >> +  for (i = 0; i < len; i++)
> >> +  ((char *)buf)[i] = (uint8_t)rte_rand();
> >> +}
> >> +
> >> +static int
> >> +test_cksum_perf_size_alignment(size_t block_size, bool aligned)
> >> +{
> >> +  char *data[NUM_BLOCKS];
> >> +  char *blocks[NUM_BLOCKS];
> >> +  unsigned int i;
> >> +  uint64_t start;
> >> +  uint64_t end;
> >> +  /* Floating point to handle low (pseudo-)TSC frequencies */
> >> +  double block_latency;
> >> +  double byte_latency;
> >> +  volatile __rte_unused uint64_t sum = 0;
> >> +
> >> +  for (i = 0; i < NUM_BLOCKS; i++) {
> >> +  data[i] = rte_malloc(NULL, block_size + 1, 0);
> >> +
> >> +  if (data[i] == NULL) {
> >> +  printf("Failed to allocate memory for block\n");
> >> +  return TEST_FAILED;
> >> +  }
> >> +
> >> +  init_block(data[i], block_size + 1);
> >> +
> >> +  blocks[i] = aligned ? data[i] : data[i] + 1;
> >> +  }
> >> +
> >> +  start = rte_rdtsc();
> >> +
> >> +  for (i = 0; i < ITERATIONS; i++) {
> >> +  unsigned int j;
> >> +  for (j = 0; j < NUM_BLOCKS; j++)
> >> +  sum += do_rte_raw_cksum(blocks[j], block_size);
> >> +  }
> >> +
> >> +  end = rte_rdtsc();
> >> +
> >> +  block_latency = (end - start) / (double)(ITERATIONS * NUM_BLOCKS);
> >> +  byte_latency = block_latency / block_size;
> >> +
> >> +  printf("%-9s %10zd %19.1f %16.2f\n", aligned ? "Aligned" : "Unaligned",
> >> + block_size, block_latency, byte_latency);
> > 
> > When I run the test on my dev machine, I get the following results,
> > which are quite reproductible:
> > 
> > Aligned   20   10.4  0.52 (range is 0.48 - 0.52)
> > Unaligne

RE: [PATCH] doc: add deprecation for restrictions in telemetry naming

2022-07-11 Thread Morten Brørup

> From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> Sent: Monday, 11 July 2022 12.54
> 
> On Fri, Jul 08, 2022 at 12:06:31AM +0200, Morten Brørup wrote:
> > > From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> > > Sent: Thursday, 7 July 2022 15.40
> > >
> > > Following discussion on-list [1], we will look to limited the
> allowed
> > > characters in names for items in telemetry. This will simplify the
> > > escaping needed for json output, or any future output formats. The
> > > lists
> > > will initially be minimal, since expansion to allow more characters
> can
> > > be done without affecting compatibility, while reducing the set
> cannot.
> > >
> > > Cc: m...@smartsharesystems.com
> > > Cc: step...@networkplumber.org
> > > Cc: ciara.po...@intel.com
> > >
> > > Signed-off-by: Bruce Richardson 
> > >
> > > [1] http://inbox.dpdk.org/dev/20220623164245.561371-1-
> > > bruce.richard...@intel.com/#r
> > > ---
> > >  doc/guides/rel_notes/deprecation.rst | 6 ++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/doc/guides/rel_notes/deprecation.rst
> > > b/doc/guides/rel_notes/deprecation.rst
> > > index 4e5b23c53d..9366690ec5 100644
> > > --- a/doc/guides/rel_notes/deprecation.rst
> > > +++ b/doc/guides/rel_notes/deprecation.rst
> > > @@ -119,6 +119,12 @@ Deprecation Notices
> > >  * metrics: The function ``rte_metrics_init`` will have a non-void
> > > return
> > >in order to notify errors instead of calling ``rte_exit``.
> > >
> > > +* telemetry: The allowed characters in names for dictionary values
> > > will be limited to
> > > +  alphanumeric characters and a small subset of additional
> printable
> > > characters.
> > > +  This will ensure that all dictionary parameter names can be
> output
> > > without escaping
> > > +  in json - or in any future output format used. Names for the
> >
> > json -> JSON
> >
> Capital idea! (pun very much intended :-) )
> 
> > > telemetry commands will
> > > +  be similarly limited.
> >
> > Perhaps also add a comment about parameters to telemetry commands,
> for completeness.
> >

I intentionally phrased this comment vaguely, to see what you had been thinking 
about this. And it certainly had the desired effect. :-)

> 
> I was not intending to impose restrictions on the parameters themselves
> since currently they are not output as part of any json. However, now
> you
> have got me thinking that perhaps we should look to scan parameters for
> invalid characters before we hand them over to the individual
> functions.
> That would allow the possibility of including parameters in any replies
> in
> a future format.
> 
> Was this what you had in mind, or any other thoughts?

Not exactly what I had in mind...

Your patch adds that "Names for the telemetry commands will be similarly 
limited.". This is input, not output. So you need to describe what restrictions 
are imposed on input.

The input commands and format don't follow any structured standard; command 
names, hierarchy and parameter names are individually chosen by each developer, 
and parameters are just a bunch of param=value with no types or limits to the 
values.

Also, the input is not JSON formatted, but - without looking deeply into the 
telemetry library - I suppose it might be URL encoded, where e.g. space is 
encoded as "%20" and '&' is encoded as "%26".

I think we should just leave the input without restrictions. Changing it would 
require a major overhaul to provide any significant improvement, e.g. attaching 
types to the parameters, so their values are not just BLOBs.

I don't strongly oppose to limiting the input command names; but we shouldn't 
impose any limit on what follows the command. So I'm proposing to explicitly 
mention that we don't impose any input limits beyond the command names.

Or we could provide input restrictions and parsing/formatting in a separate 
patch set.

Re: [dpdk-dev] [PATCH v2 0/7] Removal of PCI bus ABIs

2022-07-11 Thread Thomas Monjalon

9 months have passed. Do you have any news from SPDK side?


14/10/2021 08:41, Thomas Monjalon:
> 14/10/2021 00:48, Walker, Benjamin:
> > > From: Thomas Monjalon 
> >  
> > > Yes I think we need to agree on functions to keep as-is for compatibility.
> > > Waiting for your input please.
> > 
> > We've added a task to our backlog to propose a stable ABI for out of tree 
> > drivers here. It's not as simple as just keeping a couple of the existing 
> > functions - we're currently manipulating structures directly. We'll need to 
> > work a bit to design the simplest possible set of functions that we need.
> 
> OK thanks

[PATCH v3 2/2] net: have checksum routines accept unaligned data

2022-07-11 Thread Mattias Rönnblom

__rte_raw_cksum() (used by rte_raw_cksum() among others) accessed its
data through an uint16_t pointer, which allowed the compiler to assume
the data was 16-bit aligned. This in turn would, with certain
architectures and compiler flag combinations, result in code with SIMD
load or store instructions with restrictions on data alignment.

This patch keeps the old algorithm, but data is read using memcpy()
instead of direct pointer access, forcing the compiler to always
generate code that handles unaligned input. The __may_alias__ GCC
attribute is no longer needed.

The data on which the Internet checksum functions operates are almost
always 16-bit aligned, but there are exceptions. In particular, the
PDCP protocol header may (literally) have an odd size.

Performance impact seems to range from none to a very slight
regression.

Bugzilla ID: 1035
Cc: sta...@dpdk.org

---

v3:
  * Use RTE_ALIGN_FLOOR() in the pointer arithmetic (Olivier Matz).
v2:
  * Simplified the odd-length conditional (Morten Brørup).

Reviewed-by: Morten Brørup 

Signed-off-by: Mattias Rönnblom 
---
 lib/net/rte_ip.h | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
index b502481670..ecd250e9be 100644
--- a/lib/net/rte_ip.h
+++ b/lib/net/rte_ip.h
@@ -160,18 +160,21 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr)
 static inline uint32_t
 __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
 {
-   /* extend strict-aliasing rules */
-   typedef uint16_t __attribute__((__may_alias__)) u16_p;
-   const u16_p *u16_buf = (const u16_p *)buf;
-   const u16_p *end = u16_buf + len / sizeof(*u16_buf);
+   const void *end;
 
-   for (; u16_buf != end; ++u16_buf)
-   sum += *u16_buf;
+   for (end = RTE_PTR_ADD(buf, RTE_ALIGN_FLOOR(len, sizeof(uint16_t)));
+buf != end; buf = RTE_PTR_ADD(buf, sizeof(uint16_t))) {
+   uint16_t v;
+
+   memcpy(&v, buf, sizeof(uint16_t));
+   sum += v;
+   }
 
/* if length is odd, keeping it byte order independent */
if (unlikely(len % 2)) {
uint16_t left = 0;
-   *(unsigned char *)&left = *(const unsigned char *)end;
+
+   memcpy(&left, end, 1);
sum += left;
}
 
-- 
2.25.1

[PATCH v3 1/2] app/test: add cksum performance test

2022-07-11 Thread Mattias Rönnblom

Add performance test for the rte_raw_cksum() function, which delegates
the actual work to __rte_raw_cksum(), which in turn is used by other
functions in need of Internet checksum calculation.

Signed-off-by: Mattias Rönnblom 

---

v3:
  * Changed init function buffer parameter type, to avoid cast.
  * Code formatting improved.
v2:
  * Added __rte_unused to unused volatile variable, to keep the Intel
compiler happy.
---
 MAINTAINERS|   1 +
 app/test/meson.build   |   1 +
 app/test/test_cksum_perf.c | 117 +
 3 files changed, 119 insertions(+)
 create mode 100644 app/test/test_cksum_perf.c

diff --git a/MAINTAINERS b/MAINTAINERS
index c923712946..2a4c99e05a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1414,6 +1414,7 @@ Network headers
 M: Olivier Matz 
 F: lib/net/
 F: app/test/test_cksum.c
+F: app/test/test_cksum_perf.c
 
 Packet CRC
 M: Jasvinder Singh 
diff --git a/app/test/meson.build b/app/test/meson.build
index 431c5bd318..191db03d1d 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -18,6 +18,7 @@ test_sources = files(
 'test_bpf.c',
 'test_byteorder.c',
 'test_cksum.c',
+'test_cksum_perf.c',
 'test_cmdline.c',
 'test_cmdline_cirbuf.c',
 'test_cmdline_etheraddr.c',
diff --git a/app/test/test_cksum_perf.c b/app/test/test_cksum_perf.c
new file mode 100644
index 00..1f296cae34
--- /dev/null
+++ b/app/test/test_cksum_perf.c
@@ -0,0 +1,117 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Ericsson AB
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+#define NUM_BLOCKS 10
+#define ITERATIONS 100
+
+static const size_t data_sizes[] = { 20, 21, 100, 101, 1500, 1501 };
+
+static __rte_noinline uint16_t
+do_rte_raw_cksum(const void *buf, size_t len)
+{
+   return rte_raw_cksum(buf, len);
+}
+
+static void
+init_block(char *buf, size_t len)
+{
+   size_t i;
+
+   for (i = 0; i < len; i++)
+   buf[i] = (char)rte_rand();
+}
+
+static int
+test_cksum_perf_size_alignment(size_t block_size, bool aligned)
+{
+   char *data[NUM_BLOCKS];
+   char *blocks[NUM_BLOCKS];
+   unsigned int i;
+   uint64_t start;
+   uint64_t end;
+   /* Floating point to handle low (pseudo-)TSC frequencies */
+   double block_latency;
+   double byte_latency;
+   volatile __rte_unused uint64_t sum = 0;
+
+   for (i = 0; i < NUM_BLOCKS; i++) {
+   data[i] = rte_malloc(NULL, block_size + 1, 0);
+
+   if (data[i] == NULL) {
+   printf("Failed to allocate memory for block\n");
+   return TEST_FAILED;
+   }
+
+   init_block(data[i], block_size + 1);
+
+   blocks[i] = aligned ? data[i] : data[i] + 1;
+   }
+
+   start = rte_rdtsc();
+
+   for (i = 0; i < ITERATIONS; i++) {
+   unsigned int j;
+   for (j = 0; j < NUM_BLOCKS; j++)
+   sum += do_rte_raw_cksum(blocks[j], block_size);
+   }
+
+   end = rte_rdtsc();
+
+   block_latency = (end - start) / (double)(ITERATIONS * NUM_BLOCKS);
+   byte_latency = block_latency / block_size;
+
+   printf("%-9s %10zd %19.1f %16.2f\n", aligned ? "Aligned" : "Unaligned",
+  block_size, block_latency, byte_latency);
+
+   for (i = 0; i < NUM_BLOCKS; i++)
+   rte_free(data[i]);
+
+   return TEST_SUCCESS;
+}
+
+static int
+test_cksum_perf_size(size_t block_size)
+{
+   int rc;
+
+   rc = test_cksum_perf_size_alignment(block_size, true);
+   if (rc != TEST_SUCCESS)
+   return rc;
+
+   rc = test_cksum_perf_size_alignment(block_size, false);
+
+   return rc;
+}
+
+static int
+test_cksum_perf(void)
+{
+   uint16_t i;
+
+   printf("### rte_raw_cksum() performance ###\n");
+   printf("Alignment  Block sizeTSC cycles/block  TSC cycles/byte\n");
+
+   for (i = 0; i < RTE_DIM(data_sizes); i++) {
+   int rc;
+
+   rc = test_cksum_perf_size(data_sizes[i]);
+   if (rc != TEST_SUCCESS)
+   return rc;
+   }
+
+   return TEST_SUCCESS;
+}
+
+
+REGISTER_TEST_COMMAND(cksum_perf_autotest, test_cksum_perf);
-- 
2.25.1

Re: [PATCH] doc: announce marking device and driver objects as internal

2022-07-11 Thread Thomas Monjalon

11/07/2022 04:16, Xia, Chenbo:
> From: David Marchand 
> > rte_driver and rte_device are unnecessarily exposed in the public API/ABI.
> > Announce that they will be made opaque in the public API and mark
> > associated API as internal.
> > This impacts all bus, as their driver registration mechanism will be
> > made internal.
> > 
> > Note: the PCI bus had a similar deprecation notice that we can remove as
> > the new one is more generic.
[...]
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > +* drivers: As a followup on the work on the ``rte_bus`` object, the
> > +  ``rte_driver`` and ``rte_device`` objects (and as a domino effect,
> > their
> > +  bus-specific counterparts) will be made opaque in DPDK 22.11.
> > +  Registering a driver on a bus will be marked as an internal API:
> > +  external users may still register their drivers using the bus specific
> > +  driver header (see ``enable_driver_sdk`` meson option).
> > +
> 
> Cc SPDK folks
> 
> Thanks for your work! My only concern is using enable_driver_sdk may not be
> a good way for SPDK based on the discussion.
> http://patchwork.dpdk.org/project/dpdk/cover/20210918022443.12719-1-chenbo@intel.com/

Quick summary:
- Symbols exported by DPDK by default are supposed to be for applications.
- Option enable_driver_sdk allow installing headers to build drivers.
- Driver interface is not part of the stable ABI, i.e. no compat guarantee.
- SPDK is building drivers on top of DPDK bus drivers (PCI for now).
- SPDK was asked to check whether anything else is required.

There is no decision about ABI guarantee because we lack feedbacks.
After this cleanup, the option enable_driver_sdk would be required
when working with buses, like SPDK.
Is there a real need to build against distro-provided package?
Is there a way to provide bus drivers SDK without messing with app SDK?


> But overall this idea makes sense, so:
> 
> Acked-by: Chenbo Xia 

I like the cleanup as well. It will allow more enhancements in future.
I hope and believe we can accomodate SDK needs when it will be clear.

Acked-by: Thomas Monjalon

Re: [PATCH] event/dsw: fix migration bug

2022-07-11 Thread Jerin Jacob

On Thu, Jul 7, 2022 at 5:13 PM Mattias Rönnblom  wrote:
>
> From: Mattias Rönnblom 
>
> Fix bug in flow migration, which under certain conditions causes
> reordering and violation of atomicity guarantees.
>
> The issue occurs when the processing of a flow (on an atomic queue)
> has resulted in events enqueued to a flow currently being migrated,
> and the former (producer) flow is also selected for migration. The
> events are buffered ("paused") on the originating port, and released
> (forwarded) when the migration has completed. However, at the time of
> "unpausing" the latter (consumer) flow, processing of the producer
> flow on the port to which it was migrated may have already produced
> events, for the same paused flow. This constitutes a race condition,
> and depending on which port wins, reordering may have been introduced.
>
> This patch forbids migration when a port has paused events, since
> those events may have been the result of processing a to-be-migrated
> flow.
>
> This patch also disallows processing events pertaining to a flow under
> migration, for the same reason. A new buffer is introduced, which
> holds such not-yet-processed events dequeued from the port's input
> ring. Such events are forwarded to the target port as a part of the
> migration process.
>
> The 'forwarding' migration state is eliminated, and instead background
> processing is only performed if there are no unreleased events on the
> port.
>
> The bug is primarily triggered in situations where multiple flows are
> migrated as one transaction, but may occur even if only a single flow
> is migrated (e.g., with older DSW versions, which does not support
> multi-flow migration).
>
> Fixes: f6257b22e767 ("event/dsw: add load balancing")
> Cc: sta...@dpdk.org
>
> Signed-off-by: Mattias Rönnblom 


+ @Thomas Monjalon  to merge to the main tree directly.

Re: [PATCH v3 1/4] doc/howto: rework section on virtio-user as exception path

2022-07-11 Thread Thomas Monjalon

10/06/2022 17:35, Bruce Richardson:
> This patch extensively reworks the howto guide on using virtio-user for
> exception packets. Changes include:
> 
> * rename "exceptional path" to "exception path"
> * remove references to uio and just reference vfio-pci
> * simplify testpmd command-lines, giving a basic usage example first
>   before adding on detail about checksum or TSO parameters
> * give a complete working example showing traffic flowing through the
>   whole system from a testpmd loopback using the created TAP netdev
> * replace use of "ifconfig" with Linux standard "ip" command
> * other general rewording.
> 
> CC: sta...@dpdk.org

I don't think we should keep this for backport.
This is an enhancement.
Anyway new developments are probably not looking in old branches.

Re: [PATCH v3 1/4] doc/howto: rework section on virtio-user as exception path

2022-07-11 Thread Bruce Richardson

On Mon, Jul 11, 2022 at 03:10:56PM +0200, Thomas Monjalon wrote:
> 10/06/2022 17:35, Bruce Richardson:
> > This patch extensively reworks the howto guide on using virtio-user for
> > exception packets. Changes include:
> > 
> > * rename "exceptional path" to "exception path"
> > * remove references to uio and just reference vfio-pci
> > * simplify testpmd command-lines, giving a basic usage example first
> >   before adding on detail about checksum or TSO parameters
> > * give a complete working example showing traffic flowing through the
> >   whole system from a testpmd loopback using the created TAP netdev
> > * replace use of "ifconfig" with Linux standard "ip" command
> > * other general rewording.
> > 
> > CC: sta...@dpdk.org
> 
> I don't think we should keep this for backport.
> This is an enhancement.
> Anyway new developments are probably not looking in old branches.
> 
Sure, that's fine with me.

[PATCH v3 1/2] test/service: add perf measurements for with stats mode

2022-07-11 Thread Harry van Haaren

This commit improves the performance reporting of the service
cores polling loop to show both with and without statistics
collection modes. Collecting cycle statistics is costly, due
to calls to rte_rdtsc() per service iteration.

Reported-by: Mattias Rönnblom 
Suggested-by: Honnappa Nagarahalli 
Suggested-by: Morten Brørup 
Signed-off-by: Harry van Haaren 

---

This is split out as a seperate patch from the fix to allow
measuring the before/after of the service stats atomic fixup.
---
 app/test/test_service_cores.c | 36 ---
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c
index ced6ed0081..7415b6b686 100644
--- a/app/test/test_service_cores.c
+++ b/app/test/test_service_cores.c
@@ -777,6 +777,22 @@ service_run_on_app_core_func(void *arg)
return rte_service_run_iter_on_app_lcore(*delay_service_id, 1);
 }
 
+static float
+service_app_lcore_perf_measure(uint32_t id)
+{
+   /* Performance test: call in a loop, and measure tsc() */
+   const uint32_t perf_iters = (1 << 12);
+   uint64_t start = rte_rdtsc();
+   uint32_t i;
+   for (i = 0; i < perf_iters; i++) {
+   int err = service_run_on_app_core_func(&id);
+   TEST_ASSERT_EQUAL(0, err, "perf test: returned run failure");
+   }
+   uint64_t end = rte_rdtsc();
+
+   return (end - start)/(float)perf_iters;
+}
+
 static int
 service_app_lcore_poll_impl(const int mt_safe)
 {
@@ -828,17 +844,15 @@ service_app_lcore_poll_impl(const int mt_safe)
"MT Unsafe: App core1 didn't return -EBUSY");
}
 
-   /* Performance test: call in a loop, and measure tsc() */
-   const uint32_t perf_iters = (1 << 12);
-   uint64_t start = rte_rdtsc();
-   uint32_t i;
-   for (i = 0; i < perf_iters; i++) {
-   int err = service_run_on_app_core_func(&id);
-   TEST_ASSERT_EQUAL(0, err, "perf test: returned run failure");
-   }
-   uint64_t end = rte_rdtsc();
-   printf("perf test for %s: %0.1f cycles per call\n", mt_safe ?
-   "MT Safe" : "MT Unsafe", (end - start)/(float)perf_iters);
+   /* Measure performance of no-stats and with-stats. */
+   float cyc_no_stats = service_app_lcore_perf_measure(id);
+
+   TEST_ASSERT_EQUAL(0, rte_service_set_stats_enable(id, 1),
+   "failed to enable stats for service.");
+   float cyc_with_stats = service_app_lcore_perf_measure(id);
+
+   printf("perf test for %s, no stats: %0.1f, with stats %0.1f 
cycles/call\n",
+   mt_safe ? "MT Safe" : "MT Unsafe", cyc_no_stats, 
cyc_with_stats);
 
unregister_all();
return TEST_SUCCESS;
-- 
2.32.0

[PATCH v3 2/2] service: fix potential stats race-condition on MT services

2022-07-11 Thread Harry van Haaren

This commit fixes a potential racey-add that could occur if
multiple service-lcores were executing the same MT-safe service
at the same time, with service statistics collection enabled.

Because multiple threads can run and execute the service, the
stats values can have multiple writer threads, resulting in the
requirement of using atomic addition for correctness.

Note that when a MT unsafe service is executed, a spinlock is
held, so the stats increments are protected. This fact is used
to avoid executing atomic add instructions when not required.
Regular reads and increments are used, and only the store is
specified as atomic, reducing perf impact on e.g. x86 arch.

This patch causes a 1.25x increase in cycle-cost for polling a
MT safe service when statistics are enabled. No change was seen
for MT unsafe services, or when statistics are disabled.

Reported-by: Mattias Rönnblom 
Suggested-by: Honnappa Nagarahalli 
Suggested-by: Morten Brørup 
Suggested-by: Bruce Richardson 
Signed-off-by: Harry van Haaren 

---

v3:
- Fix 32-bit build, by forcing natural alignment of uint64_t in
  the struct that contains it, using __rte_aligned(8) macro.
- Note: I'm seeing a checkpatch "avoid externs in .c files" warning,
  but it doesn't make sense to me, so perhaps its a false-positive..?

v2 (Thanks Honnappa, Morten, Bruce & Mattias for discussion):
- Improved handling of stat stores to ensure they're atomic by
  using __atomic_store_n() with regular loads/increments.
- Added BUILD_BUG_ON alignment checks for the uint64_t stats
  variables, tested with __rte_packed to ensure build breaks.
---
 lib/eal/common/rte_service.c | 31 +++
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index d2b7275ac0..94cb056196 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -50,10 +50,17 @@ struct rte_service_spec_impl {
 * on currently.
 */
uint32_t num_mapped_cores;
-   uint64_t calls;
-   uint64_t cycles_spent;
+
+   /* 32-bit builds won't naturally align a uint64_t, so force alignment,
+* allowing regular reads to be atomic.
+*/
+   uint64_t calls __rte_aligned(8);
+   uint64_t cycles_spent __rte_aligned(8);
 } __rte_cache_aligned;
 
+/* Mask used to ensure uint64_t 8 byte vars are naturally aligned. */
+#define RTE_SERVICE_STAT_ALIGN_MASK (8 - 1)
+
 /* the internal values of a service core */
 struct core_state {
/* map of services IDs are run on this core */
@@ -359,13 +366,29 @@ service_runner_do_callback(struct rte_service_spec_impl 
*s,
 {
void *userdata = s->spec.callback_userdata;
 
+   /* Ensure the atomically stored variables are naturally aligned,
+* as required for regular loads to be atomic.
+*/
+   RTE_BUILD_BUG_ON((offsetof(struct rte_service_spec_impl, calls)
+   & RTE_SERVICE_STAT_ALIGN_MASK) != 0);
+   RTE_BUILD_BUG_ON((offsetof(struct rte_service_spec_impl, cycles_spent)
+   & RTE_SERVICE_STAT_ALIGN_MASK) != 0);
+
if (service_stats_enabled(s)) {
uint64_t start = rte_rdtsc();
s->spec.callback(userdata);
uint64_t end = rte_rdtsc();
-   s->cycles_spent += end - start;
+   uint64_t cycles = end - start;
cs->calls_per_service[service_idx]++;
-   s->calls++;
+   if (service_mt_safe(s)) {
+   __atomic_fetch_add(&s->cycles_spent, cycles, 
__ATOMIC_RELAXED);
+   __atomic_fetch_add(&s->calls, 1, __ATOMIC_RELAXED);
+   } else {
+   uint64_t cycles_new = s->cycles_spent + cycles;
+   uint64_t calls_new = s->calls++;
+   __atomic_store_n(&s->cycles_spent, cycles_new, 
__ATOMIC_RELAXED);
+   __atomic_store_n(&s->calls, calls_new, 
__ATOMIC_RELAXED);
+   }
} else
s->spec.callback(userdata);
 }
-- 
2.32.0

Re: [PATCH v3 1/2] app/test: add cksum performance test

2022-07-11 Thread Olivier Matz

On Mon, Jul 11, 2022 at 02:11:31PM +0200, Mattias Rönnblom wrote:
> Add performance test for the rte_raw_cksum() function, which delegates
> the actual work to __rte_raw_cksum(), which in turn is used by other
> functions in need of Internet checksum calculation.
> 
> Signed-off-by: Mattias Rönnblom 

Acked-by: Olivier Matz 

Thank you!

Re: [PATCH v3 2/2] net: have checksum routines accept unaligned data

2022-07-11 Thread Olivier Matz

On Mon, Jul 11, 2022 at 02:11:32PM +0200, Mattias Rönnblom wrote:
> __rte_raw_cksum() (used by rte_raw_cksum() among others) accessed its
> data through an uint16_t pointer, which allowed the compiler to assume
> the data was 16-bit aligned. This in turn would, with certain
> architectures and compiler flag combinations, result in code with SIMD
> load or store instructions with restrictions on data alignment.
> 
> This patch keeps the old algorithm, but data is read using memcpy()
> instead of direct pointer access, forcing the compiler to always
> generate code that handles unaligned input. The __may_alias__ GCC
> attribute is no longer needed.
> 
> The data on which the Internet checksum functions operates are almost
> always 16-bit aligned, but there are exceptions. In particular, the
> PDCP protocol header may (literally) have an odd size.
> 
> Performance impact seems to range from none to a very slight
> regression.
> 
> Bugzilla ID: 1035
> Cc: sta...@dpdk.org

Fixes: 6006818cfb26 ("net: new checksum functions")

> ---
> 
> v3:
>   * Use RTE_ALIGN_FLOOR() in the pointer arithmetic (Olivier Matz).
> v2:
>   * Simplified the odd-length conditional (Morten Brørup).
> 
> Reviewed-by: Morten Brørup 
> 
> Signed-off-by: Mattias Rönnblom 

Acked-by: Olivier Matz 

Thank you!

Re: [dpdk-dev] [PATCH v2] doc: propose correction rte_{bsf, fls} inline functions type use

2022-07-11 Thread Jerin Jacob

On Thu, Nov 11, 2021 at 6:11 PM Morten Brørup  
wrote:
>
> > From: Thomas Monjalon [mailto:tho...@monjalon.net]
> > Sent: Thursday, 11 November 2021 12.55
> >
> > 11/11/2021 05:15, Tyler Retzlaff:
> > > On Tue, Oct 26, 2021 at 09:45:20AM +0200, Morten Brørup wrote:
> > > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Thomas
> > Monjalon
> > > > > Sent: Monday, 25 October 2021 21.14
> > > > >
> > > > > 15/03/2021 20:34, Tyler Retzlaff:
> > > > > > The proposal has resulted from request to review [1] the
> > following
> > > > > > functions where there appeared to be inconsistency in return
> > type
> > > > > > or parameter type selections for the following inline
> > functions.
> > > > > >
> > > > > > rte_bsf32()
> > > > > > rte_bsf32_safe()
> > > > > > rte_bsf64()
> > > > > > rte_bsf64_safe()
> > > > > > rte_fls_u32()
> > > > > > rte_fls_u64()
> > > > > > rte_log2_u32()
> > > > > > rte_log2_u64()
> > > > > >
> > > > > > [1] http://mails.dpdk.org/archives/dev/2021-March/201590.html
> > > > > >
> > > > > > Signed-off-by: Tyler Retzlaff 


Acked-by: Jerin Jacob 



> > > > > > ---
> > > > > > --- a/doc/guides/rel_notes/deprecation.rst
> > > > > > +++ b/doc/guides/rel_notes/deprecation.rst
> > > > > > +* eal: Fix inline function return and parameter types for
> > > > > rte_{bsf,fls}
> > > > > > +  inline functions to be consistent.
> > > > > > +  Change ``rte_bsf32_safe`` parameter ``v`` from ``uint64_t``
> > to
> > > > > ``uint32_t``.
> > > > > > +  Change ``rte_bsf64`` return type to  ``uint32_t`` instead of
> > > > > ``int``.
> > > > > > +  Change ``rte_fls_u32`` return type to ``uint32_t`` instead
> > of
> > > > > ``int``.
> > > > > > +  Change ``rte_fls_u64`` return type to ``uint32_t`` instead
> > of
> > > > > ``int``.
> > > > >
> > > > > It seems we completely forgot this.
> > > > > How critical is it?
> > > >
> > >
> > > our organization as a matter of internal security policy requires
> > these
> > > sorts of things to be fixed. while i didn't see any bugs in the dpdk
> > > code there is an opportunity for users of these functions to
> > > accidentally write code that is prone to integer and buffer overflow
> > > class bugs.
> > >
> > > there is no urgency, but why leave things sloppy? though i do wish
> > this
> > > had been responded to in a more timely manner 7 months for something
> > > that should have almost been rubber stamped.
> >
> > It's difficult to be on all topics.
> > The best way to avoid such miss is to ping when you see no progress.
> >
> > So what's next?
> > They are only inline functions, right? so no ABI breakage.
> > Is it going to require any change on application-side? I guess no.
> > Is it acceptable in 21.11-rc3? maybe too late?
> > Is it acceptable in 22.02?
>
> If Microsoft (represented by Tyler in this case) considers this a bug, I 
> would prefer getting it into 21.11 - especially because it is an LTS release.
>
> -Morten
>

Re: [PATCH v3 2/2] doc: announce KNI deprecation

2022-07-11 Thread Jerin Jacob

On Wed, Nov 24, 2021 at 10:48 PM Ferruh Yigit  wrote:
>
> Announce the KNI kernel module move to out of dpdk repo and announce
> long term plan to deprecate the KNI.
>
> Signed-off-by: Ferruh Yigit 

Acked-by: Jerin Jacob 


> ---
> Cc: Olivier Matz Olivier Matz 
> Cc: David Marchand David Marchand 
> Cc: Stephen Hemminger Stephen Hemminger 
> Cc: Elad Nachman 
> Cc: Igor Ryzhov 
> Cc: Dan Gora 
>
> Dates are not discussed before, the patch aims to trigger a discussion
> for the dates.
> ---
>  doc/guides/prog_guide/kernel_nic_interface.rst | 2 ++
>  doc/guides/rel_notes/deprecation.rst   | 6 ++
>  2 files changed, 8 insertions(+)
>
> diff --git a/doc/guides/prog_guide/kernel_nic_interface.rst 
> b/doc/guides/prog_guide/kernel_nic_interface.rst
> index f5a8b7c0782c..d1c5ccd0851d 100644
> --- a/doc/guides/prog_guide/kernel_nic_interface.rst
> +++ b/doc/guides/prog_guide/kernel_nic_interface.rst
> @@ -7,6 +7,8 @@ Kernel NIC Interface
>  
>
>  .. Note::
> +   KNI kernel module will be moved from main git repository to `dpdk-kmods 
> `_ repository.
> +   There is a long term plan to deprecate the KNI. See 
> :doc:`../rel_notes/deprecation`
>
> :ref:`virtio_user_as_exceptional_path` alternative is the preferred way 
> for
> interfacing with the Linux network stack as it is an in-kernel solution 
> and
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index 2262b8de6093..f20852504319 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -48,6 +48,12 @@ Deprecation Notices
>in the header will not be considered as ABI anymore. This change is 
> inspired
>by the RFC https://patchwork.dpdk.org/project/dpdk/list/?series=17176.
>
> +* kni: KNI kernel module will be moved to `dpdk-kmods 
> `_
> +  repository by the `DPDK technical board decision
> +  `_, on 
> v22.11.
> +* kni: will be depreciated, will remove all kni lib, kernel module and 
> example code
> +  on v23.11.
> +
>  * lib: will fix extending some enum/define breaking the ABI. There are 
> multiple
>samples in DPDK that enum/define terminated with a ``.*MAX.*`` value which 
> is
>used by iterators, and arrays holding these values are sized with this
> --
> 2.31.1
>

Re: [PATCH] doc: announce name change of stop flush callback

2022-07-11 Thread Jerin Jacob

On Fri, Jun 24, 2022 at 2:40 PM  wrote:
>
> From: Pavan Nikhilesh 
>
> Stop flush callback is missing `rte_` prefix and might conflict with
> application declarations.
>
> Signed-off-by: Pavan Nikhilesh 

Acked-by: Jerin Jacob 

+ @Gage Eads


> ---
>  doc/guides/rel_notes/deprecation.rst | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index 4e5b23c53d..a3c4f4ac09 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -125,3 +125,6 @@ Deprecation Notices
>applications should be updated to use the ``dmadev`` library instead,
>with the underlying HW-functionality being provided by the ``ioat`` or
>``idxd`` dma drivers
> +
> +* eventdev: The function pointer declaration ``eventdev_stop_flush_t`` will 
> be
> +  renamed to ``rte_eventdev_stop_flush_t`` in DPDK 22.11.
> --
> 2.25.1
>

[PATCH v2 1/8] dts: add basic logging facility

2022-07-11 Thread Juraj Linkeš

The logging module provides loggers distinguished by two attributes,
a custom format and a verbosity switch. The loggers log to both console
and more verbosely to files.

Signed-off-by: Juraj Linkeš 
---
 dts/framework/logger.py | 118 
 1 file changed, 118 insertions(+)
 create mode 100644 dts/framework/logger.py

diff --git a/dts/framework/logger.py b/dts/framework/logger.py
new file mode 100644
index 00..2c026ee5eb
--- /dev/null
+++ b/dts/framework/logger.py
@@ -0,0 +1,118 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2010-2014 Intel Corporation
+# Copyright(c) 2022 PANTHEON.tech s.r.o.
+# Copyright(c) 2022 University of New Hampshire
+#
+
+import logging
+import os.path
+from typing import TypedDict
+
+"""
+DTS logger module with several log level. DTS framework and TestSuite log
+will saved into different log files.
+"""
+verbose = False
+date_fmt = "%d/%m/%Y %H:%M:%S"
+stream_fmt = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
+
+
+class LoggerDictType(TypedDict):
+logger: "DTSLOG"
+name: str
+node: str
+
+
+# List for saving all using loggers
+global Loggers
+Loggers: list[LoggerDictType] = []
+
+
+def set_verbose() -> None:
+global verbose
+verbose = True
+
+
+class DTSLOG(logging.LoggerAdapter):
+"""
+DTS log class for framework and testsuite.
+"""
+node: str
+logger: logging.Logger
+sh: logging.StreamHandler
+fh: logging.FileHandler
+verbose_handler: logging.FileHandler
+
+def __init__(self, logger: logging.Logger, node: str = "suite"):
+global log_dir
+
+self.logger = logger
+self.logger.setLevel(1)  # 1 means log everything
+
+self.node = node
+
+# add handler to emit to stdout
+sh = logging.StreamHandler()
+sh.setFormatter(logging.Formatter())
+sh.setFormatter(logging.Formatter(stream_fmt, date_fmt))
+
+sh.setLevel(logging.DEBUG)  # file handler default level
+global verbose
+if verbose is True:
+sh.setLevel(logging.DEBUG)
+else:
+sh.setLevel(logging.INFO)  # console handler defaultlevel
+
+self.logger.addHandler(sh)
+self.sh = sh
+
+if not os.path.exists("output"):
+os.mkdir("output")
+
+fh = logging.FileHandler(f"output/{node}.log")
+fh.setFormatter(logging.Formatter(
+fmt='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
+datefmt=date_fmt))
+
+fh.setLevel(1)  # We want all the logs we can get in the file
+self.logger.addHandler(fh)
+self.fh = fh
+
+# This outputs EVERYTHING, intended for post-mortem debugging
+# Also optimized for processing via AWK (awk -F '|' ...)
+verbose_handler = logging.FileHandler(f"output/{node}.verbose.log")
+verbose_handler.setFormatter(logging.Formatter(
+
fmt='%(asctime)s|%(name)s|%(levelname)s|%(pathname)s|%(lineno)d|%(funcName)s|'
+'%(process)d|%(thread)d|%(threadName)s|%(message)s',
+datefmt=date_fmt))
+
+verbose_handler.setLevel(1)  # We want all the logs we can get in the 
file
+self.logger.addHandler(verbose_handler)
+self.verbose_handler = verbose_handler
+
+super(DTSLOG, self).__init__(self.logger, dict(node=self.node))
+
+def logger_exit(self) -> None:
+"""
+Remove stream handler and logfile handler.
+"""
+for handler in (self.sh, self.fh, self.verbose_handler):
+handler.flush()
+self.logger.removeHandler(handler)
+
+
+def getLogger(name: str, node: str = "suite") -> DTSLOG:
+"""
+Get logger handler and if there's no handler for specified Node will 
create one.
+"""
+global Loggers
+# return saved logger
+logger: LoggerDictType
+for logger in Loggers:
+if logger["name"] == name and logger["node"] == node:
+return logger["logger"]
+
+# return new logger
+dts_logger: DTSLOG = DTSLOG(logging.getLogger(name), node)
+Loggers.append({"logger": dts_logger, "name": name, "node": node})
+return dts_logger
-- 
2.30.2

[PATCH v2 0/8] ssh connection to a node

2022-07-11 Thread Juraj Linkeš

All the necessary code needed to connect to a node in a topology with
some extras, such as basic logging, per-node locks and some extra
useful methods.

To run the code, modify the config file, conf.yaml and execute ./main.py
from the root dts folder. Here's an example config:
executions:
  - system_under_test: "SUT 1"
nodes:
  - name: "SUT 1"
hostname: 127.0.0.1
user: root
password: mypw

The code only connects to a node. You'll see logs emitted to console
saying where DTS connected.

There's no documentation, as there's not much to document - the above is
basically all there is to it, as far as user docs go. We'll add some
real docs when there's enough functionality to document, when the
HelloWorld testcases is in (point 4 in our roadmap below). What will be
documented later is runtime dependencies and how to set up the DTS
control node environment.

This is our current roadmap:
1. Review this patchset and do the rest of the items in parallel, if
possible.
2. Rework the parallel lock mechanism using something simpler using
existing Python tools.
3. Add tools which aid with DTS control node environment setup and
document their usage.
4. We have extracted the code needed to run the most basic testcase,
HelloWorld, which runs the DPDK Hello World application. We'll split
this along logical/functional boundaries and send after 1 is done. In
fact, 1 is part of 4.
5. Once we have 4 applied, we'll planning on adding a basic functional
testcase - pf_smoke. This send a bit of traffic, so the big addition is
the software traffic generator, Scapy.
6. After 5, we'll add a basic performance testcase which doesn't use
Scapy, but Trex or Ixia instead.
7. This is far in the future, but at this point we should have all of
the core functionality in place. What then remains is adding the rest of
the testcases.

v2:
Reworked config parser.
Extended logging to log to files as well.
Added type annotations to most of the code.

Juraj Linkeš (7):
  dts: add basic logging facility
  dts: add ssh pexpect library
  dts: add locks for parallel node connections
  dts: add ssh connection extension
  dts: add Node base class
  dts: add dts workflow module
  dts: add dts executable script

Owen Hilyard (1):
  dts: add config parser module

 dts/conf.yaml  |   7 +
 dts/framework/config/__init__.py   | 116 +++
 dts/framework/config/conf_yaml_schema.json |  73 +++
 dts/framework/dts.py   |  76 
 dts/framework/exception.py |  75 +++
 dts/framework/logger.py| 118 +++
 dts/framework/node.py  | 108 +++
 dts/framework/settings.py  |  43 +
 dts/framework/ssh_connection.py|  70 +++
 dts/framework/ssh_pexpect.py   | 215 +
 dts/framework/utils.py | 130 +
 dts/main.py|  47 +
 12 files changed, 1078 insertions(+)
 create mode 100644 dts/conf.yaml
 create mode 100644 dts/framework/config/__init__.py
 create mode 100644 dts/framework/config/conf_yaml_schema.json
 create mode 100644 dts/framework/dts.py
 create mode 100644 dts/framework/exception.py
 create mode 100644 dts/framework/logger.py
 create mode 100644 dts/framework/node.py
 create mode 100644 dts/framework/settings.py
 create mode 100644 dts/framework/ssh_connection.py
 create mode 100644 dts/framework/ssh_pexpect.py
 create mode 100644 dts/framework/utils.py
 create mode 100755 dts/main.py

-- 
2.30.2

[PATCH v2 2/8] dts: add ssh pexpect library

2022-07-11 Thread Juraj Linkeš

The library uses the pexpect python library and implements connection to
a node and two ways to interact with the node:
1. Send a string with specified prompt which will be matched after
   the string has been sent to the node.
2. Send a command to be executed. No prompt is specified here.

Signed-off-by: Juraj Linkeš 
---
 dts/framework/exception.py   |  60 ++
 dts/framework/ssh_pexpect.py | 207 +++
 dts/framework/utils.py   |  11 ++
 3 files changed, 278 insertions(+)
 create mode 100644 dts/framework/exception.py
 create mode 100644 dts/framework/ssh_pexpect.py
 create mode 100644 dts/framework/utils.py

diff --git a/dts/framework/exception.py b/dts/framework/exception.py
new file mode 100644
index 00..54f9f189a4
--- /dev/null
+++ b/dts/framework/exception.py
@@ -0,0 +1,60 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2010-2014 Intel Corporation
+# Copyright(c) 2022 PANTHEON.tech s.r.o.
+# Copyright(c) 2022 University of New Hampshire
+#
+
+"""
+User-defined exceptions used across the framework.
+"""
+
+
+class TimeoutException(Exception):
+
+"""
+Command execution timeout.
+"""
+
+command: str
+output: str
+
+def __init__(self, command: str, output: str):
+self.command = command
+self.output = output
+
+def __str__(self) -> str:
+return f"TIMEOUT on {self.command}"
+
+def get_output(self) -> str:
+return self.output
+
+
+class SSHConnectionException(Exception):
+
+"""
+SSH connection error.
+"""
+
+host: str
+
+def __init__(self, host: str):
+self.host = host
+
+def __str__(self) -> str:
+return f"Error trying to connect with {self.host}"
+
+
+class SSHSessionDeadException(Exception):
+
+"""
+SSH session is not alive.
+It can no longer be used.
+"""
+
+host: str
+
+def __init__(self, host: str):
+self.host = host
+
+def __str__(self) -> str:
+return f"SSH session with {self.host} has died"
diff --git a/dts/framework/ssh_pexpect.py b/dts/framework/ssh_pexpect.py
new file mode 100644
index 00..c73c1048a4
--- /dev/null
+++ b/dts/framework/ssh_pexpect.py
@@ -0,0 +1,207 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2010-2014 Intel Corporation
+# Copyright(c) 2022 PANTHEON.tech s.r.o.
+# Copyright(c) 2022 University of New Hampshire
+#
+
+import time
+from typing import Optional
+
+from pexpect import pxssh
+
+from .exception import (SSHConnectionException, SSHSessionDeadException,
+TimeoutException)
+from .logger import DTSLOG
+from .utils import GREEN, RED
+
+"""
+Module handles ssh sessions to TG and SUT.
+Implements send_expect function to send commands and get output data.
+"""
+
+
+class SSHPexpect:
+username: str
+password: str
+node: str
+logger: DTSLOG
+magic_prompt: str
+
+def __init__(
+self,
+node: str,
+username: str,
+password: Optional[str],
+logger: DTSLOG,
+):
+self.magic_prompt = "MAGIC PROMPT"
+self.logger = logger
+
+self.node = node
+self.username = username
+self.password = password or ""
+self.logger.info(f"ssh {self.username}@{self.node}")
+
+self._connect_host()
+
+def _connect_host(self) -> None:
+"""
+Create connection to assigned node.
+"""
+retry_times = 10
+try:
+if ":" in self.node:
+while retry_times:
+self.ip = self.node.split(":")[0]
+self.port = int(self.node.split(":")[1])
+self.session = pxssh.pxssh(encoding="utf-8")
+try:
+self.session.login(
+self.ip,
+self.username,
+self.password,
+original_prompt="[$#>]",
+port=self.port,
+login_timeout=20,
+password_regex=r"(?i)(?:password:)|(?:passphrase 
for key)|(?i)(password for .+:)",
+)
+except Exception as e:
+print(e)
+time.sleep(2)
+retry_times -= 1
+print("retry %d times connecting..." % (10 - 
retry_times))
+else:
+break
+else:
+raise Exception("connect to %s:%s failed" % (self.ip, 
self.port))
+else:
+self.session = pxssh.pxssh(encoding="utf-8")
+self.session.login(
+self.node,
+self.username,
+self.password,
+original_prompt="[$#>]",
+password_regex=r"(?i)(?:password:)|(?:passphrase for 
key)|(?i)(password for .+:)",
+

[PATCH v2 3/8] dts: add locks for parallel node connections

2022-07-11 Thread Juraj Linkeš

Each lock is held per node. The lock assures that multiple connections
to the same node don't execute anything at the same time, removing the
possibility of race conditions.

Signed-off-by: Juraj Linkeš 
---
 dts/framework/ssh_pexpect.py | 14 --
 dts/framework/utils.py   | 88 
 2 files changed, 99 insertions(+), 3 deletions(-)

diff --git a/dts/framework/ssh_pexpect.py b/dts/framework/ssh_pexpect.py
index c73c1048a4..01ebd1c010 100644
--- a/dts/framework/ssh_pexpect.py
+++ b/dts/framework/ssh_pexpect.py
@@ -12,7 +12,7 @@
 from .exception import (SSHConnectionException, SSHSessionDeadException,
 TimeoutException)
 from .logger import DTSLOG
-from .utils import GREEN, RED
+from .utils import GREEN, RED, parallel_lock
 
 """
 Module handles ssh sessions to TG and SUT.
@@ -33,6 +33,7 @@ def __init__(
 username: str,
 password: Optional[str],
 logger: DTSLOG,
+sut_id: int,
 ):
 self.magic_prompt = "MAGIC PROMPT"
 self.logger = logger
@@ -42,11 +43,18 @@ def __init__(
 self.password = password or ""
 self.logger.info(f"ssh {self.username}@{self.node}")
 
-self._connect_host()
+self._connect_host(sut_id=sut_id)
 
-def _connect_host(self) -> None:
+@parallel_lock(num=8)
+def _connect_host(self, sut_id: int = 0) -> None:
 """
 Create connection to assigned node.
+Parameter sut_id will be used in parallel_lock thus can assure
+isolated locks for each node.
+Parallel ssh connections are limited to MaxStartups option in SSHD
+configuration file. By default concurrent number is 10, so default
+threads number is limited to 8 which less than 10. Lock number can
+be modified along with MaxStartups value.
 """
 retry_times = 10
 try:
diff --git a/dts/framework/utils.py b/dts/framework/utils.py
index 7036843dd7..a637c4641e 100644
--- a/dts/framework/utils.py
+++ b/dts/framework/utils.py
@@ -1,7 +1,95 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2010-2014 Intel Corporation
+# Copyright(c) 2022 PANTHEON.tech s.r.o.
+# Copyright(c) 2022 University of New Hampshire
 #
 
+import threading
+from functools import wraps
+from typing import Any, Callable, TypeVar
+
+locks_info: list[dict[str, Any]] = list()
+
+T = TypeVar("T")
+
+
+def parallel_lock(num: int = 1) -> Callable[[Callable[..., T]], Callable[..., 
T]]:
+"""
+Wrapper function for protect parallel threads, allow multiple threads
+share one lock. Locks are created based on function name. Thread locks are
+separated between SUTs according to argument 'sut_id'.
+Parameter:
+num: Number of parallel threads for the lock
+"""
+global locks_info
+
+def decorate(func: Callable[..., T]) -> Callable[..., T]:
+# mypy does not know how to handle the types of this function, so Any 
is required
+@wraps(func)
+def wrapper(*args: Any, **kwargs: Any) -> T:
+if "sut_id" in kwargs:
+sut_id = kwargs["sut_id"]
+else:
+sut_id = 0
+
+# in case function arguments is not correct
+if sut_id >= len(locks_info):
+sut_id = 0
+
+lock_info = locks_info[sut_id]
+uplock = lock_info["update_lock"]
+
+name = func.__name__
+uplock.acquire()
+
+if name not in lock_info:
+lock_info[name] = dict()
+lock_info[name]["lock"] = threading.RLock()
+lock_info[name]["current_thread"] = 1
+else:
+lock_info[name]["current_thread"] += 1
+
+lock = lock_info[name]["lock"]
+
+# make sure when owned global lock, should also own update lock
+if lock_info[name]["current_thread"] >= num:
+if lock._is_owned():
+print(
+RED(
+f"SUT{sut_id:d} {threading.current_thread().name} 
waiting for func lock {func.__name__}"
+)
+)
+lock.acquire()
+else:
+uplock.release()
+
+try:
+ret = func(*args, **kwargs)
+except Exception as e:
+if not uplock._is_owned():
+uplock.acquire()
+
+if lock._is_owned():
+lock.release()
+lock_info[name]["current_thread"] = 0
+uplock.release()
+raise e
+
+if not uplock._is_owned():
+uplock.acquire()
+
+if lock._is_owned():
+lock.release()
+lock_info[name]["current_thread"] = 0
+
+uplock.release()
+
+return ret
+
+return wrapper
+
+return decorate
+
 
 def RED(text: str) -> str:
 retu

[PATCH v2 4/8] dts: add ssh connection extension

2022-07-11 Thread Juraj Linkeš

The class adds logging and history records to existing pexpect methods.

Signed-off-by: Juraj Linkeš 
---
 dts/framework/ssh_connection.py | 70 +
 1 file changed, 70 insertions(+)
 create mode 100644 dts/framework/ssh_connection.py

diff --git a/dts/framework/ssh_connection.py b/dts/framework/ssh_connection.py
new file mode 100644
index 00..0ffde0bc58
--- /dev/null
+++ b/dts/framework/ssh_connection.py
@@ -0,0 +1,70 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2010-2014 Intel Corporation
+# Copyright(c) 2022 PANTHEON.tech s.r.o.
+# Copyright(c) 2022 University of New Hampshire
+#
+import dataclasses
+from typing import Any, Optional
+
+from .logger import DTSLOG
+from .ssh_pexpect import SSHPexpect
+
+
+@dataclasses.dataclass(slots=True, frozen=True)
+class HistoryRecord:
+command: str
+name: str
+output: str | int
+
+
+class SSHConnection(object):
+"""
+Module for create session to node.
+"""
+
+name: str
+history: list[HistoryRecord]
+logger: DTSLOG
+session: SSHPexpect | Any
+
+def __init__(
+self,
+node: str,
+session_name: str,
+logger: DTSLOG,
+username: str,
+password: Optional[str] = "",
+sut_id: int = 0,
+):
+self.session = SSHPexpect(node, username, password, logger, 
sut_id=sut_id)
+self.name = session_name
+self.history = []
+self.logger = logger
+
+def send_expect(
+self, cmds: str, expected: str, timeout: int = 15, verify: bool = False
+) -> str | int:
+self.logger.info(cmds)
+out = self.session.send_expect(cmds, expected, timeout, verify)
+if isinstance(out, str):
+self.logger.debug(out.replace(cmds, ""))
+self.history.append(HistoryRecord(command=cmds, name=self.name, 
output=out))
+return out
+
+def send_command(self, cmds: str, timeout: float = 1) -> str:
+self.logger.info(cmds)
+out = self.session.send_command(cmds, timeout)
+self.logger.debug(out.replace(cmds, ""))
+self.history.append(HistoryRecord(command=cmds, name=self.name, 
output=out))
+return out
+
+def get_session_before(self, timeout: float = 15) -> str:
+out = self.session.get_session_before(timeout)
+self.logger.debug(out)
+return out
+
+def close(self, force: bool = False) -> None:
+if getattr(self, "logger", None):
+self.logger.logger_exit()
+
+self.session.close(force)
-- 
2.30.2

[PATCH v2 5/8] dts: add config parser module

2022-07-11 Thread Juraj Linkeš

From: Owen Hilyard 

The configuration is split into two parts, one defining the parameters
of the test run and the other defining the topology to be used.

The format of the configuration is YAML. It is validated according to a
json schema which also servers as detailed documentation of the various
configuration fields. This means that the complete set of allowed values
are tied to the schema as a source of truth. This enables making changes
to parts of DTS that interface with config files without a high risk of
breaking someone's configuration.

This configuration system uses immutable objects to represent the
configuration, making IDE/LSP autocomplete work properly.

Signed-off-by: Owen Hilyard 
Signed-off-by: Juraj Linkeš 
---
 dts/conf.yaml  |   7 ++
 dts/framework/config/__init__.py   | 116 +
 dts/framework/config/conf_yaml_schema.json |  73 +
 dts/framework/exception.py |  15 +++
 dts/framework/settings.py  |  40 +++
 5 files changed, 251 insertions(+)
 create mode 100644 dts/conf.yaml
 create mode 100644 dts/framework/config/__init__.py
 create mode 100644 dts/framework/config/conf_yaml_schema.json
 create mode 100644 dts/framework/settings.py

diff --git a/dts/conf.yaml b/dts/conf.yaml
new file mode 100644
index 00..11b5f53a66
--- /dev/null
+++ b/dts/conf.yaml
@@ -0,0 +1,7 @@
+executions:
+  - system_under_test: "SUT 1"
+nodes:
+  - name: "SUT 1"
+hostname: "SUT IP Address or hostname"
+user: root
+password: "leave blank to use SSH keys"
diff --git a/dts/framework/config/__init__.py b/dts/framework/config/__init__.py
new file mode 100644
index 00..511e70c9a5
--- /dev/null
+++ b/dts/framework/config/__init__.py
@@ -0,0 +1,116 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2010-2021 Intel Corporation
+# Copyright(c) 2022 University of New Hampshire
+#
+
+"""
+Generic port and topology nodes configuration file load function
+"""
+import json
+import os.path
+import pathlib
+from dataclasses import dataclass
+from enum import Enum, auto, unique
+from typing import Any, Optional
+
+import warlock
+import yaml
+
+from framework.settings import get_config_file_path
+
+
+class StrEnum(Enum):
+@staticmethod
+def _generate_next_value_(
+name: str, start: int, count: int, last_values: object
+) -> str:
+return name
+
+
+@unique
+class NodeType(StrEnum):
+physical = auto()
+virtual = auto()
+
+
+# Slots enables some optimizations, by pre-allocating space for the defined
+# attributes in the underlying data structure.
+#
+# Frozen makes the object immutable. This enables further optimizations,
+# and makes it thread safe should we every want to move in that direction.
+@dataclass(slots=True, frozen=True)
+class NodeConfiguration:
+name: str
+hostname: str
+user: str
+password: Optional[str]
+
+@staticmethod
+def from_dict(d: dict) -> "NodeConfiguration":
+return NodeConfiguration(
+name=d["name"],
+hostname=d["hostname"],
+user=d["user"],
+password=d.get("password"),
+)
+
+
+@dataclass(slots=True, frozen=True)
+class ExecutionConfiguration:
+system_under_test: str
+
+@staticmethod
+def from_dict(d: dict) -> "ExecutionConfiguration":
+return ExecutionConfiguration(
+system_under_test=d["system_under_test"],
+)
+
+
+@dataclass(slots=True, frozen=True)
+class Configuration:
+executions: list[ExecutionConfiguration]
+nodes: list[NodeConfiguration]
+
+@staticmethod
+def from_dict(d: dict) -> "Configuration":
+executions: list[ExecutionConfiguration] = list(
+map(ExecutionConfiguration.from_dict, d["executions"])
+)
+nodes: list[NodeConfiguration] = list(
+map(NodeConfiguration.from_dict, d["nodes"])
+)
+assert len(nodes) > 0, "There must be a node to test"
+
+for i, n1 in enumerate(nodes):
+for j, n2 in enumerate(nodes):
+if i != j:
+assert n1.name == n2.name, "Duplicate node names are not 
allowed"
+
+node_names = {node.name for node in nodes}
+for execution in executions:
+assert (
+execution.system_under_test in node_names
+), f"Unknown SUT {execution.system_under_test} in execution"
+
+return Configuration(executions=executions, nodes=nodes)
+
+
+def load_config(conf_file_path: str) -> Configuration:
+"""
+Loads the configuration file and the configuration file schema,
+validates the configuration file, and creates a configuration object.
+"""
+conf_file_path: str = get_config_file_path(conf_file_path)
+with open(conf_file_path, "r") as f:
+config_data = yaml.safe_load(f)
+schema_path = os.path.join(
+pathlib.Path(__file__).parent.resolve(), "conf_yaml_schema.json"
+

[PATCH v2 6/8] dts: add Node base class

2022-07-11 Thread Juraj Linkeš

The base class implements basic node management methods - connect and
execute commands.

Signed-off-by: Juraj Linkeš 
---
 dts/framework/node.py | 108 ++
 dts/framework/settings.py |   3 ++
 2 files changed, 111 insertions(+)
 create mode 100644 dts/framework/node.py

diff --git a/dts/framework/node.py b/dts/framework/node.py
new file mode 100644
index 00..09f567f32f
--- /dev/null
+++ b/dts/framework/node.py
@@ -0,0 +1,108 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2010-2014 Intel Corporation
+# Copyright(c) 2022 PANTHEON.tech s.r.o.
+# Copyright(c) 2022 University of New Hampshire
+#
+
+from typing import Optional
+
+from .config import NodeConfiguration
+from .logger import DTSLOG, getLogger
+from .settings import TIMEOUT
+from .ssh_connection import SSHConnection
+
+"""
+A node is a generic host that DTS connects to and manages.
+"""
+
+
+class Node(object):
+"""
+Basic module for node management. This module implements methods that
+manage a node, such as information gathering (of CPU/PCI/NIC) and
+environment setup.
+"""
+
+_config: NodeConfiguration
+name: str
+sut_id: int
+logger: DTSLOG
+session: SSHConnection
+
+def __init__(self, node_config: NodeConfiguration, sut_id: int = 0):
+self._config = node_config
+self.name = node_config.name
+self.sut_id = sut_id
+
+self.logger = getLogger(self.name)
+self.logger.info(f"Created node: {self.name}")
+self.session = SSHConnection(
+self.get_ip_address(),
+self.name,
+self.logger,
+self.get_username(),
+self.get_password(),
+self.sut_id,
+)
+
+def get_ip_address(self) -> str:
+"""
+Get SUT's ip address.
+"""
+return self._config.hostname
+
+def get_password(self) -> Optional[str]:
+"""
+Get SUT's login password.
+"""
+return self._config.password
+
+def get_username(self) -> str:
+"""
+Get SUT's login username.
+"""
+return self._config.user
+
+def send_expect(
+self,
+command: str,
+expected: str,
+timeout: int = TIMEOUT,
+verify: bool = False,
+trim_whitespace: bool = True,
+) -> str | int:
+"""
+Send commands to node and return string before expected string. If
+there's no expected string found before timeout, TimeoutException will
+be raised.
+
+By default, it will trim the whitespace from the expected string. This
+behavior can be turned off via the trim_whitespace argument.
+"""
+
+if trim_whitespace:
+expected = expected.strip()
+
+return self.session.send_expect(command, expected, timeout, verify)
+
+def send_command(self, cmds: str, timeout: float = TIMEOUT) -> str:
+"""
+Send commands to node and return string before timeout.
+"""
+
+return self.session.send_command(cmds, timeout)
+
+def close(self) -> None:
+"""
+Close ssh session of SUT.
+"""
+if self.session:
+self.session.close()
+self.session = None
+
+def node_exit(self) -> None:
+"""
+Recover all resource before node exit
+"""
+self.close()
+self.logger.logger_exit()
diff --git a/dts/framework/settings.py b/dts/framework/settings.py
index ca3cbd71b1..f5028e88c9 100644
--- a/dts/framework/settings.py
+++ b/dts/framework/settings.py
@@ -7,6 +7,9 @@
 import os
 import re
 
+# Default session timeout.
+TIMEOUT: int = 15
+
 DEFAULT_CONFIG_FILE_PATH: str = "./conf.yaml"
 
 # DTS global environment variables
-- 
2.30.2

[PATCH v2 7/8] dts: add dts workflow module

2022-07-11 Thread Juraj Linkeš

The module implements methods needed to run DTS. It handles the creation
of objects and eventually the whole DTS workflow, such as running node
setups, test gathering, setup and execution and various cleanups.

Signed-off-by: Juraj Linkeš 
---
 dts/framework/dts.py   | 76 ++
 dts/framework/utils.py | 35 +--
 2 files changed, 109 insertions(+), 2 deletions(-)
 create mode 100644 dts/framework/dts.py

diff --git a/dts/framework/dts.py b/dts/framework/dts.py
new file mode 100644
index 00..52ec0638df
--- /dev/null
+++ b/dts/framework/dts.py
@@ -0,0 +1,76 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2010-2019 Intel Corporation
+# Copyright(c) 2022 PANTHEON.tech s.r.o.
+# Copyright(c) 2022 University of New Hampshire
+#
+
+import sys
+from typing import Iterable, Optional
+
+import framework.logger as logger
+
+from .config import Configuration, load_config
+from .logger import getLogger
+from .node import Node
+from .utils import check_dts_python_version, create_parallel_locks
+
+log_handler: Optional[logger.DTSLOG] = None
+
+
+def dts_nodes_exit(nodes: Iterable[Node]) -> None:
+"""
+Call SUT and TG exit function after execution finished
+"""
+for node in nodes:
+node.node_exit()
+
+
+def run_all(
+config_file,
+verbose,
+) -> None:
+"""
+Main process of DTS, it will run all test suites in the config file.
+"""
+
+global log_handler
+
+# check the python version of the server that run dts
+check_dts_python_version()
+
+# init log_handler handler
+if verbose is True:
+logger.set_verbose()
+
+log_handler = getLogger("dts")
+
+# parse input config file
+config: Configuration = load_config(config_file)
+
+# init global lock
+create_parallel_locks(len(config.nodes))
+
+nodes = []
+try:
+nodes = [
+Node(node_config, sut_id=i)
+for i, node_config in enumerate(config.nodes)
+]
+dts_nodes_exit(nodes)
+finally:
+quit_execution(nodes)
+
+
+def quit_execution(nodes: Iterable[Node]) -> None:
+"""
+Close session to SUT and TG before quit.
+Return exit status when failure occurred.
+"""
+for node in nodes:
+# close all session
+node.node_exit()
+
+if log_handler is not None:
+log_handler.info("DTS ended")
+
+sys.exit(0)
diff --git a/dts/framework/utils.py b/dts/framework/utils.py
index a637c4641e..1f7f28d0c5 100644
--- a/dts/framework/utils.py
+++ b/dts/framework/utils.py
@@ -4,15 +4,29 @@
 # Copyright(c) 2022 University of New Hampshire
 #
 
+import sys
 import threading
 from functools import wraps
 from typing import Any, Callable, TypeVar
 
 locks_info: list[dict[str, Any]] = list()
 
-T = TypeVar("T")
+
+def create_parallel_locks(num_suts: int) -> None:
+"""
+Create thread lock dictionary based on SUTs number
+"""
+global locks_info
+
+locks_info = list()
+for _ in range(num_suts):
+lock_info = dict()
+lock_info["update_lock"] = threading.RLock()
+locks_info.append(lock_info)
 
 
+T = TypeVar("T")
+
 def parallel_lock(num: int = 1) -> Callable[[Callable[..., T]], Callable[..., 
T]]:
 """
 Wrapper function for protect parallel threads, allow multiple threads
@@ -21,7 +35,6 @@ def parallel_lock(num: int = 1) -> Callable[[Callable[..., 
T]], Callable[..., T]
 Parameter:
 num: Number of parallel threads for the lock
 """
-global locks_info
 
 def decorate(func: Callable[..., T]) -> Callable[..., T]:
 # mypy does not know how to handle the types of this function, so Any 
is required
@@ -97,3 +110,21 @@ def RED(text: str) -> str:
 
 def GREEN(text: str) -> str:
 return f"\u001B[32;1m{str(text)}\u001B[0m"
+
+
+def check_dts_python_version() -> None:
+if (
+sys.version_info.major < 3
+or (sys.version_info.major == 3 and sys.version_info.minor < 10)
+):
+print(
+RED(
+(
+"WARNING: Dts running node python version is lower than 
python 3.10, "
+"it is deprecated for use in DTS, "
+"and will not work in future releases."
+)
+),
+file=sys.stderr,
+)
+print(RED("Please use Python >= 3.10 instead"), file=sys.stderr)
\ No newline at end of file
-- 
2.30.2

[PATCH v2 8/8] dts: add dts executable script

2022-07-11 Thread Juraj Linkeš

The script is an interface to run DTS with standard argument parser.

Signed-off-by: Juraj Linkeš 
---
 dts/main.py | 47 +++
 1 file changed, 47 insertions(+)
 create mode 100755 dts/main.py

diff --git a/dts/main.py b/dts/main.py
new file mode 100755
index 00..e228604b52
--- /dev/null
+++ b/dts/main.py
@@ -0,0 +1,47 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2010-2014 Intel Corporation
+# Copyright(c) 2022 PANTHEON.tech s.r.o.
+# Copyright(c) 2022 University of New Hampshire
+#
+
+"""
+A test framework for testing DPDK.
+"""
+
+import argparse
+import logging
+
+from framework import dts
+from framework.settings import DEFAULT_CONFIG_FILE_PATH
+
+
+def main() -> None:
+# Read cmd-line args
+parser = argparse.ArgumentParser(description="DPDK test framework.")
+
+parser.add_argument(
+"--config-file",
+default=DEFAULT_CONFIG_FILE_PATH,
+help="configuration file that describes the test cases, SUTs and 
targets",
+)
+
+parser.add_argument(
+"-v",
+"--verbose",
+action="store_true",
+help="enable verbose output, all message output on screen",
+)
+
+args = parser.parse_args()
+
+dts.run_all(
+args.config_file,
+args.verbose,
+)
+
+
+# Main program begins here
+if __name__ == "__main__":
+logging.raiseExceptions = True
+main()
-- 
2.30.2

Re: [PATCH] doc: announce change in event vector structure

2022-07-11 Thread Jerin Jacob

On Fri, Jun 24, 2022 at 2:42 PM  wrote:
>
> From: Pavan Nikhilesh 
>
> The field `*u64s` in the structure `rte_event_vector` will
> be replaced with `u64s`.
>
> Signed-off-by: Pavan Nikhilesh 


Acked-by: Jerin Jacob 


> ---
>  doc/guides/rel_notes/deprecation.rst | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index a3c4f4ac09..6b50ec4865 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -128,3 +128,6 @@ Deprecation Notices
>
>  * eventdev: The function pointer declaration ``eventdev_stop_flush_t`` will 
> be
>renamed to ``rte_eventdev_stop_flush_t`` in DPDK 22.11.
> +
> +* eventdev: The element ``*u64s`` in the structure ``rte_event_vector`` is
> +  deprecated and will be replaced with ``u64s`` in DPDK 22.11.
> --
> 2.25.1
>

Re: [PATCH] doc: announce change to cryptodev cb function prototype

2022-07-11 Thread Jerin Jacob

On Wed, Jul 6, 2022 at 2:41 PM Zhang, Roy Fan  wrote:
>
> > -Original Message-
> > From: Srujana Challa 
> > Sent: Friday, June 24, 2022 2:45 PM
> > To: gak...@marvell.com; Zhang, Roy Fan 
> > Cc: dev@dpdk.org; jer...@marvell.com; ndabilpu...@marvell.com;
> > ano...@marvell.com
> > Subject: [PATCH] doc: announce change to cryptodev cb function prototype

> > --
> > 2.25.1
> Acked-by: Fan Zhang 


Acked-by: Jerin Jacob

Re: [PATCH v2 1/1] doc: announce addition of new ipsec event subtypes

2022-07-11 Thread Jerin Jacob

On Wed, Jul 6, 2022 at 3:19 PM Zhang, Roy Fan  wrote:
>
> > -Original Message-
> > From: Vamsi Attunuru 
> > Sent: Monday, June 27, 2022 6:30 AM
> > To: dev@dpdk.org
> > Cc: jer...@marvell.com; vattun...@marvell.com; gak...@marvell.com;
> > ferruh.yi...@intel.com; tho...@monjalon.net; ano...@marvell.com;
> > m...@ashroe.eu
> > Subject: [PATCH v2 1/1] doc: announce addition of new ipsec event subtypes
> >
> > New event subtypes need to be added for notifying expiry events
> > upon reaching IPsec SA soft packet expiry and hard packet/byte
> > expiry limits. This would be added in DPDK 22.11.
> >
> > Signed-off-by: Vamsi Attunuru 
> > Acked-by: Akhil Goyal 
> > ---
> > More details on new event subtype proposal discussion are in below email
> > thread.
> > https://patches.dpdk.org/project/dpdk/patch/20220416192530.173895-8-
> > gak...@marvell.com/
> >
> > v2: rephrase title and git log.
> > ---
> Acked-by: Fan Zhang 

Acked-by: Jerin Jacob

Re: [PATCH 1/2] doc: add enqueue depth for new event type

2022-07-11 Thread Jerin Jacob

On Mon, Jun 27, 2022 at 3:29 PM  wrote:
>
> From: Pavan Nikhilesh 
>
> A new field ``max_event_port_enqueue_new_burst`` will be added to the
> structure ``rte_event_dev_info``. The field defines the max enqueue
> burst size of new events (OP_NEW) supported by the underlying event
> device.
>
> Signed-off-by: Pavan Nikhilesh 

Acked-by: Jerin Jacob 


> ---
>  doc/guides/rel_notes/deprecation.rst | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index 4e5b23c53d..071317e8e3 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -125,3 +125,8 @@ Deprecation Notices
>applications should be updated to use the ``dmadev`` library instead,
>with the underlying HW-functionality being provided by the ``ioat`` or
>``idxd`` dma drivers
> +
> +* eventdev: The structure ``rte_event_dev_info`` will be extended to include 
> the
> +  max enqueue burst size of new events supported by the underlying event 
> device.
> +  A new field ``max_event_port_enqueue_new_burst`` will be added to the 
> structure
> +  ``rte_event_dev_info`` in DPDK 22.11.
> --
> 2.25.1
>

Re: [PATCH] doc: add event timer expiry drop stat

2022-07-11 Thread Jerin Jacob

On Fri, Jul 1, 2022 at 12:43 AM Carrillo, Erik G
 wrote:
>
> > -Original Message-
> > From: Naga Harish K, S V 
> > Sent: Monday, June 27, 2022 10:40 AM
> > To: m...@ashroe.eu; jer...@marvell.com; pbhagavat...@marvell.com;
> > sthot...@marvell.com; Carrillo, Erik G 
> > Cc: dev@dpdk.org
> > Subject: [PATCH] doc: add event timer expiry drop stat
> >
> > The structure ``rte_event_timer_adapter_stats`` will be extended by adding
> > a new field, ``evtim_drop_count``. This stat will represent the number of
> > times an event timer expiry is dropped by the event timer adapter.
> >
> > Signed-off-by: Naga Harish K S V 
> Reviewed-by: Erik Gabriel Carrillo 

Acked-by: Jerin Jacob

Re: [PATCH] doc: announce changes to event vector structure

2022-07-11 Thread Jerin Jacob

On Tue, Jun 28, 2022 at 5:01 PM  wrote:
>
> From: Pavan Nikhilesh 
>
> The structure ``rte_event_vector`` will be modified to include
> ``elem_offset:12`` bits taken from ``rsvd:15``.
> The ``elem_offset`` defines the offset into the vector array from
> which valid elements are present.
>
> Signed-off-by: Pavan Nikhilesh 


Acked-by: Jerin Jacob 


> ---
>  doc/guides/rel_notes/deprecation.rst | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index 4e5b23c53d..d7933629f2 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -125,3 +125,10 @@ Deprecation Notices
>applications should be updated to use the ``dmadev`` library instead,
>with the underlying HW-functionality being provided by the ``ioat`` or
>``idxd`` dma drivers
> +
> +* eventdev: The structure ``rte_event_vector`` will be modified to include
> +  ``elem_offset:12`` bits taken from ``rsvd:15``. The ``elem_offset`` defines
> +  the offset into the vector array from which valid elements are present.
> +  The difference between ``rte_event_vector::nb_elem`` and
> +  ``rte_event_vector::elem_offset`` gives the number of valid elements left 
> to
> +  process from the ``rte_event_vector::elem_offset``.
> --
> 2.25.1
>

Re: [PATCH] doc: announce change in crypto adapter queue add

2022-07-11 Thread Jerin Jacob

On Tue, Jun 28, 2022 at 6:12 PM Akhil Goyal  wrote:
>
> > Subject: [PATCH] doc: announce change in crypto adapter queue add
> >
> > The function `rte_event_crypto_adapter_queue_pair_add` will accept
> > `rte_event_crypto_adapter_queue_conf` argument instead of `rte_event`.
> >
> > Signed-off-by: Volodymyr Fialko 
> Acked-by: Akhil Goyal 

Acked-by: Jerin Jacob 


>

Re: [PATCH] doc: announce support for MACsec in rte_security

2022-07-11 Thread Jerin Jacob

On Wed, Jun 29, 2022 at 1:10 PM Zhang, Roy Fan  wrote:
>
> > -Original Message-
> > From: Akhil Goyal 
> > Sent: Tuesday, June 28, 2022 8:08 PM
> > To: dev@dpdk.org
> > Cc: tho...@monjalon.net; david.march...@redhat.com;
> > hemant.agra...@nxp.com; ano...@marvell.com;
> > konstantin.v.anan...@yandex.ru; ferruh.yi...@xilinx.com;
> > andrew.rybche...@oktetlabs.ru; ndabilpu...@marvell.com;
> > vattun...@marvell.com; ma...@nvidia.com; Zhang, Roy Fan
> > ; jer...@marvell.com; jiawe...@trustnetic.com;
> > Yang, Qiming ; Akhil Goyal 
> > Subject: [PATCH] doc: announce support for MACsec in rte_security
> >
> > MACsec support is planned for DPDK 22.11, which would
> > result in ABI breakage in some of the rte_security structures.
> > This patch is to give deprecation notice for the affected structures.
> >
> > Signed-off-by: Akhil Goyal 
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index 4e5b23c53d..1c3bf54d72 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -116,6 +116,11 @@ Deprecation Notices
> >pointer for the private data to the application which can be attached
> >to the packet while enqueuing.
> >
> > +* security: MACsec support is planned to be added in DPDK 22.11 which
> > would
> > +  result in updates to structures ``rte_security_macsec_xform``,
> > +  ``rte_security_macsec_stats`` and security capability structure
> > +  ``rte_security_capability`` to accomodate MACsec capabilities.
> > +
> >  * metrics: The function ``rte_metrics_init`` will have a non-void return
> >in order to notify errors instead of calling ``rte_exit``.
> >
> > --
> > 2.25.1
> Acked-by: Fan Zhang 

Acked-by: Jerin Jacob

Re: [PATCH] doc: announce marking device and driver objects as internal

2022-07-11 Thread Jerin Jacob

On Sun, Jul 10, 2022 at 11:48 AM David Marchand
 wrote:
>
> rte_driver and rte_device are unnecessarily exposed in the public API/ABI.
> Announce that they will be made opaque in the public API and mark
> associated API as internal.
> This impacts all bus, as their driver registration mechanism will be
> made internal.
>
> Note: the PCI bus had a similar deprecation notice that we can remove as
> the new one is more generic.
>
> Signed-off-by: David Marchand 

Acked-by: Jerin Jacob 

> ---
>  doc/guides/rel_notes/deprecation.rst | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index a9fd6676be..b9cc267b30 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -38,6 +38,13 @@ Deprecation Notices
>external users may still register their bus using a new driver header (see
>``enable_driver_sdk`` meson option).
>
> +* drivers: As a followup on the work on the ``rte_bus`` object, the
> +  ``rte_driver`` and ``rte_device`` objects (and as a domino effect, their
> +  bus-specific counterparts) will be made opaque in DPDK 22.11.
> +  Registering a driver on a bus will be marked as an internal API:
> +  external users may still register their drivers using the bus specific
> +  driver header (see ``enable_driver_sdk`` meson option).
> +
>  * mempool: Helper macro ``MEMPOOL_HEADER_SIZE()`` is deprecated and will
>be removed in DPDK 22.11. The replacement macro
>``RTE_MEMPOOL_HEADER_SIZE()`` is internal only.
> @@ -49,11 +56,6 @@ Deprecation Notices
>  * mempool: The mempool API macros ``MEMPOOL_PG_*`` are deprecated and
>will be removed in DPDK 22.11.
>
> -* pci: To reduce unnecessary ABIs exposed by DPDK bus driver, "rte_bus_pci.h"
> -  will be made internal in 21.11 and macros/data structures/functions defined
> -  in the header will not be considered as ABI anymore. This change is 
> inspired
> -  by the RFC https://patchwork.dpdk.org/project/dpdk/list/?series=17176.
> -
>  * lib: will fix extending some enum/define breaking the ABI. There are 
> multiple
>samples in DPDK that enum/define terminated with a ``.*MAX.*`` value which 
> is
>used by iterators, and arrays holding these values are sized with this
> --
> 2.36.1
>

Re: [PATCH] doc: announce rename of octeontx_ep driver

2022-07-11 Thread Jerin Jacob

On Fri, Jul 1, 2022 at 1:40 PM Veerasenareddy Burru  wrote:
>
> To enable single unified driver to support current OcteonTx and
> future Octeon PCI endpoint NICs, octeontx_ep driver is renamed to
> octeon_ep to reflect common driver for all Octeon based
> PCI endpoint NICs.
>
> Signed-off-by: Veerasenareddy Burru 


Acked-by: Jerin Jacob 

> ---
>  doc/guides/rel_notes/deprecation.rst | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index 4e5b23c53d..d50e68aed4 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -125,3 +125,12 @@ Deprecation Notices
>applications should be updated to use the ``dmadev`` library instead,
>with the underlying HW-functionality being provided by the ``ioat`` or
>``idxd`` dma drivers
> +
> +* drivers/octeontx_ep: rename octeontx_ep drivers
> +
> +  Current driver name "octeontx_ep" was to support OcteonTX line of products.
> +  This is renamed to apply for all Octeon EP products:
> +  OcteonTX + future Octeon chipsets.
> +  This deprecation notice is to do following actions in DPDK v22.11 version.
> +
> +  #. Rename ``drivers/net/octeontx_ep/`` to ``drivers/net/octeon_ep/``
> --
> 2.36.0
>

Re: [PATCH v2] net/iavf: fix gtpu extension flow error

2022-07-11 Thread Thomas Monjalon

07/07/2022 09:13, Zhang, Qi Z:
> From: Yang, Qiming 
> > From: Wu, WenxuanX 
> > > From: Wenxuan Wu 
> > >
> > > Due to the change of struct rte_gtp_psc_generic_hdr, kernel driver can
> > > not handle gtp_psc properly, we introduce a new structure to fix this
> > > gap between kernel driver and struct rte_gtp_psc_generic_hdr.
> > >
> > > Fixes: d5eb3e600d9e ("net/iavf: support flow director basic rule")
> > > Cc: simei...@intel.com
> > 
> > This line should be delete before patch sent.
> > 
> > > Cc: sta...@dpdk.com

it is dpdk.org!

> > >
> > > Signed-off-by: Wenxuan Wu 
> > 
> > Acked-by: Qiming Yang  

no uppercase or double space in emails please

> 
> Applied to dpdk-next-net-intel.
> 
> Thanks
> Qi

[PATCH v1 0/4] add uncore api to be called through l3fwd-power

2022-07-11 Thread tadhgkearney

This is targeting 22.11 and aims to add an API to DPDK to allow uncore 
frequency adjustment.
This will be called through the l3fwd-power app, and gives the ability to set 
the min, max and specific frequency index that you want the uncore to be at.

tadhgkearney (4):
  power: add uncore api to power library
  l3fwd-power: add option to call uncore api
  test/power: add unit tests for uncore api
  config: add uncore defines for x86

 app/test/meson.build  |   2 +
 app/test/test_power_uncore.c  | 245 +++
 config/x86/meson.build|   2 +
 doc/guides/prog_guide/power_man.rst   |  27 ++
 .../sample_app_ug/l3_forward_power_man.rst|  28 ++
 examples/l3fwd-power/main.c   | 154 ++-
 lib/power/meson.build |   2 +
 lib/power/rte_power_uncore.c  | 401 ++
 lib/power/rte_power_uncore.h  | 159 +++
 lib/power/version.map |   7 +
 10 files changed, 1018 insertions(+), 1 deletion(-)
 create mode 100644 app/test/test_power_uncore.c
 create mode 100644 lib/power/rte_power_uncore.c
 create mode 100644 lib/power/rte_power_uncore.h

-- 
2.25.1

[PATCH v1 1/4] power: add uncore api to power library

2022-07-11 Thread tadhgkearney

Add api to allow uncore frequency adjustment. This is done through manipulating 
related sysfs entries to adjust the min/max uncore values.
Seven api's are being added that are all public and experimental. Power man 
docs updated to reflect this.

Signed-off-by: tadhgkearney 
---
 doc/guides/prog_guide/power_man.rst |  27 ++
 lib/power/meson.build   |   2 +
 lib/power/rte_power_uncore.c| 401 
 lib/power/rte_power_uncore.h| 151 +++
 lib/power/version.map   |   7 +
 5 files changed, 588 insertions(+)
 create mode 100644 lib/power/rte_power_uncore.c
 create mode 100644 lib/power/rte_power_uncore.h

diff --git a/doc/guides/prog_guide/power_man.rst 
b/doc/guides/prog_guide/power_man.rst
index 98cfd3c1f3..a512bca0b6 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -276,6 +276,33 @@ API Overview for Ethernet PMD Power Management
 * **Set Scaling Max Freq**: Set the maximum frequency (kHz) to be used in 
Frequency
   Scaling mode.
 
+Uncore API
+--
+
+Abstract
+
+Up to 60% power saving can be achieved by reducing the uncore frequency to its 
lowest value. 
+With later kernels, there is now a sysfs entry to allow adjustment of uncore 
frequency. 
+This manipulates the contest of MSR 0x620, which sets min/max of the uncore 
for the SKU.
+
+
+API Overview for Uncore
+~~~
+* **Uncore Power Init**: Initialise uncore power, populate frequency array and 
record 
+  original min & max for pkg & die.
+
+* **Uncore Power Exit**: Exit uncore power, restoring original min & max for 
pkg & die.
+
+* **Get Uncore Power Freq**: Get current uncore freq index for pkg & die.
+
+* **Set Uncore Power Freq**: Set min & max uncore freq index for pkg & die 
(min and max will be the same).
+
+* **Uncore Power Max**: Set max uncore freq index for pkg & die.
+
+* **Uncore Power Min**: Set min uncore freq index for pkg & die.
+
+* **Get Num Freqs**: Get the number of frequencies in the index array.
+
 References
 --
 
diff --git a/lib/power/meson.build b/lib/power/meson.build
index ba8d66074b..80cdeb72d4 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -21,12 +21,14 @@ sources = files(
 'rte_power.c',
 'rte_power_empty_poll.c',
 'rte_power_pmd_mgmt.c',
+'rte_power_uncore.c',
 )
 headers = files(
 'rte_power.h',
 'rte_power_empty_poll.h',
 'rte_power_pmd_mgmt.h',
 'rte_power_guest_channel.h',
+'rte_power_uncore.h',
 )
 if cc.has_argument('-Wno-cast-qual')
 cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
new file mode 100644
index 00..5fd8313b03
--- /dev/null
+++ b/lib/power/rte_power_uncore.c
@@ -0,0 +1,401 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ */
+
+#include 
+
+#include 
+
+#include "rte_power_uncore.h"
+#include "power_common.h"
+
+#define BUS_FREQ 10
+
+#define POWER_GOVERNOR_PERF "performance"
+#define POWER_UNCORE_SYSFILE_MAX_FREQ \
+
"/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_die_%02u/max_freq_khz"
+#define POWER_UNCORE_SYSFILE_MIN_FREQ  \
+
"/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_die_%02u/min_freq_khz"
+#define POWER_UNCORE_SYSFILE_BASE_MAX_FREQ \
+
"/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_die_%02u/initial_max_freq_khz"
+#define POWER_UNCORE_SYSFILE_BASE_MIN_FREQ  \
+
"/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_die_%02u/initial_min_freq_khz"
+
+
+struct uncore_power_info {
+unsigned int die;/**< Core die id */
+unsigned int pkg;/**< Package id */
+uint32_t freqs[RTE_MAX_UNCORE_FREQS];/**< Frequency array */  
+uint32_t nb_freqs;   /**< Number of available freqs */
+FILE *f_cur_min; /**< FD of scaling_min */
+FILE *f_cur_max; /**< FD of scaling_max */
+FILE *f_base_min;/**< FD of initial min */
+FILE *f_base_max;/**< FD of initial max */
+int cur_idx; /**< Freq index in freqs array */
+uint32_t init_max_freq;  /**< Initial max frequency */
+uint32_t init_min_freq;  /**< Initial min frequency */
+} __rte_cache_aligned;
+
+
+static struct uncore_power_info 
uncore_info[RTE_MAX_NUMA_NODES][RTE_MAX_NUMA_DIE];
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+uint32_t target_uncore_freq, curr_max_freq;
+int ret;
+
+if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+RTE_LOG(ERR, POWER, "Invalid uncore frequency index %u, which "
+"should be less than %u\n", idx, ui->nb_freqs);
+return -1;
+}
+
+target_uncore_freq = ui->freqs[idx];
+
+if (fprin

[PATCH v1 2/4] l3fwd-power: add option to call uncore api

2022-07-11 Thread tadhgkearney

Add option for setting uncore frequency min/max/index, through uncore api.
This will be set for each package and die on the SKU. On exit, uncore frequency 
will be reverted back to previous frequency.

Signed-off-by: tadhgkearney 
---
 .../sample_app_ug/l3_forward_power_man.rst|  28 
 examples/l3fwd-power/main.c   | 154 +-
 2 files changed, 181 insertions(+), 1 deletion(-)

diff --git a/doc/guides/sample_app_ug/l3_forward_power_man.rst 
b/doc/guides/sample_app_ug/l3_forward_power_man.rst
index 8f6d906200..7b9d99ef54 100644
--- a/doc/guides/sample_app_ug/l3_forward_power_man.rst
+++ b/doc/guides/sample_app_ug/l3_forward_power_man.rst
@@ -97,6 +97,12 @@ where,
 *   -P: Sets all ports to promiscuous mode so that packets are accepted 
regardless of the packet's Ethernet MAC destination address.
 Without this option, only packets with the Ethernet MAC destination 
address set to the Ethernet address of the port are accepted.
 
+*   -u: optional, sets uncore frequency to minimum value.
+
+*   -U: optional, sets uncore frequency to maximum value.
+
+*   -i (frequency index): optional, sets uncore frequency to frequency index 
value, by setting min and max values to be the same.
+
 *   --config (port,queue,lcore)[,(port,queue,lcore)]: determines which queues 
from which ports are mapped to which cores.
 
 *   --max-pkt-len: optional, maximum packet length in decimal (64-9600)
@@ -364,3 +370,25 @@ in the DPDK Programmer's Guide for more details on PMD 
power management.
 .. code-block:: console
 
 .//examples/dpdk-l3fwd-power -l 1-3 -- -p 0x0f 
--config="(0,0,2),(0,1,3)" --pmd-mgmt=scale
+
+Setting Uncore Values
+-
+
+Uncore frequency can be adjusted through manipulating related sysfs entries to 
adjust the minimum and maximum uncore values. 
+This will be set for each package and die on the SKU. Three options are 
available for setting uncore frequency:
+
+``-u``
+  This will set uncore to minimum frequency possible.
+
+``-U``
+  This will set uncore to maximum frequency possible.
+
+``-i``
+  This will allow you to set the specific uncore frequency index that you 
want, by setting
+  minimum and maximum values to be the same. Frequency index's are set 
10Hz apart from 
+  maximum to minimum.
+  Frequency index values are in decending order, ie, index 0 is maximum 
frequency index.
+
+.. code-block:: console
+
+.//examples/dpdk-l3fwd-power -l 1-3 -- -p 0x0f 
--config="(0,0,2),(0,1,3)" -i 1
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 887c6eae3f..98a2e8428a 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -15,6 +15,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -47,6 +49,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "perf_core.h"
 #include "main.h"
@@ -71,6 +74,7 @@
 
 #ifndef APP_LOOKUP_METHOD
 #define APP_LOOKUP_METHOD APP_LOOKUP_LPM
+
 #endif
 
 #if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
@@ -134,9 +138,14 @@
 
 #define NUM_TELSTATS RTE_DIM(telstats_strings)
 
+#define UNCORE_FREQUENCY_DIR "/sys/devices/system/cpu/intel_uncore_frequency"
+
 static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
 static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
 
+/* Max number of nodes times dies available on uncore */
+#define MAX_DIE_NODES (RTE_MAX_NUMA_DIE * RTE_MAX_NUMA_NODES)
+
 /* ethernet addresses of ports */
 static struct rte_ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];
 
@@ -145,6 +154,8 @@ static rte_spinlock_t locks[RTE_MAX_ETHPORTS];
 
 /* mask of enabled ports */
 static uint32_t enabled_port_mask = 0;
+/* if uncore frequency was enabled without errors */
+static int enabled_uncore = 0;
 /* Ports set in promiscuous mode off by default. */
 static int promiscuous_on = 0;
 /* NUMA is enabled by default. */
@@ -165,6 +176,13 @@ struct telstats_name {
char name[RTE_ETH_XSTATS_NAME_SIZE];
 };
 
+struct uncore_info{
+   unsigned int pkg;
+   unsigned int die;
+};
+
+struct uncore_info ui[MAX_DIE_NODES];
+
 /* telemetry stats to be reported */
 const struct telstats_name telstats_strings[] = {
{"empty_poll"},
@@ -1616,6 +1634,9 @@ print_usage(const char *prgname)
"  [--max-pkt-len PKTLEN]\n"
"  -p PORTMASK: hexadecimal bitmask of ports to configure\n"
"  -P: enable promiscuous mode\n"
+   "  -u: set min frequency for uncore\n"
+   "  -U: set max frequency for uncore\n"
+   "  -i (frequency index): set frequency index for uncore\n"
"  --config (port,queue,lcore): rx queues configuration\n"
"  --high-perf-cores CORELIST: list of high performance cores\n"
"  --perf-config: similar as config, cores specified as indices"
@@ -1672,6 +1693,106 @@ static int parse_max_pkt_len(const char *pktlen)
return len;
 }
 
+static int
+parse_uncore_dir(void)
+{
+

[PATCH v1 3/4] test/power: add unit tests for uncore api

2022-07-11 Thread tadhgkearney

Add basic unit tests covering all seven uncore api's.

Signed-off-by: tadhgkearney 
---
 app/test/meson.build |   2 +
 app/test/test_power_uncore.c | 245 +++
 2 files changed, 247 insertions(+)
 create mode 100644 app/test/test_power_uncore.c

diff --git a/app/test/meson.build b/app/test/meson.build
index 431c5bd318..3ab360ecee 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -100,6 +100,7 @@ test_sources = files(
 'test_power.c',
 'test_power_cpufreq.c',
 'test_power_kvm_vm.c',
+'test_power_uncore.c',
 'test_prefetch.c',
 'test_rand_perf.c',
 'test_rawdev.c',
@@ -240,6 +241,7 @@ fast_tests = [
 ['power_cpufreq_autotest', false, true],
 ['power_autotest', true, true],
 ['power_kvm_vm_autotest', false, true],
+['power_uncore_autotest', true, true],
 ['reorder_autotest', true, true],
 ['service_autotest', true, true],
 ['thash_autotest', true, true],
diff --git a/app/test/test_power_uncore.c b/app/test/test_power_uncore.c
new file mode 100644
index 00..80d1ae4025
--- /dev/null
+++ b/app/test/test_power_uncore.c
@@ -0,0 +1,245 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2022 Intel Corporation
+ */
+
+#include "test.h"
+
+#ifndef RTE_LIB_POWER
+
+static int
+test_power_uncore(void)
+{
+printf("Power management library not supported, skipping test\n");
+return TEST_SKIPPED;
+}
+
+#else
+#include 
+#include 
+
+#define VALID_PKG 0
+#define VALID_DIE 0
+#define INVALID_PKG (RTE_MAX_NUMA_NODES + 1)
+#define INVALID_DIE (RTE_MAX_NUMA_DIE + 1)
+#define VALID_INDEX 1
+#define INVALID_INDEX (RTE_MAX_UNCORE_FREQS + 1)
+
+static int check_power_uncore_init(void)
+{
+int ret;
+
+/* Test initialisation of uncore configuration*/
+ret = rte_power_uncore_init(VALID_PKG, VALID_DIE);
+if(ret < 0){
+printf("Cannot initialise uncore power management for pkg %u die %u, 
this "
+"may occur if environment is not configured "
+"correctly(APCI cpufreq) or operating in another valid "
+"Power management environment\n", VALID_PKG, VALID_DIE);
+return -1;
+}
+
+/* Unsuccessful Test */
+ret = rte_power_uncore_init(INVALID_PKG, INVALID_DIE);
+if (ret == 0) {
+printf("Unexpectedly was able to initialise uncore power managment "
+"for pkg %u die %u\n", INVALID_PKG, INVALID_DIE);
+return -1;
+}
+
+return 0;
+}
+
+static int
+check_power_get_uncore_freq(void)
+{
+int ret;
+
+/* Successfully get uncore freq */
+ret = rte_power_get_uncore_freq(VALID_PKG, VALID_DIE);
+if (ret < 0) {
+printf("Failed to get uncore frequency for pkg %u die %u\n",
+VALID_PKG, VALID_DIE);
+return -1;
+}
+
+/* Unsuccessful Test */
+ret = rte_power_get_uncore_freq(INVALID_PKG, INVALID_DIE);
+if (ret >= 0) {
+printf("Unexpectedly got invalid uncore frequency for pkg %u die %u\n",
+INVALID_PKG, INVALID_DIE);
+return -1;
+}
+
+return 0;
+}
+
+static int
+check_power_set_uncore_freq(void)
+{
+int ret;
+
+/* Successfully set uncore freq */
+ret = rte_power_set_uncore_freq(VALID_PKG, VALID_DIE, VALID_INDEX);
+if (ret < 0) {
+printf("Failed to set uncore frequency for pkg %u die %u index %u\n",
+VALID_PKG, VALID_DIE, VALID_INDEX);
+return -1;
+}
+
+/* Try to unsuccesfully set invlaid uncore freq index */
+ret = rte_power_set_uncore_freq(VALID_PKG, VALID_DIE, INVALID_INDEX);
+if (ret == 0) {
+printf("Unexpectedly set invalid uncore index for pkg %u die %u index 
%u\n",
+VALID_PKG, VALID_DIE, INVALID_INDEX);
+return -1;
+}
+
+/* Unsuccessful Test */
+ret = rte_power_set_uncore_freq(INVALID_PKG, INVALID_DIE, VALID_INDEX);
+if (ret == 0) {
+printf("Unexpectedly set invalid uncore frequency for pkg %u die %u 
index %u\n",
+INVALID_PKG, INVALID_DIE, VALID_INDEX);
+return -1;
+}
+
+return 0;
+}
+
+static int
+check_power_uncore_freq_max(void)
+{
+int ret;
+
+/* Successfully get max uncore freq */
+ret = rte_power_uncore_freq_max(VALID_PKG, VALID_DIE);
+if (ret < 0) {
+printf("Failed to set max uncore frequency for pkg %u die %u\n",
+VALID_PKG, VALID_DIE);
+return -1;
+}
+
+/* Unsuccessful Test */
+ret = rte_power_uncore_freq_max(INVALID_PKG, INVALID_DIE);
+if (ret == 0) {
+printf("Unexpectedly set invalid max uncore frequency for pkg %u die 
%u\n",
+INVALID_PKG, INVALID_DIE);
+return -1;
+}
+
+return 0;
+}
+
+static int
+check_power_uncore_freq_min(void)
+{
+int ret;
+
+/* Successfully get min uncore freq */
+ret =

[PATCH v1 4/4] config: add uncore defines for x86

2022-07-11 Thread tadhgkearney

Add global define for max numa die and max uncore freqs.

Signed-off-by: tadhgkearney 
---
 config/x86/meson.build | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/config/x86/meson.build b/config/x86/meson.build
index 54345c4da3..f81103d8e3 100644
--- a/config/x86/meson.build
+++ b/config/x86/meson.build
@@ -73,3 +73,5 @@ endif
 dpdk_conf.set('RTE_CACHE_LINE_SIZE', 64)
 dpdk_conf.set('RTE_MAX_LCORE', 128)
 dpdk_conf.set('RTE_MAX_NUMA_NODES', 32)
+dpdk_conf.set('RTE_MAX_NUMA_DIE', 1)
+dpdk_conf.set('RTE_MAX_UNCORE_FREQS', 32)
-- 
2.25.1

Re: [PATCH v1 1/4] power: add uncore api to power library

2022-07-11 Thread Stephen Hemminger

On Mon, 11 Jul 2022 16:22:57 +
tadhgkearney  wrote:

> +FILE *f_cur_min; /**< FD of scaling_min */
> +FILE *f_cur_max; /**< FD of scaling_max */
> +FILE *f_base_min;/**< FD of initial min */
> +FILE *f_base_max;/**< FD of initial max */

Do you need to hold these extra FD's open?
Also wasteful to use stdio for simple control sysfs like this.

Re: [PATCH v4] net/i40e: restore disable double VLAN by default

2022-07-11 Thread Thomas Monjalon

07/07/2022 19:04, Kevin Liu:
> +Vlan related Features miss when FW >= 8.4
> +~
> +
> +If FW version >= 8.4, there'll be some Vlan related issues:
> +1. TCI input set for QinQ  is invalid.
> +2. Fail to configure TPID for QinQ.
> +3. Need to enable QinQ before enabling Vlan filter.
> +4. Fail to strip outer Vlan.

Did you check the HTML rendering?
A blank line is missing before the list.

Also you could use #. for automatic list numbering.

I'll fix when pulling.

RE: [PATCH v1 0/4] add uncore api to be called through l3fwd-power

2022-07-11 Thread Pattan, Reshma




> -Original Message-
> From: Kearney, Tadhg 
> Sent: Monday, July 11, 2022 5:23 PM
> To: dev@dpdk.org
> Cc: dave.h...@intel.com; Burakov, Anatoly ;
> Pattan, Reshma ; Kearney, Tadhg
> 
> Subject: [PATCH v1 0/4] add uncore api to be called through l3fwd-power
> 

The below patch  should be the 1st patch in the series. As the macros in this 
patch are used by rest of the patches in the series.

[PATCH v1 4/4] config: add uncore defines for x86 

Thanks,
Reshma

[dpdk-dev v1] crypto/openssl: fix of dstlen passed in HMAC

2022-07-11 Thread Kai Ji

This fix of dstlen passed in OpenSSL 3.0 lib EVP MAC final routine.

Fixes: 75adf1eae44f ("crypto/openssl: update HMAC routine with 3.0 EVP API")

Signed-off-by: Kai Ji 
---
 drivers/crypto/openssl/rte_openssl_pmd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c 
b/drivers/crypto/openssl/rte_openssl_pmd.c
index e01dacc98d..5658b9db66 100644
--- a/drivers/crypto/openssl/rte_openssl_pmd.c
+++ b/drivers/crypto/openssl/rte_openssl_pmd.c
@@ -1395,7 +1395,7 @@ process_openssl_auth_hmac(struct rte_mbuf *mbuf_src, 
uint8_t *dst, int offset,
}
 
 process_auth_final:
-   if (EVP_MAC_final(ctx, dst, &dstlen, sizeof(dst)) != 1)
+   if (EVP_MAC_final(ctx, dst, &dstlen, DIGEST_LENGTH_MAX) != 1)
goto process_auth_err;
 
EVP_MAC_CTX_free(ctx);
-- 
2.17.1

[dpdk-dev v1] crypto/qat: Enable OpenSSL legacy provider in session

2022-07-11 Thread Kai Ji

Some cryptographic algorithms such as MD and DES are now considered legacy
and not enabled by default in OpenSSL 3.0. Load up lagacy provider as MD5
DES are needed in QAT session pre-computes and secure session creation.

Fixes: 3227bc7138f5 ("crypto/qat: use intel-ipsec-mb for partial hash and AES")

Signed-off-by: Kai Ji 
---
 drivers/crypto/qat/qat_sym_session.c | 53 
 1 file changed, 39 insertions(+), 14 deletions(-)

diff --git a/drivers/crypto/qat/qat_sym_session.c 
b/drivers/crypto/qat/qat_sym_session.c
index b30396487e..42164cc6c6 100644
--- a/drivers/crypto/qat/qat_sym_session.c
+++ b/drivers/crypto/qat/qat_sym_session.c
@@ -30,6 +30,35 @@
 
 #if (OPENSSL_VERSION_NUMBER >= 0x3000L)
 #include 
+
+static OSSL_PROVIDER * legacy_lib;
+static OSSL_PROVIDER *default_lib;
+
+/* Some cryptographic algorithms such as MD and DES are now considered legacy
+ * and not enabled by default in OpenSSL 3.0. Load up lagacy provider as MD5
+ * DES are needed in QAT pre-computes and secure session creation.
+ */
+static int ossl_legacy_provider_load(void)
+{
+   /* Load Multiple providers into the default (NULL) library context */
+   legacy_lib = OSSL_PROVIDER_load(NULL, "legacy");
+   if (legacy_lib == NULL)
+   return -EINVAL;
+
+   default_lib = OSSL_PROVIDER_load(NULL, "default");
+   if (default_lib == NULL) {
+   OSSL_PROVIDER_unload(legacy_lib);
+   return  -EINVAL;
+   }
+
+   return 0;
+}
+
+static void ossl_legacy_provider_unload(void)
+{
+   OSSL_PROVIDER_unload(legacy_lib);
+   OSSL_PROVIDER_unload(default_lib);
+}
 #endif
 
 extern int qat_ipsec_mb_lib;
@@ -485,19 +514,8 @@ qat_sym_session_configure(struct rte_cryptodev *dev,
}
 
 #if (OPENSSL_VERSION_NUMBER >= 0x3000L)
-   OSSL_PROVIDER *legacy;
-   OSSL_PROVIDER *deflt;
-
-   /* Load Multiple providers into the default (NULL) library context */
-   legacy = OSSL_PROVIDER_load(NULL, "legacy");
-   if (legacy == NULL)
+   if (ossl_legacy_provider_load())
return -EINVAL;
-
-   deflt = OSSL_PROVIDER_load(NULL, "default");
-   if (deflt == NULL) {
-   OSSL_PROVIDER_unload(legacy);
-   return  -EINVAL;
-   }
 #endif
ret = qat_sym_session_set_parameters(dev, xform, sess_private_data);
if (ret != 0) {
@@ -513,8 +531,7 @@ qat_sym_session_configure(struct rte_cryptodev *dev,
sess_private_data);
 
 # if (OPENSSL_VERSION_NUMBER >= 0x3000L)
-   OSSL_PROVIDER_unload(legacy);
-   OSSL_PROVIDER_unload(deflt);
+   ossl_legacy_provider_unload();
 # endif
return 0;
 }
@@ -2606,6 +2623,10 @@ qat_security_session_create(void *dev,
return -ENOMEM;
}
 
+#if (OPENSSL_VERSION_NUMBER >= 0x3000L)
+   if (ossl_legacy_provider_load())
+   return -EINVAL;
+#endif
ret = qat_sec_session_set_docsis_parameters(cdev, conf,
sess_private_data);
if (ret != 0) {
@@ -2639,6 +2660,10 @@ qat_security_session_destroy(void *dev __rte_unused,
set_sec_session_private_data(sess, NULL);
rte_mempool_put(sess_mp, sess_priv);
}
+
+# if (OPENSSL_VERSION_NUMBER >= 0x3000L)
+   ossl_legacy_provider_unload();
+# endif
return 0;
 }
 #endif
-- 
2.17.1

Re: [PATCH] doc: add IP ECN header rewrite to mlx5 notes

2022-07-11 Thread Thomas Monjalon

11/07/2022 12:49, David Marchand:
> On Mon, Jul 11, 2022 at 11:59 AM Jiawei Wang  wrote:
> >
> > This patch updates the mlx5 notes for IPv4/IPv6 ECN field
> > rewrite offload.
> >
> > Fixes: c4e442fa4c45 ("ethdev: add IPv4/IPv6 ECN header rewrite action")
> > Fixes: 097d84a42a87 ("common/mlx5: check ECN modification capability")
> > Fixes: 76d575612277 ("net/mlx5: support modifying ECN field")
> >
> > Signed-off-by: Jiawei Wang 
> > Acked-by: Asaf Penso 
> > ---
> > +   | | Header rewrite  | | DPDK 22.07| | DPDK 22.07|
> > +   | | (ipv4_ecn)  | | OFED 5.6-2| | OFED 5.6-2|
> > +   | |  ipv6_ecn)  | | rdma-core 41  | | rdma-core 41  |
> 
> I guess the first ) is a typo.
> Did you mean "(ipv4_ecn / ipv6_ecn)" ?

Good catch.
Fixed and applied, thanks.

[Bug 1051] l2fwd-event: signal unsafe usage

2022-07-11 Thread bugzilla

https://bugs.dpdk.org/show_bug.cgi?id=1051

Bug ID: 1051
   Summary: l2fwd-event: signal unsafe usage
   Product: DPDK
   Version: 21.11
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: minor
  Priority: Normal
 Component: examples
  Assignee: dev@dpdk.org
  Reporter: step...@networkplumber.org
  Target Milestone: ---

The signal_handler in l2fwd-event uses several functions that are not
async-signal-safe.

The call to rte_mempool_lookup() in signal_handler could deadlock if
interrupted during another lookup.

Also updating variable in signal handler should use __atomic_store

This is not a bug found by inspection. In practice the code is safe.
But examples are used by users to build applications, should be following best
practices.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: [PATCH] config: set pkgconfig for ppc64le

2022-07-11 Thread Thomas Monjalon

+ PPC maintainer

07/07/2022 13:41, Ali Alnubani:
> Meson fails to detect the dependencies that are included
> in PKG_CONFIG_PATH and built for ppc64le if binaries.pkgconfig
> is not set in the ppc64le cross-file for Ubuntu.
> 
> This fixes the issue by setting binaries.pkgconfig to the
> binary provided by the package 'pkg-config-powerpc64le-linux-gnu'.
> 
> Signed-off-by: Ali Alnubani 
> ---
>  config/ppc/ppc64le-power8-linux-gcc-ubuntu | 1 +

Why not adding the same for ppc64le-power8-linux-gcc?

> +pkgconfig = 'powerpc64le-linux-gnu-pkg-config'

Re: [PATCH v2 0/2] DLB2: cq_weight fixes

2022-07-11 Thread Thomas Monjalon

08/07/2022 17:02, McDaniel, Timothy:
> From: Thomas Monjalon 
> > 06/07/2022 23:46, Timothy McDaniel:
> > > This patch series contains the following:
> > > - fix coverity issue 379234
> > > - improve error reporting for cq_weight feature
> > >
> > > Changes since V1
> > > - fixed a repeated word in the commit message of patch 1
> > >
> > > Timothy McDaniel (2):
> > >   event/dlb2: fix cq_weight array overflow
> > >   event/dlb2: improve cq_weight error messages
> > 
> > Should we merge those patches in 22.07-rc4? No risk?
> 
> I see no risk in merging these changes. I retested with them, and one fixes a 
> coverity issue.

Applied, thanks.

RE: [PATCH v2 0/2] DLB2: cq_weight fixes

2022-07-11 Thread McDaniel, Timothy




> -Original Message-
> From: Thomas Monjalon 
> Sent: Monday, July 11, 2022 3:31 PM
> To: McDaniel, Timothy 
> Cc: sta...@dpdk.org; dev@dpdk.org; jer...@marvell.com;
> jerinjac...@gmail.com; sta...@dpdk.org
> Subject: Re: [PATCH v2 0/2] DLB2: cq_weight fixes
> 
> 08/07/2022 17:02, McDaniel, Timothy:
> > From: Thomas Monjalon 
> > > 06/07/2022 23:46, Timothy McDaniel:
> > > > This patch series contains the following:
> > > > - fix coverity issue 379234
> > > > - improve error reporting for cq_weight feature
> > > >
> > > > Changes since V1
> > > > - fixed a repeated word in the commit message of patch 1
> > > >
> > > > Timothy McDaniel (2):
> > > >   event/dlb2: fix cq_weight array overflow
> > > >   event/dlb2: improve cq_weight error messages
> > >
> > > Should we merge those patches in 22.07-rc4? No risk?
> >
> > I see no risk in merging these changes. I retested with them, and one fixes 
> > a
> coverity issue.
> 
> Applied, thanks.
> 

Thank you, Thomas!

--Tim

Re: [PATCH] event/dsw: fix migration bug

2022-07-11 Thread Thomas Monjalon

11/07/2022 14:45, Jerin Jacob:
> On Thu, Jul 7, 2022 at 5:13 PM Mattias Rönnblom  wrote:
> >
> > From: Mattias Rönnblom 
> >
> > Fix bug in flow migration, which under certain conditions causes
> > reordering and violation of atomicity guarantees.
> >
> > The issue occurs when the processing of a flow (on an atomic queue)
> > has resulted in events enqueued to a flow currently being migrated,
> > and the former (producer) flow is also selected for migration. The
> > events are buffered ("paused") on the originating port, and released
> > (forwarded) when the migration has completed. However, at the time of
> > "unpausing" the latter (consumer) flow, processing of the producer
> > flow on the port to which it was migrated may have already produced
> > events, for the same paused flow. This constitutes a race condition,
> > and depending on which port wins, reordering may have been introduced.
> >
> > This patch forbids migration when a port has paused events, since
> > those events may have been the result of processing a to-be-migrated
> > flow.
> >
> > This patch also disallows processing events pertaining to a flow under
> > migration, for the same reason. A new buffer is introduced, which
> > holds such not-yet-processed events dequeued from the port's input
> > ring. Such events are forwarded to the target port as a part of the
> > migration process.
> >
> > The 'forwarding' migration state is eliminated, and instead background
> > processing is only performed if there are no unreleased events on the
> > port.
> >
> > The bug is primarily triggered in situations where multiple flows are
> > migrated as one transaction, but may occur even if only a single flow
> > is migrated (e.g., with older DSW versions, which does not support
> > multi-flow migration).
> >
> > Fixes: f6257b22e767 ("event/dsw: add load balancing")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Mattias Rönnblom 
> 
> + @Thomas Monjalon  to merge to the main tree directly.

No sorry,
it would not be serious to merge such a big patch in the last minute.
And it is not a new bug.

Re: [PATCH v3] doc: update Linux core isolation guide

2022-07-11 Thread Thomas Monjalon

I appreciate you work on the doc.
Please take extra care of the rendering, thanks.


17/05/2022 21:59, pbhagavat...@marvell.com:
> From: Pavan Nikhilesh 
> 
> Update Linux core isolation guide to include isolation from
> timers, rcu processing and IRQs.
> 
> Signed-off-by: Pavan Nikhilesh 
> ---
[...]
> +.. Note::
> +
> + More detailed information about the above parameters can be found at
> + https://www.kernel.org/doc/html/latest/timers/no_hz.html
> + https://www.kernel.org/doc/html/latest/core-api/irq/
> + 
> https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html

These links will appear on the same line.

> +
> +
> +For more fine grained control over resource management and performance 
> tuning one can look
> +into ``Linux cgroups``.
> +
> +Cpusets using cgroups::
> +
> +   https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/cpusets.html
> +
> +Systemd (CPUAffinity)::
> +
> +   https://www.freedesktop.org/software/systemd/man/systemd.exec.html
> +
> +Also, see::
> +
> +   https://man7.org/linux/man-pages/man7/cpuset.7.html
> +   https://www.suse.com/c/cpu-isolation-practical-example-part-5/
> +   https://www.rcannings.com/systemd-core-isolation/

These links are not clickable.
Please give them name to click on.

Re: [dpdk-dev v1] doc/multi-process: fixed grammar and rephrasing

2022-07-11 Thread Thomas Monjalon

Anyone to review?

Please could you go a step further and remove one useless header level,
fix links, enclose code with double backticks and other basic stuff?
Thanks


01/06/2022 11:57, Kai Ji:
> Update and rephrasing some sentences, small improvements
> made to the multi-process sample application user guide
> 
> Fixes: d0dff9ba445e ("doc: sample application user guide")
> Cc: bernard.iremon...@intel.com
> 
> Signed-off-by: Kai Ji 
> ---
>  doc/guides/sample_app_ug/multi_process.rst | 67 +++---
>  1 file changed, 33 insertions(+), 34 deletions(-)
> 
> diff --git a/doc/guides/sample_app_ug/multi_process.rst 
> b/doc/guides/sample_app_ug/multi_process.rst
> index c53331def3..e2a311a426 100644
> --- a/doc/guides/sample_app_ug/multi_process.rst
> +++ b/doc/guides/sample_app_ug/multi_process.rst
> @@ -1,5 +1,5 @@
>  ..  SPDX-License-Identifier: BSD-3-Clause
> -Copyright(c) 2010-2014 Intel Corporation.
> +Copyright(c) 2010-2022 Intel Corporation.
>  
>  .. _multi_process_app:
>  
> @@ -111,7 +111,7 @@ How the Application Works
>  The core of this example application is based on using two queues and a 
> single memory pool in shared memory.
>  These three objects are created at startup by the primary process,
>  since the secondary process cannot create objects in memory as it cannot 
> reserve memory zones,
> -and the secondary process then uses lookup functions to attach to these 
> objects as it starts up.
> +thus the secondary process uses lookup functions to attach to these objects 
> as it starts up.
>  
>  .. literalinclude:: ../../../examples/multi_process/simple_mp/main.c
>  :language: c
> @@ -119,25 +119,25 @@ and the secondary process then uses lookup functions to 
> attach to these objects
>  :end-before: >8 End of ring structure.
>  :dedent: 1
>  
> -Note, however, that the named ring structure used as send_ring in the 
> primary process is the recv_ring in the secondary process.
> +Note, that the named ring structure used as send_ring in the primary process 
> is the recv_ring in the secondary process.
>  
>  Once the rings and memory pools are all available in both the primary and 
> secondary processes,
>  the application simply dedicates two threads to sending and receiving 
> messages respectively.
> -The receive thread simply dequeues any messages on the receive ring, prints 
> them,
> -and frees the buffer space used by the messages back to the memory pool.
> -The send thread makes use of the command-prompt library to interactively 
> request user input for messages to send.
> -Once a send command is issued by the user, a buffer is allocated from the 
> memory pool, filled in with the message contents,
> -then enqueued on the appropriate rte_ring.
> +The receiver thread simply dequeues any messages on the receive ring and 
> prints out in terminal,
> +then the buffer space used by the messages is released back to the memory 
> pool.
> +The sender thread makes use of the command-prompt library to interactively 
> request user input for messages to send.
> +Once a send command is issued, the message contents are put into a buffer 
> that was allocated from the memory pool,
> +which is then enqueued on the appropriate rte_ring.
>  
>  Symmetric Multi-process Example
>  ~~~
>  
> -The second example of DPDK multi-process support demonstrates how a set of 
> processes can run in parallel,
> -with each process performing the same set of packet- processing operations.
> -(Since each process is identical in functionality to the others,
> -we refer to this as symmetric multi-processing, to differentiate it from 
> asymmetric multi- processing -
> -such as a client-server mode of operation seen in the next example,
> -where different processes perform different tasks, yet co-operate to form a 
> packet-processing system.)
> +The second DPDK multi-process example demonstrates how a set of processes 
> can run in parallel,
> +where each process is performing the same set of packet-processing 
> operations.
> +(As each process is identical in functionality to the others,
> +we refer to this as symmetric multi-processing. In the asymmetric 
> multi-processing example,
> +the different client-server mode processes perform different tasks,
> +yet co-operate to form a packet-processing system.)
>  The following diagram shows the data-flow through the application, using two 
> processes.
>  
>  .. _figure_sym_multi_proc_app:
> @@ -155,9 +155,8 @@ Similarly, each process writes outgoing packets to a 
> different TX queue on each
>  Running the Application
>  ^^^
>  
> -As with the simple_mp example, the first instance of the symmetric_mp 
> process must be run as the primary instance,
> -though with a number of other application- specific parameters also provided 
> after the EAL arguments.
> -These additional parameters are:
> +The first instance of the symmetric_mp process must be run as the primary 
> instance,
>

Re: [PATCH 1/1] doc: expand description of no-huge and PMD issue

2022-07-11 Thread Thomas Monjalon

13/06/2022 11:31, David Marchand:
> On Thu, Jun 9, 2022 at 3:40 PM Stanislaw Kardach  wrote:
> >
> > Add more details to the description of a known issue of PMDs not being
> > usable when --no-huge EAL command line parameter is used. The issue
> > actually happens whenever there is a need for physical addresses, even
> > when there is no PMD attached.
> >
> > Signed-off-by: Stanislaw Kardach 
> 
> Anatoly, Dmitry, opinions?
> Thanks.

Why nobody is reviewing doc?

Re: [PATCH v2] doc/eal: add signal safety warning

2022-07-11 Thread Thomas Monjalon

05/07/2022 22:44, Stephen Hemminger:
> The DPDK is not designed to be used from a signal handler.
> Add a notice in the documentation describing this limitation,
> similar to Linux signal-safety manual page.
> 
> Bugzilla ID: 1030
> Signed-off-by: Stephen Hemminger 
> Acked-by: Tyler Retzlaff 
> Acked-by: Chengwen Feng 
> ---
>  doc/guides/prog_guide/env_abstraction_layer.rst | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst 
> b/doc/guides/prog_guide/env_abstraction_layer.rst
> index 67842ae27207..de7ee92bba39 100644
> --- a/doc/guides/prog_guide/env_abstraction_layer.rst
> +++ b/doc/guides/prog_guide/env_abstraction_layer.rst
> @@ -818,6 +818,21 @@ Known Issues
>  
>The debug statistics of rte_ring, rte_mempool and rte_timer are not 
> supported in an unregistered non-EAL pthread.
>  
> ++ signal safety
> +
> +  The DPDK library is not designed to be async-signal-safe.
> +  Except where explicitly stated otherwise [#]_, the DPDK functions are 
> nonreentrant and are unsafe to call from a signal handler.
> +
> +.. [#] Only the function ``rte_dump_stack()`` can safely be called from 
> signal handler in this version of DPDK.

Really? Are you sure?

Note: the use of [#] is probably limited to a single usage in the page?

> +
> +.. note::
> +  The kinds of issues that make DPDK functions unsafe can be understood when
> +  one considers that much of the code in DPDK uses locks and other shared
> +  resources. If a device driver holding a ``rte_spinlock`` is interrupted
> +  by a signal and control operation is then performed that would acquire
> +  the same lock, a deadlock would result.

I find this note quite confusing.

Re: [PATCH v1] docs: update l3fwd sample app docs

2022-07-11 Thread Thomas Monjalon

That's sad to see such patch is not reviewed.


03/06/2022 13:11, Sean Morrissey:
> The L3FWD sample app doc contains outdated static code
> snippets and references to functions which no longer
> exist. This patch updates this doc to use up to date
> references.
> 
> Signed-off-by: Sean Morrissey 
> ---
>  doc/guides/sample_app_ug/l3_forward.rst | 87 +++--
>  examples/l3fwd/l3fwd_em_hlm_sse.h   |  2 +
>  examples/l3fwd/l3fwd_lpm.c  |  2 +
>  3 files changed, 28 insertions(+), 63 deletions(-)

Re: [PATCH v2] doc/eal: add signal safety warning

2022-07-11 Thread Stephen Hemminger

On Mon, 11 Jul 2022 23:15:26 +0200
Thomas Monjalon  wrote:

> 05/07/2022 22:44, Stephen Hemminger:
> > The DPDK is not designed to be used from a signal handler.
> > Add a notice in the documentation describing this limitation,
> > similar to Linux signal-safety manual page.
> > 
> > Bugzilla ID: 1030
> > Signed-off-by: Stephen Hemminger 
> > Acked-by: Tyler Retzlaff 
> > Acked-by: Chengwen Feng 
> > ---
> >  doc/guides/prog_guide/env_abstraction_layer.rst | 15 +++
> >  1 file changed, 15 insertions(+)
> > 
> > diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst 
> > b/doc/guides/prog_guide/env_abstraction_layer.rst
> > index 67842ae27207..de7ee92bba39 100644
> > --- a/doc/guides/prog_guide/env_abstraction_layer.rst
> > +++ b/doc/guides/prog_guide/env_abstraction_layer.rst
> > @@ -818,6 +818,21 @@ Known Issues
> >  
> >The debug statistics of rte_ring, rte_mempool and rte_timer are not 
> > supported in an unregistered non-EAL pthread.
> >  
> > ++ signal safety
> > +
> > +  The DPDK library is not designed to be async-signal-safe.
> > +  Except where explicitly stated otherwise [#]_, the DPDK functions are 
> > nonreentrant and are unsafe to call from a signal handler.
> > +
> > +.. [#] Only the function ``rte_dump_stack()`` can safely be called from 
> > signal handler in this version of DPDK.  
> 
> Really? Are you sure?

There are some trivial ones that are signal safe that are pure functions.
Like the bitfield and string functions.

But any function that matters such as rte_log, rte_mbuf, rte_mempool,
and any driver function are going to be problematic.

> 
> Note: the use of [#] is probably limited to a single usage in the page?

Isn't it supposed to autonumber?

> 
> > +
> > +.. note::
> > +  The kinds of issues that make DPDK functions unsafe can be understood 
> > when
> > +  one considers that much of the code in DPDK uses locks and other shared
> > +  resources. If a device driver holding a ``rte_spinlock`` is interrupted
> > +  by a signal and control operation is then performed that would acquire
> > +  the same lock, a deadlock would result.  
> 
> I find this note quite confusing.

I based it off what signal-safety says. Yes it really is that bad.

Re: [PATCH] doc: update doc for NVIDIA devices

2022-07-11 Thread Thomas Monjalon

07/07/2022 16:26, Raslan Darawsheh:
> This updates the doc to include new supported devices like ConnectX7,
> and updates the description of older ones.
> 
> Signed-off-by: Raslan Darawsheh 

Completed with an all-in move to NVIDIA trademark,
and applied, thanks.

Signed-off-by: Thomas Monjalon

Re: [PATCH] doc: add tested platforms with NVIDIA NICs

2022-07-11 Thread Thomas Monjalon

07/07/2022 15:46, Raslan Darawsheh:
> Add tested platforms with NVIDIA NICs to the 22.07 release notes.
> 
> Signed-off-by: Raslan Darawsheh 

Renamed all devices to NVIDIA and applied, thanks.

Re: [PATCH V1] doc: add tested Intel platforms with Intel NICs

2022-07-11 Thread Thomas Monjalon

08/07/2022 11:16, Lingli Chen:
> Add tested Intel platforms with Intel NICs to v22.07 release note.
> 
> Signed-off-by: Lingli Chen 
> ---
> index b40d6f9d84..abb787afc6 100644
> --- a/doc/guides/rel_notes/release_22_07.rst
> +++ b/doc/guides/rel_notes/release_22_07.rst
> @@ -356,3 +356,107 @@ Tested Platforms
> This section is a comment. Do not overwrite or remove it.
> Also, make sure to start the actual text at the margin.
> ===
> +
> +   * Intel\ |reg| platforms with Intel\ |reg| NICs combinations

Please read above:
"Also, make sure to start the actual text at the margin."

Removed the extra indent and applied, thanks.

release candidate 22.07-rc4

2022-07-11 Thread Thomas Monjalon

A new DPDK release candidate is ready for testing:
https://git.dpdk.org/dpdk/tag/?id=v22.07-rc4

There are 22 new patches in this snapshot.

Release notes:
https://doc.dpdk.org/guides/rel_notes/release_22_07.html

This is the last release candidate.
Only documentation should be updated before the release.

Reviews of deprecation notices are required:
https://patches.dpdk.org/bundle/dmarchand/deprecation_notices/

You may share some release validation results
by replying to this message at dev@dpdk.org
and by adding tested hardware in the release notes.

Please think about sharing your roadmap now for DPDK 22.11.

Thank you everyone

[PATCH v3] doc/eal: add signal safety warning

2022-07-11 Thread Stephen Hemminger

The DPDK is not designed to be used from a signal handler.
Add a notice in the documentation describing this limitation,
similar to Linux signal-safety manual page.

Bugzilla ID: 1030
Signed-off-by: Stephen Hemminger 
Acked-by: Tyler Retzlaff 
Acked-by: Chengwen Feng 
---
 doc/guides/prog_guide/env_abstraction_layer.rst | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst 
b/doc/guides/prog_guide/env_abstraction_layer.rst
index 67842ae27207..de7ee92bba39 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -818,6 +818,21 @@ Known Issues
 
   The debug statistics of rte_ring, rte_mempool and rte_timer are not 
supported in an unregistered non-EAL pthread.
 
++ signal safety
+
+  The DPDK library is not designed to be async-signal-safe.
+  Except where explicitly stated otherwise [#]_, the DPDK functions are 
nonreentrant and are unsafe to call from a signal handler.
+
+.. [#] Only the function ``rte_dump_stack()`` can safely be called from signal 
handler in this version of DPDK.
+
+.. note::
+  The kinds of issues that make DPDK functions unsafe can be understood when
+  one considers that much of the code in DPDK uses locks and other shared
+  resources. If a device driver holding a ``rte_spinlock`` is interrupted
+  by a signal and control operation is then performed that would acquire
+  the same lock, a deadlock would result.
+
+
 cgroup control
 ~~
 
-- 
2.35.1

[PATCH v4] doc/eal: add signal safety warning

2022-07-11 Thread Stephen Hemminger

The DPDK is not designed to be used from a signal handler.
Add a notice in the documentation describing this limitation,
similar to Linux signal-safety manual page.

Bugzilla ID: 1030
Signed-off-by: Stephen Hemminger 
Acked-by: Tyler Retzlaff 
Acked-by: Chengwen Feng 
---
v4 - add more functions and clarify

v3 - mistake (ignore it)

 .../prog_guide/env_abstraction_layer.rst  | 70 +++
 1 file changed, 70 insertions(+)

diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst 
b/doc/guides/prog_guide/env_abstraction_layer.rst
index 67842ae27207..35fbebe1be04 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -818,6 +818,76 @@ Known Issues
 
   The debug statistics of rte_ring, rte_mempool and rte_timer are not 
supported in an unregistered non-EAL pthread.
 
+Signal Safety
+~
+
+  The Posix API defines an async-signal-safe function as one that can be safely
+  called from with a signal handler. Many DPDK functions are non-reentrant and
+  therefore are unsafe to call from a signal handler.
+
+  The kinds of issues that make DPDK functions unsafe can be understood when
+  one considers that much of the code in DPDK uses locks and other shared
+  resources. For example, calling ``rte_mempool_lookup()`` from a signal
+  would deadlock if the signal happened during previous call ``rte_mempool``
+  routines.
+
+  Other functions are not signal safe because they use one or more
+  library routines that are not themselves signal safe.
+  For example, calling ``rte_panic()`` is not safe in a signal handler
+  because it uses ``rte_log()`` and ``rte_log()`` calls the
+  ``syslog()`` library function which is in the list of
+  signal safe functions in
+  `Signal-Safety manual page 
`_.
+
+  The set of functions that are expected to be async-signal-safe in DPDK
+  is shown in the following table. The functions not otherwise noted
+  are not async-signal-safe.
+
+.. csv-table:: **Signal Safe Functions**
+   :header: "Function"
+   :widths: 32
+
+   rte_dump_stack
+   rte_eal_get_lcore_state
+   rte_eal_get_runtime_dir
+   rte_eal_has_hugepages
+   rte_eal_has_pci
+   rte_eal_lcore_role
+   rte_eal_process_type
+   rte_eal_using_phys_addrs
+   rte_get_hpet_cycles
+   rte_get_hpet_hz
+   rte_get_main_lcore
+   rte_get_next_lcore
+   rte_get_tsc_hz
+   rte_hypervisor_get
+   rte_hypervisor_get_name
+   rte_lcore_count
+   rte_lcore_cpuset
+   rte_lcore_has_role
+   rte_lcore_index
+   rte_lcore_is_enabled
+   rte_lcore_to_cpu_id
+   rte_lcore_to_socket_id
+   rte_log_get_global_level
+   rte_log_get_level
+   rte_memory_get_nchannel
+   rte_memory_get_nrank
+   rte_reciprocal_value
+   rte_reciprocal_value_u64
+   rte_socket_count
+   rte_socket_id
+   rte_socket_id_by_idx
+   rte_strerror
+   rte_strscpy
+   rte_strsplit
+   rte_sys_gettid
+   rte_uuid_compare
+   rte_uuid_is_null
+   rte_uuid_parse
+   rte_uuid_unparse
+
+
 cgroup control
 ~~
 
-- 
2.35.1

Re: [PATCH] doc: announce marking device and driver objects as internal

2022-07-11 Thread Hemant Agrawal




On 7/10/2022 11:47 AM, David Marchand wrote:

rte_driver and rte_device are unnecessarily exposed in the public API/ABI.
Announce that they will be made opaque in the public API and mark
associated API as internal.
This impacts all bus, as their driver registration mechanism will be
made internal.

Note: the PCI bus had a similar deprecation notice that we can remove as
the new one is more generic.

Signed-off-by: David Marchand 


Acked-by:  Hemant Agrawal

RE: [PATCH] common/sfc_efx/base: remove VQ index check during VQ start

2022-07-11 Thread Saini, Abhimanyu

[AMD Official Use Only - General]

On 7/8/22 10:37, abhimanyu.sa...@xilinx.com wrote:
> > From: Abhimanyu Saini 
> >
> > The used/avail queue indexes are not bound by queue
> > size, because the descriptor entry index is calculated
> > by a simple modulo between queue index and queue_size
> 
> "is calculated" is a bit vague since looking at the code
> I've failed to find the place where modulo operation is
> done. Don't we need to apply it these values are put
> into MCDI message?
> 

The values are added to the MCDI as is and,
the modulo is performed by the hardware.

I can append the commit message to reflect the same?

> >
> > So, do not check initial used and avail queue indexes
> > against queue size because it is possible for these
> > indexes to be greater than queue size in the
> > following cases:
> > 1) The queue is created to be migrated into, or
> > 2) The client issues a qstop/qstart after running datapath
> >
> > Fixes: 4dda72dbdeab3 ("common/sfc_efx/base: add base virtio support for
> vDPA")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Abhimanyu Saini 
> > ---
> >   drivers/common/sfc_efx/base/rhead_virtio.c | 12 +---
> >   1 file changed, 1 insertion(+), 11 deletions(-)
> >
> > diff --git a/drivers/common/sfc_efx/base/rhead_virtio.c
> b/drivers/common/sfc_efx/base/rhead_virtio.c
> > index 335cb74..7f08717 100644
> > --- a/drivers/common/sfc_efx/base/rhead_virtio.c
> > +++ b/drivers/common/sfc_efx/base/rhead_virtio.c
> > @@ -47,14 +47,6 @@
> >   goto fail2;
> >   }
> >
> > - if (evvdp != NULL) {
> > - if ((evvdp->evvd_vq_cidx > evvcp->evvc_vq_size) ||
> > - (evvdp->evvd_vq_pidx > evvcp->evvc_vq_size)) {
> > - rc = EINVAL;
> > - goto fail3;
> > - }
> > - }
> > -
> >   req.emr_cmd = MC_CMD_VIRTIO_INIT_QUEUE;
> >   req.emr_in_buf = payload;
> >   req.emr_in_length = MC_CMD_VIRTIO_INIT_QUEUE_REQ_LEN;
> > @@ -116,15 +108,13 @@
> >
> >   if (req.emr_rc != 0) {
> >   rc = req.emr_rc;
> > - goto fail4;
> > + goto fail3;
> >   }
> >
> >   evvp->evv_vi_index = vi_index;
> >
> >   return (0);
> >
> > -fail4:
> > - EFSYS_PROBE(fail4);
> >   fail3:
> >   EFSYS_PROBE(fail3);
> >   fail2:

RE: [EXT] Re: [PATCH] doc: announce change in crypto adapter queue add

2022-07-11 Thread Akhil Goyal

> On Tue, Jun 28, 2022 at 6:12 PM Akhil Goyal  wrote:
> >
> > > Subject: [PATCH] doc: announce change in crypto adapter queue add
> > >
> > > The function `rte_event_crypto_adapter_queue_pair_add` will accept
> > > `rte_event_crypto_adapter_queue_conf` argument instead of `rte_event`.
> > >
> > > Signed-off-by: Volodymyr Fialko 
> > Acked-by: Akhil Goyal 
> 
> Acked-by: Jerin Jacob 
Hi Abhinandan,

Could you please Ack this deprecation notice.

Regards,
Akhil

DPDK 19.11.3 with multi processes and external physical memory: unable to receive traffic in the secondary processes

2022-07-11 Thread Asaf Sinai

Hi,

We run DPDK 19.11.3 with multi processes.
When using external memory for DPDK (instead of huge pages), we are unable to 
receive traffic in the secondary processes.


  *   We have external physical memory regions (excluded from Linux):

[ULP-NG]# cat /proc/cmdline
console=ttyS0,19200 isolcpus=1-127 smp_affinity=1 root=/dev/ram0 rwmode=r 
net.ifnames=0 biosdevname=0 systemd.show_status=0
memmap=0x100!0x6000 memmap=0x9000!0x8 
memmap=0x9000!0x18 quiet cloud-init=disabled


  *   We make all DPDK initializations in the primary process, including 
mapping the mentioned memory regions via "/dev/mem".

After these memory mapping, we register the external memory, as described in 
http://doc.dpdk.org/guides/prog_guide/env_abstraction_layer.html#support-for-externally-allocated-memory,
 and allocate memory pools:



...

/* Map physical to virtual */

int memFd = open("/dev/mem", O_RDWR);

extMemInfo[i].pageSize   = pPagesInfo->maxPageSize;

extMemInfo[i].memRegSize = 0x8000;

extMemInfo[i].memRegAddr = mmap(NULL, extMemInfo[i].memRegSize, PROT_READ | 
PROT_WRITE,

 MAP_SHARED, memFd, memRegPhysAddr[i]);



/* Create heap */

sprintf(extMemInfo[i].heapName, "extMemHeapSocket_%u", i);

rv = rte_malloc_heap_create(extMemInfo[i].heapName);



/* Save heap socket ID */

rv = rte_malloc_heap_get_socket(extMemInfo[i].heapName);

extMemInfo[i].socketId = rv;



/* Add memory region to heap */

rv = rte_malloc_heap_memory_add(extMemInfo[i].heapName, 
extMemInfo[i].memRegAddr,

extMemInfo[i].memRegSize, NULL, 0, 
extMemInfo[i].pageSize);

...

/* Allocate memory pool */

memPool = rte_mempool_create(poolName, nbufs, DPDK_MBUF_SIZE, poolCacheSize,

 sizeof(struct rte_pktmbuf_pool_private),

 rte_pktmbuf_pool_init, NULL,

 und_pktmbuf_init, NULL, extMemInfo[i].socketId, 
DPDK_NO_FLAGS);

...



Please note, that during calls to "rte_malloc_heap_memory_add", we see the 
following warnings:


EAL: WARNING! Base virtual address hint (0x2101005000 != 0x74627000) not 
respected!
EAL:This may cause issues with mapping memory into secondary processes
EAL: WARNING! Base virtual address hint (0x210100b000 != 0x74619000) not 
respected!
EAL:This may cause issues with mapping memory into secondary processes



And after executing "rte_mempool_create", physical addresses of the allocated 
memory zones are bad:


[Jul 12 12:02:17] [1875] extMemConfig: heapName=extMemHeapSocket_0, 
socketId=256, memRegAddr=0x7fff5800, memRegSize=2415919104, pageSize=2097152
[Jul 12 12:02:18] [1875] memZone: name=MP_ndPool_0, socket_id=256, 
vaddr=0x7fffe7e7bec0-0x7fffe7c0, paddr=0x-0x1840ff, 
len=1589504, hugepage_sz=2MB



  *   In the next step, we spawn the secondary processes from the primary one, 
using fork().

This way, all DPDK data and memory mappings are the same on all processes. For 
example:

Mapping in primary process:
cat /proc/1875/maps
...
7ffec800-7fff5800 rw-s 18 00:06 5
/dev/mem
7fff5800-7fffe800 rw-s 8 00:06 5 
/dev/mem
...

Mapping in secondary process:
cat /proc/2676/maps
...
7ffec800-7fff5800 rw-s 18 00:06 5
/dev/mem
7fff5800-7fffe800 rw-s 8 00:06 5 
/dev/mem
...


  *   Unfortunately, when traffic is received by the secondary processes, we 
see the following printout on the primary process:

i40e_dev_alarm_handler(): ICR0: malicious programming detected


  *   Additionally, we saw no example of using external physical shared memory 
on secondary processes.

DPDK test applications show usage of anonymous private memory on the primary 
process only.

What are we missing?
Can you please advise?

Thanks,
Asaf

[Radware]
Asaf Sinai
ND SW Engineer
Email: asa...@radware.com
T:+972-72-3917050
M:+972-50-6518541
F:+972-3-6488662
[Check out the latest and greatest from 
Radware]

www.radware.com

[Blog]  
[https://www.radware.com/images/signature/linkedin.jpg] 
 
[https://www.radware.com/images/signature/twitter.jpg] 
   [youtube] 

Confidentiality note: This message, and any attachments to it, contains 
privileged/confidential information of RADWARE Ltd./RADWARE Inc. and may not be 
disclosed, used, copied, or transmitted in any form or by any means without 
prior written permission from RADWARE. If you are not the intended recipient, 
delete the message and any attachments from your system without reading or 
copying it, and kindly notify the sender by e-mail. Thank you.
P Please consider your environmental responsibility before printi

96 matches

Mail list logo