Re: [dpdk-dev] [RFC] net: make eCPRI header host network order

2020-11-28 Thread Bing Zhao
Hi Haiyue & Ferruh,

PSB

> -Original Message-
> From: Wang, Haiyue 
> Sent: Saturday, November 28, 2020 1:31 PM
> To: Bing Zhao ; Yigit, Ferruh
> ; Olivier Matz 
> Cc: dev@dpdk.org; Stephen Hemminger 
> Subject: RE: [RFC] net: make eCPRI header host network order
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi Bing,
> 
> > -Original Message-
> > From: Bing Zhao 
> > Sent: Saturday, November 28, 2020 11:18
> > To: Yigit, Ferruh ; Olivier Matz
> > 
> > Cc: dev@dpdk.org; Wang, Haiyue ; Stephen
> > Hemminger 
> > Subject: RE: [RFC] net: make eCPRI header host network order
> >
> > Hi Ferruh & Haiyue,
> > Have you checked other headers? Like:
> > rte_ipv4_hdr
> > rte_ipv6_hdr
> > rte_tcp_hdr
> > ...
> >
> > Also
> >   [ITEM_UDP_SRC] = {
> >   .name = "src",
> >   .help = "UDP source port",
> >   .next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED),
> item_param),
> >   .args = ARGS(ARGS_ENTRY_HTON(struct
> rte_flow_item_udp,
> >hdr.src_port)),
> >   },
> >   [ITEM_UDP_DST] = {
> >   .name = "dst",
> >   .help = "UDP destination port",
> >   .next = NEXT(item_udp, NEXT_ENTRY(UNSIGNED),
> item_param),
> >   .args = ARGS(ARGS_ENTRY_HTON(struct
> rte_flow_item_udp,
> >hdr.dst_port)),
> >   },
> >
> > Or did I get sth. wrong?
> >
> 
> The original design is not wrong. ;-)
> 
> Since it is defined in librte_net, people will think it is just
> union for network order quick access like 'struct rte_gre_hdr', but
> in fact the bit field here is something like auxiliary data
> structure, we have to translate the whole 4 byte from network order
> to host order for accessing one bit field member, otherwise it will
> be wrong.

I also checked the definitions in the file " rte_higig.h", and I got your point.
Yes, I agree.
Indeed, in my original RFC (in the link)
http://inbox.dpdk.org/dev/1591717358-194133-1-git-send-email-bi...@mellanox.com/
I used the same definition method as you suggested.

Then during the implementation, since some old compilers gave some warning to 
the uint8_t or uint16_t bit fields, I decided to use uint32_t bit fields to 
make them happy 😊. Then I noticed the common header took the entire 4 bytes 
that could be treated as an u32, so I also moved the sequence of the type and 
size members. And yes, the header usage in the host SW is missed then.
And for an ingress packet, after swapping the u32 in little endian host, the 
correct value of each field could be got, but the offset of each field is wrong 
then.
I personally prefer to Ferruh's method which remaining u32 bit fields. Or else 
some instruction should be added to get rid of the warnings.

I also checked the Linux code, "/usr/include/linux/ip.h"
Like the IP header, one definition is using u8 with bit fields.

In BSD socket file "/usr/include/netinet/ip.h"
It uses "unsigned int" bit fields. Since the following is an u8 and it will be 
aligned naturally.
Maybe we could also use this favor in "/usr/include/netinet/ip.h".

> 
> Like:
> 
> struct rte_ecpri_common_hdr *eh;
> uint8_t pkt[4];
> 
> pkt[0] = 0x10;
> pkt[1] = 0x03;
> pkt[2] = 0x00;
> pkt[3] = 0x18;
> 
> eh = (struct rte_ecpri_common_hdr *)pkt;
> 
> printf("eCPRI: 0x%08x, revision = %u, type = %u size = %u\n",
> eh->u32, eh->revision, eh->type, ntohs(eh->size));
> 
> eCPRI: 0x18000310, revision = 1, type = 0 size = 4099
> 
> But in fact it should be:
> 
> eCPRI: 0x18000310, revision = 1, type = 3 size = 24
> 
> 
> After the enhancement (new revision from RFCv1):
> 
> struct rte_ecpri_common_hdr {
> union {
> rte_be32_t u32; /**< 4B common
> header in BE */
> struct {
> #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> uint16_t c:1;   /**< Concatenation
> Indicator */
> uint16_t res:3; /**< Reserved */
> uint16_t revision:4;/**< Protocol
> Revision */
> uint16_t type:8;/**< Message Type */
> #elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> uint16_t revision:4;/**< Protocol
> Revision */
> uint16_t res:3; /**< Reserved */
> uint16_t c:1;   /**< Concatenation
> Indicator */
> uint16_t type:8;/**< Message Type */
> #endif
> rte_be16_t size;/**< Payload Size */
> };
> };
> };
> 

So Ferruh, would you also please move the
+   uint32_t type:8;
out of the "#if" macro?
And since the size field should be in big-endian.
How about to use rte_be32_t to indicate this?

Regarding the commit message:
"
Other protocol structs are in the host b

[dpdk-dev] [PATCH] net/virtio-user: fix error run close(0)

2020-11-28 Thread Jiawei Zhu
From: Jiawei Zhu 

When i < VIRTIO_MAX_VIRTQUEUES and j == i,
dev->callfds[i] and dev->kickfds[i] are default 0.
So it will close(0), close the standard input (stdin).

Fixes: e6e7ad8b3024 ("net/virtio-user: move eventfd open/close into 
init/uninit")
Cc: sta...@dpdk.org

Signed-off-by: Jiawei Zhu 
---
 drivers/net/virtio/virtio_user/virtio_user_dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c 
b/drivers/net/virtio/virtio_user/virtio_user_dev.c
index 053f026..1bfd223 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
@@ -284,7 +284,7 @@ int virtio_user_stop_device(struct virtio_user_dev *dev)
}
 
if (i < VIRTIO_MAX_VIRTQUEUES) {
-   for (j = 0; j <= i; ++j) {
+   for (j = 0; j < i; ++j) {
close(dev->callfds[j]);
close(dev->kickfds[j]);
}
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH 1/3] MAINTAINERS remove experimental tag from vdev_netvsc

2020-11-28 Thread Stephen Hemminger
On Tue, 06 Oct 2020 00:41:55 +0200
Thomas Monjalon  wrote:

> 15/09/2020 16:47, Stephen Hemminger:
> > Ferruh Yigit  wrote:  
> > > On 9/15/2020 3:03 AM, Stephen Hemminger wrote:  
> > > > Vdev_netvsc has been around for several years. It no longer needs
> > > > to be marked experimental.
> > > > 
> > > > Signed-off-by: Stephen Hemminger 
> > > > ---
> > > > -Microsoft vdev_netvsc - EXPERIMENTAL
> > > > +Microsoft vdev_netvsc
> > > >  M: Matan Azrad 
> > > >  F: drivers/net/vdev_netvsc/
> > > >  F: doc/guides/nics/vdev_netvsc.rst  
> > > 
> > > As far as I remember 'vdev_netvsc' was interim solution until 'netvsc'
> > > was ready. In this patchset 'netvsc' is also becoming mature.
> > > 
> > > Wouldn't be easier to keep 'vdev_netvsc' experimental to be able to
> > > remove it soon?  
> > 
> > Let me discuss with Long Li and management.
> > Maybe replace EXPERIMENTAL with DEPRECATED in 20.11.  
> 
> Would be strange to switch from experimental to deprecated :)
> 
> +Cc Matan
> 
> I think you still need this platform driver (with failsafe and tap)
> in case you need rte_flow. Or is it well supported with netvsc PMD?

This needs more discussion. Netvsc PMD does not support rte_flow because
there is not a good/complete implementation of rte_flow library in pure
software form. It might be possible using the BPF stuff that tap supports.

I do not know of anybody who is using vdev_netvsc/failsafe with rte_flow.
Part of the problem is that the TAP implementation of rte_flow supports
a much smaller subset of features than the VF device (MLX).



Re: [dpdk-dev] [PATCH] eal/windows: vfprintf build warning with clang

2020-11-28 Thread Dmitry Kozlyuk
On Fri, 27 Nov 2020 12:07:11 +, Nick Connolly wrote:
> [...]
> +#ifdef __clang__
> +#pragma clang diagnostic push
> +#pragma clang diagnostic ignored "-Wformat-nonliteral"
> +#endif
> [...]

How about a more safe approach?

diff --git a/lib/librte_eal/windows/eal_lcore.c 
b/lib/librte_eal/windows/eal_lcore.c
index d5ff721e0..ebcd3474e 100644
--- a/lib/librte_eal/windows/eal_lcore.c
+++ b/lib/librte_eal/windows/eal_lcore.c
@@ -37,6 +37,7 @@ struct cpu_map {
 static struct cpu_map cpu_map = { 0 };
 
 /* eal_create_cpu_map() is called before logging is initialized */
+__rte_format_printf(1, 2)
 static void
 log_early(const char *format, ...)
 {


Re: [dpdk-dev] [PATCH] eal/windows: vfprintf build warning with clang

2020-11-28 Thread Nick Connolly

Looks good to me!

On 28/11/2020 21:11, Dmitry Kozlyuk wrote:

On Fri, 27 Nov 2020 12:07:11 +, Nick Connolly wrote:

[...]
+#ifdef __clang__
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wformat-nonliteral"
+#endif
[...]

How about a more safe approach?

diff --git a/lib/librte_eal/windows/eal_lcore.c 
b/lib/librte_eal/windows/eal_lcore.c
index d5ff721e0..ebcd3474e 100644
--- a/lib/librte_eal/windows/eal_lcore.c
+++ b/lib/librte_eal/windows/eal_lcore.c
@@ -37,6 +37,7 @@ struct cpu_map {
  static struct cpu_map cpu_map = { 0 };
  
  /* eal_create_cpu_map() is called before logging is initialized */

+__rte_format_printf(1, 2)
  static void
  log_early(const char *format, ...)
  {




Re: [dpdk-dev] [PATCH v2] app/testpmd: fix testpmd flows left before port stop.

2020-11-28 Thread Gregory Etelson
Hello Andrew,

> On 11/26/20 7:43 PM, Gregory Etelson wrote:
> > According to RTE flow user guide, PMD will not keep flow rules after
> > port stop. Application resources that refer to flow rules become
> > obsolete after port stop and must not be used.
> > Testpmd maintains linked list of active flows for each port. Entries
> > in that list are allocated dynamically and must be explicitly released
> > to prevent memory leak.
> > The patch releases testpmd port flow_list that holds remaining flows
> > before port is stopped.
> >
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Gregory Etelson 
> > ---
> >  app/test-pmd/testpmd.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> > 33fc0fddf5..0bb192b2f5 100644
> > --- a/app/test-pmd/testpmd.c
> > +++ b/app/test-pmd/testpmd.c
> > @@ -2806,6 +2806,9 @@ stop_port(portid_t pid)
> >   }
> >   }
> >
> > + if (port->flow_list)
> > + port_flow_flush(pi);
> > +
> >   if (rte_eth_dev_stop(pi) != 0)
> >   RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for
> port %u\n",
> >   pi);
> >
> 
> port_flow_flush() does rte_flow_flush() which is not strictly required.
> Description sounds like we should cleanup testpmd lists only.

You are right, call to rte_flow_flush() is not required, if testpmd calls 
port_flow_flush()
as part of port stop sequence, because PMD will remove flows in that scenario.
port_flow_flush() has a general implementation. It destroys all flows without
port state consideration - current or future. In this form, port_flow_flush() 
can be
called from any testpmd scenario that needs flows destruction.

Regards,
Gregory