Re: [dpdk-dev] [PATCH v3] net/pcap: rx_iface_in stream type support

2018-07-01 Thread Ido Goshen


> -Original Message-
> From: Ferruh Yigit 
> Sent: Wednesday, June 27, 2018 4:59 PM
> To: Ido Goshen ; Bruce Richardson
> ; John McNamara
> ; Marko Kovacevic
> 
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v3] net/pcap: rx_iface_in stream type support
> 
> On 6/27/2018 1:04 PM, ido goshen wrote:
> > From: ido g 
> >
> > Support rx of in direction packets only Useful for apps that also tx
> > to eth_pcap ports in order to not see them echoed back in as rx when
> > out direction is also captured
> 
> Can you please add your command, which was in previous mails, on how to re-
> produce the issue of capturing transferred packets in Rx path; for future.

[idog] I think one can just use the new doc example below (the one w/o the _in 
option) but I can add it in the commit log too...
> 
> And overall looks good, there are a few syntax comments below.
> 
> >
> > Signed-off-by: ido g 
> > ---
> > v3:
> > * merge to updated dpdk-next-net code
> > * pcap_ring doc update
> >
> > v2:
> > * clean checkpatch warning
> >
> >  doc/guides/nics/pcap_ring.rst   | 25 ++-
> >  drivers/net/pcap/rte_eth_pcap.c | 45
> > ++---
> >  2 files changed, 66 insertions(+), 4 deletions(-)
> >
> > diff --git a/doc/guides/nics/pcap_ring.rst
> > b/doc/guides/nics/pcap_ring.rst index 7fd063c..6282be6 100644
> > --- a/doc/guides/nics/pcap_ring.rst
> > +++ b/doc/guides/nics/pcap_ring.rst
> > @@ -71,11 +71,19 @@ The different stream types are:
> >  tx_pcap=/path/to/file.pcap
> >
> >  *   rx_iface: Defines a reception stream based on a network interface name.
> > -The driver reads packets coming from the given interface using the 
> > Linux
> kernel driver for that interface.
> > +The driver reads packets from the given interface using the Linux 
> > kernel
> driver for that interface.
> > +The driver captures both the incoming and outgoing packets on that
> interface.
> 
> This is only true if tx_iface parameter given for that interface, right? I 
> can be
> good to clarify to not confuse people. I am for keeping first sentences, and 
> add
> a note about this special case, something like (feel free to update):
> 

[idog] No, This is true indifferent to what the other params are.
i.e. 
In case iface_rx is given the dpdk app will see not only packets coming into 
that iface (e.g. echo request) but also what linux apps are sending out of that 
iface (e.g. echo reply)
In case iface_rx_in is given it will see only incoming traffic (only the echo 
requests)
Giving tx_iface with the same iface just exposes that behavior and makes it 
worst cause it will also capture back what the dpdk app is sending to that 
iface and not only what Linux sends.
Therefore I think the documentation is correct.

> "
> The driver reads packets coming from the given interface using the Linux
> kernel driver for that interface.
> When tx_iface argument given for same interface, Tx packets also captured.
> "
> 
> >  The value is an interface name.
> >
> >  rx_iface=eth0
> >
> > +*   rx_iface_in: Defines a reception stream based on a network interface
> name.
> > +The driver reads packets from the given interface using the Linux 
> > kernel
> driver for that interface.
> > +The driver captures only the incoming packets on that interface.
> 
> Again I am for keeping "... reads packets coming from the given interface ..."
> and clarify the difference in next sentences specific to tx_iface usage.
> 
> > +The value is an interface name.
> > +
> > +rx_iface_in=eth0
> > +
> >  *   tx_iface: Defines a transmission stream based on a network interface
> name.
> >  The driver sends packets to the given interface using the Linux kernel
> driver for that interface.
> >  The value is an interface name.
> > @@ -122,6 +130,21 @@ Forward packets through two network interfaces:
> >  $RTE_TARGET/app/testpmd -l 0-3 -n 4 \
> >  --vdev 'net_pcap0,iface=eth0' --vdev='net_pcap1;iface=eth1'
> >
> > +Enable 2 tx queues on a network interface:> + .. code-block:: console
> > +
> > +$RTE_TARGET/app/testpmd -l 0-3 -n 4 \
> > +--vdev 'net_pcap0,rx_iface=eth1,tx_iface=eth1,tx_iface=eth1' \
> > +-- --txq 2
> > +
> > +Read only incoming packets from a network interface:
> 
> This title is confusing, the sample is not for "read only incoming packets" 
> it Tx
> also J. I understand what you mean, but I believe it would be better to 
> clarify
> this.

[idog] Would this make it clearer?
"Read only incoming packets from a network interface and write them back to 
that network interface:"

> 
> > +
> > +.. code-block:: console
> > +
> > +$RTE_TARGET/app/testpmd -l 0-3 -n 4 \
> > +--vdev 'net_pcap0,rx_iface_in=eth1,tx_iface=eth1'
> > +
> >  Using libpcap-based PMD with the testpmd Application
> > 
> >
> > diff --git a/drivers/net/pcap/rte_eth_pcap.c
> > b/drivers/net/pcap/rte_eth_pcap.c index b21930b

Re: [dpdk-dev] [PATCH V4 8/9] app/testpmd: show example to handle hot unplug

2018-07-01 Thread Matan Azrad
Hi Jeff

A good advance, thank you, but as I said in previous version, this patch 
inserts a bug and the next one fixes it.
Patch 9 should be before patch 8 while this patch just add 1 more option for 
EAL hotplug.

Please see 1 more comment below.

From: Jeff Guo
> Use testpmd for example, to show how an application smoothly handle failure
> when device being hot unplug. If app have enabled the device event monitor
> and register the hot plug event’s callback before running, once app detect the
> removal event, the callback would be called. It will first stop the packet
> forwarding, then stop the port, close the port, and finally detach the port to
> remove the device out from the device lists.
> 
> Signed-off-by: Jeff Guo 
> ---
> v4->v3:
> remove some unused code
> ---
>  app/test-pmd/testpmd.c | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> 24c1998..42ed196 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -2196,6 +2196,9 @@ static void
>  eth_dev_event_callback(char *device_name, enum rte_dev_event_type type,
>__rte_unused void *arg)
>  {
> + uint16_t port_id;
> + int ret;
> +
>   if (type >= RTE_DEV_EVENT_MAX) {
>   fprintf(stderr, "%s called upon invalid event %d\n",
>   __func__, type);
> @@ -2206,9 +2209,12 @@ eth_dev_event_callback(char *device_name, enum
> rte_dev_event_type type,
>   case RTE_DEV_EVENT_REMOVE:
>   RTE_LOG(ERR, EAL, "The device: %s has been removed!\n",
>   device_name);
> - /* TODO: After finish failure handle, begin to stop
> -  * packet forward, stop port, close port, detach port.
> -  */
> + ret = rte_eth_dev_get_port_by_name(device_name, &port_id);

As you probably know, 1 rte_device may be associated to more than one ethdev 
ports, so the ethdev port name can be different from rte_device name.
Looks like we need a new ethdev API to get all the ports associated to one 
rte_device.

> + if (ret) {
> + printf("can not get port by device %s!\n",
> device_name);
> + return;
> + }
> + rmv_event_callback((void *)(intptr_t)port_id);
>   break;
>   case RTE_DEV_EVENT_ADD:
>   RTE_LOG(ERR, EAL, "The device: %s has been added!\n", @@ -
> 2736,7 +2742,6 @@ main(int argc, char** argv)
>   return -1;
>   }
>   eth_dev_event_callback_register();
> -
>   }
> 
>   if (start_port(RTE_PORT_ALL) != 0)
> --
> 2.7.4



Re: [dpdk-dev] [PATCH v5 03/15] vhost: vring address setup for packed queues

2018-07-01 Thread Maxime Coquelin




On 06/29/2018 05:59 PM, Tiwei Bie wrote:

@@ -888,7 +914,8 @@ vhost_user_set_mem_table(struct virtio_net **pdev, struct 
VhostUserMsg *pmsg)
  static int
  vq_is_ready(struct vhost_virtqueue *vq)
  {
-   return vq && vq->desc && vq->avail && vq->used &&
+   return vq &&
+  (vq->desc_packed || (vq->desc && vq->avail && vq->used)) &&
   vq->kickfd != VIRTIO_UNINITIALIZED_EVENTFD &&
   vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD;


It seems that the check is wrong here as desc_packed and desc are in a
union. We may have to check whether packed ring has been negotiated.


[dpdk-dev] [PATCH v4] net/pcap: rx_iface_in stream type support

2018-07-01 Thread ido goshen
From: ido g 

Support rx of in direction packets only
Useful for apps that also tx to eth_pcap ports in order to not see them
echoed back in as rx when out direction is also captured

Example:
In case using rx_iface and sending *single* packet to eth1
it will loop forever as the when it is sent to tx_iface=eth1
it will be captured again on the rx_iface=eth1 and so on
  $RTE_TARGET/app/testpmd l 0-3 -n 4 \
--vdev 'net_pcap0,rx_iface=eth1,tx_iface=eth1'
  …
  -- Forward statistics for port 0  
  RX-packets: 758RX-dropped: 0 RX-total: 758
  TX-packets: 758TX-dropped: 0 TX-total: 758
  --
While if using rx_iface_in it will not be captured on the way out and
be forwarded only once
  $RTE_TARGET/app/testpmd l 0-3 -n 4 \
--vdev 'net_pcap0,rx_iface_in=eth1,tx_iface=eth1'
  …
  -- Forward statistics for port 0  
  RX-packets: 1  RX-dropped: 0 RX-total: 1
  TX-packets: 1  TX-dropped: 0 TX-total: 1
  --

Signed-off-by: ido g 
---
v4:
* fix order of rx_face and rx_iface_in mix
* reward pcap_ring doc example
* cosmetics code alignments
* adding example commands in commit log 

v3:
* merge to updated dpdk-next-net code
* pcap_ring doc update

v2:
* clean checkpatch warning

 doc/guides/nics/pcap_ring.rst   | 25 +++-
 drivers/net/pcap/rte_eth_pcap.c | 51 +
 2 files changed, 70 insertions(+), 6 deletions(-)

diff --git a/doc/guides/nics/pcap_ring.rst b/doc/guides/nics/pcap_ring.rst
index 7fd063c..879e543 100644
--- a/doc/guides/nics/pcap_ring.rst
+++ b/doc/guides/nics/pcap_ring.rst
@@ -71,11 +71,19 @@ The different stream types are:
 tx_pcap=/path/to/file.pcap
 
 *   rx_iface: Defines a reception stream based on a network interface name.
-The driver reads packets coming from the given interface using the Linux 
kernel driver for that interface.
+The driver reads packets from the given interface using the Linux kernel 
driver for that interface.
+The driver captures both the incoming and outgoing packets on that 
interface.
 The value is an interface name.
 
 rx_iface=eth0
 
+*   rx_iface_in: Defines a reception stream based on a network interface name.
+The driver reads packets from the given interface using the Linux kernel 
driver for that interface.
+The driver captures only the incoming packets on that interface.
+The value is an interface name.
+
+rx_iface_in=eth0
+
 *   tx_iface: Defines a transmission stream based on a network interface name.
 The driver sends packets to the given interface using the Linux kernel 
driver for that interface.
 The value is an interface name.
@@ -122,6 +130,21 @@ Forward packets through two network interfaces:
 $RTE_TARGET/app/testpmd -l 0-3 -n 4 \
 --vdev 'net_pcap0,iface=eth0' --vdev='net_pcap1;iface=eth1'
 
+Enable 2 tx queues on a network interface:
+
+.. code-block:: console
+
+$RTE_TARGET/app/testpmd -l 0-3 -n 4 \
+--vdev 'net_pcap0,rx_iface=eth1,tx_iface=eth1,tx_iface=eth1' \
+-- --txq 2
+
+Read only incoming packets from a network interface and write them back to the 
same network interface:
+
+.. code-block:: console
+
+$RTE_TARGET/app/testpmd -l 0-3 -n 4 \
+--vdev 'net_pcap0,rx_iface_in=eth1,tx_iface=eth1'
+
 Using libpcap-based PMD with the testpmd Application
 
 
diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index b21930b..0a89b24 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -26,6 +26,7 @@
 #define ETH_PCAP_RX_PCAP_ARG  "rx_pcap"
 #define ETH_PCAP_TX_PCAP_ARG  "tx_pcap"
 #define ETH_PCAP_RX_IFACE_ARG "rx_iface"
+#define ETH_PCAP_RX_IFACE_IN_ARG "rx_iface_in"
 #define ETH_PCAP_TX_IFACE_ARG "tx_iface"
 #define ETH_PCAP_IFACE_ARG"iface"
 
@@ -83,6 +84,7 @@ struct pmd_devargs {
ETH_PCAP_RX_PCAP_ARG,
ETH_PCAP_TX_PCAP_ARG,
ETH_PCAP_RX_IFACE_ARG,
+   ETH_PCAP_RX_IFACE_IN_ARG,
ETH_PCAP_TX_IFACE_ARG,
ETH_PCAP_IFACE_ARG,
NULL
@@ -739,6 +741,21 @@ struct pmd_devargs {
 }
 
 static inline int
+set_iface_direction(const char *iface, pcap_t *pcap,
+   pcap_direction_t direction)
+{
+   const char *direction_str = (direction == PCAP_D_IN) ? "IN" : "OUT";
+   if (pcap_setdirection(pcap, direction) < 0) {
+   PMD_LOG(ERR, "Setting %s pcap direction %s failed - %s\n",
+   iface, direction_str, pcap_geterr(pcap));
+   return -1;
+   }
+   PMD_LOG(INFO, "Setting %s pcap direction %s\n",
+   iface, direction_str);
+   return 0;
+}
+
+static inline int
 open_

[dpdk-dev] 17.05 --> 17.11, minimum hash table key size

2018-07-01 Thread Bly, Mike
Hello,

We are in process of migrating our design from DPDK 17.05 to 17.11 and we ran 
into a small problem. Within our design, we have some hash tables with 4-byte 
keys. While going through the changes done in 17.11, we have found there was an 
added key_size check, which now requires key_size >= 8 bytes (see 
check_params_create() in rte_table_hash_ext.c). Not seeing any other options, 
so I was hoping someone could advise on how to support a 4-byte hash key size 
in 17.11 and on a go forward basis.

Regards,
Mike


Re: [dpdk-dev] [PATCH] net/mlx4: refinements to Rx packet type report

2018-07-01 Thread Shahaf Shuler
Thursday, June 28, 2018 3:40 PM, Adrien Mazarguil:
> Subject: Re: [dpdk-dev] [PATCH] net/mlx4: refinements to Rx packet type
> report
> 
> On Thu, Jun 28, 2018 at 09:30:28AM +0300, Moti Haimovsky wrote:
> > This commit refines the Rx Packet type flags reported by the PMD for
> > each packet being received in order to make the report more accurate.
> >
> > Signed-off-by: Moti Haimovsky 
> 
> Patch looks good, thanks.
> 
> Acked-by: Adrien Mazarguil 

Applied to next-net-mlx, thanks. 

> 
> --
> Adrien Mazarguil
> 6WIND


[dpdk-dev] [PATCH] net/thunderx: add support for Rx VLAN offload

2018-07-01 Thread Pavan Nikhilesh
From: "Kudurumalla, Rakesh" 

This feature is used to offload stripping of vlan header from recevied
packets and update vlan_tci field in mbuf when
DEV_RX_OFFLOAD_VLAN_STRIP & ETH_VLAN_STRIP_MASK flag is set.

Signed-off-by: Rakesh Kudurumalla 
Signed-off-by: Pavan Nikhilesh 
---
 drivers/net/thunderx/base/nicvf_hw.c |  1 +
 drivers/net/thunderx/nicvf_ethdev.c  | 59 +--
 drivers/net/thunderx/nicvf_rxtx.c| 70 
 drivers/net/thunderx/nicvf_rxtx.h| 15 --
 drivers/net/thunderx/nicvf_struct.h  |  1 +
 5 files changed, 119 insertions(+), 27 deletions(-)

diff --git a/drivers/net/thunderx/base/nicvf_hw.c 
b/drivers/net/thunderx/base/nicvf_hw.c
index b07a2937d..5b1abe201 100644
--- a/drivers/net/thunderx/base/nicvf_hw.c
+++ b/drivers/net/thunderx/base/nicvf_hw.c
@@ -699,6 +699,7 @@ nicvf_vlan_hw_strip(struct nicvf *nic, bool enable)
else
val &= ~((STRIP_SECOND_VLAN | STRIP_FIRST_VLAN) << 25);
 
+   nic->vlan_strip = enable;
nicvf_reg_write(nic, NIC_VNIC_RQ_GEN_CFG, val);
 }
 
diff --git a/drivers/net/thunderx/nicvf_ethdev.c 
b/drivers/net/thunderx/nicvf_ethdev.c
index 76fed9f99..4f58b2e33 100644
--- a/drivers/net/thunderx/nicvf_ethdev.c
+++ b/drivers/net/thunderx/nicvf_ethdev.c
@@ -52,6 +52,8 @@ static void nicvf_dev_stop(struct rte_eth_dev *dev);
 static void nicvf_dev_stop_cleanup(struct rte_eth_dev *dev, bool cleanup);
 static void nicvf_vf_stop(struct rte_eth_dev *dev, struct nicvf *nic,
  bool cleanup);
+static int nicvf_vlan_offload_config(struct rte_eth_dev *dev, int mask);
+static int nicvf_vlan_offload_set(struct rte_eth_dev *dev, int mask);
 
 RTE_INIT(nicvf_init_log);
 static void
@@ -357,11 +359,9 @@ nicvf_dev_supported_ptypes_get(struct rte_eth_dev *dev)
}
 
memcpy((char *)ptypes + copied, &ptypes_end, sizeof(ptypes_end));
-   if (dev->rx_pkt_burst == nicvf_recv_pkts ||
-   dev->rx_pkt_burst == nicvf_recv_pkts_multiseg)
-   return ptypes;
 
-   return NULL;
+   /* All Ptypes are supported in all Rx functions. */
+   return ptypes;
 }
 
 static void
@@ -918,13 +918,18 @@ nicvf_set_tx_function(struct rte_eth_dev *dev)
 static void
 nicvf_set_rx_function(struct rte_eth_dev *dev)
 {
-   if (dev->data->scattered_rx) {
-   PMD_DRV_LOG(DEBUG, "Using multi-segment rx callback");
-   dev->rx_pkt_burst = nicvf_recv_pkts_multiseg;
-   } else {
-   PMD_DRV_LOG(DEBUG, "Using single-segment rx callback");
-   dev->rx_pkt_burst = nicvf_recv_pkts;
-   }
+   struct nicvf *nic = nicvf_pmd_priv(dev);
+
+   const eth_rx_burst_t rx_burst_func[2][2] = {
+   /* [NORMAL/SCATTER] [VLAN_STRIP/NO_VLAN_STRIP] */
+   [0][0] = nicvf_recv_pkts_no_offload,
+   [0][1] = nicvf_recv_pkts_vlan_strip,
+   [1][0] = nicvf_recv_pkts_multiseg_no_offload,
+   [1][1] = nicvf_recv_pkts_multiseg_vlan_strip,
+   };
+
+   dev->rx_pkt_burst =
+   rx_burst_func[dev->data->scattered_rx][nic->vlan_strip];
 }
 
 static int
@@ -1469,7 +1474,7 @@ nicvf_vf_start(struct rte_eth_dev *dev, struct nicvf 
*nic, uint32_t rbdrsz)
struct rte_mbuf *mbuf;
uint16_t rx_start, rx_end;
uint16_t tx_start, tx_end;
-   bool vlan_strip;
+   int mask;
 
PMD_INIT_FUNC_TRACE();
 
@@ -1590,9 +1595,9 @@ nicvf_vf_start(struct rte_eth_dev *dev, struct nicvf 
*nic, uint32_t rbdrsz)
 nic->rbdr->tail, nb_rbdr_desc, nic->vf_id);
 
/* Configure VLAN Strip */
-   vlan_strip = !!(dev->data->dev_conf.rxmode.offloads &
-   DEV_RX_OFFLOAD_VLAN_STRIP);
-   nicvf_vlan_hw_strip(nic, vlan_strip);
+   mask = ETH_VLAN_STRIP_MASK | ETH_VLAN_FILTER_MASK |
+   ETH_VLAN_EXTEND_MASK;
+   ret = nicvf_vlan_offload_config(dev, mask);
 
/* Based on the packet type(IPv4 or IPv6), the nicvf HW aligns L3 data
 * to the 64bit memory address.
@@ -1983,6 +1988,7 @@ static const struct eth_dev_ops nicvf_eth_dev_ops = {
.dev_infos_get= nicvf_dev_info_get,
.dev_supported_ptypes_get = nicvf_dev_supported_ptypes_get,
.mtu_set  = nicvf_dev_set_mtu,
+   .vlan_offload_set = nicvf_vlan_offload_set,
.reta_update  = nicvf_dev_reta_update,
.reta_query   = nicvf_dev_reta_query,
.rss_hash_update  = nicvf_dev_rss_hash_update,
@@ -1999,6 +2005,29 @@ static const struct eth_dev_ops nicvf_eth_dev_ops = {
.get_reg  = nicvf_dev_get_regs,
 };
 
+static int
+nicvf_vlan_offload_config(struct rte_eth_dev *dev, int mask)
+{
+   struct rte_eth_rxmode *rxmode;
+   struct nicvf *nic = nicvf_pmd_priv(dev);
+   rxmode = &dev->data->dev_conf.rxmode;
+   if (mask & ETH_VLAN_STRIP_MASK) {
+   if (rxmode->offloads & DEV_RX_OFFLOAD_VLAN_

Re: [dpdk-dev] [PATCH v6] net/fm10k: add support for check descriptor status APIs

2018-07-01 Thread Zhao1, Wei
Hi, Ferruh

> -Original Message-
> From: Yigit, Ferruh
> Sent: Friday, June 29, 2018 7:04 PM
> To: Zhao1, Wei ; dev@dpdk.org
> Cc: Zhang, Qi Z 
> Subject: Re: [dpdk-dev] [PATCH v6] net/fm10k: add support for check
> descriptor status APIs
> 
> On 6/29/2018 2:48 AM, Wei Zhao wrote:
> > rte_eth_rx_descritpr_status and rte_eth_tx_descriptor_status are
> > supported by fm10K.
> >
> > Signed-off-by: Wei Zhao 
> >
> > ---
> >
> > v2:
> > -fix DD check error in tx descriptor
> >
> > v3:
> > -fix DD check index error
> >
> > v4:
> > -fix error in RS bit list poll
> >
> > v5:
> > -rebase code to branch and delete useless variable
> >
> > v6:
> > -change release note
> > ---
> >  doc/guides/rel_notes/release_18_08.rst |  6 +++
> >  drivers/net/fm10k/fm10k.h  |  7 +++
> >  drivers/net/fm10k/fm10k_ethdev.c   |  2 +
> >  drivers/net/fm10k/fm10k_rxtx.c | 78
> ++
> 
> Can you please update fm10k*.ini files to announce newly added "Rx
> descriptor status" & "Tx descriptor status" features?

Ok, thank you. I will commit new patch.


[dpdk-dev] [PATCH v4 2/5] eventdev: improve err handling for Rx adapter queue add/del

2018-07-01 Thread Nikhil Rao
The new WRR sequence applicable after queue add/del is set
up after setting the new queue state, so a memory allocation
failure will leave behind an incorrect state.

This change separates the memory sizing + allocation for the
Rx poll and WRR array from calculation of the WRR sequence.
If there is a memory allocation failure, existing Rx queue
configuration remains unchanged.

Signed-off-by: Nikhil Rao 
---
 lib/librte_eventdev/rte_event_eth_rx_adapter.c | 418 ++---
 1 file changed, 302 insertions(+), 116 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.c 
b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
index 9361d48..926f83a 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.c
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
@@ -109,10 +109,16 @@ struct eth_device_info {
 * rx_adapter_stop callback needs to be invoked
 */
uint8_t dev_rx_started;
-   /* If nb_dev_queues > 0, the start callback will
+   /* Number of queues added for this device */
+   uint16_t nb_dev_queues;
+   /* If nb_rx_poll > 0, the start callback will
 * be invoked if not already invoked
 */
-   uint16_t nb_dev_queues;
+   uint16_t nb_rx_poll;
+   /* sum(wrr(q)) for all queues within the device
+* useful when deleting all device queues
+*/
+   uint32_t wrr_len;
 };
 
 /* Per Rx queue */
@@ -188,13 +194,170 @@ static uint16_t rxa_gcd_u16(uint16_t a, uint16_t b)
}
 }
 
-/* Precalculate WRR polling sequence for all queues in rx_adapter */
+static inline int
+rxa_polled_queue(struct eth_device_info *dev_info,
+   int rx_queue_id)
+{
+   struct eth_rx_queue_info *queue_info;
+
+   queue_info = &dev_info->rx_queue[rx_queue_id];
+   return !dev_info->internal_event_port &&
+   dev_info->rx_queue &&
+   queue_info->queue_enabled && queue_info->wt != 0;
+}
+
+/* Calculate size of the eth_rx_poll and wrr_sched arrays
+ * after deleting poll mode rx queues
+ */
+static void
+rxa_calc_nb_post_poll_del(struct rte_event_eth_rx_adapter *rx_adapter,
+   struct eth_device_info *dev_info,
+   int rx_queue_id,
+   uint32_t *nb_rx_poll,
+   uint32_t *nb_wrr)
+{
+   uint32_t poll_diff;
+   uint32_t wrr_len_diff;
+
+   if (rx_queue_id == -1) {
+   poll_diff = dev_info->nb_rx_poll;
+   wrr_len_diff = dev_info->wrr_len;
+   } else {
+   poll_diff = rxa_polled_queue(dev_info, rx_queue_id);
+   wrr_len_diff = poll_diff ? dev_info->rx_queue[rx_queue_id].wt :
+   0;
+   }
+
+   *nb_rx_poll = rx_adapter->num_rx_polled - poll_diff;
+   *nb_wrr = rx_adapter->wrr_len - wrr_len_diff;
+}
+
+/* Calculate nb_rx_* after adding poll mode rx queues
+ */
+static void
+rxa_calc_nb_post_add_poll(struct rte_event_eth_rx_adapter *rx_adapter,
+   struct eth_device_info *dev_info,
+   int rx_queue_id,
+   uint16_t wt,
+   uint32_t *nb_rx_poll,
+   uint32_t *nb_wrr)
+{
+   uint32_t poll_diff;
+   uint32_t wrr_len_diff;
+
+   if (rx_queue_id == -1) {
+   poll_diff = dev_info->dev->data->nb_rx_queues -
+   dev_info->nb_rx_poll;
+   wrr_len_diff = wt*dev_info->dev->data->nb_rx_queues
+   - dev_info->wrr_len;
+   } else {
+   poll_diff = !rxa_polled_queue(dev_info, rx_queue_id);
+   wrr_len_diff = rxa_polled_queue(dev_info, rx_queue_id) ?
+   wt - dev_info->rx_queue[rx_queue_id].wt :
+   wt;
+   }
+
+   *nb_rx_poll = rx_adapter->num_rx_polled + poll_diff;
+   *nb_wrr = rx_adapter->wrr_len + wrr_len_diff;
+}
+
+/* Calculate nb_rx_* after adding rx_queue_id */
+static void
+rxa_calc_nb_post_add(struct rte_event_eth_rx_adapter *rx_adapter,
+   struct eth_device_info *dev_info,
+   int rx_queue_id,
+   uint16_t wt,
+   uint32_t *nb_rx_poll,
+   uint32_t *nb_wrr)
+{
+   rxa_calc_nb_post_add_poll(rx_adapter, dev_info, rx_queue_id,
+   wt, nb_rx_poll, nb_wrr);
+}
+
+/* Calculate nb_rx_* after deleting rx_queue_id */
+static void
+rxa_calc_nb_post_del(struct rte_event_eth_rx_adapter *rx_adapter,
+   struct eth_device_info *dev_info,
+   int rx_queue_id,
+   uint32_t *nb_rx_poll,
+   uint32_t *nb_wrr)
+{
+   rxa_calc_nb_post_poll_del(rx_adapter, dev_info, rx_queue_id, nb_rx_poll,
+   nb_wrr);
+}
+
+/*
+ * Allocate the rx_poll array
+ */
+static struct eth_rx_poll_entry *
+rxa_alloc_poll(struct rte_event_eth_rx_adapter *rx_adapter,
+   uint32_t nu

[dpdk-dev] [PATCH v4 0/5] eventdev: add interrupt driven queues to Rx adapter

2018-07-01 Thread Nikhil Rao
This patch series adds support for interrupt driven queues to the
ethernet Rx adapter, the first 3 patches prepare the code to
handle both poll and interrupt driven Rx queues, the 4th patch
patch has code changes specific to interrupt driven queues and
the final patch has test code.

Changelog:
v3->v4:

* Fix FreeBSD build breakage.

v2->v3:

* Fix shared build breakage.

* Fix FreeBSD build breakage.

* Reduce epoll maxevents parameter by 1, since thread wakeup
  uses pthread_cancel as opposed to an exit message through a
  file monitored by epoll_wait().

* Check intr_handle before access, it is NULL when zero Rx queue
  interrupts are configured.

* Remove thread_stop flag, in the event of a pthread_cancel, it is
  not possible to check this flag thread stack is unwound without
  returning to rxa_intr_thread.

v1->v2:

* Move rte_service_component_runstate_set such that it
  is called only when cap & RTE__EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT
  is false. (Jerin Jacob)

* Fix meson build. (Jerin Jacob)

* Replace calls to pthread_* with rte_ctrl_thread_create().
  (Jerin Jacob)

* Move adapter test code to separate patch. (Jerin Jacob)

Note: I haven't removed the note about devices created
rte_event_eth_rx_adapter_create, will fix in a separate patch.


Nikhil Rao (5):
  eventdev: standardize Rx adapter internal function names
  eventdev: improve err handling for Rx adapter queue add/del
  eventdev: move Rx adapter eth Rx to separate function
  eventdev: add interrupt driven queues to Rx adapter
  eventdev: add Rx adapter tests for interrupt driven queues

 config/rte_config.h|1 +
 lib/librte_eventdev/rte_event_eth_rx_adapter.h |5 +-
 lib/librte_eventdev/rte_event_eth_rx_adapter.c | 1526 +---
 test/test/test_event_eth_rx_adapter.c  |  261 +++-
 .../prog_guide/event_ethernet_rx_adapter.rst   |   24 +
 config/common_base |1 +
 lib/librte_eventdev/Makefile   |9 +-
 7 files changed, 1588 insertions(+), 239 deletions(-)

-- 
1.8.3.1



[dpdk-dev] [PATCH v4 5/5] eventdev: add Rx adapter tests for interrupt driven queues

2018-07-01 Thread Nikhil Rao
Add test for queue add and delete, the add/delete calls
also switch queues between poll and interrupt mode.

Signed-off-by: Nikhil Rao 
---
 test/test/test_event_eth_rx_adapter.c | 261 +++---
 1 file changed, 242 insertions(+), 19 deletions(-)

diff --git a/test/test/test_event_eth_rx_adapter.c 
b/test/test/test_event_eth_rx_adapter.c
index d432731..2337e54 100644
--- a/test/test/test_event_eth_rx_adapter.c
+++ b/test/test/test_event_eth_rx_adapter.c
@@ -25,28 +25,17 @@ struct event_eth_rx_adapter_test_params {
struct rte_mempool *mp;
uint16_t rx_rings, tx_rings;
uint32_t caps;
+   int rx_intr_port_inited;
+   uint16_t rx_intr_port;
 };
 
 static struct event_eth_rx_adapter_test_params default_params;
 
 static inline int
-port_init(uint8_t port, struct rte_mempool *mp)
+port_init_common(uint8_t port, const struct rte_eth_conf *port_conf,
+   struct rte_mempool *mp)
 {
-   static const struct rte_eth_conf port_conf_default = {
-   .rxmode = {
-   .mq_mode = ETH_MQ_RX_RSS,
-   .max_rx_pkt_len = ETHER_MAX_LEN
-   },
-   .rx_adv_conf = {
-   .rss_conf = {
-   .rss_hf = ETH_RSS_IP |
- ETH_RSS_TCP |
- ETH_RSS_UDP,
-   }
-   }
-   };
const uint16_t rx_ring_size = 512, tx_ring_size = 512;
-   struct rte_eth_conf port_conf = port_conf_default;
int retval;
uint16_t q;
struct rte_eth_dev_info dev_info;
@@ -54,7 +43,7 @@ struct event_eth_rx_adapter_test_params {
if (!rte_eth_dev_is_valid_port(port))
return -1;
 
-   retval = rte_eth_dev_configure(port, 0, 0, &port_conf);
+   retval = rte_eth_dev_configure(port, 0, 0, port_conf);
 
rte_eth_dev_info_get(port, &dev_info);
 
@@ -64,7 +53,7 @@ struct event_eth_rx_adapter_test_params {
 
/* Configure the Ethernet device. */
retval = rte_eth_dev_configure(port, default_params.rx_rings,
-   default_params.tx_rings, &port_conf);
+   default_params.tx_rings, port_conf);
if (retval != 0)
return retval;
 
@@ -104,6 +93,77 @@ struct event_eth_rx_adapter_test_params {
return 0;
 }
 
+static inline int
+port_init_rx_intr(uint8_t port, struct rte_mempool *mp)
+{
+   static const struct rte_eth_conf port_conf_default = {
+   .rxmode = {
+   .mq_mode = ETH_MQ_RX_RSS,
+   .max_rx_pkt_len = ETHER_MAX_LEN
+   },
+   .intr_conf = {
+   .rxq = 1,
+   },
+   };
+
+   return port_init_common(port, &port_conf_default, mp);
+}
+
+static inline int
+port_init(uint8_t port, struct rte_mempool *mp)
+{
+   static const struct rte_eth_conf port_conf_default = {
+   .rxmode = {
+   .mq_mode = ETH_MQ_RX_RSS,
+   .max_rx_pkt_len = ETHER_MAX_LEN
+   },
+   .rx_adv_conf = {
+   .rss_conf = {
+   .rss_hf = ETH_RSS_IP |
+   ETH_RSS_TCP |
+   ETH_RSS_UDP,
+   }
+   }
+   };
+
+   return port_init_common(port, &port_conf_default, mp);
+}
+
+static int
+init_port_rx_intr(int num_ports)
+{
+   int retval;
+   uint16_t portid;
+   int err;
+
+   default_params.mp = rte_pktmbuf_pool_create("packet_pool",
+  NB_MBUFS,
+  MBUF_CACHE_SIZE,
+  MBUF_PRIV_SIZE,
+  RTE_MBUF_DEFAULT_BUF_SIZE,
+  rte_socket_id());
+   if (!default_params.mp)
+   return -ENOMEM;
+
+   RTE_ETH_FOREACH_DEV(portid) {
+   retval = port_init_rx_intr(portid, default_params.mp);
+   if (retval)
+   continue;
+   err = rte_event_eth_rx_adapter_caps_get(TEST_DEV_ID, portid,
+   &default_params.caps);
+   if (err)
+   continue;
+   if (!(default_params.caps &
+   RTE_EVENT_ETH_RX_ADAPTER_CAP_INTERNAL_PORT)) {
+   default_params.rx_intr_port_inited = 1;
+   default_params.rx_intr_port = portid;
+   return 0;
+   }
+   rte_eth_dev_stop(portid);
+   }
+   return 0;
+}
+
 static int
 init_ports(int num_ports)
 {
@@ -181,6 +241,57 @@ struct event_eth_rx_adapter_test_params {

[dpdk-dev] [PATCH v4 4/5] eventdev: add interrupt driven queues to Rx adapter

2018-07-01 Thread Nikhil Rao
Add support for interrupt driven queues when eth device is
configured for rxq interrupts and servicing weight for the
queue is configured to be zero.

A interrupt driven packet received counter has been added to
rte_event_eth_rx_adapter_stats.

Signed-off-by: Nikhil Rao 
---
 config/rte_config.h|   1 +
 lib/librte_eventdev/rte_event_eth_rx_adapter.h |   5 +-
 lib/librte_eventdev/rte_event_eth_rx_adapter.c | 940 -
 .../prog_guide/event_ethernet_rx_adapter.rst   |  24 +
 config/common_base |   1 +
 lib/librte_eventdev/Makefile   |   9 +-
 6 files changed, 950 insertions(+), 30 deletions(-)

diff --git a/config/rte_config.h b/config/rte_config.h
index a1d0175..ec88f14 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -64,6 +64,7 @@
 #define RTE_EVENT_MAX_DEVS 16
 #define RTE_EVENT_MAX_QUEUES_PER_DEV 64
 #define RTE_EVENT_TIMER_ADAPTER_NUM_MAX 32
+#define RTE_EVENT_ETH_INTR_RING_SIZE 1024
 #define RTE_EVENT_CRYPTO_ADAPTER_MAX_INSTANCE 32
 
 /* rawdev defines */
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.h 
b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
index 307b2b5..97f25e9 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.h
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
@@ -64,8 +64,7 @@
  * the service function ID of the adapter in this case.
  *
  * Note:
- * 1) Interrupt driven receive queues are currently unimplemented.
- * 2) Devices created after an instance of rte_event_eth_rx_adapter_create
+ * 1) Devices created after an instance of rte_event_eth_rx_adapter_create
  *  should be added to a new instance of the rx adapter.
  */
 
@@ -199,6 +198,8 @@ struct rte_event_eth_rx_adapter_stats {
 * block cycles can be used to compute the percentage of
 * cycles the service is blocked by the event device.
 */
+   uint64_t rx_intr_packets;
+   /**< Received packet count for interrupt mode Rx queues */
 };
 
 /**
diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.c 
b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
index 8fe037f..42dd7f8 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.c
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
@@ -2,6 +2,11 @@
  * Copyright(c) 2017 Intel Corporation.
  * All rights reserved.
  */
+#if defined(LINUX)
+#include 
+#endif
+#include 
+
 #include 
 #include 
 #include 
@@ -11,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "rte_eventdev.h"
 #include "rte_eventdev_pmd.h"
@@ -24,6 +30,22 @@
 #define ETH_RX_ADAPTER_MEM_NAME_LEN32
 
 #define RSS_KEY_SIZE   40
+/* value written to intr thread pipe to signal thread exit */
+#define ETH_BRIDGE_INTR_THREAD_EXIT1
+/* Sentinel value to detect initialized file handle */
+#define INIT_FD-1
+
+/*
+ * Used to store port and queue ID of interrupting Rx queue
+ */
+union queue_data {
+   RTE_STD_C11
+   void *ptr;
+   struct {
+   uint16_t port;
+   uint16_t queue;
+   };
+};
 
 /*
  * There is an instance of this struct per polled Rx queue added to the
@@ -75,6 +97,30 @@ struct rte_event_eth_rx_adapter {
uint16_t enq_block_count;
/* Block start ts */
uint64_t rx_enq_block_start_ts;
+   /* epoll fd used to wait for Rx interrupts */
+   int epd;
+   /* Num of interrupt driven interrupt queues */
+   uint32_t num_rx_intr;
+   /* Used to send  of interrupting Rx queues from
+* the interrupt thread to the Rx thread
+*/
+   struct rte_ring *intr_ring;
+   /* Rx Queue data (dev id, queue id) for the last non-empty
+* queue polled
+*/
+   union queue_data qd;
+   /* queue_data is valid */
+   int qd_valid;
+   /* Interrupt ring lock, synchronizes Rx thread
+* and interrupt thread
+*/
+   rte_spinlock_t intr_ring_lock;
+   /* event array passed to rte_poll_wait */
+   struct rte_epoll_event *epoll_events;
+   /* Count of interrupt vectors in use */
+   uint32_t num_intr_vec;
+   /* Thread blocked on Rx interrupts */
+   pthread_t rx_intr_thread;
/* Configuration callback for rte_service configuration */
rte_event_eth_rx_adapter_conf_cb conf_cb;
/* Configuration callback argument */
@@ -93,6 +139,8 @@ struct rte_event_eth_rx_adapter {
uint32_t service_id;
/* Adapter started flag */
uint8_t rxa_started;
+   /* Adapter ID */
+   uint8_t id;
 } __rte_cache_aligned;
 
 /* Per eth device */
@@ -111,19 +159,40 @@ struct eth_device_info {
uint8_t dev_rx_started;
/* Number of queues added for this device */
uint16_t nb_dev_queues;
-   /* If nb_rx_poll > 0, the start callback will
+   /* Number of poll based queues
+* If nb_rx_poll > 0, the start callback will
 * be invoked if not already invoked
 */
uint16_t nb

[dpdk-dev] [PATCH v4 1/5] eventdev: standardize Rx adapter internal function names

2018-07-01 Thread Nikhil Rao
Add a common prefix to function names and rename
few to better match functionality

Signed-off-by: Nikhil Rao 
Acked-by: Jerin Jacob 
---
 lib/librte_eventdev/rte_event_eth_rx_adapter.c | 167 -
 1 file changed, 80 insertions(+), 87 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.c 
b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
index ce1f62d..9361d48 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.c
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
@@ -129,30 +129,30 @@ struct eth_rx_queue_info {
 static struct rte_event_eth_rx_adapter **event_eth_rx_adapter;
 
 static inline int
-valid_id(uint8_t id)
+rxa_validate_id(uint8_t id)
 {
return id < RTE_EVENT_ETH_RX_ADAPTER_MAX_INSTANCE;
 }
 
 #define RTE_EVENT_ETH_RX_ADAPTER_ID_VALID_OR_ERR_RET(id, retval) do { \
-   if (!valid_id(id)) { \
+   if (!rxa_validate_id(id)) { \
RTE_EDEV_LOG_ERR("Invalid eth Rx adapter id = %d\n", id); \
return retval; \
} \
 } while (0)
 
 static inline int
-sw_rx_adapter_queue_count(struct rte_event_eth_rx_adapter *rx_adapter)
+rxa_sw_adapter_queue_count(struct rte_event_eth_rx_adapter *rx_adapter)
 {
return rx_adapter->num_rx_polled;
 }
 
 /* Greatest common divisor */
-static uint16_t gcd_u16(uint16_t a, uint16_t b)
+static uint16_t rxa_gcd_u16(uint16_t a, uint16_t b)
 {
uint16_t r = a % b;
 
-   return r ? gcd_u16(b, r) : b;
+   return r ? rxa_gcd_u16(b, r) : b;
 }
 
 /* Returns the next queue in the polling sequence
@@ -160,7 +160,7 @@ static uint16_t gcd_u16(uint16_t a, uint16_t b)
  * http://kb.linuxvirtualserver.org/wiki/Weighted_Round-Robin_Scheduling
  */
 static int
-wrr_next(struct rte_event_eth_rx_adapter *rx_adapter,
+rxa_wrr_next(struct rte_event_eth_rx_adapter *rx_adapter,
 unsigned int n, int *cw,
 struct eth_rx_poll_entry *eth_rx_poll, uint16_t max_wt,
 uint16_t gcd, int prev)
@@ -190,7 +190,7 @@ static uint16_t gcd_u16(uint16_t a, uint16_t b)
 
 /* Precalculate WRR polling sequence for all queues in rx_adapter */
 static int
-eth_poll_wrr_calc(struct rte_event_eth_rx_adapter *rx_adapter)
+rxa_calc_wrr_sequence(struct rte_event_eth_rx_adapter *rx_adapter)
 {
uint16_t d;
uint16_t q;
@@ -239,7 +239,7 @@ static uint16_t gcd_u16(uint16_t a, uint16_t b)
rx_poll[poll_q].eth_rx_qid = q;
max_wrr_pos += wt;
max_wt = RTE_MAX(max_wt, wt);
-   gcd = (gcd) ? gcd_u16(gcd, wt) : wt;
+   gcd = (gcd) ? rxa_gcd_u16(gcd, wt) : wt;
poll_q++;
}
}
@@ -259,7 +259,7 @@ static uint16_t gcd_u16(uint16_t a, uint16_t b)
int prev = -1;
int cw = -1;
for (i = 0; i < max_wrr_pos; i++) {
-   rx_wrr[i] = wrr_next(rx_adapter, poll_q, &cw,
+   rx_wrr[i] = rxa_wrr_next(rx_adapter, poll_q, &cw,
 rx_poll, max_wt, gcd, prev);
prev = rx_wrr[i];
}
@@ -276,7 +276,7 @@ static uint16_t gcd_u16(uint16_t a, uint16_t b)
 }
 
 static inline void
-mtoip(struct rte_mbuf *m, struct ipv4_hdr **ipv4_hdr,
+rxa_mtoip(struct rte_mbuf *m, struct ipv4_hdr **ipv4_hdr,
struct ipv6_hdr **ipv6_hdr)
 {
struct ether_hdr *eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
@@ -315,7 +315,7 @@ static uint16_t gcd_u16(uint16_t a, uint16_t b)
 
 /* Calculate RSS hash for IPv4/6 */
 static inline uint32_t
-do_softrss(struct rte_mbuf *m, const uint8_t *rss_key_be)
+rxa_do_softrss(struct rte_mbuf *m, const uint8_t *rss_key_be)
 {
uint32_t input_len;
void *tuple;
@@ -324,7 +324,7 @@ static uint16_t gcd_u16(uint16_t a, uint16_t b)
struct ipv4_hdr *ipv4_hdr;
struct ipv6_hdr *ipv6_hdr;
 
-   mtoip(m, &ipv4_hdr, &ipv6_hdr);
+   rxa_mtoip(m, &ipv4_hdr, &ipv6_hdr);
 
if (ipv4_hdr) {
ipv4_tuple.src_addr = rte_be_to_cpu_32(ipv4_hdr->src_addr);
@@ -343,13 +343,13 @@ static uint16_t gcd_u16(uint16_t a, uint16_t b)
 }
 
 static inline int
-rx_enq_blocked(struct rte_event_eth_rx_adapter *rx_adapter)
+rxa_enq_blocked(struct rte_event_eth_rx_adapter *rx_adapter)
 {
return !!rx_adapter->enq_block_count;
 }
 
 static inline void
-rx_enq_block_start_ts(struct rte_event_eth_rx_adapter *rx_adapter)
+rxa_enq_block_start_ts(struct rte_event_eth_rx_adapter *rx_adapter)
 {
if (rx_adapter->rx_enq_block_start_ts)
return;
@@ -362,13 +362,13 @@ static uint16_t gcd_u16(uint16_t a, uint16_t b)
 }
 
 static inline void
-rx_enq_block_end_ts(struct rte_event_eth_rx_adapter *rx_adapter,
+rxa_enq_block_end_ts(struct rte_event_eth_rx_adapter *rx_adapter,
struct rte_event_eth_rx_adapter_stats *stats)
 {
if (unlikely(

[dpdk-dev] [PATCH v4 3/5] eventdev: move Rx adapter eth Rx to separate function

2018-07-01 Thread Nikhil Rao
Create a separate function that handles eth receive and
enqueue to event buffer. This function will also be called for
interrupt driven receive queues.

Signed-off-by: Nikhil Rao 
Acked-by: Jerin Jacob 
---
 lib/librte_eventdev/rte_event_eth_rx_adapter.c | 67 ++
 1 file changed, 47 insertions(+), 20 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.c 
b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
index 926f83a..8fe037f 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.c
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.c
@@ -616,6 +616,45 @@ static uint16_t rxa_gcd_u16(uint16_t a, uint16_t b)
}
 }
 
+/* Enqueue packets fromto event buffer */
+static inline uint32_t
+rxa_eth_rx(struct rte_event_eth_rx_adapter *rx_adapter,
+   uint16_t port_id,
+   uint16_t queue_id,
+   uint32_t rx_count,
+   uint32_t max_rx)
+{
+   struct rte_mbuf *mbufs[BATCH_SIZE];
+   struct rte_eth_event_enqueue_buffer *buf =
+   &rx_adapter->event_enqueue_buffer;
+   struct rte_event_eth_rx_adapter_stats *stats =
+   &rx_adapter->stats;
+   uint16_t n;
+   uint32_t nb_rx = 0;
+
+   /* Don't do a batch dequeue from the rx queue if there isn't
+* enough space in the enqueue buffer.
+*/
+   while (BATCH_SIZE <= (RTE_DIM(buf->events) - buf->count)) {
+   if (buf->count >= BATCH_SIZE)
+   rxa_flush_event_buffer(rx_adapter);
+
+   stats->rx_poll_count++;
+   n = rte_eth_rx_burst(port_id, queue_id, mbufs, BATCH_SIZE);
+   if (unlikely(!n))
+   break;
+   rxa_buffer_mbufs(rx_adapter, port_id, queue_id, mbufs, n);
+   nb_rx += n;
+   if (rx_count + nb_rx > max_rx)
+   break;
+   }
+
+   if (buf->count >= BATCH_SIZE)
+   rxa_flush_event_buffer(rx_adapter);
+
+   return nb_rx;
+}
+
 /*
  * Polls receive queues added to the event adapter and enqueues received
  * packets to the event device.
@@ -633,17 +672,16 @@ static uint16_t rxa_gcd_u16(uint16_t a, uint16_t b)
 rxa_poll(struct rte_event_eth_rx_adapter *rx_adapter)
 {
uint32_t num_queue;
-   uint16_t n;
uint32_t nb_rx = 0;
-   struct rte_mbuf *mbufs[BATCH_SIZE];
struct rte_eth_event_enqueue_buffer *buf;
uint32_t wrr_pos;
uint32_t max_nb_rx;
+   struct rte_event_eth_rx_adapter_stats *stats;
 
wrr_pos = rx_adapter->wrr_pos;
max_nb_rx = rx_adapter->max_nb_rx;
buf = &rx_adapter->event_enqueue_buffer;
-   struct rte_event_eth_rx_adapter_stats *stats = &rx_adapter->stats;
+   stats = &rx_adapter->stats;
 
/* Iterate through a WRR sequence */
for (num_queue = 0; num_queue < rx_adapter->wrr_len; num_queue++) {
@@ -658,32 +696,21 @@ static uint16_t rxa_gcd_u16(uint16_t a, uint16_t b)
rxa_flush_event_buffer(rx_adapter);
if (BATCH_SIZE > (ETH_EVENT_BUFFER_SIZE - buf->count)) {
rx_adapter->wrr_pos = wrr_pos;
-   return;
+   break;
}
 
-   stats->rx_poll_count++;
-   n = rte_eth_rx_burst(d, qid, mbufs, BATCH_SIZE);
-
-   if (n) {
-   stats->rx_packets += n;
-   /* The check before rte_eth_rx_burst() ensures that
-* all n mbufs can be buffered
-*/
-   rxa_buffer_mbufs(rx_adapter, d, qid, mbufs, n);
-   nb_rx += n;
-   if (nb_rx > max_nb_rx) {
-   rx_adapter->wrr_pos =
+   nb_rx += rxa_eth_rx(rx_adapter, d, qid, nb_rx, max_nb_rx);
+   if (nb_rx > max_nb_rx) {
+   rx_adapter->wrr_pos =
(wrr_pos + 1) % rx_adapter->wrr_len;
-   break;
-   }
+   break;
}
 
if (++wrr_pos == rx_adapter->wrr_len)
wrr_pos = 0;
}
 
-   if (buf->count >= BATCH_SIZE)
-   rxa_flush_event_buffer(rx_adapter);
+   stats->rx_packets += nb_rx;
 }
 
 static int
-- 
1.8.3.1



[dpdk-dev] [Bug 67] multi_process/l2fwd_fork failed to compile

2018-07-01 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=67

Bug ID: 67
   Summary: multi_process/l2fwd_fork failed to compile
   Product: DPDK
   Version: 18.05
  Hardware: All
OS: All
Status: CONFIRMED
  Severity: normal
  Priority: Normal
 Component: examples
  Assignee: dev@dpdk.org
  Reporter: wangl...@infoch.cn
  Target Milestone: ---

CC main.o
/root/dpdk-18.05/examples/multi_process/l2fwd_fork/main.c: In function ‘main’:
/root/dpdk-18.05/examples/multi_process/l2fwd_fork/main.c:1043:33: error:
‘dev_info’ undeclared (first use in this function)
   rte_eth_dev_info_get(portid, &dev_info);
 ^
/root/dpdk-18.05/examples/multi_process/l2fwd_fork/main.c:1043:33: note: each
undeclared identifier is reported only once for each function it appears in
/root/dpdk-18.05/examples/multi_process/l2fwd_fork/main.c:1077:11: error:
‘struct rte_eth_txconf’ has no member named ‘tx_offloads’
   txq_conf.tx_offloads = local_port_conf.txmode.offloads;
   ^
make[1]: *** [main.o] Error 1
make: *** [all] Error 2

-- 
You are receiving this mail because:
You are the assignee for the bug.

[dpdk-dev] [PATCH] eal: fix device be attached twice

2018-07-01 Thread Qi Zhang
If an attached PCI device be attached again, it will cause
rte_pci_device->device.name be corrupted due to unexpected
rte_devargs_remove.

Fixes: 7e8b26650146 ("eal: fix hotplug add / remove")
Cc: sta...@dpdk.org

Signed-off-by: Qi Zhang 
---
 lib/librte_eal/common/eal_common_dev.c | 21 +++--
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index 61cb3b162..14c5f05fa 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -42,18 +42,6 @@ static struct dev_event_cb_list dev_event_cbs;
 /* spinlock for device callbacks */
 static rte_spinlock_t dev_event_lock = RTE_SPINLOCK_INITIALIZER;
 
-static int cmp_detached_dev_name(const struct rte_device *dev,
-   const void *_name)
-{
-   const char *name = _name;
-
-   /* skip attached devices */
-   if (dev->driver != NULL)
-   return 1;
-
-   return strcmp(dev->name, name);
-}
-
 static int cmp_dev_name(const struct rte_device *dev, const void *_name)
 {
const char *name = _name;
@@ -151,14 +139,19 @@ int __rte_experimental rte_eal_hotplug_add(const char 
*busname, const char *devn
if (ret)
goto err_devarg;
 
-   dev = bus->find_device(NULL, cmp_detached_dev_name, devname);
+   dev = bus->find_device(NULL, cmp_dev_name, devname);
if (dev == NULL) {
-   RTE_LOG(ERR, EAL, "Cannot find unplugged device (%s)\n",
+   RTE_LOG(ERR, EAL, "Cannot find device (%s)\n",
devname);
ret = -ENODEV;
goto err_devarg;
}
 
+   if (dev->driver != NULL) {
+   RTE_LOG(ERR, EAL, "Device is already plugged\n");
+   return -EEXIST;
+   }
+
ret = bus->plug(dev);
if (ret) {
RTE_LOG(ERR, EAL, "Driver cannot attach the device (%s)\n",
-- 
2.13.6



[dpdk-dev] [PATCH] net/ixgbe: fix missing NULL point check

2018-07-01 Thread Qi Zhang
Add missing NULL point check in ixgbe_pf_host_uninit, or it may cause
segement fault when detach a device. 

Fixes: cf80ba6e2038 ("net/ixgbe: add support for representor ports")
Cc: sta...@dpdk.org

Signed-off-by: Qi Zhang 
---
 drivers/net/ixgbe/ixgbe_pf.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_pf.c b/drivers/net/ixgbe/ixgbe_pf.c
index 4d199c8..73f0e43 100644
--- a/drivers/net/ixgbe/ixgbe_pf.c
+++ b/drivers/net/ixgbe/ixgbe_pf.c
@@ -128,21 +128,24 @@ void ixgbe_pf_host_uninit(struct rte_eth_dev *eth_dev)
 
PMD_INIT_FUNC_TRACE();
 
-   vfinfo = IXGBE_DEV_PRIVATE_TO_P_VFDATA(eth_dev->data->dev_private);
-
RTE_ETH_DEV_SRIOV(eth_dev).active = 0;
RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool = 0;
RTE_ETH_DEV_SRIOV(eth_dev).def_vmdq_idx = 0;
RTE_ETH_DEV_SRIOV(eth_dev).def_pool_q_idx = 0;
 
-   ret = rte_eth_switch_domain_free((*vfinfo)->switch_domain_id);
-   if (ret)
-   PMD_INIT_LOG(WARNING, "failed to free switch domain: %d", ret);
-
vf_num = dev_num_vf(eth_dev);
if (vf_num == 0)
return;
 
+   vfinfo = IXGBE_DEV_PRIVATE_TO_P_VFDATA(eth_dev->data->dev_private);
+
+   if (*vfinfo == NULL)
+   return;
+
+   ret = rte_eth_switch_domain_free((*vfinfo)->switch_domain_id);
+   if (ret)
+   PMD_INIT_LOG(WARNING, "failed to free switch domain: %d", ret);
+
rte_free(*vfinfo);
*vfinfo = NULL;
 }
-- 
2.5.5



[dpdk-dev] [PATCH] net/mlx5: activate Verbs cleanup on removal

2018-07-01 Thread Matan Azrad
Starting from rdma-core v19, Mellanox OFED 4.4, the Verbs resources
cleanup is properly activated in plug-out process while setting the
MLX5_DEVICE_FATAL_CLEANUP environment variable to 1.

Set the aforementioned variable to 1.

Signed-off-by: Matan Azrad 
---
 drivers/net/mlx5/mlx5.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index f0e6ed7..d081bdd 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1409,6 +1409,11 @@
/* Match the size of Rx completion entry to the size of a cacheline. */
if (RTE_CACHE_LINE_SIZE == 128)
setenv("MLX5_CQE_SIZE", "128", 0);
+   /*
+* MLX5_DEVICE_FATAL_CLEANUP tells ibv_destroy functions to
+* cleanup all the Verbs resources even when the device was removed.
+*/
+   setenv("MLX5_DEVICE_FATAL_CLEANUP", "1", 1);
 #ifdef RTE_LIBRTE_MLX5_DLOPEN_DEPS
if (mlx5_glue_init())
return;
-- 
1.9.5



[dpdk-dev] [PATCH] net/i40e: fix link speed issue

2018-07-01 Thread Xiaoyun Li
When link needs to go up, I40E_AQ_PHY_AN_ENABLED is always be set in DPDK.
So all speeds are always set. This causes speed config never works.

This patch fixes this issue and only allows to set available speeds. If
link needs to go up and speed setting is not supported, it will print
warning and set default available speeds. And when link needs to go down,
link speed field should be set to non-zero to avoid link down issue when
binding back to kernel driver.

Fixes: ca7e599d4506 ("net/i40e: fix link management")
Fixes: 1bb8f661168d ("net/i40e: fix link down and negotiation")
Cc: sta...@dpdk.org

Signed-off-by: Xiaoyun Li 
---
 drivers/net/i40e/i40e_ethdev.c | 58 ++
 1 file changed, 36 insertions(+), 22 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 13c5d32..272a975 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -2026,27 +2026,38 @@ i40e_phy_conf_link(struct i40e_hw *hw,
struct i40e_aq_get_phy_abilities_resp phy_ab;
struct i40e_aq_set_phy_config phy_conf;
enum i40e_aq_phy_type cnt;
+   uint8_t avail_speed;
uint32_t phy_type_mask = 0;
 
const uint8_t mask = I40E_AQ_PHY_FLAG_PAUSE_TX |
I40E_AQ_PHY_FLAG_PAUSE_RX |
I40E_AQ_PHY_FLAG_PAUSE_RX |
I40E_AQ_PHY_FLAG_LOW_POWER;
-   const uint8_t advt = I40E_LINK_SPEED_40GB |
-   I40E_LINK_SPEED_25GB |
-   I40E_LINK_SPEED_10GB |
-   I40E_LINK_SPEED_1GB |
-   I40E_LINK_SPEED_100MB;
int ret = -ENOTSUP;
 
+   /* To get phy capabilities of available speeds. */
+   status = i40e_aq_get_phy_capabilities(hw, false, true, &phy_ab,
+ NULL);
+   if (status) {
+   PMD_DRV_LOG(ERR, "Failed to get PHY capabilities: %d\n",
+   status);
+   return ret;
+   }
+   avail_speed = phy_ab.link_speed;
 
+   /* To get the current phy config. */
status = i40e_aq_get_phy_capabilities(hw, false, false, &phy_ab,
  NULL);
-   if (status)
+   if (status) {
+   PMD_DRV_LOG(ERR, "Failed to get the current PHY config: %d\n",
+   status);
return ret;
+   }
 
-   /* If link already up, no need to set up again */
-   if (is_up && phy_ab.phy_type != 0)
+   /* If link needs to go up and its speed values are OK, no need
+* to set up again.
+*/
+   if (is_up && phy_ab.phy_type != 0 && phy_ab.link_speed != 0)
return I40E_SUCCESS;
 
memset(&phy_conf, 0, sizeof(phy_conf));
@@ -2055,15 +2066,17 @@ i40e_phy_conf_link(struct i40e_hw *hw,
abilities &= ~mask;
abilities |= phy_ab.abilities & mask;
 
-   /* update ablities and speed */
-   if (abilities & I40E_AQ_PHY_AN_ENABLED)
-   phy_conf.link_speed = advt;
-   else
-   phy_conf.link_speed = is_up ? force_speed : phy_ab.link_speed;
-
phy_conf.abilities = abilities;
 
-
+   /* If link needs to go up, but the force speed is not supported,
+* Warn users and config the default available speeds.
+*/
+   if (is_up && !(force_speed & avail_speed)) {
+   PMD_DRV_LOG(WARNING, "Invalid speed setting, set to 
default!\n");
+   phy_conf.link_speed = avail_speed;
+   } else {
+   phy_conf.link_speed = is_up ? force_speed : avail_speed;
+   }
 
/* To enable link, phy_type mask needs to include each type */
for (cnt = I40E_PHY_TYPE_SGMII; cnt < I40E_PHY_TYPE_MAX; cnt++)
@@ -2099,6 +2112,14 @@ i40e_apply_link_speed(struct rte_eth_dev *dev)
struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
struct rte_eth_conf *conf = &dev->data->dev_conf;
 
+   if (conf->link_speeds == ETH_LINK_SPEED_AUTONEG) {
+   conf->link_speeds = ETH_LINK_SPEED_40G |
+   ETH_LINK_SPEED_25G |
+   ETH_LINK_SPEED_20G |
+   ETH_LINK_SPEED_10G |
+   ETH_LINK_SPEED_1G |
+   ETH_LINK_SPEED_100M;
+   }
speed = i40e_parse_link_speeds(conf->link_speeds);
abilities |= I40E_AQ_PHY_ENABLE_ATOMIC_LINK;
if (!(conf->link_speeds & ETH_LINK_SPEED_FIXED))
@@ -2220,13 +2241,6 @@ i40e_dev_start(struct rte_eth_dev *dev)
}
 
/* Apply link configure */
-   if (dev->data->dev_conf.link_speeds & ~(ETH_LINK_SPEED_100M |
-   ETH_LINK_SPEED_1G | ETH_LINK_SPEED_10G |
-   ETH_LINK_SPEED_20G | ETH_LINK_SPEED_25G |
-   ETH_LINK_SPEED_40G)) {
-  

[dpdk-dev] [PATCH v8 00/19] enable hotplug on multi-process

2018-07-01 Thread Qi Zhang
v8:
- update rte_eal_version.map due to new API added.
- minor reword on release note.
- minor fix on commit log and code style.

NOTE:
  Some issues which is not related with this patchset is expected when
  play with hotplug_mp sample as belows.

- Attach a PCI device twice may cause device can't be detached
  below fix is required:
  https://patches.dpdk.org/patch/42030/

- ixgbe device can't detached, below fix is required
  https://patches.dpdk.org/patch/42031/

v7:
- update rte_ethdev_version.map for new APIs.
- improve code readability in __handle_secondary_request by use goto.
- add comments to explain why need to call rte_eal_alarm_set.
- add error log when process_mp_init_callbacks failed.
- reword release notes base on Anatoly's suggestion.
- add back previous "Acked-by" and "Reviewed-by" in commit log.

  NOTE: current patchset depends on below IPC fix, or it may not be able
  to attach a shared vdev.
  https://patches.dpdk.org/patch/41647/

v6:
- remove bus->scan_one, since ABI break is not necessary.
- remove patch for failsafe PMD since it will not support secondary.
- fix wrong implemenation on ixgbe.
- add rte_eth_dev_release_port_private into rte_eth_dev_pci_generic_remove for
  secondary process, so we don't need to patch on PMD if PMD use the
  default remove function.
- add release notes update.
- agreed to use strdup(peer) as workaround for repling a sync request in 
seperate
  thread.

v5:
- since we will keep mp thread separate from interrupt thread,
  it is not necessary to use temporary thread, we use rte_eal_alarm_set.
- remove the change in rte_eth_dev_release_port, since there is a better
  way to prevent rte_eth_dev_release_port be called after
  rte_eth_dev_release_port_private.
- fix the issue that lock does not take effect on secondary due to
  previous re-work
- fix the issue when the first attached device is a private device from
  secondary. (patch 8/24)
- work around for reply a sync request in separate thread, this is still
  an open and in discussion as below.
  https://mails.dpdk.org/archives/dev/2018-June/105359.html

v4:
- since mp thread will be merged to interrupt thread, the fix on v3
  for sync IPC deadlock will not work. the new version enable the
  machanism to invoke a mp action callback in a temporary thread to
  avoid the IPC deadlock, with this, secondary to primary request
  impelemtation also be simplified, since we can use sync request
  directly in a separate thread.

v3:
- enable mp init callback register to help non-eal module to initialize
  mp channel during rte_eal_init
- fix when attach share device from secondary.
  1) dead lock due to sync IPC be invoked in rte_malloc in primary
 process when handle secondary request to attach device, the
 solution is primary process to issue share device attach/detach
 in interrupt thread.
  2) return port_id not correct.
- check nb_sent and nb_received in sync IPC.
- fix memory leak duirng error handling at attach_on_secondary.
- improve clean_lock_callback to only lock/unlock spinlock once
- improve error code return in check-reply during async IPC.
- remove rte_ prefix of internal function in ethdev_mp.c
- sample code improvement.
  1) rename sample to "hotplug_mp", and move to example/multi-process.
  2) cleanup header include.
  3) call rte_eal_cleanup before exit.

v2:
- rename rte_ethdev_mp.* to ethdev_mp.*
- rename rte_ethdev_lock.* to ethdev_lock.*
- move internal funciton to ethdev_private.h
- separate rte_eth_dev_[un]lock into rte_eth_dev_[un]lock and
  rte_eth_dev_[un]lock_with_callback
- lock callbacks will be removed automatically after device is detached.
- add experimental tag for all new APIs.
- fix coding style issue.
- fix wrong lisence header in sample code.
- fix spelling 
- fix meson.build.
- improve comments. 

Background:
===

Currently secondary process will only sync ethdev from primary
process at init stage, but it will not be aware if device
is attached/detached on primary process at runtime.

While there is the requirement from application that take
primary-secondary process model. The primary process work as a
resource management process, it will create/destroy virtual device
at runtime, while the secondary process deal with the network stuff
with these devices.

Solution:
=

So the orignial intention is to fix this gap, but beyond that
the patch set provide a more comprehesive solution to handle
different hotplug cases in multi-process situation, it cover below
scenario:

1. Attach a share device from primary
2. Detach a share device from primary
3. Attach a share device from secondary
4. Detach a share device from secondary
5. Attach a private device from secondary
6. Detach a private device from secondary
7. Detach a share device from secondary privately
8. Attach a share device from secondary privately

In primary-secondary process model, we assume ethernet devices are
shared by default. that means attach or detach a device on any process
will broa

[dpdk-dev] [PATCH v8 05/19] ethdev: support attach or detach share device from secondary

2018-07-01 Thread Qi Zhang
This patch cover the multi-process hotplug case when a share device
attach/detach request be issued from secondary process

device attach on secondary:
a) seconary send sync request to primary.
b) primary receive the request and attach the new device if failed
   goto i).
c) primary forward attach sync request to all secondary.
d) secondary receive request and attach device and send reply.
e) primary check the reply if all success go to j).
f) primary send attach rollback sync request to all secondary.
g) secondary receive the request and detach device and send reply.
h) primary receive the reply and detach device as rollback action.
i) send fail reply to secondary, goto k).
j) send success reply to secondary.
k) secondary process receive reply of step a) and return.

device detach on secondary:
a) secondary send sync request to primary
b) primary receive the request and perform pre-detach check, if device
   is locked, goto j).
c) primary send pre-detach sync request to all secondary.
d) secondary perform pre-detach check and send reply.
e) primary check the reply if any fail goto j).
f) primary send detach sync request to all secondary
g) secondary detach the device and send reply
h) primary detach the device.
i) send success reply to secondary, goto k).
j) send fail reply to secondary.
k) secondary process receive reply of step a) and return.

Signed-off-by: Qi Zhang 
Reviewed-by: Anatoly Burakov 
---
 lib/librte_ethdev/ethdev_mp.c | 179 --
 1 file changed, 173 insertions(+), 6 deletions(-)

diff --git a/lib/librte_ethdev/ethdev_mp.c b/lib/librte_ethdev/ethdev_mp.c
index 1d148cd5e..8d13da591 100644
--- a/lib/librte_ethdev/ethdev_mp.c
+++ b/lib/librte_ethdev/ethdev_mp.c
@@ -5,8 +5,44 @@
 #include 
 
 #include "rte_ethdev_driver.h"
+
 #include "ethdev_mp.h"
 #include "ethdev_lock.h"
+#include "ethdev_private.h"
+
+/**
+ *
+ * secondary to primary request.
+ * start from function eth_dev_request_to_primary.
+ *
+ * device attach on secondary:
+ * a) seconary send sycn request to primary
+ * b) primary receive the request and attach the new device thread,
+ *if failed goto i).
+ * c) primary forward attach request to all secondary as sync request
+ * d) secondary receive request and attach device and send reply.
+ * e) primary check the reply if all success go to j).
+ * f) primary send attach rollback sync request to all secondary.
+ * g) secondary receive the request and detach device and send reply.
+ * h) primary receive the reply and detach device as rollback action.
+ * i) send fail sync reply to secondary, goto k).
+ * j) send success sync reply to secondary.
+ * k) secondary process receive reply of step a) and return.
+ *
+ * device detach on secondary:
+ * a) secondary send detach sync request to primary
+ * b) primary receive the request and perform pre-detach check, if device
+ *is locked, goto j).
+ * c) primary send pre-detach sync request to all secondary.
+ * d) secondary perform pre-detach check and send reply.
+ * e) primary check the reply if any fail goto j).
+ * f) primary send detach sync request to all secondary
+ * g) secondary detach the device and send reply
+ * h) primary detach the device.
+ * i) send success sync reply to secondary, goto k).
+ * j) send fail sync reply to secondary.
+ * k) secondary process receive reply of step a) and return.
+ */
 
 #define MP_TIMEOUT_S 5 /**< 5 seconds timeouts */
 
@@ -84,11 +120,122 @@ static int attach_on_secondary(const char *devargs, 
uint16_t port_id)
 }
 
 static int
-handle_secondary_request(const struct rte_mp_msg *msg, const void *peer)
+send_response_to_secondary(const struct eth_dev_mp_req *req,
+   int result,
+   const void *peer)
+{
+   struct rte_mp_msg mp_resp;
+   struct eth_dev_mp_req *resp =
+   (struct eth_dev_mp_req *)mp_resp.param;
+   int ret;
+
+   memset(&mp_resp, 0, sizeof(mp_resp));
+   mp_resp.len_param = sizeof(*resp);
+   strcpy(mp_resp.name, ETH_DEV_MP_ACTION_REQUEST);
+   memcpy(resp, req, sizeof(*req));
+   resp->result = result;
+
+   ret = rte_mp_reply(&mp_resp, peer);
+   if (ret)
+   ethdev_log(ERR, "failed to send response to secondary\n");
+
+   return ret;
+}
+
+int eth_dev_request_to_secondary(struct eth_dev_mp_req *req);
+
+static void
+__handle_secondary_request(void *param)
+{
+   struct mp_reply_bundle *bundle = param;
+   const struct rte_mp_msg *msg = &bundle->msg;
+   const struct eth_dev_mp_req *req =
+   (const struct eth_dev_mp_req *)msg->param;
+   struct eth_dev_mp_req tmp_req;
+   uint16_t port_id;
+   int ret = 0;
+
+   tmp_req = *req;
+
+   if (req->t == REQ_TYPE_ATTACH) {
+   ret = do_eth_dev_attach(req->devargs, &port_id);
+   if (ret)
+   goto finish;
+
+   tmp_req.port_id = port_id;
+   ret = eth_dev_request_to_secondar

[dpdk-dev] [PATCH v8 02/19] eal: enable multi process init callback

2018-07-01 Thread Qi Zhang
Introduce new API rte_eal_register_mp_init that help to register
a callback function which will be invoked right after multi-process
channel be established (rte_mp_channel_init). Typically the API
will be used by other module that want it's mp channel action callbacks
can be registered during rte_eal_init automatically.

Signed-off-by: Qi Zhang 
Acked-by: Anatoly Burakov 
---
 lib/librte_eal/common/eal_common_proc.c | 57 +++--
 lib/librte_eal/common/eal_private.h |  5 +++
 lib/librte_eal/common/include/rte_eal.h | 34 
 lib/librte_eal/linuxapp/eal/eal.c   |  2 ++
 lib/librte_eal/rte_eal_version.map  |  1 +
 5 files changed, 97 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_proc.c 
b/lib/librte_eal/common/eal_common_proc.c
index f010ef59e..f6d7c83e4 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -619,11 +619,47 @@ unlink_sockets(const char *filter)
return 0;
 }
 
+struct mp_init_entry {
+   TAILQ_ENTRY(mp_init_entry) next;
+   rte_eal_mp_init_callback_t callback;
+};
+
+TAILQ_HEAD(mp_init_entry_list, mp_init_entry);
+static struct mp_init_entry_list mp_init_entry_list =
+   TAILQ_HEAD_INITIALIZER(mp_init_entry_list);
+
+static int process_mp_init_callbacks(void)
+{
+   struct mp_init_entry *entry;
+   int ret;
+
+   TAILQ_FOREACH(entry, &mp_init_entry_list, next) {
+   ret = entry->callback();
+   if (ret)
+   return ret;
+   }
+   return 0;
+}
+
+int __rte_experimental
+rte_eal_register_mp_init(rte_eal_mp_init_callback_t callback)
+{
+   struct mp_init_entry *entry = calloc(1, sizeof(struct mp_init_entry));
+
+   if (entry == NULL)
+   return -ENOMEM;
+
+   entry->callback = callback;
+   TAILQ_INSERT_TAIL(&mp_init_entry_list, entry, next);
+
+   return 0;
+}
+
 int
 rte_mp_channel_init(void)
 {
char path[PATH_MAX];
-   int dir_fd;
+   int dir_fd, ret;
pthread_t mp_handle_tid, async_reply_handle_tid;
 
/* create filter path */
@@ -686,7 +722,24 @@ rte_mp_channel_init(void)
flock(dir_fd, LOCK_UN);
close(dir_fd);
 
-   return 0;
+   ret = process_mp_init_callbacks();
+   if (ret)
+   RTE_LOG(ERR, EAL, "failed to process mp init callbacks\n");
+
+   return ret;
+}
+
+void rte_mp_init_callback_cleanup(void)
+{
+   struct mp_init_entry *entry;
+
+   while (!TAILQ_EMPTY(&mp_init_entry_list)) {
+   TAILQ_FOREACH(entry, &mp_init_entry_list, next) {
+   TAILQ_REMOVE(&mp_init_entry_list, entry, next);
+   free(entry);
+   break;
+   }
+   }
 }
 
 /**
diff --git a/lib/librte_eal/common/eal_private.h 
b/lib/librte_eal/common/eal_private.h
index bdadc4d50..bc230ee23 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -247,6 +247,11 @@ struct rte_bus *rte_bus_find_by_device_name(const char 
*str);
 int rte_mp_channel_init(void);
 
 /**
+ * Cleanup all mp channel init callbacks.
+ */
+void rte_mp_init_callback_cleanup(void);
+
+/**
  * Internal Executes all the user application registered callbacks for
  * the specific device. It is for DPDK internal user only. User
  * application should not call it directly.
diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index 8de5d69e8..506f17f34 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -512,6 +512,40 @@ __rte_deprecated
 const char *
 rte_eal_mbuf_default_mempool_ops(void);
 
+/**
+ * Callback function right after multi-process channel be established.
+ * Typical implementation of these functions is to register mp channel
+ * action callbacks
+ *
+ * @return
+ *  - 0 on success.
+ *  - (<0) on failure.
+ */
+typedef int (*rte_eal_mp_init_callback_t)(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Register a callback function that will be invoked right after
+ * multi-process channel be established (rte_mp_channel_init). Typically
+ * the function is used by other module that want it's mp channel
+ * action callbacks can be registered during rte_eal_init automatically.
+ *
+ * @note
+ *   This function only take effect when be called before rte_eal_init,
+ *   and all registered callback will be clear during rte_eal_cleanup.
+ *
+ * @param callback
+ *   function be called at that moment.
+ *
+ * @return
+ *  - 0 on success.
+ *  - (<0) on failure.
+ */
+int __rte_experimental
+rte_eal_register_mp_init(rte_eal_mp_init_callback_t callback);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 8655b8691..45cccff7e 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal

[dpdk-dev] [PATCH v8 01/19] ethdev: add function to release port in local process

2018-07-01 Thread Qi Zhang
Add driver API rte_eth_release_port_private to support the
requirement that an ethdev only be released on secondary process,
so only local state be set to unused, share data will not be
reset so primary process can still use it.

Signed-off-by: Qi Zhang 
Reviewed-by: Andrew Rybchenko 
Acked-by: Remy Horton 
---
 lib/librte_ethdev/rte_ethdev.c| 12 
 lib/librte_ethdev/rte_ethdev_driver.h | 13 +
 lib/librte_ethdev/rte_ethdev_pci.h|  3 +++
 3 files changed, 28 insertions(+)

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index a9977df97..52a97694c 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -359,6 +359,18 @@ rte_eth_dev_attach_secondary(const char *name)
 }
 
 int
+rte_eth_dev_release_port_private(struct rte_eth_dev *eth_dev)
+{
+   if (eth_dev == NULL)
+   return -EINVAL;
+
+   _rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_DESTROY, NULL);
+   eth_dev->state = RTE_ETH_DEV_UNUSED;
+
+   return 0;
+}
+
+int
 rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
 {
if (eth_dev == NULL)
diff --git a/lib/librte_ethdev/rte_ethdev_driver.h 
b/lib/librte_ethdev/rte_ethdev_driver.h
index c9c825e3f..49c27223d 100644
--- a/lib/librte_ethdev/rte_ethdev_driver.h
+++ b/lib/librte_ethdev/rte_ethdev_driver.h
@@ -70,6 +70,19 @@ int rte_eth_dev_release_port(struct rte_eth_dev *eth_dev);
 
 /**
  * @internal
+ * Release the specified ethdev port in local process, only set to ethdev
+ * state to unused, but not reset share data since it assume other process
+ * is still using it, typically it is called by secondary process.
+ *
+ * @param eth_dev
+ * The *eth_dev* pointer is the address of the *rte_eth_dev* structure.
+ * @return
+ *   - 0 on success, negative on error
+ */
+int rte_eth_dev_release_port_private(struct rte_eth_dev *eth_dev);
+
+/**
+ * @internal
  * Release device queues and clear its configuration to force the user
  * application to reconfigure it. It is for internal use only.
  *
diff --git a/lib/librte_ethdev/rte_ethdev_pci.h 
b/lib/librte_ethdev/rte_ethdev_pci.h
index 2cfd37274..eeb944146 100644
--- a/lib/librte_ethdev/rte_ethdev_pci.h
+++ b/lib/librte_ethdev/rte_ethdev_pci.h
@@ -197,6 +197,9 @@ rte_eth_dev_pci_generic_remove(struct rte_pci_device 
*pci_dev,
if (!eth_dev)
return -ENODEV;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(eth_dev);
+
if (dev_uninit) {
ret = dev_uninit(eth_dev);
if (ret)
-- 
2.13.6



[dpdk-dev] [PATCH v8 06/19] ethdev: support attach private device as first

2018-07-01 Thread Qi Zhang
When attach a private device from secondary as the first one, we need
to make sure rte_eth_dev_shared_data is initialized, the patch add
necessary IPC for secondary to inform primary to do initialization.

Signed-off-by: Qi Zhang 
---
 lib/librte_ethdev/ethdev_mp.c  |  2 ++
 lib/librte_ethdev/ethdev_mp.h  |  1 +
 lib/librte_ethdev/ethdev_private.h |  3 +++
 lib/librte_ethdev/rte_ethdev.c | 31 ---
 4 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/lib/librte_ethdev/ethdev_mp.c b/lib/librte_ethdev/ethdev_mp.c
index 8d13da591..28f89dba9 100644
--- a/lib/librte_ethdev/ethdev_mp.c
+++ b/lib/librte_ethdev/ethdev_mp.c
@@ -189,6 +189,8 @@ __handle_secondary_request(void *param)
} else {
ret = tmp_req.result;
}
+   } else if (req->t == REQ_TYPE_SHARE_DATA_PREPARE) {
+   eth_dev_shared_data_prepare();
} else {
ethdev_log(ERR, "unsupported secondary to primary request\n");
ret = -ENOTSUP;
diff --git a/lib/librte_ethdev/ethdev_mp.h b/lib/librte_ethdev/ethdev_mp.h
index 40be46c89..61fc381da 100644
--- a/lib/librte_ethdev/ethdev_mp.h
+++ b/lib/librte_ethdev/ethdev_mp.h
@@ -15,6 +15,7 @@ enum eth_dev_req_type {
REQ_TYPE_PRE_DETACH,
REQ_TYPE_DETACH,
REQ_TYPE_ATTACH_ROLLBACK,
+   REQ_TYPE_SHARE_DATA_PREPARE,
 };
 
 struct eth_dev_mp_req {
diff --git a/lib/librte_ethdev/ethdev_private.h 
b/lib/librte_ethdev/ethdev_private.h
index 981e7de8a..005d63afc 100644
--- a/lib/librte_ethdev/ethdev_private.h
+++ b/lib/librte_ethdev/ethdev_private.h
@@ -36,4 +36,7 @@ int do_eth_dev_attach(const char *devargs, uint16_t *port_id);
  */
 int do_eth_dev_detach(uint16_t port_id);
 
+/* Prepare shared data for multi-process */
+void eth_dev_shared_data_prepare(void);
+
 #endif
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 7d89d9f95..408a49f44 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -199,11 +199,14 @@ rte_eth_find_next(uint16_t port_id)
return port_id;
 }
 
-static void
-rte_eth_dev_shared_data_prepare(void)
+void
+eth_dev_shared_data_prepare(void)
 {
const unsigned flags = 0;
const struct rte_memzone *mz;
+   struct eth_dev_mp_req req;
+
+   memset(&req, 0, sizeof(req));
 
rte_spinlock_lock(&rte_eth_shared_data_lock);
 
@@ -215,6 +218,12 @@ rte_eth_dev_shared_data_prepare(void)
rte_socket_id(), flags);
} else
mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
+   /* if secondary attach a private device first */
+   if (mz == NULL && rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   req.t = REQ_TYPE_SHARE_DATA_PREPARE;
+   eth_dev_request_to_primary(&req);
+   mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
+   }
if (mz == NULL)
rte_panic("Cannot allocate ethdev shared data\n");
 
@@ -255,7 +264,7 @@ rte_eth_dev_allocated(const char *name)
 {
struct rte_eth_dev *ethdev;
 
-   rte_eth_dev_shared_data_prepare();
+   eth_dev_shared_data_prepare();
 
rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
@@ -300,7 +309,7 @@ rte_eth_dev_allocate(const char *name)
uint16_t port_id;
struct rte_eth_dev *eth_dev = NULL;
 
-   rte_eth_dev_shared_data_prepare();
+   eth_dev_shared_data_prepare();
 
/* Synchronize port creation between primary and secondary threads. */
rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
@@ -339,7 +348,7 @@ rte_eth_dev_attach_secondary(const char *name)
uint16_t i;
struct rte_eth_dev *eth_dev = NULL;
 
-   rte_eth_dev_shared_data_prepare();
+   eth_dev_shared_data_prepare();
 
/* Synchronize port attachment to primary port creation and release. */
rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
@@ -379,7 +388,7 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
if (eth_dev == NULL)
return -EINVAL;
 
-   rte_eth_dev_shared_data_prepare();
+   eth_dev_shared_data_prepare();
 
_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_DESTROY, NULL);
 
@@ -433,7 +442,7 @@ rte_eth_find_next_owned_by(uint16_t port_id, const uint64_t 
owner_id)
 int __rte_experimental
 rte_eth_dev_owner_new(uint64_t *owner_id)
 {
-   rte_eth_dev_shared_data_prepare();
+   eth_dev_shared_data_prepare();
 
rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
@@ -488,7 +497,7 @@ rte_eth_dev_owner_set(const uint16_t port_id,
 {
int ret;
 
-   rte_eth_dev_shared_data_prepare();
+   eth_dev_shared_data_prepare();
 
rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
@@ -505,7 +514,7 @@ rte_eth_dev_owner_u

[dpdk-dev] [PATCH v8 03/19] ethdev: enable hotplug on multi-process

2018-07-01 Thread Qi Zhang
We are going to introduce the solution to handle different hotplug
cases in multi-process situation, it include below scenario:

1. Attach a share device from primary
2. Detach a share device from primary
3. Attach a share device from secondary
4. Detach a share device from secondary
5. Attach a private device from secondary
6. Detach a private device from secondary
7. Detach a share device from secondary privately
8. Attach a share device from secondary privately

In primary-secondary process model, we assume device is shared by default.
that means attach or detach a device on any process will broadcast to
all other processes through mp channel then device information will be
synchronized on all processes.

Any failure during attaching process will cause inconsistent status
between processes, so proper rollback action should be considered.
Also it is not safe to detach a share device when other process still use
it, so a handshake mechanism is introduced.

This patch covers the implementation of case 1,2,5,6,7,8.
Case 3,4 will be implemented on separate patch as well as handshake
mechanism.

Scenario for Case 1, 2:

attach device
a) primary attach the new device if failed goto h).
b) primary send attach sync request to all secondary.
c) secondary receive request and attach device and send reply.
d) primary check the reply if all success go to i).
e) primary send attach rollback sync request to all secondary.
f) secondary receive the request and detach device and send reply.
g) primary receive the reply and detach device as rollback action.
h) attach fail
i) attach success

detach device
a) primary perform pre-detach check, if device is locked, goto i).
b) primary send pre-detach sync request to all secondary.
c) secondary perform pre-detach check and send reply.
d) primary check the reply if any fail goto i).
e) primary send detach sync request to all secondary
f) secondary detach the device and send reply (assume no fail)
g) primary detach the device.
h) detach success
i) detach failed

Case 5, 6:
Secondary process can attach private device which only visible to itself,
in this case no IPC is involved, primary process is not allowed to have
private device so far.

Case 7, 8:
Secondary process can also temporally to detach a share device "privately"
then attach it back later, this action also not impact other processes.

APIs changes:

rte_eth_dev_attach and rte_eth_dev_attach are extended to support
share device attach/detach in primary-secondary process model, it will
be called in case 1,2,3,4.

New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are
introduced to cover case 5,6,7,8, this API can only be invoked in secondary
process.

Signed-off-by: Qi Zhang 
---
 lib/librte_ethdev/Makefile   |   1 +
 lib/librte_ethdev/ethdev_mp.c| 261 +++
 lib/librte_ethdev/ethdev_mp.h|  41 +
 lib/librte_ethdev/ethdev_private.h   |  39 +
 lib/librte_ethdev/meson.build|   1 +
 lib/librte_ethdev/rte_ethdev.c   | 210 +++--
 lib/librte_ethdev/rte_ethdev.h   |  45 ++
 lib/librte_ethdev/rte_ethdev_core.h  |   5 +
 lib/librte_ethdev/rte_ethdev_version.map |   2 +
 9 files changed, 588 insertions(+), 17 deletions(-)
 create mode 100644 lib/librte_ethdev/ethdev_mp.c
 create mode 100644 lib/librte_ethdev/ethdev_mp.h
 create mode 100644 lib/librte_ethdev/ethdev_private.h

diff --git a/lib/librte_ethdev/Makefile b/lib/librte_ethdev/Makefile
index c2f2f7d82..d0a059b83 100644
--- a/lib/librte_ethdev/Makefile
+++ b/lib/librte_ethdev/Makefile
@@ -19,6 +19,7 @@ EXPORT_MAP := rte_ethdev_version.map
 LIBABIVER := 9
 
 SRCS-y += rte_ethdev.c
+SRCS-y += ethdev_mp.c
 SRCS-y += rte_flow.c
 SRCS-y += rte_tm.c
 SRCS-y += rte_mtr.c
diff --git a/lib/librte_ethdev/ethdev_mp.c b/lib/librte_ethdev/ethdev_mp.c
new file mode 100644
index 0..0f9d8990d
--- /dev/null
+++ b/lib/librte_ethdev/ethdev_mp.c
@@ -0,0 +1,261 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2018 Intel Corporation
+ */
+#include 
+#include 
+
+#include "rte_ethdev_driver.h"
+#include "ethdev_mp.h"
+
+#define MP_TIMEOUT_S 5 /**< 5 seconds timeouts */
+
+struct mp_reply_bundle {
+   struct rte_mp_msg msg;
+   void *peer;
+};
+
+static int detach_on_secondary(uint16_t port_id)
+{
+   struct rte_device *dev;
+   struct rte_bus *bus;
+   int ret = 0;
+
+   if (rte_eth_devices[port_id].state == RTE_ETH_DEV_UNUSED) {
+   ethdev_log(ERR, "detach on secondary: invalid port %d\n",
+  port_id);
+   return -ENODEV;
+   }
+
+   dev = rte_eth_devices[port_id].device;
+   if (dev == NULL)
+   return -EINVAL;
+
+   bus = rte_bus_find_by_device(dev);
+   if (bus == NULL)
+   return -ENOENT;
+
+   ret = rte_eal_hotplug_remove(bus->name, dev->name);
+   if (ret) {
+   ethdev_log(ERR, "failed to h

[dpdk-dev] [PATCH v8 07/19] net/i40e: enable port detach on secondary process

2018-07-01 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/i40e/i40e_ethdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 13c5d3296..7d1f98422 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -678,6 +678,8 @@ static int eth_i40e_pci_remove(struct rte_pci_device 
*pci_dev)
if (!ethdev)
return -ENODEV;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(ethdev);
 
if (ethdev->data->dev_flags & RTE_ETH_DEV_REPRESENTOR)
return rte_eth_dev_destroy(ethdev, i40e_vf_representor_uninit);
-- 
2.13.6



[dpdk-dev] [PATCH v8 04/19] ethdev: introduce device lock

2018-07-01 Thread Qi Zhang
Introduce API rte_eth_dev_lock and rte_eth_dev_unlock to let
application lock or unlock on specific ethdev, a locked device
can't be detached, this help applicaiton to prevent unexpected
device detaching, especially in multi-process envrionment.

Aslo introduce the new API rte_eth_dev_lock_with_callback and
rte_eth_dev_unlock_with callback to let application to register
a callback function which will be invoked before a device is going
to be detached, the return value of the function will decide if
device will continue be detached or not, this support application
to do condition check at runtime.

Signed-off-by: Qi Zhang 
Reviewed-by: Anatoly Burakov 
---
 lib/librte_ethdev/Makefile   |   1 +
 lib/librte_ethdev/ethdev_lock.c  | 140 +++
 lib/librte_ethdev/ethdev_lock.h  |  31 +++
 lib/librte_ethdev/ethdev_mp.c|   3 +-
 lib/librte_ethdev/meson.build|   1 +
 lib/librte_ethdev/rte_ethdev.c   |  60 -
 lib/librte_ethdev/rte_ethdev.h   | 124 +++
 lib/librte_ethdev/rte_ethdev_version.map |   2 +
 8 files changed, 360 insertions(+), 2 deletions(-)
 create mode 100644 lib/librte_ethdev/ethdev_lock.c
 create mode 100644 lib/librte_ethdev/ethdev_lock.h

diff --git a/lib/librte_ethdev/Makefile b/lib/librte_ethdev/Makefile
index d0a059b83..62bef03fc 100644
--- a/lib/librte_ethdev/Makefile
+++ b/lib/librte_ethdev/Makefile
@@ -20,6 +20,7 @@ LIBABIVER := 9
 
 SRCS-y += rte_ethdev.c
 SRCS-y += ethdev_mp.c
+SRCS-y += ethdev_lock.c
 SRCS-y += rte_flow.c
 SRCS-y += rte_tm.c
 SRCS-y += rte_mtr.c
diff --git a/lib/librte_ethdev/ethdev_lock.c b/lib/librte_ethdev/ethdev_lock.c
new file mode 100644
index 0..6379519e3
--- /dev/null
+++ b/lib/librte_ethdev/ethdev_lock.c
@@ -0,0 +1,140 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+#include "ethdev_lock.h"
+
+struct lock_entry {
+   TAILQ_ENTRY(lock_entry) next;
+   rte_eth_dev_lock_callback_t callback;
+   uint16_t port_id;
+   void *user_args;
+   int ref_count;
+};
+
+TAILQ_HEAD(lock_entry_list, lock_entry);
+static struct lock_entry_list lock_entry_list =
+   TAILQ_HEAD_INITIALIZER(lock_entry_list);
+static rte_spinlock_t lock_entry_lock = RTE_SPINLOCK_INITIALIZER;
+
+int
+register_lock_callback(uint16_t port_id,
+   rte_eth_dev_lock_callback_t callback,
+   void *user_args)
+{
+   struct lock_entry *le;
+
+   rte_spinlock_lock(&lock_entry_lock);
+
+   TAILQ_FOREACH(le, &lock_entry_list, next) {
+   if (le->port_id == port_id &&
+   le->callback == callback &&
+   le->user_args == user_args)
+   break;
+   }
+
+   if (le == NULL) {
+   le = calloc(1, sizeof(struct lock_entry));
+   if (le == NULL) {
+   rte_spinlock_unlock(&lock_entry_lock);
+   return -ENOMEM;
+   }
+   le->callback = callback;
+   le->port_id = port_id;
+   le->user_args = user_args;
+   TAILQ_INSERT_TAIL(&lock_entry_list, le, next);
+   }
+   le->ref_count++;
+
+   rte_spinlock_unlock(&lock_entry_lock);
+   return 0;
+}
+
+int
+unregister_lock_callback(uint16_t port_id,
+   rte_eth_dev_lock_callback_t callback,
+   void *user_args)
+{
+   struct lock_entry *le;
+   int ret = 0;
+
+   rte_spinlock_lock(&lock_entry_lock);
+
+   TAILQ_FOREACH(le, &lock_entry_list, next) {
+   if (le->port_id == port_id &&
+   le->callback == callback &&
+   le->user_args == user_args)
+   break;
+   }
+
+   if (le != NULL) {
+   le->ref_count--;
+   if (le->ref_count == 0) {
+   TAILQ_REMOVE(&lock_entry_list, le, next);
+   free(le);
+   }
+   } else {
+   ret = -ENOENT;
+   }
+
+   rte_spinlock_unlock(&lock_entry_lock);
+   return ret;
+}
+
+static int clean_lock_callback_one(uint16_t port_id)
+{
+   struct lock_entry *le;
+   int ret = 0;
+
+   TAILQ_FOREACH(le, &lock_entry_list, next) {
+   if (le->port_id == port_id)
+   break;
+   }
+
+   if (le != NULL) {
+   le->ref_count--;
+   if (le->ref_count == 0) {
+   TAILQ_REMOVE(&lock_entry_list, le, next);
+   free(le);
+   }
+   } else {
+   ret = -ENOENT;
+   }
+
+   return ret;
+
+}
+
+void clean_lock_callback(uint16_t port_id)
+{
+   int ret;
+
+   rte_spinlock_lock(&lock_entry_lock);
+
+   for (;;) {
+   ret = clean_lock_callback_one(port_id);
+   if (ret == -ENOENT)
+   break;

[dpdk-dev] [PATCH v8 08/19] net/ixgbe: enable port detach on secondary process

2018-07-01 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 87d2ad090..161a15f05 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1792,6 +1792,9 @@ static int eth_ixgbe_pci_remove(struct rte_pci_device 
*pci_dev)
if (!ethdev)
return -ENODEV;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return rte_eth_dev_release_port_private(ethdev);
+
if (ethdev->data->dev_flags & RTE_ETH_DEV_REPRESENTOR)
return rte_eth_dev_destroy(ethdev, ixgbe_vf_representor_uninit);
else
-- 
2.13.6



[dpdk-dev] [PATCH v8 11/19] net/kni: enable port detach on secondary process

2018-07-01 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/kni/rte_eth_kni.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/kni/rte_eth_kni.c b/drivers/net/kni/rte_eth_kni.c
index ab63ea427..e5679c76a 100644
--- a/drivers/net/kni/rte_eth_kni.c
+++ b/drivers/net/kni/rte_eth_kni.c
@@ -419,6 +419,7 @@ eth_kni_probe(struct rte_vdev_device *vdev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = ð_kni_ops;
+   eth_dev->device = &vdev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -463,6 +464,16 @@ eth_kni_remove(struct rte_vdev_device *vdev)
if (eth_dev == NULL)
return -1;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(vdev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
eth_kni_dev_stop(eth_dev);
 
internals = eth_dev->data->dev_private;
-- 
2.13.6



[dpdk-dev] [PATCH v8 09/19] net/af_packet: enable port detach on secondary process

2018-07-01 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/af_packet/rte_eth_af_packet.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index ea47abbf8..33ac19de8 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -935,6 +935,7 @@ rte_pmd_af_packet_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -986,6 +987,16 @@ rte_pmd_af_packet_remove(struct rte_vdev_device *dev)
if (eth_dev == NULL)
return -1;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
internals = eth_dev->data->dev_private;
for (q = 0; q < internals->nb_queues; q++) {
rte_free(internals->rx_queue[q].rd);
-- 
2.13.6



[dpdk-dev] [PATCH v8 12/19] net/null: enable port detach on secondary process

2018-07-01 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/null/rte_eth_null.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c
index 1d2e6b9e9..2f040729b 100644
--- a/drivers/net/null/rte_eth_null.c
+++ b/drivers/net/null/rte_eth_null.c
@@ -623,6 +623,7 @@ rte_pmd_null_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -667,18 +668,31 @@ static int
 rte_pmd_null_remove(struct rte_vdev_device *dev)
 {
struct rte_eth_dev *eth_dev = NULL;
+   const char *name;
 
if (!dev)
return -EINVAL;
 
+   name = rte_vdev_device_name(dev);
+
PMD_LOG(INFO, "Closing null ethdev on numa socket %u",
rte_socket_id());
 
/* find the ethdev entry */
-   eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
+   eth_dev = rte_eth_dev_allocated(name);
if (eth_dev == NULL)
return -1;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
rte_free(eth_dev->data->dev_private);
 
rte_eth_dev_release_port(eth_dev);
-- 
2.13.6



[dpdk-dev] [PATCH v8 10/19] net/bonding: enable port detach on secondary process

2018-07-01 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index f155ff779..da45ba9ba 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -3062,6 +3062,7 @@ bond_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &default_dev_ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -3168,6 +3169,16 @@ bond_remove(struct rte_vdev_device *dev)
if (eth_dev == NULL)
return -ENODEV;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
RTE_ASSERT(eth_dev->device == &dev->device);
 
internals = eth_dev->data->dev_private;
-- 
2.13.6



[dpdk-dev] [PATCH v8 13/19] net/octeontx: enable port detach on secondary process

2018-07-01 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/octeontx/octeontx_ethdev.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/drivers/net/octeontx/octeontx_ethdev.c 
b/drivers/net/octeontx/octeontx_ethdev.c
index 1eb453b21..497bacdc6 100644
--- a/drivers/net/octeontx/octeontx_ethdev.c
+++ b/drivers/net/octeontx/octeontx_ethdev.c
@@ -1016,6 +1016,7 @@ octeontx_create(struct rte_vdev_device *dev, int port, 
uint8_t evdev,
 
eth_dev->tx_pkt_burst = octeontx_xmit_pkts;
eth_dev->rx_pkt_burst = octeontx_recv_pkts;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -1138,6 +1139,18 @@ octeontx_remove(struct rte_vdev_device *dev)
if (eth_dev == NULL)
return -ENODEV;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0) {
+   rte_eth_dev_release_port_private(eth_dev);
+   continue;
+   }
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
nic = octeontx_pmd_priv(eth_dev);
rte_event_dev_stop(nic->evdev);
PMD_INIT_LOG(INFO, "Closing octeontx device %s", octtx_name);
@@ -1148,6 +1161,9 @@ octeontx_remove(struct rte_vdev_device *dev)
rte_event_dev_close(nic->evdev);
}
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return 0;
+
/* Free FC resource */
octeontx_pko_fc_free();
 
-- 
2.13.6



[dpdk-dev] [PATCH v8 15/19] net/softnic: enable port detach on secondary process

2018-07-01 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/softnic/rte_eth_softnic.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/net/softnic/rte_eth_softnic.c 
b/drivers/net/softnic/rte_eth_softnic.c
index 6b3c13e5c..a45a7b0dd 100644
--- a/drivers/net/softnic/rte_eth_softnic.c
+++ b/drivers/net/softnic/rte_eth_softnic.c
@@ -750,6 +750,7 @@ pmd_probe(struct rte_vdev_device *vdev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &pmd_ops;
+   eth_dev->device = &vdev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -803,17 +804,29 @@ pmd_remove(struct rte_vdev_device *vdev)
 {
struct rte_eth_dev *dev = NULL;
struct pmd_internals *p;
+   const char *name;
 
if (!vdev)
return -EINVAL;
 
-   PMD_LOG(INFO, "Removing device \"%s\"",
-   rte_vdev_device_name(vdev));
+   name = rte_vdev_device_name(vdev);
+   PMD_LOG(INFO, "Removing device \"%s\"", name);
 
/* Find the ethdev entry */
-   dev = rte_eth_dev_allocated(rte_vdev_device_name(vdev));
+   dev = rte_eth_dev_allocated(name);
if (dev == NULL)
return -ENODEV;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(vdev)) == 0)
+   return rte_eth_dev_release_port_private(dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
p = dev->data->dev_private;
 
/* Free device data structures*/
-- 
2.13.6



[dpdk-dev] [PATCH v8 14/19] net/pcap: enable port detach on secondary process

2018-07-01 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/pcap/rte_eth_pcap.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/net/pcap/rte_eth_pcap.c b/drivers/net/pcap/rte_eth_pcap.c
index 6bd4a7d79..6cc20c2b2 100644
--- a/drivers/net/pcap/rte_eth_pcap.c
+++ b/drivers/net/pcap/rte_eth_pcap.c
@@ -925,6 +925,7 @@ pmd_pcap_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -1016,6 +1017,7 @@ static int
 pmd_pcap_remove(struct rte_vdev_device *dev)
 {
struct rte_eth_dev *eth_dev = NULL;
+   const char *name;
 
PMD_LOG(INFO, "Closing pcap ethdev on numa socket %d",
rte_socket_id());
@@ -1023,11 +1025,22 @@ pmd_pcap_remove(struct rte_vdev_device *dev)
if (!dev)
return -1;
 
+   name = rte_vdev_device_name(dev);
/* reserve an ethdev entry */
-   eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
+   eth_dev = rte_eth_dev_allocated(name);
if (eth_dev == NULL)
return -1;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
rte_free(eth_dev->data->dev_private);
 
rte_eth_dev_release_port(eth_dev);
-- 
2.13.6



[dpdk-dev] [PATCH v8 17/19] net/vhost: enable port detach on secondary process

2018-07-01 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
---
 drivers/net/vhost/rte_eth_vhost.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/vhost/rte_eth_vhost.c 
b/drivers/net/vhost/rte_eth_vhost.c
index ba9d768a0..f773711b4 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1353,6 +1353,7 @@ rte_pmd_vhost_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -1435,6 +1436,16 @@ rte_pmd_vhost_remove(struct rte_vdev_device *dev)
if (eth_dev == NULL)
return -ENODEV;
 
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
+
eth_dev_close(eth_dev);
 
rte_free(vring_states[eth_dev->data->port_id]);
-- 
2.13.6



[dpdk-dev] [PATCH v8 16/19] net/tap: enable port detach on secondary process

2018-07-01 Thread Qi Zhang
Previously, detach port on secondary process will mess primary
process and cause same device can't be attached again, by take
advantage of rte_eth_release_port_private, we can support this
with minor change.

Signed-off-by: Qi Zhang 
Acked-by: Keith Wiles 
---
 drivers/net/tap/rte_eth_tap.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index df396bfde..bb5f20b01 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1759,6 +1759,7 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
}
/* TODO: request info from primary to set up Rx and Tx */
eth_dev->dev_ops = &ops;
+   eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
@@ -1827,12 +1828,24 @@ rte_pmd_tap_remove(struct rte_vdev_device *dev)
 {
struct rte_eth_dev *eth_dev = NULL;
struct pmd_internals *internals;
+   const char *name;
int i;
 
+   name = rte_vdev_device_name(dev);
/* find the ethdev entry */
-   eth_dev = rte_eth_dev_allocated(rte_vdev_device_name(dev));
+   eth_dev = rte_eth_dev_allocated(name);
if (!eth_dev)
-   return 0;
+   return -ENODEV;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   /* detach device on local pprocess only */
+   if (strlen(rte_vdev_device_args(dev)) == 0)
+   return rte_eth_dev_release_port_private(eth_dev);
+   /**
+* else this is a private device for current process
+* so continue with normal detach scenario
+*/
+   }
 
internals = eth_dev->data->dev_private;
 
-- 
2.13.6



[dpdk-dev] [PATCH v8 18/19] examples/multi_process: add hotplug sample

2018-07-01 Thread Qi Zhang
The sample code demonstrate device (ethdev only) management
at multi-process envrionment. User can attach/detach a device
on primary process and see it is synced on secondary process
automatically, also user can lock a device to prevent it be
detached or unlock it to go back to default behaviour.

How to start?
./hotplug_mp --proc-type=auto

Command Line Example:

>help
>list

/* attach a af_packet vdev */
>attach net_af_packet,iface=eth0

/* detach port 0 */
>detach 0

/* attach a private af_packet vdev (secondary process only)*/
>attachp net_af_packet,iface=eth0

/* detach a private device (secondary process only) */
>detachp 0

/* lock port 0 */
>lock 0

/* unlock port 0 */
>unlock 0

Signed-off-by: Qi Zhang 
---
 examples/multi_process/Makefile  |   1 +
 examples/multi_process/hotplug_mp/Makefile   |  23 ++
 examples/multi_process/hotplug_mp/commands.c | 356 +++
 examples/multi_process/hotplug_mp/commands.h |  10 +
 examples/multi_process/hotplug_mp/main.c |  41 +++
 5 files changed, 431 insertions(+)
 create mode 100644 examples/multi_process/hotplug_mp/Makefile
 create mode 100644 examples/multi_process/hotplug_mp/commands.c
 create mode 100644 examples/multi_process/hotplug_mp/commands.h
 create mode 100644 examples/multi_process/hotplug_mp/main.c

diff --git a/examples/multi_process/Makefile b/examples/multi_process/Makefile
index a6708b7e4..b76b02fcb 100644
--- a/examples/multi_process/Makefile
+++ b/examples/multi_process/Makefile
@@ -13,5 +13,6 @@ include $(RTE_SDK)/mk/rte.vars.mk
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += client_server_mp
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += simple_mp
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += symmetric_mp
+DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += hotplug_mp
 
 include $(RTE_SDK)/mk/rte.extsubdir.mk
diff --git a/examples/multi_process/hotplug_mp/Makefile 
b/examples/multi_process/hotplug_mp/Makefile
new file mode 100644
index 0..c09a57bfa
--- /dev/null
+++ b/examples/multi_process/hotplug_mp/Makefile
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2010-2014 Intel Corporation
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overridden by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = hotplug_mp
+
+# all source are stored in SRCS-y
+SRCS-y := main.c commands.c
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/multi_process/hotplug_mp/commands.c 
b/examples/multi_process/hotplug_mp/commands.c
new file mode 100644
index 0..31f9e2e15
--- /dev/null
+++ b/examples/multi_process/hotplug_mp/commands.c
@@ -0,0 +1,356 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/**/
+
+struct cmd_help_result {
+   cmdline_fixed_string_t help;
+};
+
+static void cmd_help_parsed(__attribute__((unused)) void *parsed_result,
+   struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   cmdline_printf(cl,
+  "commands:\n"
+  "- attach \n"
+  "- detach \n"
+  "- attachp \n"
+  "- detachp \n"
+  "- lock \n"
+  "- unlock \n"
+  "- list\n\n");
+}
+
+cmdline_parse_token_string_t cmd_help_help =
+   TOKEN_STRING_INITIALIZER(struct cmd_help_result, help, "help");
+
+cmdline_parse_inst_t cmd_help = {
+   .f = cmd_help_parsed,  /* function to call */
+   .data = NULL,  /* 2nd arg of func */
+   .help_str = "show help",
+   .tokens = {/* token list, NULL terminated */
+   (void *)&cmd_help_help,
+   NULL,
+   },
+};
+
+/**/
+
+struct cmd_quit_result {
+   cmdline_fixed_string_t quit;
+};
+
+static void cmd_quit_parsed(__attribute__((unused)) void *parsed_result,
+   struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   cmdline_quit(cl);
+}
+
+cmdline_parse_token_string_t cmd_quit_quit =
+   TOKEN_STRING_INITIALIZER(struct cmd_quit_result, quit, "quit");
+
+cmdline_parse_inst_t cmd_quit = {
+   .f = cmd_quit_parsed,  /* function to call */
+   .data = NULL,  /* 2nd arg of func */
+   .help_str = "quit",
+   .tokens = {/* token list, NULL terminated */
+   (void *)&cmd_quit_quit,
+   NULL,
+   },
+};
+
+/**/
+
+struct cmd_list_result {
+   cmdline_fixed_string_t list;
+};
+
+static void cmd_

[dpdk-dev] [PATCH v8 19/19] doc: update release notes for multi process hotplug

2018-07-01 Thread Qi Zhang
Update release notes for the new multi process hotplug feature.

Signed-off-by: Qi Zhang 
---
 doc/guides/rel_notes/release_18_08.rst | 20 
 1 file changed, 20 insertions(+)

diff --git a/doc/guides/rel_notes/release_18_08.rst 
b/doc/guides/rel_notes/release_18_08.rst
index bc0124295..93a813340 100644
--- a/doc/guides/rel_notes/release_18_08.rst
+++ b/doc/guides/rel_notes/release_18_08.rst
@@ -46,6 +46,21 @@ New Features
   Flow API support has been added to CXGBE Poll Mode Driver to offload
   flows to Chelsio T5/T6 NICs.
 
+* **Support etherdev multi-process hotplug.**
+
+  Hotplug and hot-unplug for ethdev devices will now be supported in
+  multiprocessing scenario. Any ethdev devices created in the primary
+  process will be regarded as shared and will be available for all DPDK
+  processes, while secondary processes will have a choice between adding
+  a private (non-shared) or a shared device. Synchronization between
+  processes will be done using DPDK IPC.
+
+* **Support etherdev locking.**
+
+  Application can now lock an ethernet device to prevent unexpected device
+  removal. Devices can either be locked unconditionally, or an application
+  can register for a callback before unplug for the purposes of performing
+  cleanup before releasing the device (or have a chance to deny unplug)
 
 API Changes
 ---
@@ -60,6 +75,11 @@ API Changes
Also, make sure to start the actual text at the margin.
=
 
+* ethdev: scope of rte_eth_dev_attach and rte_eth_dev_detach is extended.
+
+  In primary-secondary process model, ``rte_eth_dev_attach`` will guarantee
+  that device be attached on all processes, while ``rte_eth_dev_detach``
+  will guarantee device be detached on all processes.
 
 ABI Changes
 ---
-- 
2.13.6



[dpdk-dev] [PATCH v5 1/9] vhost: advertise support in-order feature

2018-07-01 Thread Marvin Liu
If devices always use descriptors in the same order in which they have
been made available. These devices can offer the VIRTIO_F_IN_ORDER
feature. If negotiated, this knowledge allows devices to notify the use
of a batch of buffers to virtio driver by only writing used ring index.

Vhost user device has supported this feature by default. If vhost
dequeue zero is enabled, should disable VIRTIO_F_IN_ORDER as vhost can’t
assure that descriptors returned from NIC are in order.

Signed-off-by: Marvin Liu 
Reviewed-by: Maxime Coquelin 

diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 0399c37bc..d63031747 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -853,6 +853,12 @@ rte_vhost_driver_register(const char *path, uint64_t flags)
vsocket->supported_features = VIRTIO_NET_SUPPORTED_FEATURES;
vsocket->features   = VIRTIO_NET_SUPPORTED_FEATURES;
 
+   /* Dequeue zero copy can't assure descriptors returned in order */
+   if (vsocket->dequeue_zero_copy) {
+   vsocket->supported_features &= ~(1ULL << VIRTIO_F_IN_ORDER);
+   vsocket->features &= ~(1ULL << VIRTIO_F_IN_ORDER);
+   }
+
if (!(flags & RTE_VHOST_USER_IOMMU_SUPPORT)) {
vsocket->supported_features &= ~(1ULL << 
VIRTIO_F_IOMMU_PLATFORM);
vsocket->features &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM);
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 786a74f64..3437b996b 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -191,6 +191,13 @@ struct vhost_msg {
  #define VIRTIO_F_VERSION_1 32
 #endif
 
+/*
+ * Available and used descs are in same order
+ */
+#ifndef VIRTIO_F_IN_ORDER
+#define VIRTIO_F_IN_ORDER  35
+#endif
+
 /* Features supported by this builtin vhost-user net driver. */
 #define VIRTIO_NET_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
(1ULL << VIRTIO_F_ANY_LAYOUT) | \
@@ -214,7 +221,8 @@ struct vhost_msg {
(1ULL << VIRTIO_NET_F_GUEST_ECN) | \
(1ULL << VIRTIO_RING_F_INDIRECT_DESC) | \
(1ULL << VIRTIO_RING_F_EVENT_IDX) | \
-   (1ULL << VIRTIO_NET_F_MTU) | \
+   (1ULL << VIRTIO_NET_F_MTU)  | \
+   (1ULL << VIRTIO_F_IN_ORDER) | \
(1ULL << VIRTIO_F_IOMMU_PLATFORM))
 
 
-- 
2.17.0



[dpdk-dev] [PATCH v5 0/9] support in-order feature

2018-07-01 Thread Marvin Liu
In latest virtio-spec, new feature bit VIRTIO_F_IN_ORDER was introduced.
When this feature has been negotiated, virtio driver will use
descriptors in ring order: starting from offset 0 in the table, and
wrapping around at the end of the table. Vhost devices will always use
descriptors in the same order in which they have been made available.
This can reduce virtio accesses to used ring.

Based on updated virtio-spec, this series realized IN_ORDER prototype
in virtio driver. Due to new [RT]x path added into selection, also add
two new parameters mrg_rx and in_order into virtio-user vdev parameters
list. This will allow user to configure feature bits thus can impact
[RT]x path selection.

Performance of virtio user with IN_ORDER feature:

Platform: Purely
CPU: Intel(R) Xeon(R) Platinum 8160 CPU @ 2.10GHz
DPDK baseline: 18.05
Setup: testpmd with vhost vdev + testpmd with virtio vdev

+--+--+--+-+
|Vhost->Virtio |1 Queue   |2 Queues  |4 Queues |
+--+--+--+-+
|Inorder   |12.0Mpps  |24.2Mpps  |26.0Mpps |
|Normal|12.1Mpps  |18.5Mpps  |18.9Mpps |
+--+--+--+-+

+--+--++-+
|Virtio->Vhost |1 Queue   |2 Queues|4 Queues |
+--+--++-+
|Inorder   |13.8Mpps  |10.7 ~ 15.2Mpps |11.5Mpps |
|Normal|13.3Mpps  |9.8 ~ 14Mpps|10.5Mpps |
+--+--++-+

+-+--+++
|Loopback |1 Queue   |2 Queues|4 Queues|
+-+--+++
|Inorder  |7.4Mpps   |9.1 ~ 11.6Mpps  |10.5 ~ 11.3Mpps |
+-+--+++
|Normal   |7.5Mpps   |7.7 ~ 9.0Mpps   |7.6 ~ 7.8Mpps   |
+-+--+++

v5:
- disable simple Tx when in-order negotiated
- doc update

v4:
- disable simple [RT]x function for ARM
- squash doc update into relevant patches
- fix git-check-log and checkpatch errors

v3:
- refine [RT]x function selection logic
- fix in-order mergeable packets index error
- combine unsupport mask patch
- doc virtio in-order update
- fix checkpatch error

v2:
- merge to latest dpdk-net-virtio 
- not use in_direct for normal xmit packets
- update available ring for each descriptor
- clean up IN_ORDER xmit function
- unmask feature bits when disabled in_order or mgr_rxbuf
- extract common part between IN_ORDER and normal functions
- update performance result

Marvin Liu (9):
  vhost: advertise support in-order feature
  net/virtio: add in-order feature bit definition
  net/virtio-user: add unsupported features mask
  net/virtio-user: add mrg-rxbuf and in-order vdev parameters
  net/virtio: free in-order descriptors before device start
  net/virtio: extract common part for in-order functions
  net/virtio: support in-order Rx and Tx
  net/virtio: add in-order Rx/Tx into selection
  net/virtio: advertise support in-order feature

 doc/guides/nics/virtio.rst|  23 +-
 drivers/net/virtio/virtio_ethdev.c|  32 +-
 drivers/net/virtio/virtio_ethdev.h|   7 +
 drivers/net/virtio/virtio_pci.h   |   8 +
 drivers/net/virtio/virtio_rxtx.c  | 639 --
 .../net/virtio/virtio_user/virtio_user_dev.c  |  30 +-
 .../net/virtio/virtio_user/virtio_user_dev.h  |   4 +-
 drivers/net/virtio/virtio_user_ethdev.c   |  47 +-
 drivers/net/virtio/virtqueue.c|   8 +
 drivers/net/virtio/virtqueue.h|   2 +
 lib/librte_vhost/socket.c |   6 +
 lib/librte_vhost/vhost.h  |  10 +-
 12 files changed, 736 insertions(+), 80 deletions(-)

-- 
2.17.0



[dpdk-dev] [PATCH v5 4/9] net/virtio-user: add mrg-rxbuf and in-order vdev parameters

2018-07-01 Thread Marvin Liu
Add parameters for configuring VIRTIO_NET_F_MRG_RXBUF and
VIRTIO_F_IN_ORDER feature bits. If feature is disabled, also update
corresponding unsupported feature bit.

Signed-off-by: Marvin Liu 
Reviewed-by: Maxime Coquelin 

diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst
index a42d1bb30..46e292c4d 100644
--- a/doc/guides/nics/virtio.rst
+++ b/doc/guides/nics/virtio.rst
@@ -331,3 +331,13 @@ The user can specify below argument in devargs.
 driver, and works as a HW vhost backend. This argument is used to specify
 a virtio device needs to work in vDPA mode.
 (Default: 0 (disabled))
+
+#. ``mrg_rxbuf``:
+
+It is used to enable virtio device mergeable Rx buffer feature.
+(Default: 1 (enabled))
+
+#. ``in_order``:
+
+It is used to enable virtio device in-order feature.
+(Default: 1 (enabled))
diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c 
b/drivers/net/virtio/virtio_user/virtio_user_dev.c
index e0e956888..953c46055 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
@@ -375,7 +375,8 @@ virtio_user_dev_setup(struct virtio_user_dev *dev)
 
 int
 virtio_user_dev_init(struct virtio_user_dev *dev, char *path, int queues,
-int cq, int queue_size, const char *mac, char **ifname)
+int cq, int queue_size, const char *mac, char **ifname,
+int mrg_rxbuf, int in_order)
 {
pthread_mutex_init(&dev->mutex, NULL);
snprintf(dev->path, PATH_MAX, "%s", path);
@@ -420,6 +421,16 @@ virtio_user_dev_init(struct virtio_user_dev *dev, char 
*path, int queues,
dev->device_features = VIRTIO_USER_SUPPORTED_FEATURES;
}
 
+   if (!mrg_rxbuf) {
+   dev->device_features &= ~(1ull << VIRTIO_NET_F_MRG_RXBUF);
+   dev->unsupported_features |= (1ull << VIRTIO_NET_F_MRG_RXBUF);
+   }
+
+   if (!in_order) {
+   dev->device_features &= ~(1ull << VIRTIO_F_IN_ORDER);
+   dev->unsupported_features |= (1ull << VIRTIO_F_IN_ORDER);
+   }
+
if (dev->mac_specified) {
dev->device_features |= (1ull << VIRTIO_NET_F_MAC);
} else {
diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.h 
b/drivers/net/virtio/virtio_user/virtio_user_dev.h
index c23ddfcc5..d6e0e137b 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.h
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.h
@@ -48,7 +48,8 @@ int is_vhost_user_by_type(const char *path);
 int virtio_user_start_device(struct virtio_user_dev *dev);
 int virtio_user_stop_device(struct virtio_user_dev *dev);
 int virtio_user_dev_init(struct virtio_user_dev *dev, char *path, int queues,
-int cq, int queue_size, const char *mac, char 
**ifname);
+int cq, int queue_size, const char *mac, char **ifname,
+int mrg_rxbuf, int in_order);
 void virtio_user_dev_uninit(struct virtio_user_dev *dev);
 void virtio_user_handle_cq(struct virtio_user_dev *dev, uint16_t queue_idx);
 uint8_t virtio_user_handle_mq(struct virtio_user_dev *dev, uint16_t q_pairs);
diff --git a/drivers/net/virtio/virtio_user_ethdev.c 
b/drivers/net/virtio/virtio_user_ethdev.c
index 08fa4bd47..fcd30251f 100644
--- a/drivers/net/virtio/virtio_user_ethdev.c
+++ b/drivers/net/virtio/virtio_user_ethdev.c
@@ -358,8 +358,12 @@ static const char *valid_args[] = {
VIRTIO_USER_ARG_QUEUE_SIZE,
 #define VIRTIO_USER_ARG_INTERFACE_NAME "iface"
VIRTIO_USER_ARG_INTERFACE_NAME,
-#define VIRTIO_USER_ARG_SERVER_MODE "server"
+#define VIRTIO_USER_ARG_SERVER_MODE"server"
VIRTIO_USER_ARG_SERVER_MODE,
+#define VIRTIO_USER_ARG_MRG_RXBUF  "mrg_rxbuf"
+   VIRTIO_USER_ARG_MRG_RXBUF,
+#define VIRTIO_USER_ARG_IN_ORDER   "in_order"
+   VIRTIO_USER_ARG_IN_ORDER,
NULL
 };
 
@@ -464,6 +468,8 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev)
uint64_t cq = VIRTIO_USER_DEF_CQ_EN;
uint64_t queue_size = VIRTIO_USER_DEF_Q_SZ;
uint64_t server_mode = VIRTIO_USER_DEF_SERVER_MODE;
+   uint64_t mrg_rxbuf = 1;
+   uint64_t in_order = 1;
char *path = NULL;
char *ifname = NULL;
char *mac_addr = NULL;
@@ -563,6 +569,24 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev)
goto end;
}
 
+   if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_MRG_RXBUF) == 1) {
+   if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_MRG_RXBUF,
+  &get_integer_arg, &mrg_rxbuf) < 0) {
+   PMD_INIT_LOG(ERR, "error to parse %s",
+VIRTIO_USER_ARG_MRG_RXBUF);
+   goto end;
+   }
+   }
+
+   if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_IN_ORDER) == 1) {
+   if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_IN_ORDER,
+  

[dpdk-dev] [PATCH v5 2/9] net/virtio: add in-order feature bit definition

2018-07-01 Thread Marvin Liu
If VIRTIO_F_IN_ORDER has been negotiated, driver will use descriptors in
ring order: starting from offset 0 in the table, and wrapping around at
the end of the table. Also introduce use_inorder_[rt]x flag for
selection of IN_ORDER [RT]x handlers.

Signed-off-by: Marvin Liu 
Reviewed-by: Maxime Coquelin 

diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index a28ba8339..77f805df6 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -121,6 +121,12 @@ struct virtnet_ctl;
 #define VIRTIO_TRANSPORT_F_START 28
 #define VIRTIO_TRANSPORT_F_END   34
 
+/*
+ * Inorder feature indicates that all buffers are used by the device
+ * in the same order in which they have been made available.
+ */
+#define VIRTIO_F_IN_ORDER 35
+
 /* The Guest publishes the used index for which it expects an interrupt
  * at the end of the avail ring. Host should ignore the avail->flags field. */
 /* The Host publishes the avail index for which it expects a kick
@@ -233,6 +239,8 @@ struct virtio_hw {
uint8_t modern;
uint8_t use_simple_rx;
uint8_t use_simple_tx;
+   uint8_t use_inorder_rx;
+   uint8_t use_inorder_tx;
uint16_tport_id;
uint8_t mac_addr[ETHER_ADDR_LEN];
uint32_tnotify_off_multiplier;
diff --git a/drivers/net/virtio/virtio_user_ethdev.c 
b/drivers/net/virtio/virtio_user_ethdev.c
index 1c102ca72..8747cbf94 100644
--- a/drivers/net/virtio/virtio_user_ethdev.c
+++ b/drivers/net/virtio/virtio_user_ethdev.c
@@ -441,6 +441,8 @@ virtio_user_eth_dev_alloc(struct rte_vdev_device *vdev)
hw->modern   = 0;
hw->use_simple_rx = 0;
hw->use_simple_tx = 0;
+   hw->use_inorder_rx = 0;
+   hw->use_inorder_tx = 0;
hw->virtio_user_dev = dev;
return eth_dev;
 }
-- 
2.17.0



[dpdk-dev] [PATCH v5 5/9] net/virtio: free in-order descriptors before device start

2018-07-01 Thread Marvin Liu
Add new function for freeing IN_ORDER descriptors. As descriptors will
be allocated and freed sequentially when IN_ORDER feature was
negotiated. There will be no need to utilize chain for freed descriptors
management, only index update is enough.

Signed-off-by: Marvin Liu 
Reviewed-by: Maxime Coquelin 

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 92fab2174..0bca29855 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -47,6 +47,13 @@ virtio_dev_rx_queue_done(void *rxq, uint16_t offset)
return VIRTQUEUE_NUSED(vq) >= offset;
 }
 
+void
+vq_ring_free_inorder(struct virtqueue *vq, uint16_t desc_idx, uint16_t num)
+{
+   vq->vq_free_cnt += num;
+   vq->vq_desc_tail_idx = desc_idx & (vq->vq_nentries - 1);
+}
+
 void
 vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx)
 {
diff --git a/drivers/net/virtio/virtqueue.c b/drivers/net/virtio/virtqueue.c
index a7d0a9cbe..56a77cc71 100644
--- a/drivers/net/virtio/virtqueue.c
+++ b/drivers/net/virtio/virtqueue.c
@@ -74,6 +74,14 @@ virtqueue_rxvq_flush(struct virtqueue *vq)
desc_idx = used_idx;
rte_pktmbuf_free(vq->sw_ring[desc_idx]);
vq->vq_free_cnt++;
+   } else if (hw->use_inorder_rx) {
+   desc_idx = (uint16_t)uep->id;
+   dxp = &vq->vq_descx[desc_idx];
+   if (dxp->cookie != NULL) {
+   rte_pktmbuf_free(dxp->cookie);
+   dxp->cookie = NULL;
+   }
+   vq_ring_free_inorder(vq, desc_idx, 1);
} else {
desc_idx = (uint16_t)uep->id;
dxp = &vq->vq_descx[desc_idx];
diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 14364f356..26518ed98 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -306,6 +306,8 @@ virtio_get_queue_type(struct virtio_hw *hw, uint16_t 
vtpci_queue_idx)
 #define VIRTQUEUE_NUSED(vq) ((uint16_t)((vq)->vq_ring.used->idx - 
(vq)->vq_used_cons_idx))
 
 void vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx);
+void vq_ring_free_inorder(struct virtqueue *vq, uint16_t desc_idx,
+ uint16_t num);
 
 static inline void
 vq_update_avail_idx(struct virtqueue *vq)
-- 
2.17.0



[dpdk-dev] [PATCH v5 3/9] net/virtio-user: add unsupported features mask

2018-07-01 Thread Marvin Liu
This patch introduces unsupported features mask for virtio-user device.
For virtio-user server mode, when reconnecting virtio-user will retrieve
vhost device features as base and then unmask unsupported features.

Signed-off-by: Marvin Liu 
Reviewed-by: Maxime Coquelin 

diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c 
b/drivers/net/virtio/virtio_user/virtio_user_dev.c
index 4322527f2..e0e956888 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
@@ -384,6 +384,7 @@ virtio_user_dev_init(struct virtio_user_dev *dev, char 
*path, int queues,
dev->queue_pairs = 1; /* mq disabled by default */
dev->queue_size = queue_size;
dev->mac_specified = 0;
+   dev->unsupported_features = 0;
parse_mac(dev, mac);
 
if (*ifname) {
@@ -419,10 +420,12 @@ virtio_user_dev_init(struct virtio_user_dev *dev, char 
*path, int queues,
dev->device_features = VIRTIO_USER_SUPPORTED_FEATURES;
}
 
-   if (dev->mac_specified)
+   if (dev->mac_specified) {
dev->device_features |= (1ull << VIRTIO_NET_F_MAC);
-   else
+   } else {
dev->device_features &= ~(1ull << VIRTIO_NET_F_MAC);
+   dev->unsupported_features |= (1ull << VIRTIO_NET_F_MAC);
+   }
 
if (cq) {
/* device does not really need to know anything about CQ,
@@ -437,6 +440,14 @@ virtio_user_dev_init(struct virtio_user_dev *dev, char 
*path, int queues,
dev->device_features &= ~(1ull << VIRTIO_NET_F_GUEST_ANNOUNCE);
dev->device_features &= ~(1ull << VIRTIO_NET_F_MQ);
dev->device_features &= ~(1ull << VIRTIO_NET_F_CTRL_MAC_ADDR);
+   dev->unsupported_features |= (1ull << VIRTIO_NET_F_CTRL_VQ);
+   dev->unsupported_features |= (1ull << VIRTIO_NET_F_CTRL_RX);
+   dev->unsupported_features |= (1ull << VIRTIO_NET_F_CTRL_VLAN);
+   dev->unsupported_features |=
+   (1ull << VIRTIO_NET_F_GUEST_ANNOUNCE);
+   dev->unsupported_features |= (1ull << VIRTIO_NET_F_MQ);
+   dev->unsupported_features |=
+   (1ull << VIRTIO_NET_F_CTRL_MAC_ADDR);
}
 
/* The backend will not report this feature, we add it explicitly */
@@ -444,6 +455,7 @@ virtio_user_dev_init(struct virtio_user_dev *dev, char 
*path, int queues,
dev->device_features |= (1ull << VIRTIO_NET_F_STATUS);
 
dev->device_features &= VIRTIO_USER_SUPPORTED_FEATURES;
+   dev->unsupported_features |= ~VIRTIO_USER_SUPPORTED_FEATURES;
 
if (rte_mem_event_callback_register(VIRTIO_USER_MEM_EVENT_CLB_NAME,
virtio_user_mem_event_cb, dev)) {
diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.h 
b/drivers/net/virtio/virtio_user/virtio_user_dev.h
index d2d4cb825..c23ddfcc5 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.h
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.h
@@ -33,6 +33,7 @@ struct virtio_user_dev {
   * and will be sync with device
   */
uint64_tdevice_features; /* supported features by device */
+   uint64_tunsupported_features; /* unsupported features mask */
uint8_t status;
uint16_tport_id;
uint8_t mac_addr[ETHER_ADDR_LEN];
diff --git a/drivers/net/virtio/virtio_user_ethdev.c 
b/drivers/net/virtio/virtio_user_ethdev.c
index 8747cbf94..08fa4bd47 100644
--- a/drivers/net/virtio/virtio_user_ethdev.c
+++ b/drivers/net/virtio/virtio_user_ethdev.c
@@ -30,7 +30,6 @@ virtio_user_server_reconnect(struct virtio_user_dev *dev)
int ret;
int flag;
int connectfd;
-   uint64_t features = dev->device_features;
struct rte_eth_dev *eth_dev = &rte_eth_devices[dev->port_id];
 
connectfd = accept(dev->listenfd, NULL, NULL);
@@ -45,15 +44,8 @@ virtio_user_server_reconnect(struct virtio_user_dev *dev)
return -1;
}
 
-   features &= ~dev->device_features;
-   /* For following bits, vhost-user doesn't really need to know */
-   features &= ~(1ull << VIRTIO_NET_F_MAC);
-   features &= ~(1ull << VIRTIO_NET_F_CTRL_VLAN);
-   features &= ~(1ull << VIRTIO_NET_F_CTRL_MAC_ADDR);
-   features &= ~(1ull << VIRTIO_NET_F_STATUS);
-   if (features)
-   PMD_INIT_LOG(ERR, "WARNING: Some features 0x%" PRIx64 " are not 
supported by vhost-user!",
-features);
+   /* umask vhost-user unsupported features */
+   dev->device_features &= ~(dev->unsupported_features);
 
dev->features &= dev->device_features;
 
-- 
2.17.0



[dpdk-dev] [PATCH v5 6/9] net/virtio: extract common part for in-order functions

2018-07-01 Thread Marvin Liu
IN_ORDER virtio-user Tx function support Tx checksum offloading and
TSO which also support on normal Tx function. So extracts common part
into separated function for reuse.

Signed-off-by: Marvin Liu 
Reviewed-by: Maxime Coquelin 

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 0bca29855..e9b1b496e 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -246,6 +246,55 @@ tx_offload_enabled(struct virtio_hw *hw)
(var) = (val);  \
 } while (0)
 
+static inline void
+virtqueue_xmit_offload(struct virtio_net_hdr *hdr,
+   struct rte_mbuf *cookie,
+   int offload)
+{
+   if (offload) {
+   if (cookie->ol_flags & PKT_TX_TCP_SEG)
+   cookie->ol_flags |= PKT_TX_TCP_CKSUM;
+
+   switch (cookie->ol_flags & PKT_TX_L4_MASK) {
+   case PKT_TX_UDP_CKSUM:
+   hdr->csum_start = cookie->l2_len + cookie->l3_len;
+   hdr->csum_offset = offsetof(struct udp_hdr,
+   dgram_cksum);
+   hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
+   break;
+
+   case PKT_TX_TCP_CKSUM:
+   hdr->csum_start = cookie->l2_len + cookie->l3_len;
+   hdr->csum_offset = offsetof(struct tcp_hdr, cksum);
+   hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
+   break;
+
+   default:
+   ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
+   ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
+   ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
+   break;
+   }
+
+   /* TCP Segmentation Offload */
+   if (cookie->ol_flags & PKT_TX_TCP_SEG) {
+   virtio_tso_fix_cksum(cookie);
+   hdr->gso_type = (cookie->ol_flags & PKT_TX_IPV6) ?
+   VIRTIO_NET_HDR_GSO_TCPV6 :
+   VIRTIO_NET_HDR_GSO_TCPV4;
+   hdr->gso_size = cookie->tso_segsz;
+   hdr->hdr_len =
+   cookie->l2_len +
+   cookie->l3_len +
+   cookie->l4_len;
+   } else {
+   ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
+   ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
+   ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
+   }
+   }
+}
+
 static inline void
 virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct rte_mbuf *cookie,
   uint16_t needed, int use_indirect, int can_push)
@@ -315,49 +364,7 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct 
rte_mbuf *cookie,
idx = start_dp[idx].next;
}
 
-   /* Checksum Offload / TSO */
-   if (offload) {
-   if (cookie->ol_flags & PKT_TX_TCP_SEG)
-   cookie->ol_flags |= PKT_TX_TCP_CKSUM;
-
-   switch (cookie->ol_flags & PKT_TX_L4_MASK) {
-   case PKT_TX_UDP_CKSUM:
-   hdr->csum_start = cookie->l2_len + cookie->l3_len;
-   hdr->csum_offset = offsetof(struct udp_hdr,
-   dgram_cksum);
-   hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
-   break;
-
-   case PKT_TX_TCP_CKSUM:
-   hdr->csum_start = cookie->l2_len + cookie->l3_len;
-   hdr->csum_offset = offsetof(struct tcp_hdr, cksum);
-   hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
-   break;
-
-   default:
-   ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
-   ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
-   ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
-   break;
-   }
-
-   /* TCP Segmentation Offload */
-   if (cookie->ol_flags & PKT_TX_TCP_SEG) {
-   virtio_tso_fix_cksum(cookie);
-   hdr->gso_type = (cookie->ol_flags & PKT_TX_IPV6) ?
-   VIRTIO_NET_HDR_GSO_TCPV6 :
-   VIRTIO_NET_HDR_GSO_TCPV4;
-   hdr->gso_size = cookie->tso_segsz;
-   hdr->hdr_len =
-   cookie->l2_len +
-   cookie->l3_len +
-   cookie->l4_len;
-   } else {
-   ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
-   ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
-   ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
-   }
-   }
+   virtqueue_xmit_offload(hdr, cookie, offload);
 
do {
st

[dpdk-dev] [PATCH v5 7/9] net/virtio: support in-order Rx and Tx

2018-07-01 Thread Marvin Liu
IN_ORDER Rx function depends on merge-able feature. Descriptors
allocation and free will be done in bulk.

Virtio dequeue logic:
dequeue_burst_rx(burst mbufs)
for (each mbuf b) {
if (b need merge) {
merge remained mbufs
add merged mbuf to return mbufs list
} else {
add mbuf to return mbufs list
}
}
if (last mbuf c need merge) {
dequeue_burst_rx(required mbufs)
merge last mbuf c
}
refill_avail_ring_bulk()
update_avail_ring()
return mbufs list

IN_ORDER Tx function can support offloading features. Packets which
matched "can_push" option will be handled by simple xmit function. Those
packets can't match "can_push" will be handled by original xmit function
with in-order flag.

Virtio enqueue logic:
xmit_cleanup(used descs)
for (each xmit mbuf b) {
if (b can inorder xmit) {
add mbuf b to inorder burst list
continue
} else {
xmit inorder burst list
xmit mbuf b by original function
}
}
if (inorder burst list not empty) {
xmit inorder burst list
}
update_avail_ring()

Signed-off-by: Marvin Liu 
Reviewed-by: Maxime Coquelin 

diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index bb40064ea..cd8070248 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -83,9 +83,15 @@ uint16_t virtio_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
 uint16_t virtio_recv_mergeable_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts);
 
+uint16_t virtio_recv_mergeable_pkts_inorder(void *rx_queue,
+   struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
+
 uint16_t virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);
 
+uint16_t virtio_xmit_pkts_inorder(void *tx_queue, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts);
+
 uint16_t virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts);
 
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index e9b1b496e..6394071b8 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -122,6 +122,44 @@ virtqueue_dequeue_burst_rx(struct virtqueue *vq, struct 
rte_mbuf **rx_pkts,
return i;
 }
 
+static uint16_t
+virtqueue_dequeue_rx_inorder(struct virtqueue *vq,
+   struct rte_mbuf **rx_pkts,
+   uint32_t *len,
+   uint16_t num)
+{
+   struct vring_used_elem *uep;
+   struct rte_mbuf *cookie;
+   uint16_t used_idx = 0;
+   uint16_t i;
+
+   if (unlikely(num == 0))
+   return 0;
+
+   for (i = 0; i < num; i++) {
+   used_idx = vq->vq_used_cons_idx & (vq->vq_nentries - 1);
+   /* Desc idx same as used idx */
+   uep = &vq->vq_ring.used->ring[used_idx];
+   len[i] = uep->len;
+   cookie = (struct rte_mbuf *)vq->vq_descx[used_idx].cookie;
+
+   if (unlikely(cookie == NULL)) {
+   PMD_DRV_LOG(ERR, "vring descriptor with no mbuf cookie 
at %u",
+   vq->vq_used_cons_idx);
+   break;
+   }
+
+   rte_prefetch0(cookie);
+   rte_packet_prefetch(rte_pktmbuf_mtod(cookie, void *));
+   rx_pkts[i]  = cookie;
+   vq->vq_used_cons_idx++;
+   vq->vq_descx[used_idx].cookie = NULL;
+   }
+
+   vq_ring_free_inorder(vq, used_idx, i);
+   return i;
+}
+
 #ifndef DEFAULT_TX_FREE_THRESH
 #define DEFAULT_TX_FREE_THRESH 32
 #endif
@@ -150,6 +188,83 @@ virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num)
}
 }
 
+/* Cleanup from completed inorder transmits. */
+static void
+virtio_xmit_cleanup_inorder(struct virtqueue *vq, uint16_t num)
+{
+   uint16_t i, used_idx, desc_idx, last_idx;
+   int16_t free_cnt = 0;
+   struct vq_desc_extra *dxp = NULL;
+
+   if (unlikely(num == 0))
+   return;
+
+   for (i = 0; i < num; i++) {
+   struct vring_used_elem *uep;
+
+   used_idx = vq->vq_used_cons_idx & (vq->vq_nentries - 1);
+   uep = &vq->vq_ring.used->ring[used_idx];
+   desc_idx = (uint16_t)uep->id;
+
+   dxp = &vq->vq_descx[desc_idx];
+   vq->vq_used_cons_idx++;
+
+   if (dxp->cookie != NULL) {
+   rte_pktmbuf_free(dxp->cookie);
+   dxp->cookie = NULL;
+   }
+   }
+
+   last_idx = desc_idx + dxp->ndescs - 1;
+   free_cnt = last_idx - vq->vq_desc_tail_idx;
+   if (free_cnt <= 0)
+   free_cnt += vq->vq_nentries;
+
+   vq_ring_free_inorder(vq, 

[dpdk-dev] [PATCH v5 8/9] net/virtio: add in-order Rx/Tx into selection

2018-07-01 Thread Marvin Liu
After IN_ORDER Rx/Tx paths added, need to update Rx/Tx path selection
logic.

Rx path select logic: If IN_ORDER and merge-able are enabled will select
IN_ORDER Rx path. If IN_ORDER is enabled, Rx offload and merge-able are
disabled will select simple Rx path. Otherwise will select normal Rx
path.

Tx path select logic: If IN_ORDER is enabled will select IN_ORDER Tx
path. Otherwise will select default Tx path.

Signed-off-by: Marvin Liu 
Reviewed-by: Maxime Coquelin 

diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst
index 46e292c4d..7c099fb7c 100644
--- a/doc/guides/nics/virtio.rst
+++ b/doc/guides/nics/virtio.rst
@@ -201,7 +201,7 @@ The packet transmission flow is:
 Virtio PMD Rx/Tx Callbacks
 --
 
-Virtio driver has 3 Rx callbacks and 2 Tx callbacks.
+Virtio driver has 4 Rx callbacks and 3 Tx callbacks.
 
 Rx callbacks:
 
@@ -215,6 +215,9 @@ Rx callbacks:
Vector version without mergeable Rx buffer support, also fixes the available
ring indexes and uses vector instructions to optimize performance.
 
+#. ``virtio_recv_mergeable_pkts_inorder``:
+   In-order version with mergeable Rx buffer support.
+
 Tx callbacks:
 
 #. ``virtio_xmit_pkts``:
@@ -223,6 +226,8 @@ Tx callbacks:
 #. ``virtio_xmit_pkts_simple``:
Vector version fixes the available ring indexes to optimize performance.
 
+#. ``virtio_xmit_pkts_inorder``:
+   In-order version.
 
 By default, the non-vector callbacks are used:
 
@@ -254,6 +259,12 @@ Example of using the vector version of the virtio poll 
mode driver in
 
testpmd -l 0-2 -n 4 -- -i --tx-offloads=0x0 --rxq=1 --txq=1 --nb-cores=1
 
+In-order callbacks only work on simulated virtio user vdev.
+
+*   For Rx: If mergeable Rx buffers is enabled and in-order is enabled then
+``virtio_xmit_pkts_inorder`` is used.
+
+*   For Tx: If in-order is enabled then ``virtio_xmit_pkts_inorder`` is used.
 
 Interrupt mode
 --
diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index df50a571a..df7981ddb 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1320,6 +1320,11 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev)
PMD_INIT_LOG(INFO, "virtio: using simple Rx path on port %u",
eth_dev->data->port_id);
eth_dev->rx_pkt_burst = virtio_recv_pkts_vec;
+   } else if (hw->use_inorder_rx) {
+   PMD_INIT_LOG(INFO,
+   "virtio: using inorder mergeable buffer Rx path on port 
%u",
+   eth_dev->data->port_id);
+   eth_dev->rx_pkt_burst = &virtio_recv_mergeable_pkts_inorder;
} else if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
PMD_INIT_LOG(INFO,
"virtio: using mergeable buffer Rx path on port %u",
@@ -1335,6 +1340,10 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev)
PMD_INIT_LOG(INFO, "virtio: using simple Tx path on port %u",
eth_dev->data->port_id);
eth_dev->tx_pkt_burst = virtio_xmit_pkts_simple;
+   } else if (hw->use_inorder_tx) {
+   PMD_INIT_LOG(INFO, "virtio: using inorder Tx path on port %u",
+   eth_dev->data->port_id);
+   eth_dev->tx_pkt_burst = virtio_xmit_pkts_inorder;
} else {
PMD_INIT_LOG(INFO, "virtio: using standard Tx path on port %u",
eth_dev->data->port_id);
@@ -1874,20 +1883,27 @@ virtio_dev_configure(struct rte_eth_dev *dev)
hw->use_simple_rx = 1;
hw->use_simple_tx = 1;
 
+   if (vtpci_with_feature(hw, VIRTIO_F_IN_ORDER)) {
+   /* Simple Tx not compatible with in-order ring */
+   hw->use_inorder_tx = 1;
+   hw->use_simple_tx = 0;
+   if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
+   hw->use_inorder_rx = 1;
+   hw->use_simple_rx = 0;
+   } else {
+   hw->use_inorder_rx = 0;
+   if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM |
+  DEV_RX_OFFLOAD_TCP_CKSUM))
+   hw->use_simple_rx = 0;
+   }
+   }
+
 #if defined RTE_ARCH_ARM64 || defined RTE_ARCH_ARM
if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_NEON)) {
hw->use_simple_rx = 0;
hw->use_simple_tx = 0;
}
 #endif
-   if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
-   hw->use_simple_rx = 0;
-   hw->use_simple_tx = 0;
-   }
-
-   if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM |
-  DEV_RX_OFFLOAD_TCP_CKSUM))
-   hw->use_simple_rx = 0;
 
return 0;
 }
-- 
2.17.0



[dpdk-dev] [PATCH v5 9/9] net/virtio: advertise support in-order feature

2018-07-01 Thread Marvin Liu
Signed-off-by: Marvin Liu 
Reviewed-by: Maxime Coquelin 

diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index cd8070248..350e9ce73 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -36,6 +36,7 @@
 1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE |  \
 1u << VIRTIO_RING_F_INDIRECT_DESC |\
 1ULL << VIRTIO_F_VERSION_1   | \
+1ULL << VIRTIO_F_IN_ORDER| \
 1ULL << VIRTIO_F_IOMMU_PLATFORM)
 
 #define VIRTIO_PMD_SUPPORTED_GUEST_FEATURES\
diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c 
b/drivers/net/virtio/virtio_user/virtio_user_dev.c
index 953c46055..7df600b02 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
@@ -371,6 +371,7 @@ virtio_user_dev_setup(struct virtio_user_dev *dev)
 1ULL << VIRTIO_NET_F_GUEST_CSUM|   \
 1ULL << VIRTIO_NET_F_GUEST_TSO4|   \
 1ULL << VIRTIO_NET_F_GUEST_TSO6|   \
+1ULL << VIRTIO_F_IN_ORDER  |   \
 1ULL << VIRTIO_F_VERSION_1)
 
 int
-- 
2.17.0



[dpdk-dev] [PATCH v2] mempool/octeontx: fix pool to aura mapping

2018-07-01 Thread Pavan Nikhilesh
HW needs each pool to be mapped to an aura set of 16 auras.
Previously, pool to aura mapping was considered to be 1:1.

Fixes: 02fd6c744350 ("mempool/octeontx: support allocation")
Cc: sta...@dpdk.org

Signed-off-by: Pavan Nikhilesh 
Acked-by: Santosh Shukla 
---
 v2 Changes:
 - use macro to avoid code duplication (Santosh).
 - use uint16_t for gaura id.

 drivers/event/octeontx/timvf_evdev.c  |  2 +-
 drivers/mempool/octeontx/octeontx_fpavf.c | 45 ++-
 drivers/mempool/octeontx/octeontx_fpavf.h |  9 +
 drivers/net/octeontx/octeontx_ethdev.c|  6 +--
 drivers/net/octeontx/octeontx_rxtx.c  |  2 +-
 5 files changed, 42 insertions(+), 22 deletions(-)

diff --git a/drivers/event/octeontx/timvf_evdev.c 
b/drivers/event/octeontx/timvf_evdev.c
index c4fbd2d86..8a045c250 100644
--- a/drivers/event/octeontx/timvf_evdev.c
+++ b/drivers/event/octeontx/timvf_evdev.c
@@ -174,7 +174,7 @@ timvf_ring_start(const struct rte_event_timer_adapter 
*adptr)
if (use_fpa) {
pool = (uintptr_t)((struct rte_mempool *)
timr->chunk_pool)->pool_id;
-   ret = octeontx_fpa_bufpool_gpool(pool);
+   ret = octeontx_fpa_bufpool_gaura(pool);
if (ret < 0) {
timvf_log_dbg("Unable to get gaura id");
ret = -ENOMEM;
diff --git a/drivers/mempool/octeontx/octeontx_fpavf.c 
b/drivers/mempool/octeontx/octeontx_fpavf.c
index 7aecaa85d..e5918c866 100644
--- a/drivers/mempool/octeontx/octeontx_fpavf.c
+++ b/drivers/mempool/octeontx/octeontx_fpavf.c
@@ -243,7 +243,7 @@ octeontx_fpapf_pool_setup(unsigned int gpool, unsigned int 
buf_size,
POOL_LTYPE(0x2) | POOL_STYPE(0) | POOL_SET_NAT_ALIGN |
POOL_ENA;

-   cfg.aid = 0;
+   cfg.aid = FPA_AURA_IDX(gpool);
cfg.pool_cfg = reg;
cfg.pool_stack_base = phys_addr;
cfg.pool_stack_end = phys_addr + memsz;
@@ -327,7 +327,7 @@ octeontx_fpapf_aura_attach(unsigned int gpool_index)
hdr.vfid = gpool_index;
hdr.res_code = 0;
memset(&cfg, 0x0, sizeof(struct octeontx_mbox_fpa_cfg));
-   cfg.aid = gpool_index; /* gpool is guara */
+   cfg.aid = gpool_index << FPA_GAURA_SHIFT;

ret = octeontx_mbox_send(&hdr, &cfg,
sizeof(struct octeontx_mbox_fpa_cfg),
@@ -335,7 +335,8 @@ octeontx_fpapf_aura_attach(unsigned int gpool_index)
if (ret < 0) {
fpavf_log_err("Could not attach fpa ");
fpavf_log_err("aura %d to pool %d. Err=%d. FuncErr=%d\n",
- gpool_index, gpool_index, ret, hdr.res_code);
+ gpool_index << FPA_GAURA_SHIFT, gpool_index, ret,
+ hdr.res_code);
ret = -EACCES;
goto err;
}
@@ -355,14 +356,15 @@ octeontx_fpapf_aura_detach(unsigned int gpool_index)
goto err;
}

-   cfg.aid = gpool_index; /* gpool is gaura */
+   cfg.aid = gpool_index << FPA_GAURA_SHIFT;
hdr.coproc = FPA_COPROC;
hdr.msg = FPA_DETACHAURA;
hdr.vfid = gpool_index;
ret = octeontx_mbox_send(&hdr, &cfg, sizeof(cfg), NULL, 0);
if (ret < 0) {
fpavf_log_err("Couldn't detach FPA aura %d Err=%d FuncErr=%d\n",
- gpool_index, ret, hdr.res_code);
+ gpool_index << FPA_GAURA_SHIFT, ret,
+ hdr.res_code);
ret = -EINVAL;
}

@@ -469,6 +471,7 @@ octeontx_fpa_bufpool_free_count(uintptr_t handle)
 {
uint64_t cnt, limit, avail;
uint8_t gpool;
+   uint16_t gaura;
uintptr_t pool_bar;

if (unlikely(!octeontx_fpa_handle_valid(handle)))
@@ -476,14 +479,16 @@ octeontx_fpa_bufpool_free_count(uintptr_t handle)

/* get the gpool */
gpool = octeontx_fpa_bufpool_gpool(handle);
+   /* get the aura */
+   gaura = octeontx_fpa_bufpool_gaura(handle);

/* Get pool bar address from handle */
pool_bar = handle & ~(uint64_t)FPA_GPOOL_MASK;

cnt = fpavf_read64((void *)((uintptr_t)pool_bar +
-   FPA_VF_VHAURA_CNT(gpool)));
+   FPA_VF_VHAURA_CNT(gaura)));
limit = fpavf_read64((void *)((uintptr_t)pool_bar +
-   FPA_VF_VHAURA_CNT_LIMIT(gpool)));
+   FPA_VF_VHAURA_CNT_LIMIT(gaura)));

avail = fpavf_read64((void *)((uintptr_t)pool_bar +
FPA_VF_VHPOOL_AVAILABLE(gpool)));
@@ -496,6 +501,7 @@ octeontx_fpa_bufpool_create(unsigned int object_size, 
unsigned int object_count,
unsigned int buf_offset, int node_id)
 {
unsigned int gpool;
+   unsigned int gaura;
uintptr_t gpool_handle;
uintptr_t pool_bar;
int res;
@@ -545,16 +551,18 @@ octeontx

Re: [dpdk-dev] [PATCH v2 4/4] net/ena: enable WC

2018-07-01 Thread Michał Krawczyk
2018-06-28 15:15 GMT+02:00 Rafal Kozik :
>
> Write combining (WC) increases NIC performance by making better
> utilization of PCI bus. ENA PMD may make usage of this feature.
>
> To enable it load igb_uio driver with wc_activate set to 1.
>
> Signed-off-by: Rafal Kozik 
> Acked-by: Bruce Richardson 
Acked-by: Michal Krawczyk 
> ---
>  drivers/net/ena/ena_ethdev.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
> index 9ae73e3..1870edf 100644
> --- a/drivers/net/ena/ena_ethdev.c
> +++ b/drivers/net/ena/ena_ethdev.c
> @@ -2210,7 +2210,8 @@ static int eth_ena_pci_remove(struct rte_pci_device 
> *pci_dev)
>
>  static struct rte_pci_driver rte_ena_pmd = {
> .id_table = pci_id_ena_map,
> -   .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
> +   .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
> +RTE_PCI_DRV_WC_ACTIVATE,
> .probe = eth_ena_pci_probe,
> .remove = eth_ena_pci_remove,
>  };
> --
> 2.7.4
>