Re: [dpdk-dev] [dpdk-stable] [PATCH v2 08/13] app/testpmd: fix RSS flow action configuration

2018-04-05 Thread Nélio Laranjeiro
On Wed, Apr 04, 2018 at 04:57:59PM +0200, Adrien Mazarguil wrote:
> Except for a list of queues, RSS configuration (hash key and fields) cannot
> be specified from the flow command line and testpmd does not provide safe
> defaults either.
> 
> In order to validate their implementation with testpmd, PMDs had to
> interpret its NULL RSS configuration parameters somehow, however this has
> never been valid to begin with.
> 
> This patch makes testpmd always provide default values.
> 
> Fixes: 05d34c6e9d2c ("app/testpmd: add queue actions to flow command")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Adrien Mazarguil 
> Cc: Wenzhuo Lu 
> Cc: Jingjing Wu 

Acked-by: Nelio Laranjeiro 

-- 
Nélio Laranjeiro
6WIND


Re: [dpdk-dev] [PATCH] net/i40evf: regression fix - reenable interrupts in handler

2018-04-05 Thread Jankowski, Konrad
Hi Zhang,

If you look at the source of interrupt handlers in both the igb_uio and 
uio_pci_generic drivers (look for irqreturn_t type), you will see they can 
disable interrupts immediately after receipt of one.
It's up to the user to make sure they're re-enabled. Also please compare with 
code in i40e_ethdev.c, which still does this correctly - there's an explicit 
rte_intr_enable(dev->intr_handle) call at the end of 
i40e_dev_interrupt_handler().
Probably a cleaner approach would be to leave them disabled as is, but only 
enable them for a once-off receipt when sending an AdminQ message, maybe that 
was the assumption here. However I've added some tracing to 
igbuio_pci_irqcontrol() and I'm sure this isn't happening on my system. 
(nothing is enabling those interrupts post device init). Looks like code path 
might be different with the newest igb_uio driver and MSI enabled, but the 
current code will still not work for all cases (like with uio_pci_generic).
There can also be cases when you have and older igb_uio driver which disables 
interrupts in all cases and running it with with a new DPDK. (vendor provided 
compiled driver) I think for full compatibility we need to keep re-enabling 
those interrupts.

Regards,
Konrad

-Original Message-
From: Zhang, Qi Z 
Sent: Wednesday, March 28, 2018 4:37 AM
To: Jankowski, Konrad ; Dai, Wei 
; Xing, Beilei ; Wu, Jingjing 
; dev@dpdk.org
Subject: RE: [PATCH] net/i40evf: regression fix - reenable interrupts in handler

Hi Jankowski:

> -Original Message-
> From: Jankowski, KonradX
> Sent: Thursday, February 15, 2018 2:33 AM
> To: Dai, Wei ; Xing, Beilei 
> ; Zhang, Qi Z ; Wu, 
> Jingjing ; dev@dpdk.org
> Cc: Jankowski, KonradX 
> Subject: [PATCH] net/i40evf: regression fix - reenable interrupts in 
> handler
> 
> Commit 66b8304f removed the rte_intr_enable() call from
> i40evf_dev_interrupt_handler() as a "bonus". On one of my systems this 
> causes the AdminQ messages to stop beeing delivered to the VF. This 
> results in unability to initialize and use the port. With this patch it works 
> again.
> 
> System in question:
> Wind River OVP6 running kernel 3.10.58-ovp-rt58-WR6.0.0.13_preempt-rt
> 
> Signed-off-by: Konrad Jankowski 
> ---
>  drivers/net/i40e/i40e_ethdev_vf.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/i40e/i40e_ethdev_vf.c
> b/drivers/net/i40e/i40e_ethdev_vf.c
> index fd003fe..b927a35 100644
> --- a/drivers/net/i40e/i40e_ethdev_vf.c
> +++ b/drivers/net/i40e/i40e_ethdev_vf.c
> @@ -1404,6 +1404,7 @@ i40evf_dev_interrupt_handler(void *param)
> 
>  done:
>   i40evf_enable_irq0(hw);
> + rte_intr_enable(dev->intr_handle);'

Would you explain more about why the patch fix the issue?
Usually we will not accept a fix just because it work but not understand the 
root cause.

Regards
Qi

>  }
> 
>  static int
> --
> 2.5.5

--
Intel Research and Development Ireland Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263


This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient(s). Any review or distribution by others is
strictly prohibited. If you are not the intended recipient, please contact the
sender and delete all copies.



Re: [dpdk-dev] [PATCH] ring: relax alignment constraint on ring structure

2018-04-05 Thread Jerin Jacob
-Original Message-
> Date: Wed, 4 Apr 2018 23:38:41 +
> From: "Ananyev, Konstantin" 
> To: Jerin Jacob , Olivier Matz
>  
> CC: "dev@dpdk.org" , "Richardson, Bruce"
>  
> Subject: RE: [dpdk-dev] [PATCH] ring: relax alignment constraint on ring
>  structure
> 
> Hi lads,
> 
> > -Original Message-
> > From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com]
> > Sent: Tuesday, April 3, 2018 5:43 PM
> > To: Olivier Matz 
> > Cc: dev@dpdk.org; Ananyev, Konstantin ; 
> > Richardson, Bruce 
> > Subject: Re: [dpdk-dev] [PATCH] ring: relax alignment constraint on ring 
> > structure
> > 
> > -Original Message-
> > > Date: Tue, 3 Apr 2018 17:56:01 +0200
> > > From: Olivier Matz 
> > > To: Jerin Jacob 
> > > CC: dev@dpdk.org, konstantin.anan...@intel.com, bruce.richard...@intel.com
> > > Subject: Re: [dpdk-dev] [PATCH] ring: relax alignment constraint on ring
> > >  structure
> > > User-Agent: NeoMutt/20170113 (1.7.2)
> > >
> > > On Tue, Apr 03, 2018 at 09:07:04PM +0530, Jerin Jacob wrote:
> > > > -Original Message-
> > > > > Date: Tue, 3 Apr 2018 17:25:17 +0200
> > > > > From: Olivier Matz 
> > > > > To: Jerin Jacob 
> > > > > CC: dev@dpdk.org, konstantin.anan...@intel.com, 
> > > > > bruce.richard...@intel.com
> > > > > Subject: Re: [dpdk-dev] [PATCH] ring: relax alignment constraint on 
> > > > > ring
> > > > >  structure
> > > > > User-Agent: NeoMutt/20170113 (1.7.2)
> > > > >
> > > > > On Tue, Apr 03, 2018 at 08:37:23PM +0530, Jerin Jacob wrote:
> > > > > > -Original Message-
> > > > > > > Date: Tue, 3 Apr 2018 15:26:44 +0200
> > > > > > > From: Olivier Matz 
> > > > > > > To: dev@dpdk.org
> > > > > > > Subject: [dpdk-dev] [PATCH] ring: relax alignment constraint on 
> > > > > > > ring
> > > > > > >  structure
> > > > > > > X-Mailer: git-send-email 2.11.0
> > > > > > >
> > > > > > > The initial objective of
> > > > > > > commit d9f0d3a1ffd4 ("ring: remove split cacheline build setting")
> > > > > > > was to add an empty cache line betwee, the producer and consumer
> > > > > > > data (on platform with cache line size = 64B), preventing from
> > > > > > > having them on adjacent cache lines.
> > > > > > >
> > > > > > > Following discussion on the mailing list, it appears that this
> > > > > > > also imposes an alignment constraint that is not required.
> > > > > > >
> > > > > > > This patch removes the extra alignment constraint and adds the
> > > > > > > empty cache lines using padding fields in the structure. The
> > > > > > > size of rte_ring structure and the offset of the fields remain
> > > > > > > the same on platforms with cache line size = 64B:
> > > > > > >
> > > > > > >   rte_ring = 384
> > > > > > >   rte_ring.name = 0
> > > > > > >   rte_ring.flags = 32
> > > > > > >   rte_ring.memzone = 40
> > > > > > >   rte_ring.size = 48
> > > > > > >   rte_ring.mask = 52
> > > > > > >   rte_ring.prod = 128
> > > > > > >   rte_ring.cons = 256
> > > > > > >
> > > > > > > But it has an impact on platform where cache line size is 128B:
> > > > > > >
> > > > > > >   rte_ring = 384-> 768
> > > > > > >   rte_ring.name = 0
> > > > > > >   rte_ring.flags = 32
> > > > > > >   rte_ring.memzone = 40
> > > > > > >   rte_ring.size = 48
> > > > > > >   rte_ring.mask = 52
> > > > > > >   rte_ring.prod = 128   -> 256
> > > > > > >   rte_ring.cons = 256   -> 512
> > > > > >
> > > > > > Are we leaving TWO cacheline to make sure, HW prefetch don't load
> > > > > > the adjust cacheline(consumer)?
> > > > > >
> > > > > > If so, Will it have impact on those machine where it is 128B Cache 
> > > > > > line
> > > > > > and the HW prefetcher is not loading the next caching explicitly. 
> > > > > > Right?
> > > > >
> > > > > The impact on machines that have a 128B cache line is that an unused
> > > > > cache line will be added between the producer and consumer data. I
> > > > > expect that the impact is positive in case there is a hw prefetcher, 
> > > > > and
> > > > > null in case there is no such prefetcher.
> > > >
> > > > It is not NULL, Right? You are loosing 256B for each ring.
> > >
> > > Is it really that important?
> > 
> > Pipeline or eventdev SW cases there could more rings in the system.
> > I don't see any downside of having config option which is enabled
> > default.
> > 
> > In my view, such config options are good, as in embedded usecases, customers
> > can really fine tune the target for the need. In server usecases, let the 
> > default
> > of option be enabled, no harm.
> 
> But that would mean we have to maintain two layouts for the rte_ring 
> structure.

Is there any downside of having two configurable layout? meaning, we are not
transferring rte_ring structure over network etc(ie no interoperability
issue). Does it really matter? May I am missing something here.

I was thinking like this:

in config/common_base:
CONFIG_RTE_EAL_HAS_HW_PREFETCH=y

#ifdef RTE_EAL_HAS_HW_PREFETCH
#define EMPTY_CACHE_LINE char pad0 __rte_cache_aligned
#else
#define EMPTY_CACHE_LINE 
#e

[dpdk-dev] [PATCH] vhost: fix meson build issues

2018-04-05 Thread Tomasz Duszynski
This patch addresses following meson build issues:

1) Since rte_vdpa.h includes rte_pci.h it introduces pci
   dependency thus deps array should be updated accordingly.

2) Since vhost.h includes rte_vdpa.h vdpa.c should be added to
   the sources list. Otherwise we end up with linker errors
   caused by undefined references.

Fixes: 34b30b2e7e42 ("vhost: add apis for datapath configuration")
Cc: zhihong.w...@intel.com

Signed-off-by: Tomasz Duszynski 
---
 lib/librte_vhost/meson.build | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_vhost/meson.build b/lib/librte_vhost/meson.build
index 9e8c0e7..8d14877 100644
--- a/lib/librte_vhost/meson.build
+++ b/lib/librte_vhost/meson.build
@@ -10,6 +10,6 @@ endif
 version = 4
 allow_experimental_apis = true
 sources = files('fd_man.c', 'iotlb.c', 'socket.c', 'vhost.c', 'vhost_user.c',
-   'virtio_net.c')
+   'virtio_net.c', 'vdpa.c')
 headers = files('rte_vhost.h')
-deps += ['ethdev']
+deps += ['ethdev', 'pci']
--
2.7.4



Re: [dpdk-dev] [dpdk-stable] [PATCH v2 09/13] app/testpmd: fix missing RSS fields in flow action

2018-04-05 Thread Nélio Laranjeiro
On Wed, Apr 04, 2018 at 04:58:01PM +0200, Adrien Mazarguil wrote:
> Users cannot override the default RSS settings when entering a RSS action,
> only a list of queues can be provided.
> 
> This patch enables them to set a RSS hash key and types for a flow rule.
> 
> Fixes: 05d34c6e9d2c ("app/testpmd: add queue actions to flow command")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Adrien Mazarguil 
> Cc: Wenzhuo Lu 
> Cc: Jingjing Wu 

Acked-by: Nelio Laranjeiro 

-- 
Nélio Laranjeiro
6WIND


Re: [dpdk-dev] [dpdk-stable] [PATCH v2 11/13] ethdev: fix missing boolean values in flow command

2018-04-05 Thread Nélio Laranjeiro
On Wed, Apr 04, 2018 at 04:58:05PM +0200, Adrien Mazarguil wrote:
> Original implementation lacks the on/off toggle.
> 
> This patch shows up as a fix because it has been a popular request ever
> since the first DPDK release with the original implementation but was never
> addressed.
> 
> Fixes: abc3d81aca1b ("app/testpmd: add item raw to flow command")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Adrien Mazarguil 

Acked-by: Nelio Laranjeiro 

> ---
>  app/test-pmd/cmdline_flow.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index dbf4afebf..30450f1a4 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -2695,6 +2695,7 @@ static const char *const boolean_name[] = {
>   "false", "true",
>   "no", "yes",
>   "N", "Y",
> + "off", "on",
>   NULL,
>  };
>  
> -- 
> 2.11.0

-- 
Nélio Laranjeiro
6WIND


Re: [dpdk-dev] [dpdk-stable] [PATCH v2 13/13] ethdev: fix missing include in flow API

2018-04-05 Thread Nélio Laranjeiro
On Wed, Apr 04, 2018 at 04:58:08PM +0200, Adrien Mazarguil wrote:
> Fixes: b1a4b4cbc0a8 ("ethdev: introduce generic flow API")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Adrien Mazarguil 

Acked-by: Nelio Laranjeiro 

> ---
>  lib/librte_ether/rte_flow.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
> index 13e420218..cdaaa3a5b 100644
> --- a/lib/librte_ether/rte_flow.h
> +++ b/lib/librte_ether/rte_flow.h
> @@ -14,6 +14,8 @@
>   * associated actions in hardware through flow rules.
>   */
>  
> +#include 
> +
>  #include 
>  #include 
>  #include 
> -- 
> 2.11.0

-- 
Nélio Laranjeiro
6WIND


Re: [dpdk-dev] [PATCH v6 4/8] lib/librte_vhost: add request handler

2018-04-05 Thread Maxime Coquelin



On 04/04/2018 04:25 PM, Fan Zhang wrote:

This patch adds the implementation that parses virtio crypto request
to dpdk crypto operation.

Signed-off-by: Fan Zhang 
---
  lib/librte_vhost/rte_vhost_crypto.h |  14 +
  lib/librte_vhost/vhost_crypto.c | 622 
  2 files changed, 636 insertions(+)
  create mode 100644 lib/librte_vhost/rte_vhost_crypto.h


...

diff --git a/lib/librte_vhost/vhost_crypto.c b/lib/librte_vhost/vhost_crypto.c
index 34587c27e..5608e94ad 100644
--- a/lib/librte_vhost/vhost_crypto.c
+++ b/lib/librte_vhost/vhost_crypto.c
@@ -8,9 +8,11 @@

...

+static uint8_t
+prepare_sym_cipher_op(struct vhost_crypto *vcrypto, struct rte_crypto_op *op,
+   struct vhost_crypto_data_req *vc_req,
+   struct virtio_crypto_cipher_data_req *cipher,
+   struct vring_desc *cur_desc)
+{
+   struct vring_desc *head = vc_req->head;
+   struct vring_desc *desc = cur_desc;
+   struct rte_vhost_memory *mem = vc_req->mem;
+   struct rte_mbuf *m_src = op->sym->m_src, *m_dst = op->sym->m_dst;
+   uint8_t *iv_data = rte_crypto_op_ctod_offset(op, uint8_t *, IV_OFFSET);
+   uint8_t ret = 0;
+
+   /* prepare */
+   /* iv */
+   if (unlikely(copy_data(iv_data, head, mem, &desc,
+   cipher->para.iv_len) < 0)) {
+   ret = VIRTIO_CRYPTO_BADMSG;
+   goto error_exit;
+   }
+
+#ifdef RTE_LIBRTE_VHOST_DEBUG
+   rte_hexdump(stdout, "IV:", iv_data, cipher->para.iv_len);
+#endif
+
+   m_src->data_len = cipher->para.src_data_len;
+
+   switch (vcrypto->option) {
+   case RTE_VHOST_CRYPTO_ZERO_COPY_ENABLE:
+   m_src->buf_iova = gpa_to_hpa(vcrypto->dev, desc->addr,
+   cipher->para.src_data_len);
+   m_src->buf_addr = get_data_ptr(head, mem, &desc,
+   cipher->para.src_data_len);
+   if (unlikely(m_src->buf_iova == 0 ||
+   m_src->buf_addr == NULL)) {
+   VC_LOG_ERR("zero_copy may fail due to cross page data");
+   ret = VIRTIO_CRYPTO_ERR;
+   goto error_exit;
+   }
+   break;
+   case RTE_VHOST_CRYPTO_ZERO_COPY_DISABLE:
+   if (unlikely(cipher->para.src_data_len >
+   RTE_MBUF_DEFAULT_BUF_SIZE)) {
+   VC_LOG_ERR("Not enough space to do data copy");
+   ret = VIRTIO_CRYPTO_ERR;
+   goto error_exit;
+   }
+   if (unlikely(copy_data(rte_pktmbuf_mtod(m_src, uint8_t *), head,
+   mem, &desc, cipher->para.src_data_len))
+   < 0) {
+   ret = VIRTIO_CRYPTO_BADMSG;
+   goto error_exit;
+   }
+   break;
+   default:
+   ret = VIRTIO_CRYPTO_BADMSG;
+   goto error_exit;
+   break;


Note: I removed the break to make checkpatch happy.


Re: [dpdk-dev] [PATCH v3 09/11] mempool/dpaa: prepare to remove register memory area op

2018-04-05 Thread Hemant Agrawal


On 3/26/2018 9:39 PM, Andrew Rybchenko wrote:

Populate mempool driver callback is executed a bit later than
register memory area, provides the same information and will
substitute the later since it gives more flexibility and in addition
to notification about memory area allows to customize how mempool
objects are stored in memory.

Signed-off-by: Andrew Rybchenko 
---
v2 -> v3:
  - fix build error because of prototype mismatch (char * -> void *)

v1 -> v2:
  - fix build error because of prototype mismatch

  drivers/mempool/dpaa/dpaa_mempool.c | 13 +++--
  1 file changed, 7 insertions(+), 6 deletions(-)


Acked-by: Hemant Agrawal 


Re: [dpdk-dev] [PATCH v6 0/8] vhost: introduce vhost crypto backend

2018-04-05 Thread Maxime Coquelin



On 04/04/2018 04:24 PM, Fan Zhang wrote:

This patchset adds crypto backend suppport to vhost library
including a proof-of-concept sample application. The implementation
follows the virtio-crypto specification and have been tested
with qemu 2.11.50 (with several patches applied, detailed later)
with Fedora 24 running in the frontend.

The vhost_crypto library acts as a "bridge" method that translate
the virtio-crypto crypto requests to DPDK crypto operations, so it
is purely software implementation. However it does require the user
to provide the DPDK Cryptodev ID so it knows how to handle the
virtio-crypto session creation and deletion mesages.

Currently the implementation supports AES-CBC-128 and HMAC-SHA1
cipher only/chaining modes and does not support sessionless mode
yet. The guest can use standard virtio-crypto driver to set up
session and sends encryption/decryption requests to backend. The
vhost-crypto sample application provided in this patchset will
do the actual crypto work.

The following steps are involved to enable vhost-crypto support.

In the host:
1. Download the qemu source code.

2. Recompile your qemu with vhost-crypto option enabled.

3. Apply this patchset to latest DPDK code and recompile DPDK.

4. Compile and run vhost-crypto sample application.

./examples/vhost_crypto/build/vhost-crypto -l 11,12 -w :86:01.0 \
  --socket-mem 2048,2048

Where :86:01.0 is the QAT PCI address. You may use AES-NI-MB if it is
not available. The sample application requires 2 lcores: 1 master and 1
worker. The application will create a UNIX socket file
/tmp/vhost_crypto1.socket.

5. Start your qemu application. Here is my command:

qemu/x86_64-softmmu/qemu-system-x86_64 -machine accel=kvm -cpu host \
-smp 2 -m 1G -hda ~/path-to-your/image.qcow \
-object memory-backend-file,id=mem,size=1G,mem-path=/dev/hugepages,share=on \
-mem-prealloc -numa node,memdev=mem -chardev \
socket,id=charcrypto0,path=/tmp/vhost_crypto1.socket \
-object cryptodev-vhost-user,id=cryptodev0,chardev=charcrypto0 \
-device virtio-crypto-pci,id=crypto0,cryptodev=cryptodev0

6. Once guest is booted. The Linux virtio_crypto kernel module is loaded by
default. You shall see the following logs in your demsg:

[   17.611044] virtio_crypto: loading out-of-tree module taints kernel.
[   17.611083] virtio_crypto: module verification failed: signature and/or ...
[   17.611723] virtio_crypto virtio0: max_queues: 1, max_cipher_key_len: ...
[   17.612156] virtio_crypto virtio0: will run requests pump with realtime ...
[   18.376100] virtio_crypto virtio0: Accelerator is ready

The virtio_crypto driver in the guest is now up and running.

7. The rest steps can be as same as the Testing section in
https://wiki.qemu.org/Features/VirtioCrypto

8. It is possible to use DPDK Virtio Crypto PMD
(https://dpdk.org/dev/patchwork/patch/36921/) in the guest to work with
this patchset to achieve optimal performance.

v6:
- Changed commit message
- removed rte prefix in handler prototype

v5:
- removed external ops register API.
- patch cleaned.

v4:
- Changed external vhost backend ops register API.
- Fixed a bug.

v3:
- Changed external vhost backend private data and message handling
- Added experimental tag to rte_vhost_crypto_set_zero_copy()

v2:
- Moved vhost_crypto_data_req data from crypto op to source mbuf.
- Removed ZERO-COPY flag from config option and make it run-timely changeable.
- Guest-polling mode possible.
- Simplified vring descriptor access procedure.
- Work with both LKCF and DPDK Virtio-Crypto PMD guest drivers.

Fan Zhang (8):
   lib/librte_vhost: add vhost user message handlers
   lib/librte_vhost: add virtio-crypto user message structure
   lib/librte_vhost: add session message handler
   lib/librte_vhost: add request handler
   lib/librte_vhost: add public function implementation
   lib/librte_vhost: update makefile
   examples/vhost_crypto: add vhost crypto sample application
   doc: update for vhost crypto support

  doc/guides/prog_guide/vhost_lib.rst   |   25 +
  doc/guides/rel_notes/release_18_05.rst|5 +
  doc/guides/sample_app_ug/index.rst|1 +
  doc/guides/sample_app_ug/vhost_crypto.rst |   82 ++
  examples/vhost_crypto/Makefile|   32 +
  examples/vhost_crypto/main.c  |  541 
  examples/vhost_crypto/meson.build |   14 +
  lib/librte_vhost/Makefile |6 +-
  lib/librte_vhost/meson.build  |8 +-
  lib/librte_vhost/rte_vhost_crypto.h   |  109 +++
  lib/librte_vhost/rte_vhost_version.map|   11 +
  lib/librte_vhost/vhost.c  |2 +-
  lib/librte_vhost/vhost.h  |   53 +-
  lib/librte_vhost/vhost_crypto.c   | 1312 +
  lib/librte_vhost/vhost_user.c |   33 +-
  lib/librte_vhost/vhost_user.h |   35 +-
  16 files changed, 2256 insertions(+), 13 deletions(-)
  create mode 100644 doc/guides/sample_app_ug/vhost_crypto.rst
  create mode

[dpdk-dev] [PATCH] test/tun: add new test for tun

2018-04-05 Thread Vipin Varghese
Add TUN PMD validation for create, port setup, tx, rx and stats functions.

Signed-off-by: Vipin Varghese 
---
 test/test/Makefile |   4 +
 test/test/autotest_data.py |  13 ++
 test/test/meson.build  |   4 +
 test/test/test_tun.c   | 333 +
 4 files changed, 354 insertions(+)
 create mode 100644 test/test/test_tun.c

diff --git a/test/test/Makefile b/test/test/Makefile
index a88cc38..e5d8200 100644
--- a/test/test/Makefile
+++ b/test/test/Makefile
@@ -193,6 +193,10 @@ endif
 
 SRCS-$(CONFIG_RTE_LIBRTE_KVARGS) += test_kvargs.c
 
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_TAP),y)
+SRCS-y += test_tun.c
+endif
+
 CFLAGS += -DALLOW_EXPERIMENTAL_API
 
 CFLAGS += -O3
diff --git a/test/test/autotest_data.py b/test/test/autotest_data.py
index aacfe0a..35f3aab 100644
--- a/test/test/autotest_data.py
+++ b/test/test/autotest_data.py
@@ -357,6 +357,19 @@ def per_sockets(num):
 ]
 },
 {
+"Prefix":"tun",
+"Memory":"512",
+"Tests":
+[
+{
+"Name":"TUN autotest",
+"Command": "tun_autotest",
+"Func":default_autotest,
+"Report":  None,
+},
+]
+},
+{
 "Prefix":"mempool_perf",
 "Memory":per_sockets(256),
 "Tests":
diff --git a/test/test/meson.build b/test/test/meson.build
index eb3d87a..fbb4cf7 100644
--- a/test/test/meson.build
+++ b/test/test/meson.build
@@ -93,6 +93,7 @@ test_sources = files('commands.c',
'test_timer.c',
'test_timer_perf.c',
'test_timer_racecond.c',
+   'test_tun.c',
'test_version.c',
'virtual_pmd.c'
 )
@@ -227,6 +228,9 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_RING_PMD')
test_deps += 'pmd_ring'
 endif
+if dpdk_conf.has('RTE_LIBRTE_TAP_PMD')
+   test_deps += 'pmd_tap'
+endif
 if dpdk_conf.has('RTE_LIBRTE_POWER')
test_deps += 'power'
 endif
diff --git a/test/test/test_tun.c b/test/test/test_tun.c
new file mode 100644
index 000..c165a94
--- /dev/null
+++ b/test/test/test_tun.c
@@ -0,0 +1,333 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2014 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define NB_MBUF  8192
+#define MAX_PACKET_SZ2048
+#define MBUF_DATA_SZ (MAX_PACKET_SZ + RTE_PKTMBUF_HEADROOM)
+#define PKT_BURST_SZ 32
+#define MEMPOOL_CACHE_SZ PKT_BURST_SZ
+#define SOCKET   0
+#define NB_RXD   1024
+#define NB_TXD   1024
+#define MAX_PKT_BURST32
+#define IFCONFIG "/sbin/ifconfig "
+#define PING "/bin/ping -qr -c 10 -i 0.2 15.0.0.1 -I "
+
+static int tun_id;
+static int socket_id;
+static uint16_t nb_ports, port_id;
+
+static struct rte_eth_dev_info info;
+static struct rte_eth_stats stats;
+static struct rte_mempool *mp;
+static struct rte_mbuf *mbuf;
+
+static char tun_drv_name[20] = "net_tun";
+static char tun_intf_name[20] = "atest";
+static char tun_intf_cmd[20] = "\0";
+static char portname[25] = "\0";
+static char cmd_exec[70] = "\0";
+
+static const struct rte_eth_rxconf rx_conf = {
+   .rx_thresh = {
+   .pthresh = 8,
+   .hthresh = 8,
+   .wthresh = 4,
+   },
+   .rx_free_thresh = 0,
+};
+
+static const struct rte_eth_txconf tx_conf = {
+   .tx_thresh = {
+   .pthresh = 36,
+   .hthresh = 0,
+   .wthresh = 0,
+   },
+   .tx_free_thresh = 0,
+   .tx_rs_thresh = 0,
+};
+
+static const struct rte_eth_conf port_conf = {
+   .rxmode = {
+   .header_split = 0,
+   .hw_ip_checksum = 0,
+   .hw_vlan_filter = 0,
+   .jumbo_frame = 0,
+   .hw_strip_crc = 0,
+   },
+   .txmode = {
+   .mq_mode = ETH_MQ_TX_NONE | ETH_DCB_NONE,
+   },
+};
+
+static void
+send_ping(void *param)
+{
+   char *tun_intf_name = (char *) param;
+
+   sprintf(cmd_exec, "%s %s &", PING, tun_intf_name);
+   if (system(cmd_exec) != 0)
+   printf("fail to execute (%s)!\n", cmd_exec);
+
+   fflush(stdout);
+}
+
+static int
+tun_port_setup(uint16_t port_id)
+{
+   int ret = rte_eth_dev_configure(port_id, 1, 1, &port_conf);
+   if (ret < 0) {
+   printf("fail to configure port %d\n", port_id);
+   return -1;
+   }
+
+   ret = rte_eth_rx_queue_setup(port_id, 0, NB_RXD, socket_id,
+   &rx_conf, mp);
+   if (ret < 0) {
+   printf("fail to setup rx queue for port %d\n", port_id);
+   return -1;
+   }
+
+   ret = rte_eth_tx_queue_setup(port_id, 0, NB_TXD, socket_id,
+   &tx_conf);
+   if (ret < 0) {
+   printf("fail to setup tx queue for port %d\n

Re: [dpdk-dev] [PATCH v5] net/virtio-user: add support for server mode

2018-04-05 Thread Tiwei Bie
On Thu, Apr 05, 2018 at 01:17:53AM +0800, zhiyong.y...@intel.com wrote:
[...]
> +static int
> +virtio_user_start_server(struct virtio_user_dev *dev, struct sockaddr_un *un)
> +{
> + int ret;
> + int flag;
> + int fd = dev->listenfd;
> +
> + ret = bind(fd, (struct sockaddr *)un, sizeof(*un));
> + if (ret < 0) {
> + PMD_DRV_LOG(ERR, "failed to bind to %s: %s; remove it and try 
> again\n",
> + dev->path, strerror(errno));
> + goto err;
> + }
> + ret = listen(fd, MAX_VIRTIO_USER_BACKLOG);
> + if (ret < 0)
> + goto err;
> +
> + flag = fcntl(fd, F_GETFL);
> + fcntl(fd, F_SETFL, flag | O_NONBLOCK);
> + dev->vhostfd = -1;
> +
> + return 0;
> +err:
> + close(dev->listenfd);

The dev->listenfd isn't created in this function, maybe it's
better to avoid closing this file in this function.

> + return -1;
> +}
> +
>  /**
>   * Set up environment to talk with a vhost user backend.
>   *
> @@ -390,6 +418,7 @@ vhost_user_setup(struct virtio_user_dev *dev)
>  {
>   int fd;
>   int flag;
> + int ret = 0;
>   struct sockaddr_un un;
>  
>   fd = socket(AF_UNIX, SOCK_STREAM, 0);
> @@ -405,14 +434,20 @@ vhost_user_setup(struct virtio_user_dev *dev)
>   memset(&un, 0, sizeof(un));
>   un.sun_family = AF_UNIX;
>   snprintf(un.sun_path, sizeof(un.sun_path), "%s", dev->path);
> - if (connect(fd, (struct sockaddr *)&un, sizeof(un)) < 0) {
> - PMD_DRV_LOG(ERR, "connect error, %s", strerror(errno));
> - close(fd);
> - return -1;
> +
> + if (dev->is_server) {
> + dev->listenfd = fd;
> + ret = virtio_user_start_server(dev, &un);
> + } else {

Maybe it's better to keep the style consistent. How
about something like this:

if (dev->is_server) {
if (virtio_user_start_server(fd, &un) < 0) {
PMD_DRV_LOG(ERR, some messages...);
close(fd);
return -1;
}
dev->listenfd = fd;
dev->vhostfd = -1;
} else {

> + dev->vhostfd = fd;
> + if (connect(fd, (struct sockaddr *)&un, sizeof(un)) < 0) {
> + PMD_DRV_LOG(ERR, "connect error, %s", strerror(errno));
> + close(fd);
> + return -1;
> + }
>   }
>  
> - dev->vhostfd = fd;
> - return 0;
> + return ret;
>  }
>  
>  static int
> diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c 
> b/drivers/net/virtio/virtio_user/virtio_user_dev.c
> index f90fee9e5..45e324679 100644
> --- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
> +++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
> @@ -254,7 +254,8 @@ virtio_user_fill_intr_handle(struct virtio_user_dev *dev)
>   eth_dev->intr_handle->fd = -1;
>   if (dev->vhostfd >= 0)
>   eth_dev->intr_handle->fd = dev->vhostfd;
> -

Maybe it's better to keep this empty line (keep it before the return 0).

> + else if (dev->is_server)
> + eth_dev->intr_handle->fd = dev->listenfd;
>   return 0;
>  }
>  
> @@ -267,24 +268,29 @@ virtio_user_dev_setup(struct virtio_user_dev *dev)
>   dev->vhostfds = NULL;
>   dev->tapfds = NULL;
>  
> - if (is_vhost_user_by_type(dev->path)) {
> - dev->ops = &ops_user;
> + if (dev->is_server) {
> + dev->ops = &ops_user;/* server mode only supports vhost user */
>   } else {
> - dev->ops = &ops_kernel;
> -
> - dev->vhostfds = malloc(dev->max_queue_pairs * sizeof(int));
> - dev->tapfds = malloc(dev->max_queue_pairs * sizeof(int));
> - if (!dev->vhostfds || !dev->tapfds) {
> - PMD_INIT_LOG(ERR, "Failed to malloc");
> - return -1;
> - }
> -
> - for (q = 0; q < dev->max_queue_pairs; ++q) {
> - dev->vhostfds[q] = -1;
> - dev->tapfds[q] = -1;
> + if (is_vhost_user_by_type(dev->path)) {
> + dev->ops = &ops_user;
> + } else {
> + dev->ops = &ops_kernel;
> +
> + dev->vhostfds = malloc(dev->max_queue_pairs *
> +sizeof(int));
> + dev->tapfds = malloc(dev->max_queue_pairs *
> +  sizeof(int));
> + if (!dev->vhostfds || !dev->tapfds) {
> + PMD_INIT_LOG(ERR, "Failed to malloc");
> + return -1;
> + }
> +
> + for (q = 0; q < dev->max_queue_pairs; ++q) {
> + dev->vhostfds[q] = -1;
> + dev->tapfds[q] = -1;
> + }
>   }
>   }
> -

There is no need to remove thi

[dpdk-dev] [PATCH V19 0/4] add device event monitor framework

2018-04-05 Thread Jeff Guo
About hot plug in dpdk, We already have proactive way to add/remove devices
through APIs (rte_eal_hotplug_add/remove), and also have fail-safe driver
to offload the fail-safe work from the app user. But there are still lack
of a general mechanism to monitor hotplug event for all driver, now the
hotplug interrupt event is diversity between each device and driver, such
as mlx4, pci driver and others.

Use the hot removal event for example, pci drivers not all exposure the
remove interrupt, so in order to make user to easy use the hot plug
feature for pci driver, something must be done to detect the remove event
at the kernel level and offer a new line of interrupt to the user land.

Base on the uevent of kobject mechanism in kernel, we could use it to
benefit for monitoring the hot plug status of the device which not only
uio/vfio of pci bus devices, but also other, such as cpu/usb/pci-express bus 
devices.

The idea is comming as bellow.

a.The uevent message form FD monitoring like below.
remove@/devices/pci:80/:80:02.2/:82:00.0/:83:03.0/:84:00.2/uio/uio2
ACTION=remove
DEVPATH=/devices/pci:80/:80:02.2/:82:00.0/:83:03.0/:84:00.2/uio/uio2
SUBSYSTEM=uio
MAJOR=243
MINOR=2
DEVNAME=uio2
SEQNUM=11366

b.add device event monitor framework:
add several general api to enable uevent monitoring.

c.show example how to use uevent monitor
enable uevent monitoring in testpmd to show device event monitor machenism 
usage.

TODO: failure handler mechanism for hot plug and driver auto bind for hot 
insertion.
that would let the next hot plug patch set to cover.

patchset history:
v19->v18:
fix some typo and misunderstanding part

v18->v17:
1.add feature announcement in release document, fix bsp compile issue.
2.refine socket configuration.
3.remove hotplug policy and detach/attach process from testpmd, let it
focus on the device event monitoring which the patch set introduced.

v17->v16:
1.add related part of the interrupt handle type adding.
2.add new API into map, fix typo issue, add (void*)-1 value for unregister all 
callback
3.add new file into meson.build, modify coding sytle and add print info, delete 
unused part.
4.unregister all user's callback when stop event monitor

v16->v15:
1.remove some linux related code out of eal common layer
2.fix some uneasy readble issue.

v15->v14:
1.use exist eal interrupt epoll to replace of rte service usage for monitor 
thread,
2.add new device event handle type in eal interrupt.
3.remove the uevent type check and any policy from eal,
let it check and management in user's callback.
4.add "--hot-plug" configure parameter in testpmd to switch the hotplug feature.

v14->v13:
1.add __rte_experimental on function defind and fix bsd build issue

v13->v12:
1.fix some logic issue and null check issue
2.fix monitor stop func issue

v12->v11:
1.identify null param in callback for monitor all devices uevent

v11->v10:
1:modify some typo and add experimental tag in new file.
2:modify callback register calling.

v10->v9:
1.fix prefix issue.
2.use a common callback lists for all device and all type to replace
add callback parameter into device struct.
3.delete some unuse part.

v9->v8:
split the patch set into small and explicit patch

v8->v7:
1.use rte_service to replace pthread management.
2.fix defind issue and copyright issue
3.fix some lock issue

v7->v6:
1.modify vdev part according to the vdev rework
2.re-define and split the func into common and bus specific code
3.fix some incorrect issue.
4.fix the system hung after send packcet issue.

v6->v5:
1.add hot plug policy, in eal, default handle to prepare hot plug work for
all pci device, then let app to manage to deside which device need to
hot plug.
2.modify to manage event callback in each device.
3.fix some system hung issue when igb_uioome typo error.release.
4.modify the pci part to the bus-pci base on the bus rework.
5.add hot plug policy in app, show example to use hotplug list to manage
to deside which device need to hot plug.

v5->v4:
1.Move uevent monitor epolling from eal interrupt to eal device layer.
2.Redefine the eal device API for common, and distinguish between linux and bsd
3.Add failure handler helper api in bus layer.Add function of find device by 
name.
4.Replace of individual fd bind with single device, use a common fd to polling 
all device.
5.Add to register hot insertion monitoring and process, add function to auto 
bind driver befor user add device
6.Refine some coding style and typos issue
7.add new callback to process hot insertion

v4->v3:
1.move uevent monitor api from eal interrupt to eal device layer.
2.create uevent type and struct in eal device.
3.move uevent handler for each driver to eal layer.
4.add uevent failure handler to process signal fault issue.
5.add example for request and use uevent monitoring in testpmd.

v3->v2:
1.refine some return error
2.refine the string searching logic to avoid memory issue

v2->v1:
1.remove global variables of hotplug_fd, add ue

[dpdk-dev] [PATCH V19 1/4] eal: add device event handle in interrupt thread

2018-04-05 Thread Jeff Guo
Add new interrupt handle type of RTE_INTR_HANDLE_DEV_EVENT, for
device event interrupt monitor.

Signed-off-by: Jeff Guo 
---
v19->v18:
fix some typo
---
 lib/librte_eal/common/include/rte_eal_interrupts.h |  1 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 11 +-
 test/test/test_interrupts.c| 39 --
 3 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_eal_interrupts.h 
b/lib/librte_eal/common/include/rte_eal_interrupts.h
index 3f792a9..6eb4932 100644
--- a/lib/librte_eal/common/include/rte_eal_interrupts.h
+++ b/lib/librte_eal/common/include/rte_eal_interrupts.h
@@ -34,6 +34,7 @@ enum rte_intr_handle_type {
RTE_INTR_HANDLE_ALARM,/**< alarm handle */
RTE_INTR_HANDLE_EXT,  /**< external handler */
RTE_INTR_HANDLE_VDEV, /**< virtual device */
+   RTE_INTR_HANDLE_DEV_EVENT,/**< device event handle */
RTE_INTR_HANDLE_MAX   /**< count of elements */
 };
 
diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index f86f22f..58e9328 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -559,6 +559,9 @@ rte_intr_enable(const struct rte_intr_handle *intr_handle)
return -1;
break;
 #endif
+   /* not used at this moment */
+   case RTE_INTR_HANDLE_DEV_EVENT:
+   return -1;
/* unknown handle type */
default:
RTE_LOG(ERR, EAL,
@@ -606,6 +609,9 @@ rte_intr_disable(const struct rte_intr_handle *intr_handle)
return -1;
break;
 #endif
+   /* not used at this moment */
+   case RTE_INTR_HANDLE_DEV_EVENT:
+   return -1;
/* unknown handle type */
default:
RTE_LOG(ERR, EAL,
@@ -674,7 +680,10 @@ eal_intr_process_interrupts(struct epoll_event *events, 
int nfds)
bytes_read = 0;
call = true;
break;
-
+   case RTE_INTR_HANDLE_DEV_EVENT:
+   bytes_read = 0;
+   call = true;
+   break;
default:
bytes_read = 1;
break;
diff --git a/test/test/test_interrupts.c b/test/test/test_interrupts.c
index 31a70a0..dc19175 100644
--- a/test/test/test_interrupts.c
+++ b/test/test/test_interrupts.c
@@ -20,6 +20,7 @@ enum test_interrupt_handle_type {
TEST_INTERRUPT_HANDLE_VALID,
TEST_INTERRUPT_HANDLE_VALID_UIO,
TEST_INTERRUPT_HANDLE_VALID_ALARM,
+   TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT,
TEST_INTERRUPT_HANDLE_CASE1,
TEST_INTERRUPT_HANDLE_MAX
 };
@@ -80,6 +81,10 @@ test_interrupt_init(void)
intr_handles[TEST_INTERRUPT_HANDLE_VALID_ALARM].type =
RTE_INTR_HANDLE_ALARM;
 
+   intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT].fd = pfds.readfd;
+   intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT].type =
+   RTE_INTR_HANDLE_DEV_EVENT;
+
intr_handles[TEST_INTERRUPT_HANDLE_CASE1].fd = pfds.writefd;
intr_handles[TEST_INTERRUPT_HANDLE_CASE1].type = RTE_INTR_HANDLE_UIO;
 
@@ -250,6 +255,14 @@ test_interrupt_enable(void)
return -1;
}
 
+   /* check with specific valid intr_handle */
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT];
+   if (rte_intr_enable(&test_intr_handle) == 0) {
+   printf("unexpectedly enable a specific intr_handle "
+   "successfully\n");
+   return -1;
+   }
+
/* check with valid handler and its type */
test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_CASE1];
if (rte_intr_enable(&test_intr_handle) < 0) {
@@ -306,6 +319,14 @@ test_interrupt_disable(void)
return -1;
}
 
+   /* check with specific valid intr_handle */
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT];
+   if (rte_intr_disable(&test_intr_handle) == 0) {
+   printf("unexpectedly disable a specific intr_handle "
+   "successfully\n");
+   return -1;
+   }
+
/* check with valid handler and its type */
test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_CASE1];
if (rte_intr_disable(&test_intr_handle) < 0) {
@@ -393,9 +414,17 @@ test_interrupt(void)
goto out;
}
 
+   printf("Check valid device event interrupt full path\n");
+   if (test_interrupt_full_path_check(
+   TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT) < 0) {
+   printf("failure occurred during checking valid device event "
+   "interrupt full pa

[dpdk-dev] [PATCH V19 2/4] eal: add device event monitor framework

2018-04-05 Thread Jeff Guo
This patch aims to add a general device event monitor framework at
EAL device layer, for device hotplug awareness and actions adopted
accordingly. It could also expand for all other types of device event
monitor, but not in this scope at the stage.

To get started, users firstly call below new added APIs to enable/disable
the device event monitor mechanism:
  - rte_dev_event_monitor_start
  - rte_dev_event_monitor_stop

Then users shell register or unregister callbacks through the new added
APIs. Callbacks can be some device specific, or for all devices.
  -rte_dev_event_callback_register
  -rte_dev_event_callback_unregister

Use hotplug case for example, when device hotplug insertion or hotplug
removal, we will get notified from kernel, then call user's callbacks
accordingly to handle it, such as detach or attach the device from the
bus, and could benefit further fail-safe or live-migration.

Signed-off-by: Jeff Guo 
---
v19->v18:
clear the coding style and fix typo
---
 doc/guides/rel_notes/release_18_05.rst  |   9 ++
 lib/librte_eal/bsdapp/eal/Makefile  |   1 +
 lib/librte_eal/bsdapp/eal/eal_dev.c |  21 +
 lib/librte_eal/bsdapp/eal/meson.build   |   1 +
 lib/librte_eal/common/eal_common_dev.c  | 161 
 lib/librte_eal/common/eal_private.h |  15 +++
 lib/librte_eal/common/include/rte_dev.h |  94 +++
 lib/librte_eal/linuxapp/eal/Makefile|   1 +
 lib/librte_eal/linuxapp/eal/eal_dev.c   |  22 +
 lib/librte_eal/linuxapp/eal/meson.build |   1 +
 lib/librte_eal/rte_eal_version.map  |  10 ++
 11 files changed, 336 insertions(+)
 create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c

diff --git a/doc/guides/rel_notes/release_18_05.rst 
b/doc/guides/rel_notes/release_18_05.rst
index e5fac1c..d3c86bd 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -58,6 +58,15 @@ New Features
   * Added support for NVGRE, VXLAN and GENEVE filters in flow API.
   * Added support for DROP action in flow API.
 
+* **Added device event monitor framework.**
+
+  Added a general device event monitor framework at EAL, for device dynamic 
management.
+  Such as device hotplug awareness and actions adopted accordingly. The list 
of new APIs:
+
+  * ``rte_dev_event_monitor_start`` and ``rte_dev_event_monitor_stop`` are for
+the event monitor enable and disable.
+  * ``rte_dev_event_callback_register`` and 
``rte_dev_event_callback_unregister``
+are for the user's callbacks register and unregister.
 
 API Changes
 ---
diff --git a/lib/librte_eal/bsdapp/eal/Makefile 
b/lib/librte_eal/bsdapp/eal/Makefile
index ed1d17b..90b88eb 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -33,6 +33,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_lcore.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_timer.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_interrupts.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_alarm.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_dev.c
 
 # from common dir
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_common_lcore.c
diff --git a/lib/librte_eal/bsdapp/eal/eal_dev.c 
b/lib/librte_eal/bsdapp/eal/eal_dev.c
new file mode 100644
index 000..1c6c51b
--- /dev/null
+++ b/lib/librte_eal/bsdapp/eal/eal_dev.c
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+int __rte_experimental
+rte_dev_event_monitor_start(void)
+{
+   RTE_LOG(ERR, EAL, "Device event is not supported for FreeBSD\n");
+   return -1;
+}
+
+int __rte_experimental
+rte_dev_event_monitor_stop(void)
+{
+   RTE_LOG(ERR, EAL, "Device event is not supported for FreeBSD\n");
+   return -1;
+}
diff --git a/lib/librte_eal/bsdapp/eal/meson.build 
b/lib/librte_eal/bsdapp/eal/meson.build
index e83fc91..6dfc533 100644
--- a/lib/librte_eal/bsdapp/eal/meson.build
+++ b/lib/librte_eal/bsdapp/eal/meson.build
@@ -12,4 +12,5 @@ env_sources = files('eal_alarm.c',
'eal_timer.c',
'eal.c',
'eal_memory.c',
+   'eal_dev.c'
 )
diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index cd07144..e202cf2 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -14,9 +14,34 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "eal_private.h"
 
+/**
+ * The device event callback description.
+ *
+ * It contains callback address to be registered by user application,
+ * the pointer to the parameters for callback, and the device name.
+ */
+struct dev_event_callback {
+   TAILQ_ENTRY(dev_event_callback) next; /**< Callbacks list */
+   rte_dev_event_cb_fn cb_fn;/**< Callback address */
+   void *cb_arg;   /**< Callback parameter */
+   char *dev_name;  /**<

[dpdk-dev] [PATCH V19 3/4] eal/linux: uevent parse and process

2018-04-05 Thread Jeff Guo
In order to handle the uevent which has been detected from the kernel
side, add uevent parse and process function to translate the uevent into
device event, which user has subscribed to monitor.

Signed-off-by: Jeff Guo 
---
v19->18:
fix some misunderstanding part
---
 lib/librte_eal/linuxapp/eal/eal_dev.c | 196 +-
 1 file changed, 194 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c 
b/lib/librte_eal/linuxapp/eal/eal_dev.c
index 9c8d1a0..4686c41 100644
--- a/lib/librte_eal/linuxapp/eal/eal_dev.c
+++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
@@ -2,21 +2,213 @@
  * Copyright(c) 2018 Intel Corporation
  */
 
+#include 
+#include 
+#include 
+#include 
+
 #include 
 #include 
 #include 
+#include 
+#include 
+
+#include "eal_private.h"
+
+static struct rte_intr_handle intr_handle = {.fd = -1 };
+static bool monitor_started;
+
+#define EAL_UEV_MSG_LEN 4096
+#define EAL_UEV_MSG_ELEM_LEN 128
+
+/* identify the system layer which reports this event. */
+enum eal_dev_event_subsystem {
+   EAL_DEV_EVENT_SUBSYSTEM_PCI, /* PCI bus device event */
+   EAL_DEV_EVENT_SUBSYSTEM_UIO, /* UIO driver device event */
+   EAL_DEV_EVENT_SUBSYSTEM_VFIO, /* VFIO driver device event */
+   EAL_DEV_EVENT_SUBSYSTEM_MAX
+};
+
+static int
+dev_uev_socket_fd_create(void)
+{
+   struct sockaddr_nl addr;
+   int ret;
+
+   intr_handle.fd = socket(PF_NETLINK, SOCK_RAW | SOCK_CLOEXEC |
+   SOCK_NONBLOCK,
+   NETLINK_KOBJECT_UEVENT);
+   if (intr_handle.fd < 0) {
+   RTE_LOG(ERR, EAL, "create uevent fd failed.\n");
+   return -1;
+   }
+
+   memset(&addr, 0, sizeof(addr));
+   addr.nl_family = AF_NETLINK;
+   addr.nl_pid = 0;
+   addr.nl_groups = 0x;
+
+   ret = bind(intr_handle.fd, (struct sockaddr *) &addr, sizeof(addr));
+   if (ret < 0) {
+   RTE_LOG(ERR, EAL, "Failed to bind uevent socket.\n");
+   goto err;
+   }
+
+   return 0;
+err:
+   close(intr_handle.fd);
+   intr_handle.fd = -1;
+   return ret;
+}
+
+static int
+dev_uev_parse(const char *buf, struct rte_dev_event *event, int length)
+{
+   char action[EAL_UEV_MSG_ELEM_LEN];
+   char subsystem[EAL_UEV_MSG_ELEM_LEN];
+   char pci_slot_name[EAL_UEV_MSG_ELEM_LEN];
+   int i = 0, ret = 0;
+
+   memset(action, 0, EAL_UEV_MSG_ELEM_LEN);
+   memset(subsystem, 0, EAL_UEV_MSG_ELEM_LEN);
+   memset(pci_slot_name, 0, EAL_UEV_MSG_ELEM_LEN);
+
+   while (i < length) {
+   for (; i < length; i++) {
+   if (*buf)
+   break;
+   buf++;
+   }
+   /**
+* check device uevent from kernel side, no need to check
+* uevent from udev.
+*/
+   if (!strncmp(buf, "libudev", 7)) {
+   buf += 7;
+   i += 7;
+   return -1;
+   }
+   if (!strncmp(buf, "ACTION=", 7)) {
+   buf += 7;
+   i += 7;
+   snprintf(action, sizeof(action), "%s", buf);
+   } else if (!strncmp(buf, "SUBSYSTEM=", 10)) {
+   buf += 10;
+   i += 10;
+   snprintf(subsystem, sizeof(subsystem), "%s", buf);
+   } else if (!strncmp(buf, "PCI_SLOT_NAME=", 14)) {
+   buf += 14;
+   i += 14;
+   snprintf(pci_slot_name, sizeof(subsystem), "%s", buf);
+   event->devname = strdup(pci_slot_name);
+   }
+   for (; i < length; i++) {
+   if (*buf == '\0')
+   break;
+   buf++;
+   }
+   }
+
+   /* parse the subsystem layer */
+   if (!strncmp(subsystem, "uio", 3))
+   event->subsystem = EAL_DEV_EVENT_SUBSYSTEM_UIO;
+   else if (!strncmp(subsystem, "pci", 3))
+   event->subsystem = EAL_DEV_EVENT_SUBSYSTEM_PCI;
+   else if (!strncmp(subsystem, "vfio", 4))
+   event->subsystem = EAL_DEV_EVENT_SUBSYSTEM_VFIO;
+   else
+   ret = -1;
 
+   /* parse the action type */
+   if (!strncmp(action, "add", 3))
+   event->type = RTE_DEV_EVENT_ADD;
+   else if (!strncmp(action, "remove", 6))
+   event->type = RTE_DEV_EVENT_REMOVE;
+   else
+   ret = -1;
+   return ret;
+}
+
+static void
+dev_uev_handler(__rte_unused void *param)
+{
+   struct rte_dev_event uevent;
+   int ret;
+   char buf[EAL_UEV_MSG_LEN];
+
+   memset(&uevent, 0, sizeof(struct rte_dev_event));
+   memset(buf, 0, EAL_UEV_MSG_LEN);
+
+   ret = recv(intr_handle.fd, buf, EAL_UEV_MSG_LEN, MSG_DONTWAIT);
+   if (ret == 0 ||

[dpdk-dev] [PATCH V19 4/4] app/testpmd: enable device hotplug monitoring

2018-04-05 Thread Jeff Guo
Use testpmd for example, to show how an application uses device event
APIs to monitor the hotplug events, including both hot removal event
and hot insertion event.

The process is that, testpmd first enable hotplug by below commands,

E.g. ./build/app/testpmd -c 0x3 --n 4 -- -i --hot-plug

then testpmd starts the device event monitor by calling the new API
(rte_dev_event_monitor_start) and register the user's callback by call
the API (rte_dev_event_callback_register), when device being hotplug
insertion or hotplug removal, the device event monitor detects the event
and call user's callbacks, user could process the event in the callback
accordingly.

This patch only shows the event monitoring, device attach/detach would
not be involved here, will add from other hotplug patch set.

Signed-off-by: Jeff Guo 
---
v19->v18:
fix some typo
---
 app/test-pmd/parameters.c |   5 +-
 app/test-pmd/testpmd.c| 101 +-
 app/test-pmd/testpmd.h|   2 +
 doc/guides/testpmd_app_ug/run_app.rst |   4 ++
 4 files changed, 110 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 2192bdc..1a05284 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -186,6 +186,7 @@ usage(char* progname)
printf("  --flow-isolate-all: "
   "requests flow API isolated mode on all ports at initialization 
time.\n");
printf("  --tx-offloads=0x: hexadecimal bitmask of TX queue 
offloads\n");
+   printf("  --hot-plug: enable hot plug for device.\n");
 }
 
 #ifdef RTE_LIBRTE_CMDLINE
@@ -621,6 +622,7 @@ launch_args_parse(int argc, char** argv)
{ "print-event",1, 0, 0 },
{ "mask-event", 1, 0, 0 },
{ "tx-offloads",1, 0, 0 },
+   { "hot-plug",   0, 0, 0 },
{ 0, 0, 0, 0 },
};
 
@@ -1101,7 +1103,8 @@ launch_args_parse(int argc, char** argv)
rte_exit(EXIT_FAILURE,
 "invalid mask-event 
argument\n");
}
-
+   if (!strcmp(lgopts[opt_idx].name, "hot-plug"))
+   hot_plug = 1;
break;
case 'h':
usage(argv[0]);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 4c0e258..d2c122a 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -284,6 +285,8 @@ uint8_t lsc_interrupt = 1; /* enabled by default */
  */
 uint8_t rmv_interrupt = 1; /* enabled by default */
 
+uint8_t hot_plug = 0; /**< hotplug disabled by default. */
+
 /*
  * Display or mask ether events
  * Default to all events except VF_MBOX
@@ -391,6 +394,12 @@ static void check_all_ports_link_status(uint32_t 
port_mask);
 static int eth_event_callback(portid_t port_id,
  enum rte_eth_event_type type,
  void *param, void *ret_param);
+static void eth_dev_event_callback(char *device_name,
+   enum rte_dev_event_type type,
+   void *param);
+static int eth_dev_event_callback_register(void);
+static int eth_dev_event_callback_unregister(void);
+
 
 /*
  * Check if all the ports are started.
@@ -1853,6 +1862,39 @@ reset_port(portid_t pid)
printf("Done\n");
 }
 
+static int
+eth_dev_event_callback_register(void)
+{
+   int ret;
+
+   /* register the device event callback */
+   ret = rte_dev_event_callback_register(NULL,
+   eth_dev_event_callback, NULL);
+   if (ret) {
+   printf("Failed to register device event callback\n");
+   return -1;
+   }
+
+   return 0;
+}
+
+
+static int
+eth_dev_event_callback_unregister(void)
+{
+   int ret;
+
+   /* unregister the device event callback */
+   ret = rte_dev_event_callback_unregister(NULL,
+   eth_dev_event_callback, NULL);
+   if (ret < 0) {
+   printf("Failed to unregister device event callback\n");
+   return -1;
+   }
+
+   return 0;
+}
+
 void
 attach_port(char *identifier)
 {
@@ -1916,6 +1958,7 @@ void
 pmd_test_exit(void)
 {
portid_t pt_id;
+   int ret;
 
if (test_done == 0)
stop_packet_forwarding();
@@ -1929,6 +1972,18 @@ pmd_test_exit(void)
close_port(pt_id);
}
}
+
+   if (hot_plug) {
+   ret = rte_dev_event_monitor_stop();
+   if (ret)
+   RTE_LOG(ERR, EAL,
+   "fail to stop device event monitor.");
+
+   ret = eth_dev_event_callback_unregister();
+   if (ret)
+

[dpdk-dev] [PATCH 1/2] crypto/dpaa_sec: improve the error checking

2018-04-05 Thread Hemant Agrawal
From: Sunil Kumar Kori 

Reported by NXP's internal coverity

Signed-off-by: Sunil Kumar Kori 
---
 drivers/crypto/dpaa_sec/dpaa_sec.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c 
b/drivers/crypto/dpaa_sec/dpaa_sec.c
index c5191ce..79ccb22 100644
--- a/drivers/crypto/dpaa_sec/dpaa_sec.c
+++ b/drivers/crypto/dpaa_sec/dpaa_sec.c
@@ -388,7 +388,7 @@ static int
 dpaa_sec_prep_cdb(dpaa_sec_session *ses)
 {
struct alginfo alginfo_c = {0}, alginfo_a = {0}, alginfo = {0};
-   uint32_t shared_desc_len = 0;
+   int32_t shared_desc_len = 0;
struct sec_cdb *cdb = &ses->cdb;
int err;
 #if RTE_BYTE_ORDER == RTE_BIG_ENDIAN
@@ -530,6 +530,12 @@ dpaa_sec_prep_cdb(dpaa_sec_session *ses)
ses->digest_length, ses->dir);
}
}
+
+   if (shared_desc_len < 0) {
+   PMD_TX_LOG(ERR, "error in preparing command block\n");
+   return shared_desc_len;
+   }
+
cdb->sh_hdr.hi.field.idlen = shared_desc_len;
cdb->sh_hdr.hi.word = rte_cpu_to_be_32(cdb->sh_hdr.hi.word);
cdb->sh_hdr.lo.word = rte_cpu_to_be_32(cdb->sh_hdr.lo.word);
-- 
2.7.4



[dpdk-dev] [PATCH 2/2] crypto/dpaa2_sec: improve error handling

2018-04-05 Thread Hemant Agrawal
From: Sunil Kumar Kori 

Fixed as reported by NXP's internal coverity.
Also part of dpdk coverity.

Coverity issue: 268331
Coverity issue: 268333

Signed-off-by: Sunil Kumar Kori 
---
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c | 27 +++
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c 
b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
index 23012e3..d02d821 100644
--- a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
+++ b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
@@ -1627,7 +1627,7 @@ dpaa2_sec_auth_init(struct rte_cryptodev *dev,
 {
struct dpaa2_sec_dev_private *dev_priv = dev->data->dev_private;
struct alginfo authdata;
-   unsigned int bufsize, i;
+   int bufsize, i;
struct ctxt_priv *priv;
struct sec_flow_context *flc;
 
@@ -1723,6 +1723,10 @@ dpaa2_sec_auth_init(struct rte_cryptodev *dev,
bufsize = cnstr_shdsc_hmac(priv->flc_desc[DESC_INITFINAL].desc,
   1, 0, &authdata, !session->dir,
   session->digest_length);
+   if (bufsize < 0) {
+   DPAA2_SEC_ERR("Crypto: Invalid buffer length");
+   goto error_out;
+   }
 
flc->word1_sdl = (uint8_t)bufsize;
flc->word2_rflc_31_0 = lower_32_bits(
@@ -1753,7 +1757,7 @@ dpaa2_sec_aead_init(struct rte_cryptodev *dev,
struct dpaa2_sec_aead_ctxt *ctxt = &session->ext_params.aead_ctxt;
struct dpaa2_sec_dev_private *dev_priv = dev->data->dev_private;
struct alginfo aeaddata;
-   unsigned int bufsize, i;
+   int bufsize, i;
struct ctxt_priv *priv;
struct sec_flow_context *flc;
struct rte_crypto_aead_xform *aead_xform = &xform->aead;
@@ -1844,6 +1848,11 @@ dpaa2_sec_aead_init(struct rte_cryptodev *dev,
priv->flc_desc[0].desc, 1, 0,
&aeaddata, session->iv.length,
session->digest_length);
+   if (bufsize < 0) {
+   DPAA2_SEC_ERR("Crypto: Invalid buffer length");
+   goto error_out;
+   }
+
flc->word1_sdl = (uint8_t)bufsize;
flc->word2_rflc_31_0 = lower_32_bits(
(size_t)&(((struct dpaa2_sec_qp *)
@@ -1873,7 +1882,7 @@ dpaa2_sec_aead_chain_init(struct rte_cryptodev *dev,
struct dpaa2_sec_aead_ctxt *ctxt = &session->ext_params.aead_ctxt;
struct dpaa2_sec_dev_private *dev_priv = dev->data->dev_private;
struct alginfo authdata, cipherdata;
-   unsigned int bufsize, i;
+   int bufsize, i;
struct ctxt_priv *priv;
struct sec_flow_context *flc;
struct rte_crypto_cipher_xform *cipher_xform;
@@ -2065,6 +2074,10 @@ dpaa2_sec_aead_chain_init(struct rte_cryptodev *dev,
  ctxt->auth_only_len,
  session->digest_length,
  session->dir);
+   if (bufsize < 0) {
+   DPAA2_SEC_ERR("Crypto: Invalid buffer length");
+   goto error_out;
+   }
} else {
DPAA2_SEC_ERR("Hash before cipher not supported");
goto error_out;
@@ -2156,7 +2169,7 @@ dpaa2_sec_set_ipsec_session(struct rte_cryptodev *dev,
struct ipsec_encap_pdb encap_pdb;
struct ipsec_decap_pdb decap_pdb;
struct alginfo authdata, cipherdata;
-   unsigned int bufsize;
+   int bufsize;
struct sec_flow_context *flc;
 
PMD_INIT_FUNC_TRACE();
@@ -2346,6 +2359,12 @@ dpaa2_sec_set_ipsec_session(struct rte_cryptodev *dev,
1, 0, &decap_pdb, &cipherdata, &authdata);
} else
goto out;
+
+   if (bufsize < 0) {
+   DPAA2_SEC_ERR("Crypto: Invalid buffer length");
+   goto out;
+   }
+
flc->word1_sdl = (uint8_t)bufsize;
 
/* Enable the stashing control bit */
-- 
2.7.4



Re: [dpdk-dev] [PATCH V18 4/4] app/testpmd: enable device hotplug monitoring

2018-04-05 Thread Guo, Jia



On 4/5/2018 12:31 AM, Matan Azrad wrote:

Hi all

What do you think about adding the "--hotplug" parameter as a new EAL command 
line parameter?
that just use testpmd for example at this stage, if the total solution 
is accept for all and got agreement for that i think could let it in EAL 
command in the coming patch set..

good suggestion, azrad.

From: Tan, Jianfeng, Wednesday, April 4, 2018 6:23 AM

-Original Message-
From: Guo, Jia
Sent: Tuesday, April 3, 2018 6:34 PM
To: step...@networkplumber.org; Richardson, Bruce; Yigit, Ferruh;
Ananyev, Konstantin; gaetan.ri...@6wind.com; Wu, Jingjing;
tho...@monjalon.net; mo...@mellanox.com; Van Haaren, Harry; Tan,
Jianfeng
Cc: jblu...@infradead.org; shreyansh.j...@nxp.com; dev@dpdk.org; Guo,
Jia; Zhang, Helin
Subject: [PATCH V18 4/4] app/testpmd: enable device hotplug monitoring

Use testpmd for example, to show how an application use device event

s/use/uses


APIs to monitor the hotplug events, including both hot removal event
and hot insertion event.

The process is that, testpmd first enable hotplug by below commands,

E.g. ./build/app/testpmd -c 0x3 --n 4 -- -i --hot-plug

then testpmd start the device event monitor by call the new API

s/start/starts
s/call/calling


(rte_dev_event_monitor_start) and register the user's callback by call
the API (rte_dev_event_callback_register), when device being hotplug
insertion or hotplug removal, the device event monitor detects the
event and call user's callbacks, user could process the event in the
callback accordingly.

This patch only shows the event monitoring, device attach/detach would
not be involved here, will add from other hotplug patch set.

Signed-off-by: Jeff Guo 

Some typos and a trivial suggestion. Feel free to carry my

Reviewed-by: Jianfeng Tan 

in the next version.


---
v18->v17:
remove hotplug policy and detach/attach process from testpmd, let it
focus on the device event monitoring which the patch set introduced.
---
  app/test-pmd/parameters.c |   5 +-
  app/test-pmd/testpmd.c| 112
+-
  app/test-pmd/testpmd.h|   2 +
  doc/guides/testpmd_app_ug/run_app.rst |   4 ++
  4 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 97d22b8..558cd40 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -186,6 +186,7 @@ usage(char* progname)
printf("  --flow-isolate-all: "
   "requests flow API isolated mode on all ports at
initialization time.\n");
printf("  --tx-offloads=0x: hexadecimal bitmask of TX

queue

offloads\n");
+   printf("  --hot-plug: enable hot plug for device.\n");
  }

  #ifdef RTE_LIBRTE_CMDLINE
@@ -621,6 +622,7 @@ launch_args_parse(int argc, char** argv)
{ "print-event",  1, 0, 0 },
{ "mask-event",   1, 0, 0 },
{ "tx-offloads",  1, 0, 0 },
+   { "hot-plug", 0, 0, 0 },
{ 0, 0, 0, 0 },
};

@@ -1102,7 +1104,8 @@ launch_args_parse(int argc, char** argv)
rte_exit(EXIT_FAILURE,
 "invalid mask-event
argument\n");
}
-
+   if (!strcmp(lgopts[opt_idx].name, "hot-plug"))
+   hot_plug = 1;
break;
case 'h':
usage(argv[0]);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
4c0e258..2faeb90 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -12,6 +12,7 @@
  #include 
  #include 
  #include 
+#include 

  #include 
  #include 
@@ -284,6 +285,8 @@ uint8_t lsc_interrupt = 1; /* enabled by default */
   */
  uint8_t rmv_interrupt = 1; /* enabled by default */

+uint8_t hot_plug = 0; /**< hotplug disabled by default. */
+
  /*
   * Display or mask ether events
   * Default to all events except VF_MBOX @@ -391,6 +394,12 @@ static
void check_all_ports_link_status(uint32_t
port_mask);
  static int eth_event_callback(portid_t port_id,
  enum rte_eth_event_type type,
  void *param, void *ret_param);
+static int eth_dev_event_callback(char *device_name,
+   enum rte_dev_event_type type,
+   void *param);
+static int eth_dev_event_callback_register(void);
+static int eth_dev_event_callback_unregister(void);
+

  /*
   * Check if all the ports are started.
@@ -1853,6 +1862,39 @@ reset_port(portid_t pid)
printf("Done\n");
  }

+static int
+eth_dev_event_callback_register(void)
+{
+   int diag;
+
+   /* register the device event callback */
+   diag = rte_dev_event_callback_register(NULL,
+   eth_dev_event_callback, NULL);
+   if (diag) {
+   pri

Re: [dpdk-dev] [PATCH v3] event/sw: code refractor to reduce the fetch stall

2018-04-05 Thread Jerin Jacob
-Original Message-
> Date: Thu, 5 Apr 2018 11:26:30 +0530
> From: Vipin Varghese 
> To: dev@dpdk.org, harry.van.haa...@intel.com
> CC: jerin.ja...@caviumnetworks.com, Vipin Varghese
>  
> Subject: [PATCH v3] event/sw: code refractor to reduce the fetch stall
> X-Mailer: git-send-email 1.9.1
> 
> With rearranging the code to prefetch the contents before
> loop check increases performance from single and multistage
> atomic pipeline.
> 
> Signed-off-by: Vipin Varghese 
> Acked-by: Harry van Haaren 

Applied to dpdk-next-eventdev/master. Thanks.

> 


[dpdk-dev] [PATCH 2/8] bus/dpaa: fix the unchecked return value

2018-04-05 Thread Hemant Agrawal
From: Sunil Kumar Kori 

Fixes: 5d944582d028 ("bus/dpaa: check portal presence in the caller function")
Coverity issue: 268323
Cc: sta...@dpdk.org

Signed-off-by: Sunil Kumar Kori 
Acked-by: Hemant Agrawal 
---
 drivers/bus/dpaa/dpaa_bus.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c
index 3535da5..ffc90a7 100644
--- a/drivers/bus/dpaa/dpaa_bus.c
+++ b/drivers/bus/dpaa/dpaa_bus.c
@@ -308,9 +308,15 @@ rte_dpaa_portal_fq_init(void *arg, struct qman_fq *fq)
/* Affine above created portal with channel*/
u32 sdqcr;
struct qman_portal *qp;
+   int ret;
 
-   if (unlikely(!RTE_PER_LCORE(dpaa_io)))
-   rte_dpaa_portal_init(arg);
+   if (unlikely(!RTE_PER_LCORE(dpaa_io))) {
+   ret = rte_dpaa_portal_init(arg);
+   if (ret < 0) {
+   DPAA_BUS_LOG(ERR, "portal initialization failure");
+   return ret;
+   }
+   }
 
/* Initialise qman specific portals */
qp = fsl_qman_portal_create();
-- 
2.7.4



[dpdk-dev] [PATCH 1/8] bus/dpaa: fix the resource leak issue

2018-04-05 Thread Hemant Agrawal
From: Sunil Kumar Kori 

Fixes: 9d32ef0f5d61 ("bus/dpaa: support creating dynamic HW portal")
Coverity issue: 268332
Cc: sta...@dpdk.org

Signed-off-by: Sunil Kumar Kori 
Acked-by: Hemant Agrawal 
---
 drivers/bus/dpaa/base/qbman/qman_driver.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/bus/dpaa/base/qbman/qman_driver.c 
b/drivers/bus/dpaa/base/qbman/qman_driver.c
index 66838d2..07b29d5 100644
--- a/drivers/bus/dpaa/base/qbman/qman_driver.c
+++ b/drivers/bus/dpaa/base/qbman/qman_driver.c
@@ -160,6 +160,7 @@ struct qman_portal *fsl_qman_portal_create(void)
 &cpuset);
if (ret) {
error(0, ret, "pthread_getaffinity_np()");
+   kfree(q_pcfg);
return NULL;
}
 
@@ -168,12 +169,14 @@ struct qman_portal *fsl_qman_portal_create(void)
if (CPU_ISSET(loop, &cpuset)) {
if (q_pcfg->cpu != -1) {
pr_err("Thread is not affine to 1 cpu\n");
+   kfree(q_pcfg);
return NULL;
}
q_pcfg->cpu = loop;
}
if (q_pcfg->cpu == -1) {
pr_err("Bug in getaffinity handling!\n");
+   kfree(q_pcfg);
return NULL;
}
 
@@ -183,6 +186,7 @@ struct qman_portal *fsl_qman_portal_create(void)
ret = process_portal_map(&q_map);
if (ret) {
error(0, ret, "process_portal_map()");
+   kfree(q_pcfg);
return NULL;
}
q_pcfg->channel = q_map.channel;
@@ -217,6 +221,7 @@ struct qman_portal *fsl_qman_portal_create(void)
close(q_fd);
 err1:
process_portal_unmap(&q_map.addr);
+   kfree(q_pcfg);
return NULL;
 }
 
-- 
2.7.4



[dpdk-dev] [PATCH 4/8] net/dpaa: fix the oob access

2018-04-05 Thread Hemant Agrawal
Fixes: b21ed3e2a16d ("net/dpaa: support extended statistics")
Coverity issue: 268318
Cc: sta...@dpdk.org

Signed-off-by: Hemant Agrawal 
---
 drivers/net/dpaa/dpaa_ethdev.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index 0aad111..cbdc4f2 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -339,6 +339,9 @@ dpaa_xstats_get_names(__rte_unused struct rte_eth_dev *dev,
 {
unsigned int i, stat_cnt = RTE_DIM(dpaa_xstats_strings);
 
+   if (limit < stat_cnt)
+   return stat_cnt;
+
if (xstats_names != NULL)
for (i = 0; i < stat_cnt; i++)
snprintf(xstats_names[i].name,
@@ -366,7 +369,7 @@ dpaa_xstats_get_by_id(struct rte_eth_dev *dev, const 
uint64_t *ids,
return 0;
 
fman_if_stats_get_all(dpaa_intf->fif, values_copy,
- sizeof(struct dpaa_if_stats));
+ sizeof(struct dpaa_if_stats) / 8);
 
for (i = 0; i < stat_cnt; i++)
values[i] =
-- 
2.7.4



[dpdk-dev] [PATCH 3/8] net/dpaa: fix the array overrun

2018-04-05 Thread Hemant Agrawal
Fixes: 62f53995caaf ("net/dpaa: add frame count based tail drop with CGR")
Coverity issue: 268342
Cc: sta...@dpdk.org

Signed-off-by: Hemant Agrawal 
---
 drivers/net/dpaa/dpaa_ethdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index db49364..0aad111 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -1105,10 +1105,10 @@ dpaa_dev_init(struct rte_eth_dev *eth_dev)
dpaa_push_mode_max_queue = DPAA_MAX_PUSH_MODE_QUEUE;
}
 
-   /* Each device can not have more than DPAA_PCD_FQID_MULTIPLIER RX
+   /* Each device can not have more than DPAA_MAX_NUM_PCD_QUEUES RX
 * queues.
 */
-   if (num_rx_fqs <= 0 || num_rx_fqs > DPAA_PCD_FQID_MULTIPLIER) {
+   if (num_rx_fqs <= 0 || num_rx_fqs > DPAA_MAX_NUM_PCD_QUEUES) {
DPAA_PMD_ERR("Invalid number of RX queues\n");
return -EINVAL;
}
-- 
2.7.4



[dpdk-dev] [PATCH 5/8] bus/dpaa: fix resource leak

2018-04-05 Thread Hemant Agrawal
Fixes: 1459585888b5 ("bus/dpaa: fix memory allocation during scan")
Coverity issue: 268337
Cc: sta...@dpdk.org

Signed-off-by: Hemant Agrawal 
---
 drivers/bus/dpaa/base/fman/fman.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/bus/dpaa/base/fman/fman.c 
b/drivers/bus/dpaa/base/fman/fman.c
index e6fd5f3..be91da4 100644
--- a/drivers/bus/dpaa/base/fman/fman.c
+++ b/drivers/bus/dpaa/base/fman/fman.c
@@ -442,6 +442,7 @@ fman_if_init(const struct device_node *dpa_node)
if (!pool_node) {
FMAN_ERR(-ENXIO, "%s: bad fsl,bman-buffer-pools\n",
 dname);
+   free(bpool);
goto err;
}
pname = pool_node->full_name;
@@ -449,6 +450,7 @@ fman_if_init(const struct device_node *dpa_node)
prop = of_get_property(pool_node, "fsl,bpid", &proplen);
if (!prop) {
FMAN_ERR(-EINVAL, "%s: no fsl,bpid\n", pname);
+   free(bpool);
goto err;
}
assert(proplen == sizeof(*prop));
-- 
2.7.4



[dpdk-dev] [PATCH 7/8] net/dpaa2: fix the implementation of xstats

2018-04-05 Thread Hemant Agrawal
Fixes: 1d6329b2fc1f ("net/dpaa2: support extra stats")
Cc: sta...@dpdk.org

Signed-off-by: Hemant Agrawal 
---
 drivers/net/dpaa2/dpaa2_ethdev.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 281483d..eed6dc9 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -1115,12 +1115,12 @@ dpaa2_dev_xstats_get(struct rte_eth_dev *dev, struct 
rte_eth_xstat *xstats,
union dpni_statistics value[3] = {};
unsigned int i = 0, num = RTE_DIM(dpaa2_xstats_strings);
 
-   if (xstats == NULL)
-   return 0;
-
if (n < num)
return num;
 
+   if (xstats == NULL)
+   return 0;
+
/* Get Counters from page_0*/
retcode = dpni_get_statistics(dpni, CMD_PRI_LOW, priv->token,
  0, 0, &value[0]);
@@ -1153,10 +1153,13 @@ dpaa2_dev_xstats_get(struct rte_eth_dev *dev, struct 
rte_eth_xstat *xstats,
 static int
 dpaa2_xstats_get_names(__rte_unused struct rte_eth_dev *dev,
   struct rte_eth_xstat_name *xstats_names,
-  __rte_unused unsigned int limit)
+  unsigned int limit)
 {
unsigned int i, stat_cnt = RTE_DIM(dpaa2_xstats_strings);
 
+   if (limit < stat_cnt)
+   return stat_cnt;
+
if (xstats_names != NULL)
for (i = 0; i < stat_cnt; i++)
snprintf(xstats_names[i].name,
-- 
2.7.4



[dpdk-dev] [PATCH 6/8] net/dpaa: update checksum for external pool obj

2018-04-05 Thread Hemant Agrawal
From: Akhil Goyal 

Signed-off-by: Akhil Goyal 
---
 drivers/net/dpaa/dpaa_rxtx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index bdb7f66..1316d2a 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -825,6 +825,8 @@ tx_on_external_pool(struct qman_fq *txq, struct rte_mbuf 
*mbuf,
}
 
DPAA_MBUF_TO_CONTIG_FD(dmable_mbuf, fd_arr, dpaa_intf->bp_info->bpid);
+   if (mbuf->ol_flags & DPAA_TX_CKSUM_OFFLOAD_MASK)
+   dpaa_unsegmented_checksum(mbuf, fd_arr);
 
return 0;
 }
-- 
2.7.4



[dpdk-dev] [PATCH 8/8] bus/fslmc: configure separate portal for Ethernet Rx

2018-04-05 Thread Hemant Agrawal
From: Nipun Gupta 

In case of Receive from Ethernet we add a new pull request (prefetch)
but do not fetch the results from that pull request until next
dequeue operation. This keeps the portal in busy mode.

This patch updates the portals bifurcation to have separate portals
to receive packets for Ethernet and all other devices to use a
common portal.

Signed-off-by: Nipun Gupta 
---
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c| 27 ++-
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.h|  8 
 drivers/bus/fslmc/rte_bus_fslmc_version.map |  8 +++-
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c | 12 ++--
 drivers/net/dpaa2/dpaa2_rxtx.c  | 29 -
 5 files changed, 47 insertions(+), 37 deletions(-)

diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c 
b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
index 881dd5f..a741626 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
@@ -350,7 +350,7 @@ dpaa2_affine_qbman_swp(void)
 }
 
 int
-dpaa2_affine_qbman_swp_sec(void)
+dpaa2_affine_qbman_ethrx_swp(void)
 {
unsigned int lcore_id = rte_lcore_id();
uint64_t tid = syscall(SYS_gettid);
@@ -361,35 +361,36 @@ dpaa2_affine_qbman_swp_sec(void)
else if (lcore_id >= RTE_MAX_LCORE)
return -1;
 
-   if (dpaa2_io_portal[lcore_id].sec_dpio_dev) {
+   if (dpaa2_io_portal[lcore_id].ethrx_dpio_dev) {
DPAA2_BUS_DP_INFO(
"DPAA Portal=%p (%d) is being shared between thread"
" %" PRIu64 " and current %" PRIu64 "\n",
-   dpaa2_io_portal[lcore_id].sec_dpio_dev,
-   dpaa2_io_portal[lcore_id].sec_dpio_dev->index,
+   dpaa2_io_portal[lcore_id].ethrx_dpio_dev,
+   dpaa2_io_portal[lcore_id].ethrx_dpio_dev->index,
dpaa2_io_portal[lcore_id].sec_tid,
tid);
-   RTE_PER_LCORE(_dpaa2_io).sec_dpio_dev
-   = dpaa2_io_portal[lcore_id].sec_dpio_dev;
+   RTE_PER_LCORE(_dpaa2_io).ethrx_dpio_dev
+   = dpaa2_io_portal[lcore_id].ethrx_dpio_dev;
rte_atomic16_inc(&dpaa2_io_portal
-[lcore_id].sec_dpio_dev->ref_count);
+[lcore_id].ethrx_dpio_dev->ref_count);
dpaa2_io_portal[lcore_id].sec_tid = tid;
 
DPAA2_BUS_DP_DEBUG(
"Old Portal=%p (%d) affined thread"
" - %" PRIu64 "\n",
-   dpaa2_io_portal[lcore_id].sec_dpio_dev,
-   dpaa2_io_portal[lcore_id].sec_dpio_dev->index,
+   dpaa2_io_portal[lcore_id].ethrx_dpio_dev,
+   dpaa2_io_portal[lcore_id].ethrx_dpio_dev->index,
tid);
return 0;
}
 
/* Populate the dpaa2_io_portal structure */
-   dpaa2_io_portal[lcore_id].sec_dpio_dev = dpaa2_get_qbman_swp(lcore_id);
+   dpaa2_io_portal[lcore_id].ethrx_dpio_dev =
+   dpaa2_get_qbman_swp(lcore_id);
 
-   if (dpaa2_io_portal[lcore_id].sec_dpio_dev) {
-   RTE_PER_LCORE(_dpaa2_io).sec_dpio_dev
-   = dpaa2_io_portal[lcore_id].sec_dpio_dev;
+   if (dpaa2_io_portal[lcore_id].ethrx_dpio_dev) {
+   RTE_PER_LCORE(_dpaa2_io).ethrx_dpio_dev
+   = dpaa2_io_portal[lcore_id].ethrx_dpio_dev;
dpaa2_io_portal[lcore_id].sec_tid = tid;
return 0;
} else {
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h 
b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h
index c0bd878..d593eea 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.h
@@ -13,7 +13,7 @@
 
 struct dpaa2_io_portal_t {
struct dpaa2_dpio_dev *dpio_dev;
-   struct dpaa2_dpio_dev *sec_dpio_dev;
+   struct dpaa2_dpio_dev *ethrx_dpio_dev;
uint64_t net_tid;
uint64_t sec_tid;
void *eventdev;
@@ -25,8 +25,8 @@ RTE_DECLARE_PER_LCORE(struct dpaa2_io_portal_t, _dpaa2_io);
 #define DPAA2_PER_LCORE_DPIO RTE_PER_LCORE(_dpaa2_io).dpio_dev
 #define DPAA2_PER_LCORE_PORTAL DPAA2_PER_LCORE_DPIO->sw_portal
 
-#define DPAA2_PER_LCORE_SEC_DPIO RTE_PER_LCORE(_dpaa2_io).sec_dpio_dev
-#define DPAA2_PER_LCORE_SEC_PORTAL DPAA2_PER_LCORE_SEC_DPIO->sw_portal
+#define DPAA2_PER_LCORE_ETHRX_DPIO RTE_PER_LCORE(_dpaa2_io).ethrx_dpio_dev
+#define DPAA2_PER_LCORE_ETHRX_PORTAL DPAA2_PER_LCORE_ETHRX_DPIO->sw_portal
 
 /* Variable to store DPAA2 platform type */
 extern uint32_t dpaa2_svr_family;
@@ -39,7 +39,7 @@ struct dpaa2_dpio_dev *dpaa2_get_qbman_swp(int cpu_id);
 int dpaa2_affine_qbman_swp(void);
 
 /* Affine additional DPIO portal to current crypto processing thread */
-int dpaa2_affine_qbman_swp_sec(void);
+int dpaa2_affine_qbman_ethrx_swp(void);
 

[dpdk-dev] [PATCH V19 1/4] eal: add device event handle in interrupt thread

2018-04-05 Thread Jeff Guo
Add new interrupt handle type of RTE_INTR_HANDLE_DEV_EVENT, for
device event interrupt monitor.

Signed-off-by: Jeff Guo 
Reviewed-by: Jianfeng Tan 
---
v19->v18:
fix some typo
---
 lib/librte_eal/common/include/rte_eal_interrupts.h |  1 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 11 +-
 test/test/test_interrupts.c| 39 --
 3 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_eal_interrupts.h 
b/lib/librte_eal/common/include/rte_eal_interrupts.h
index 3f792a9..6eb4932 100644
--- a/lib/librte_eal/common/include/rte_eal_interrupts.h
+++ b/lib/librte_eal/common/include/rte_eal_interrupts.h
@@ -34,6 +34,7 @@ enum rte_intr_handle_type {
RTE_INTR_HANDLE_ALARM,/**< alarm handle */
RTE_INTR_HANDLE_EXT,  /**< external handler */
RTE_INTR_HANDLE_VDEV, /**< virtual device */
+   RTE_INTR_HANDLE_DEV_EVENT,/**< device event handle */
RTE_INTR_HANDLE_MAX   /**< count of elements */
 };
 
diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index f86f22f..58e9328 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -559,6 +559,9 @@ rte_intr_enable(const struct rte_intr_handle *intr_handle)
return -1;
break;
 #endif
+   /* not used at this moment */
+   case RTE_INTR_HANDLE_DEV_EVENT:
+   return -1;
/* unknown handle type */
default:
RTE_LOG(ERR, EAL,
@@ -606,6 +609,9 @@ rte_intr_disable(const struct rte_intr_handle *intr_handle)
return -1;
break;
 #endif
+   /* not used at this moment */
+   case RTE_INTR_HANDLE_DEV_EVENT:
+   return -1;
/* unknown handle type */
default:
RTE_LOG(ERR, EAL,
@@ -674,7 +680,10 @@ eal_intr_process_interrupts(struct epoll_event *events, 
int nfds)
bytes_read = 0;
call = true;
break;
-
+   case RTE_INTR_HANDLE_DEV_EVENT:
+   bytes_read = 0;
+   call = true;
+   break;
default:
bytes_read = 1;
break;
diff --git a/test/test/test_interrupts.c b/test/test/test_interrupts.c
index 31a70a0..dc19175 100644
--- a/test/test/test_interrupts.c
+++ b/test/test/test_interrupts.c
@@ -20,6 +20,7 @@ enum test_interrupt_handle_type {
TEST_INTERRUPT_HANDLE_VALID,
TEST_INTERRUPT_HANDLE_VALID_UIO,
TEST_INTERRUPT_HANDLE_VALID_ALARM,
+   TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT,
TEST_INTERRUPT_HANDLE_CASE1,
TEST_INTERRUPT_HANDLE_MAX
 };
@@ -80,6 +81,10 @@ test_interrupt_init(void)
intr_handles[TEST_INTERRUPT_HANDLE_VALID_ALARM].type =
RTE_INTR_HANDLE_ALARM;
 
+   intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT].fd = pfds.readfd;
+   intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT].type =
+   RTE_INTR_HANDLE_DEV_EVENT;
+
intr_handles[TEST_INTERRUPT_HANDLE_CASE1].fd = pfds.writefd;
intr_handles[TEST_INTERRUPT_HANDLE_CASE1].type = RTE_INTR_HANDLE_UIO;
 
@@ -250,6 +255,14 @@ test_interrupt_enable(void)
return -1;
}
 
+   /* check with specific valid intr_handle */
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT];
+   if (rte_intr_enable(&test_intr_handle) == 0) {
+   printf("unexpectedly enable a specific intr_handle "
+   "successfully\n");
+   return -1;
+   }
+
/* check with valid handler and its type */
test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_CASE1];
if (rte_intr_enable(&test_intr_handle) < 0) {
@@ -306,6 +319,14 @@ test_interrupt_disable(void)
return -1;
}
 
+   /* check with specific valid intr_handle */
+   test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT];
+   if (rte_intr_disable(&test_intr_handle) == 0) {
+   printf("unexpectedly disable a specific intr_handle "
+   "successfully\n");
+   return -1;
+   }
+
/* check with valid handler and its type */
test_intr_handle = intr_handles[TEST_INTERRUPT_HANDLE_CASE1];
if (rte_intr_disable(&test_intr_handle) < 0) {
@@ -393,9 +414,17 @@ test_interrupt(void)
goto out;
}
 
+   printf("Check valid device event interrupt full path\n");
+   if (test_interrupt_full_path_check(
+   TEST_INTERRUPT_HANDLE_VALID_DEV_EVENT) < 0) {
+   printf("failure occurred during checking valid device event "
+  

Re: [dpdk-dev] [PATCH V18 4/4] app/testpmd: enable device hotplug monitoring

2018-04-05 Thread Tan, Jianfeng



On 4/5/2018 12:31 AM, Matan Azrad wrote:

Hi all

What do you think about adding the "--hotplug" parameter as a new EAL command 
line parameter?


+1

Thanks,
Jianfeng



From: Tan, Jianfeng, Wednesday, April 4, 2018 6:23 AM

-Original Message-
From: Guo, Jia
Sent: Tuesday, April 3, 2018 6:34 PM
To: step...@networkplumber.org; Richardson, Bruce; Yigit, Ferruh;
Ananyev, Konstantin; gaetan.ri...@6wind.com; Wu, Jingjing;
tho...@monjalon.net; mo...@mellanox.com; Van Haaren, Harry; Tan,
Jianfeng
Cc: jblu...@infradead.org; shreyansh.j...@nxp.com; dev@dpdk.org; Guo,
Jia; Zhang, Helin
Subject: [PATCH V18 4/4] app/testpmd: enable device hotplug monitoring

Use testpmd for example, to show how an application use device event

s/use/uses


APIs to monitor the hotplug events, including both hot removal event
and hot insertion event.

The process is that, testpmd first enable hotplug by below commands,

E.g. ./build/app/testpmd -c 0x3 --n 4 -- -i --hot-plug

then testpmd start the device event monitor by call the new API

s/start/starts
s/call/calling


(rte_dev_event_monitor_start) and register the user's callback by call
the API (rte_dev_event_callback_register), when device being hotplug
insertion or hotplug removal, the device event monitor detects the
event and call user's callbacks, user could process the event in the
callback accordingly.

This patch only shows the event monitoring, device attach/detach would
not be involved here, will add from other hotplug patch set.

Signed-off-by: Jeff Guo 

Some typos and a trivial suggestion. Feel free to carry my

Reviewed-by: Jianfeng Tan 

in the next version.


---
v18->v17:
remove hotplug policy and detach/attach process from testpmd, let it
focus on the device event monitoring which the patch set introduced.
---
  app/test-pmd/parameters.c |   5 +-
  app/test-pmd/testpmd.c| 112
+-
  app/test-pmd/testpmd.h|   2 +
  doc/guides/testpmd_app_ug/run_app.rst |   4 ++
  4 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 97d22b8..558cd40 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -186,6 +186,7 @@ usage(char* progname)
printf("  --flow-isolate-all: "
   "requests flow API isolated mode on all ports at
initialization time.\n");
printf("  --tx-offloads=0x: hexadecimal bitmask of TX

queue

offloads\n");
+   printf("  --hot-plug: enable hot plug for device.\n");
  }

  #ifdef RTE_LIBRTE_CMDLINE
@@ -621,6 +622,7 @@ launch_args_parse(int argc, char** argv)
{ "print-event",  1, 0, 0 },
{ "mask-event",   1, 0, 0 },
{ "tx-offloads",  1, 0, 0 },
+   { "hot-plug", 0, 0, 0 },
{ 0, 0, 0, 0 },
};

@@ -1102,7 +1104,8 @@ launch_args_parse(int argc, char** argv)
rte_exit(EXIT_FAILURE,
 "invalid mask-event
argument\n");
}
-
+   if (!strcmp(lgopts[opt_idx].name, "hot-plug"))
+   hot_plug = 1;
break;
case 'h':
usage(argv[0]);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
4c0e258..2faeb90 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -12,6 +12,7 @@
  #include 
  #include 
  #include 
+#include 

  #include 
  #include 
@@ -284,6 +285,8 @@ uint8_t lsc_interrupt = 1; /* enabled by default */
   */
  uint8_t rmv_interrupt = 1; /* enabled by default */

+uint8_t hot_plug = 0; /**< hotplug disabled by default. */
+
  /*
   * Display or mask ether events
   * Default to all events except VF_MBOX @@ -391,6 +394,12 @@ static
void check_all_ports_link_status(uint32_t
port_mask);
  static int eth_event_callback(portid_t port_id,
  enum rte_eth_event_type type,
  void *param, void *ret_param);
+static int eth_dev_event_callback(char *device_name,
+   enum rte_dev_event_type type,
+   void *param);
+static int eth_dev_event_callback_register(void);
+static int eth_dev_event_callback_unregister(void);
+

  /*
   * Check if all the ports are started.
@@ -1853,6 +1862,39 @@ reset_port(portid_t pid)
printf("Done\n");
  }

+static int
+eth_dev_event_callback_register(void)
+{
+   int diag;
+
+   /* register the device event callback */
+   diag = rte_dev_event_callback_register(NULL,
+   eth_dev_event_callback, NULL);
+   if (diag) {
+   printf("Failed to setup dev_event callback\n");
+   return -1;
+   }
+
+   return 0;
+}
+
+
+static int
+eth_dev_event_callback_unregister(void)
+{
+   int dia

[dpdk-dev] [PATCH V19 2/4] eal: add device event monitor framework

2018-04-05 Thread Jeff Guo
This patch aims to add a general device event monitor framework at
EAL device layer, for device hotplug awareness and actions adopted
accordingly. It could also expand for all other types of device event
monitor, but not in this scope at the stage.

To get started, users firstly call below new added APIs to enable/disable
the device event monitor mechanism:
  - rte_dev_event_monitor_start
  - rte_dev_event_monitor_stop

Then users shell register or unregister callbacks through the new added
APIs. Callbacks can be some device specific, or for all devices.
  -rte_dev_event_callback_register
  -rte_dev_event_callback_unregister

Use hotplug case for example, when device hotplug insertion or hotplug
removal, we will get notified from kernel, then call user's callbacks
accordingly to handle it, such as detach or attach the device from the
bus, and could benefit further fail-safe or live-migration.

Signed-off-by: Jeff Guo 
---
v19->v18:
clear the coding style and fix typo
---
 doc/guides/rel_notes/release_18_05.rst  |   9 ++
 lib/librte_eal/bsdapp/eal/Makefile  |   1 +
 lib/librte_eal/bsdapp/eal/eal_dev.c |  21 +
 lib/librte_eal/bsdapp/eal/meson.build   |   1 +
 lib/librte_eal/common/eal_common_dev.c  | 161 
 lib/librte_eal/common/eal_private.h |  15 +++
 lib/librte_eal/common/include/rte_dev.h |  94 +++
 lib/librte_eal/linuxapp/eal/Makefile|   1 +
 lib/librte_eal/linuxapp/eal/eal_dev.c   |  22 +
 lib/librte_eal/linuxapp/eal/meson.build |   1 +
 lib/librte_eal/rte_eal_version.map  |  10 ++
 11 files changed, 336 insertions(+)
 create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
 create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c

diff --git a/doc/guides/rel_notes/release_18_05.rst 
b/doc/guides/rel_notes/release_18_05.rst
index e5fac1c..d3c86bd 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -58,6 +58,15 @@ New Features
   * Added support for NVGRE, VXLAN and GENEVE filters in flow API.
   * Added support for DROP action in flow API.
 
+* **Added device event monitor framework.**
+
+  Added a general device event monitor framework at EAL, for device dynamic 
management.
+  Such as device hotplug awareness and actions adopted accordingly. The list 
of new APIs:
+
+  * ``rte_dev_event_monitor_start`` and ``rte_dev_event_monitor_stop`` are for
+the event monitor enable and disable.
+  * ``rte_dev_event_callback_register`` and 
``rte_dev_event_callback_unregister``
+are for the user's callbacks register and unregister.
 
 API Changes
 ---
diff --git a/lib/librte_eal/bsdapp/eal/Makefile 
b/lib/librte_eal/bsdapp/eal/Makefile
index ed1d17b..90b88eb 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -33,6 +33,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_lcore.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_timer.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_interrupts.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_alarm.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_dev.c
 
 # from common dir
 SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_common_lcore.c
diff --git a/lib/librte_eal/bsdapp/eal/eal_dev.c 
b/lib/librte_eal/bsdapp/eal/eal_dev.c
new file mode 100644
index 000..1c6c51b
--- /dev/null
+++ b/lib/librte_eal/bsdapp/eal/eal_dev.c
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+int __rte_experimental
+rte_dev_event_monitor_start(void)
+{
+   RTE_LOG(ERR, EAL, "Device event is not supported for FreeBSD\n");
+   return -1;
+}
+
+int __rte_experimental
+rte_dev_event_monitor_stop(void)
+{
+   RTE_LOG(ERR, EAL, "Device event is not supported for FreeBSD\n");
+   return -1;
+}
diff --git a/lib/librte_eal/bsdapp/eal/meson.build 
b/lib/librte_eal/bsdapp/eal/meson.build
index e83fc91..6dfc533 100644
--- a/lib/librte_eal/bsdapp/eal/meson.build
+++ b/lib/librte_eal/bsdapp/eal/meson.build
@@ -12,4 +12,5 @@ env_sources = files('eal_alarm.c',
'eal_timer.c',
'eal.c',
'eal_memory.c',
+   'eal_dev.c'
 )
diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index cd07144..e202cf2 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -14,9 +14,34 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "eal_private.h"
 
+/**
+ * The device event callback description.
+ *
+ * It contains callback address to be registered by user application,
+ * the pointer to the parameters for callback, and the device name.
+ */
+struct dev_event_callback {
+   TAILQ_ENTRY(dev_event_callback) next; /**< Callbacks list */
+   rte_dev_event_cb_fn cb_fn;/**< Callback address */
+   void *cb_arg;   /**< Callback parameter */
+   char *dev_name;  /**<

[dpdk-dev] [PATCH V19 4/4] app/testpmd: enable device hotplug monitoring

2018-04-05 Thread Jeff Guo
Use testpmd for example, to show how an application uses device event
APIs to monitor the hotplug events, including both hot removal event
and hot insertion event.

The process is that, testpmd first enable hotplug by below commands,

E.g. ./build/app/testpmd -c 0x3 --n 4 -- -i --hot-plug

then testpmd starts the device event monitor by calling the new API
(rte_dev_event_monitor_start) and register the user's callback by call
the API (rte_dev_event_callback_register), when device being hotplug
insertion or hotplug removal, the device event monitor detects the event
and call user's callbacks, user could process the event in the callback
accordingly.

This patch only shows the event monitoring, device attach/detach would
not be involved here, will add from other hotplug patch set.

Signed-off-by: Jeff Guo 
Reviewed-by: Jianfeng Tan 
---
v19->v18:
fix some typo
---
 app/test-pmd/parameters.c |   5 +-
 app/test-pmd/testpmd.c| 101 +-
 app/test-pmd/testpmd.h|   2 +
 doc/guides/testpmd_app_ug/run_app.rst |   4 ++
 4 files changed, 110 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 2192bdc..1a05284 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -186,6 +186,7 @@ usage(char* progname)
printf("  --flow-isolate-all: "
   "requests flow API isolated mode on all ports at initialization 
time.\n");
printf("  --tx-offloads=0x: hexadecimal bitmask of TX queue 
offloads\n");
+   printf("  --hot-plug: enable hot plug for device.\n");
 }
 
 #ifdef RTE_LIBRTE_CMDLINE
@@ -621,6 +622,7 @@ launch_args_parse(int argc, char** argv)
{ "print-event",1, 0, 0 },
{ "mask-event", 1, 0, 0 },
{ "tx-offloads",1, 0, 0 },
+   { "hot-plug",   0, 0, 0 },
{ 0, 0, 0, 0 },
};
 
@@ -1101,7 +1103,8 @@ launch_args_parse(int argc, char** argv)
rte_exit(EXIT_FAILURE,
 "invalid mask-event 
argument\n");
}
-
+   if (!strcmp(lgopts[opt_idx].name, "hot-plug"))
+   hot_plug = 1;
break;
case 'h':
usage(argv[0]);
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 4c0e258..d2c122a 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -284,6 +285,8 @@ uint8_t lsc_interrupt = 1; /* enabled by default */
  */
 uint8_t rmv_interrupt = 1; /* enabled by default */
 
+uint8_t hot_plug = 0; /**< hotplug disabled by default. */
+
 /*
  * Display or mask ether events
  * Default to all events except VF_MBOX
@@ -391,6 +394,12 @@ static void check_all_ports_link_status(uint32_t 
port_mask);
 static int eth_event_callback(portid_t port_id,
  enum rte_eth_event_type type,
  void *param, void *ret_param);
+static void eth_dev_event_callback(char *device_name,
+   enum rte_dev_event_type type,
+   void *param);
+static int eth_dev_event_callback_register(void);
+static int eth_dev_event_callback_unregister(void);
+
 
 /*
  * Check if all the ports are started.
@@ -1853,6 +1862,39 @@ reset_port(portid_t pid)
printf("Done\n");
 }
 
+static int
+eth_dev_event_callback_register(void)
+{
+   int ret;
+
+   /* register the device event callback */
+   ret = rte_dev_event_callback_register(NULL,
+   eth_dev_event_callback, NULL);
+   if (ret) {
+   printf("Failed to register device event callback\n");
+   return -1;
+   }
+
+   return 0;
+}
+
+
+static int
+eth_dev_event_callback_unregister(void)
+{
+   int ret;
+
+   /* unregister the device event callback */
+   ret = rte_dev_event_callback_unregister(NULL,
+   eth_dev_event_callback, NULL);
+   if (ret < 0) {
+   printf("Failed to unregister device event callback\n");
+   return -1;
+   }
+
+   return 0;
+}
+
 void
 attach_port(char *identifier)
 {
@@ -1916,6 +1958,7 @@ void
 pmd_test_exit(void)
 {
portid_t pt_id;
+   int ret;
 
if (test_done == 0)
stop_packet_forwarding();
@@ -1929,6 +1972,18 @@ pmd_test_exit(void)
close_port(pt_id);
}
}
+
+   if (hot_plug) {
+   ret = rte_dev_event_monitor_stop();
+   if (ret)
+   RTE_LOG(ERR, EAL,
+   "fail to stop device event monitor.");
+
+   ret = eth_dev_event_callback_unregister();
+   if 

[dpdk-dev] [PATCH V19 3/4] eal/linux: uevent parse and process

2018-04-05 Thread Jeff Guo
In order to handle the uevent which has been detected from the kernel
side, add uevent parse and process function to translate the uevent into
device event, which user has subscribed to monitor.

Signed-off-by: Jeff Guo 
---
v19->18:
fix some misunderstanding part
---
 lib/librte_eal/linuxapp/eal/eal_dev.c | 196 +-
 1 file changed, 194 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c 
b/lib/librte_eal/linuxapp/eal/eal_dev.c
index 9c8d1a0..4686c41 100644
--- a/lib/librte_eal/linuxapp/eal/eal_dev.c
+++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
@@ -2,21 +2,213 @@
  * Copyright(c) 2018 Intel Corporation
  */
 
+#include 
+#include 
+#include 
+#include 
+
 #include 
 #include 
 #include 
+#include 
+#include 
+
+#include "eal_private.h"
+
+static struct rte_intr_handle intr_handle = {.fd = -1 };
+static bool monitor_started;
+
+#define EAL_UEV_MSG_LEN 4096
+#define EAL_UEV_MSG_ELEM_LEN 128
+
+/* identify the system layer which reports this event. */
+enum eal_dev_event_subsystem {
+   EAL_DEV_EVENT_SUBSYSTEM_PCI, /* PCI bus device event */
+   EAL_DEV_EVENT_SUBSYSTEM_UIO, /* UIO driver device event */
+   EAL_DEV_EVENT_SUBSYSTEM_VFIO, /* VFIO driver device event */
+   EAL_DEV_EVENT_SUBSYSTEM_MAX
+};
+
+static int
+dev_uev_socket_fd_create(void)
+{
+   struct sockaddr_nl addr;
+   int ret;
+
+   intr_handle.fd = socket(PF_NETLINK, SOCK_RAW | SOCK_CLOEXEC |
+   SOCK_NONBLOCK,
+   NETLINK_KOBJECT_UEVENT);
+   if (intr_handle.fd < 0) {
+   RTE_LOG(ERR, EAL, "create uevent fd failed.\n");
+   return -1;
+   }
+
+   memset(&addr, 0, sizeof(addr));
+   addr.nl_family = AF_NETLINK;
+   addr.nl_pid = 0;
+   addr.nl_groups = 0x;
+
+   ret = bind(intr_handle.fd, (struct sockaddr *) &addr, sizeof(addr));
+   if (ret < 0) {
+   RTE_LOG(ERR, EAL, "Failed to bind uevent socket.\n");
+   goto err;
+   }
+
+   return 0;
+err:
+   close(intr_handle.fd);
+   intr_handle.fd = -1;
+   return ret;
+}
+
+static int
+dev_uev_parse(const char *buf, struct rte_dev_event *event, int length)
+{
+   char action[EAL_UEV_MSG_ELEM_LEN];
+   char subsystem[EAL_UEV_MSG_ELEM_LEN];
+   char pci_slot_name[EAL_UEV_MSG_ELEM_LEN];
+   int i = 0, ret = 0;
+
+   memset(action, 0, EAL_UEV_MSG_ELEM_LEN);
+   memset(subsystem, 0, EAL_UEV_MSG_ELEM_LEN);
+   memset(pci_slot_name, 0, EAL_UEV_MSG_ELEM_LEN);
+
+   while (i < length) {
+   for (; i < length; i++) {
+   if (*buf)
+   break;
+   buf++;
+   }
+   /**
+* check device uevent from kernel side, no need to check
+* uevent from udev.
+*/
+   if (!strncmp(buf, "libudev", 7)) {
+   buf += 7;
+   i += 7;
+   return -1;
+   }
+   if (!strncmp(buf, "ACTION=", 7)) {
+   buf += 7;
+   i += 7;
+   snprintf(action, sizeof(action), "%s", buf);
+   } else if (!strncmp(buf, "SUBSYSTEM=", 10)) {
+   buf += 10;
+   i += 10;
+   snprintf(subsystem, sizeof(subsystem), "%s", buf);
+   } else if (!strncmp(buf, "PCI_SLOT_NAME=", 14)) {
+   buf += 14;
+   i += 14;
+   snprintf(pci_slot_name, sizeof(subsystem), "%s", buf);
+   event->devname = strdup(pci_slot_name);
+   }
+   for (; i < length; i++) {
+   if (*buf == '\0')
+   break;
+   buf++;
+   }
+   }
+
+   /* parse the subsystem layer */
+   if (!strncmp(subsystem, "uio", 3))
+   event->subsystem = EAL_DEV_EVENT_SUBSYSTEM_UIO;
+   else if (!strncmp(subsystem, "pci", 3))
+   event->subsystem = EAL_DEV_EVENT_SUBSYSTEM_PCI;
+   else if (!strncmp(subsystem, "vfio", 4))
+   event->subsystem = EAL_DEV_EVENT_SUBSYSTEM_VFIO;
+   else
+   ret = -1;
 
+   /* parse the action type */
+   if (!strncmp(action, "add", 3))
+   event->type = RTE_DEV_EVENT_ADD;
+   else if (!strncmp(action, "remove", 6))
+   event->type = RTE_DEV_EVENT_REMOVE;
+   else
+   ret = -1;
+   return ret;
+}
+
+static void
+dev_uev_handler(__rte_unused void *param)
+{
+   struct rte_dev_event uevent;
+   int ret;
+   char buf[EAL_UEV_MSG_LEN];
+
+   memset(&uevent, 0, sizeof(struct rte_dev_event));
+   memset(buf, 0, EAL_UEV_MSG_LEN);
+
+   ret = recv(intr_handle.fd, buf, EAL_UEV_MSG_LEN, MSG_DONTWAIT);
+   if (ret == 0 ||

[dpdk-dev] [PATCH V19 0/4] add device event monitor framework

2018-04-05 Thread Jeff Guo
About hot plug in dpdk, We already have proactive way to add/remove devices
through APIs (rte_eal_hotplug_add/remove), and also have fail-safe driver
to offload the fail-safe work from the app user. But there are still lack
of a general mechanism to monitor hotplug event for all driver, now the
hotplug interrupt event is diversity between each device and driver, such
as mlx4, pci driver and others.

Use the hot removal event for example, pci drivers not all exposure the
remove interrupt, so in order to make user to easy use the hot plug
feature for pci driver, something must be done to detect the remove event
at the kernel level and offer a new line of interrupt to the user land.

Base on the uevent of kobject mechanism in kernel, we could use it to
benefit for monitoring the hot plug status of the device which not only
uio/vfio of pci bus devices, but also other, such as cpu/usb/pci-express bus 
devices.

The idea is comming as bellow.

a.The uevent message form FD monitoring like below.
remove@/devices/pci:80/:80:02.2/:82:00.0/:83:03.0/:84:00.2/uio/uio2
ACTION=remove
DEVPATH=/devices/pci:80/:80:02.2/:82:00.0/:83:03.0/:84:00.2/uio/uio2
SUBSYSTEM=uio
MAJOR=243
MINOR=2
DEVNAME=uio2
SEQNUM=11366

b.add device event monitor framework:
add several general api to enable uevent monitoring.

c.show example how to use uevent monitor
enable uevent monitoring in testpmd to show device event monitor machenism 
usage.

TODO: failure handler mechanism for hot plug and driver auto bind for hot 
insertion.
that would let the next hot plug patch set to cover.

patchset history:
v19->v18:
fix some typo and misunderstanding part

v18->v17:
1.add feature announcement in release document, fix bsp compile issue.
2.refine socket configuration.
3.remove hotplug policy and detach/attach process from testpmd, let it
focus on the device event monitoring which the patch set introduced.

v17->v16:
1.add related part of the interrupt handle type adding.
2.add new API into map, fix typo issue, add (void*)-1 value for unregister all 
callback
3.add new file into meson.build, modify coding sytle and add print info, delete 
unused part.
4.unregister all user's callback when stop event monitor

v16->v15:
1.remove some linux related code out of eal common layer
2.fix some uneasy readble issue.

v15->v14:
1.use exist eal interrupt epoll to replace of rte service usage for monitor 
thread,
2.add new device event handle type in eal interrupt.
3.remove the uevent type check and any policy from eal,
let it check and management in user's callback.
4.add "--hot-plug" configure parameter in testpmd to switch the hotplug feature.

v14->v13:
1.add __rte_experimental on function defind and fix bsd build issue

v13->v12:
1.fix some logic issue and null check issue
2.fix monitor stop func issue

v12->v11:
1.identify null param in callback for monitor all devices uevent

v11->v10:
1:modify some typo and add experimental tag in new file.
2:modify callback register calling.

v10->v9:
1.fix prefix issue.
2.use a common callback lists for all device and all type to replace
add callback parameter into device struct.
3.delete some unuse part.

v9->v8:
split the patch set into small and explicit patch

v8->v7:
1.use rte_service to replace pthread management.
2.fix defind issue and copyright issue
3.fix some lock issue

v7->v6:
1.modify vdev part according to the vdev rework
2.re-define and split the func into common and bus specific code
3.fix some incorrect issue.
4.fix the system hung after send packcet issue.

v6->v5:
1.add hot plug policy, in eal, default handle to prepare hot plug work for
all pci device, then let app to manage to deside which device need to
hot plug.
2.modify to manage event callback in each device.
3.fix some system hung issue when igb_uioome typo error.release.
4.modify the pci part to the bus-pci base on the bus rework.
5.add hot plug policy in app, show example to use hotplug list to manage
to deside which device need to hot plug.

v5->v4:
1.Move uevent monitor epolling from eal interrupt to eal device layer.
2.Redefine the eal device API for common, and distinguish between linux and bsd
3.Add failure handler helper api in bus layer.Add function of find device by 
name.
4.Replace of individual fd bind with single device, use a common fd to polling 
all device.
5.Add to register hot insertion monitoring and process, add function to auto 
bind driver befor user add device
6.Refine some coding style and typos issue
7.add new callback to process hot insertion

v4->v3:
1.move uevent monitor api from eal interrupt to eal device layer.
2.create uevent type and struct in eal device.
3.move uevent handler for each driver to eal layer.
4.add uevent failure handler to process signal fault issue.
5.add example for request and use uevent monitoring in testpmd.

v3->v2:
1.refine some return error
2.refine the string searching logic to avoid memory issue

v2->v1:
1.remove global variables of hotplug_fd, add ue

Re: [dpdk-dev] [PATCH v5 2/2] eal/vfio: export internal vfio functions

2018-04-05 Thread Wang, Xiao W
Hi Hemant,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Hemant Agrawal
> Sent: Wednesday, April 4, 2018 3:49 PM
> To: dev@dpdk.org
> Cc: Burakov, Anatoly ; tho...@monjalon.net
> Subject: [dpdk-dev] [PATCH v5 2/2] eal/vfio: export internal vfio functions
> 
> This patch moves some of the internal vfio functions from
> eal_vfio.h to rte_vfio.h for common uses with "rte_" prefix.
> 
> This patch also change the FSLMC bus usages from the internal
> VFIO functions to external ones with "rte_" prefix
> 
> Signed-off-by: Hemant Agrawal 
> Acked-by: Anatoly Burakov 
> ---
> v5: fix the bsd compilation
> 
>  drivers/bus/fslmc/Makefile |  1 -
>  drivers/bus/fslmc/fslmc_vfio.c |  7 +--
>  drivers/bus/fslmc/fslmc_vfio.h |  2 -
>  drivers/bus/fslmc/meson.build  |  1 -
>  lib/librte_eal/bsdapp/eal/eal.c| 24 +
>  lib/librte_eal/common/include/rte_vfio.h   | 75
> +-
>  lib/librte_eal/linuxapp/eal/eal_vfio.c | 39 +++---
>  lib/librte_eal/linuxapp/eal/eal_vfio.h | 21 
>  lib/librte_eal/linuxapp/eal/eal_vfio_mp_sync.c |  4 +-
>  lib/librte_eal/rte_eal_version.map |  3 ++
>  10 files changed, 127 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/bus/fslmc/Makefile b/drivers/bus/fslmc/Makefile
> index 93870ba..3aa34e2 100644
> --- a/drivers/bus/fslmc/Makefile
> +++ b/drivers/bus/fslmc/Makefile
> @@ -16,7 +16,6 @@ CFLAGS += $(WERROR_FLAGS)
>  CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc
>  CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc/mc
>  CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc/qbman/include
> -CFLAGS += -I$(RTE_SDK)/lib/librte_eal/linuxapp/eal
>  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
>  LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
>  LDLIBS += -lrte_ethdev
> diff --git a/drivers/bus/fslmc/fslmc_vfio.c b/drivers/bus/fslmc/fslmc_vfio.c
> index 62499de..f3b2960 100644
> --- a/drivers/bus/fslmc/fslmc_vfio.c
> +++ b/drivers/bus/fslmc/fslmc_vfio.c
> @@ -91,7 +91,8 @@ fslmc_get_container_group(int *groupid)
>   }
> 
>   /* get group number */
> - ret = vfio_get_group_no(SYSFS_FSL_MC_DEVICES, g_container,
> groupid);
> + ret = rte_vfio_get_group_num(SYSFS_FSL_MC_DEVICES,
> +  g_container, groupid);
>   if (ret <= 0) {
>   DPAA2_BUS_ERR("Unable to find %s IOMMU group",
> g_container);
>   return -1;
> @@ -124,7 +125,7 @@ vfio_connect_container(void)
>   }
> 
>   /* Opens main vfio file descriptor which represents the "container" */
> - fd = vfio_get_container_fd();
> + fd = rte_vfio_get_container_fd();
>   if (fd < 0) {
>   DPAA2_BUS_ERR("Failed to open VFIO container");
>   return -errno;
> @@ -620,7 +621,7 @@ fslmc_vfio_setup_group(void)
>   }
> 
>   /* Get the actual group fd */
> - ret = vfio_get_group_fd(groupid);
> + ret = rte_vfio_get_group_fd(groupid);
>   if (ret < 0)
>   return ret;
>   vfio_group.fd = ret;
> diff --git a/drivers/bus/fslmc/fslmc_vfio.h b/drivers/bus/fslmc/fslmc_vfio.h
> index e8fb344..9e2c4fe 100644
> --- a/drivers/bus/fslmc/fslmc_vfio.h
> +++ b/drivers/bus/fslmc/fslmc_vfio.h
> @@ -10,8 +10,6 @@
> 
>  #include 
> 
> -#include "eal_vfio.h"
> -
>  #define DPAA2_MC_DPNI_DEVID  7
>  #define DPAA2_MC_DPSECI_DEVID3
>  #define DPAA2_MC_DPCON_DEVID 5
> diff --git a/drivers/bus/fslmc/meson.build b/drivers/bus/fslmc/meson.build
> index e94340e..78f9d92 100644
> --- a/drivers/bus/fslmc/meson.build
> +++ b/drivers/bus/fslmc/meson.build
> @@ -22,6 +22,5 @@ sources = files('fslmc_bus.c',
> 
>  allow_experimental_apis = true
> 
> -includes += include_directories('../../../lib/librte_eal/linuxapp/eal')
>  includes += include_directories('mc', 'qbman/include', 'portal')
>  cflags += ['-D_GNU_SOURCE']
> diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
> index 4eafcb5..e2f2df1 100644
> --- a/lib/librte_eal/bsdapp/eal/eal.c
> +++ b/lib/librte_eal/bsdapp/eal/eal.c
> @@ -746,6 +746,10 @@ int rte_vfio_enable(const char *modname);
>  int rte_vfio_is_enabled(const char *modname);
>  int rte_vfio_noiommu_is_enabled(void);
>  int rte_vfio_clear_group(int vfio_group_fd);
> +int rte_vfio_get_group_num(const char *sysfs_base, const char *dev_addr,
> +int *iommu_group_num);
> +int rte_vfio_get_container_fd(void);
> +int rte_vfio_get_group_fd(int iommu_group_num);

Considering the "group_no" field defined in eal_vfio.h, will "iommu_group_num" 
cause inconsistency
In naming?
 
/*
 * we don't need to store device fd's anywhere since they can be obtained from
 * the group fd via an ioctl() call.
 */
struct vfio_group {
int group_no;
int fd;
int devices;
};

BRs,
Xiao


Re: [dpdk-dev] [PATCH v5] net/virtio-user: add support for server mode

2018-04-05 Thread Yang, Zhiyong
Tiwei,

Thanks  a lot for your review and comments.

Reply inline.

> -Original Message-
> From: Bie, Tiwei
> Sent: Thursday, April 5, 2018 4:29 PM
> To: Yang, Zhiyong 
> Cc: dev@dpdk.org; maxime.coque...@redhat.com; tho...@monjalon.net;
> Tan, Jianfeng ; Wang, Zhihong
> ; Wang, Dong1 
> Subject: Re: [PATCH v5] net/virtio-user: add support for server mode
> 
> On Thu, Apr 05, 2018 at 01:17:53AM +0800, zhiyong.y...@intel.com wrote:
> [...]
> > +static int
> > +virtio_user_start_server(struct virtio_user_dev *dev, struct
> > +sockaddr_un *un) {
> > +   int ret;
> > +   int flag;
> > +   int fd = dev->listenfd;
> > +
> > +   ret = bind(fd, (struct sockaddr *)un, sizeof(*un));
> > +   if (ret < 0) {
> > +   PMD_DRV_LOG(ERR, "failed to bind to %s: %s; remove it and
> try again\n",
> > +   dev->path, strerror(errno));
> > +   goto err;
> > +   }
> > +   ret = listen(fd, MAX_VIRTIO_USER_BACKLOG);
> > +   if (ret < 0)
> > +   goto err;
> > +
> > +   flag = fcntl(fd, F_GETFL);
> > +   fcntl(fd, F_SETFL, flag | O_NONBLOCK);
> > +   dev->vhostfd = -1;
> > +
> > +   return 0;
> > +err:
> > +   close(dev->listenfd);
> 
> The dev->listenfd isn't created in this function, maybe it's better to avoid
> closing this file in this function.
> 

Ok.

> > +   return -1;
> > +}
> > +
> >  /**
> >   * Set up environment to talk with a vhost user backend.
> >   *
> > @@ -390,6 +418,7 @@ vhost_user_setup(struct virtio_user_dev *dev)  {
> > int fd;
> > int flag;
> > +   int ret = 0;
> > struct sockaddr_un un;
> >
> > fd = socket(AF_UNIX, SOCK_STREAM, 0); @@ -405,14 +434,20 @@
> > vhost_user_setup(struct virtio_user_dev *dev)
> > memset(&un, 0, sizeof(un));
> > un.sun_family = AF_UNIX;
> > snprintf(un.sun_path, sizeof(un.sun_path), "%s", dev->path);
> > -   if (connect(fd, (struct sockaddr *)&un, sizeof(un)) < 0) {
> > -   PMD_DRV_LOG(ERR, "connect error, %s", strerror(errno));
> > -   close(fd);
> > -   return -1;
> > +
> > +   if (dev->is_server) {
> > +   dev->listenfd = fd;
> > +   ret = virtio_user_start_server(dev, &un);
> > +   } else {
> 
> Maybe it's better to keep the style consistent. How about something like this:
> 
>   if (dev->is_server) {
>   if (virtio_user_start_server(fd, &un) < 0) {
>   PMD_DRV_LOG(ERR, some messages...);
>   close(fd);
>   return -1;
>   }
>   dev->listenfd = fd;
>   dev->vhostfd = -1;
>   } else {
> 

Ok. it looks better.

So, the following code changes also.

> > +   if (connect(fd, (struct sockaddr *)&un, sizeof(un)) < 0) {
> > +   PMD_DRV_LOG(ERR, "connect error, %s",
> strerror(errno));
> > +   close(fd);
> > +   return -1;
> > +   }
> > +   dev->vhostfd = fd;

Keep consistency.

> > }
> >
> > -   dev->vhostfd = fd;
> > -   return 0;
> > +   return ret;
> >  }
> >
> >  static int
> > diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c
> > b/drivers/net/virtio/virtio_user/virtio_user_dev.c
> > index f90fee9e5..45e324679 100644
> > --- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
> > +++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
> > @@ -254,7 +254,8 @@ virtio_user_fill_intr_handle(struct virtio_user_dev
> *dev)
> > eth_dev->intr_handle->fd = -1;
> > if (dev->vhostfd >= 0)
> > eth_dev->intr_handle->fd = dev->vhostfd;
> > -
> 
> Maybe it's better to keep this empty line (keep it before the return 0).
>

Ok.
 
> > +   else if (dev->is_server)
> > +   eth_dev->intr_handle->fd = dev->listenfd;
> > return 0;
> >  }
> >
> > @@ -267,24 +268,29 @@ virtio_user_dev_setup(struct virtio_user_dev
> *dev)
> > dev->vhostfds = NULL;
> > dev->tapfds = NULL;
> >
> > -   if (is_vhost_user_by_type(dev->path)) {
> > -   dev->ops = &ops_user;
> > +   if (dev->is_server) {
> > +   dev->ops = &ops_user;/* server mode only supports vhost
> user */
> > } else {
> > -   dev->ops = &ops_kernel;
> > -
> > -   dev->vhostfds = malloc(dev->max_queue_pairs *
> sizeof(int));
> > -   dev->tapfds = malloc(dev->max_queue_pairs * sizeof(int));
> > -   if (!dev->vhostfds || !dev->tapfds) {
> > -   PMD_INIT_LOG(ERR, "Failed to malloc");
> > -   return -1;
> > -   }
> > -
> > -   for (q = 0; q < dev->max_queue_pairs; ++q) {
> > -   dev->vhostfds[q] = -1;
> > -   dev->tapfds[q] = -1;
> > +   if (is_vhost_user_by_type(dev->path)) {
> > +   dev->ops = &ops_user;
> > +   } else {
> > +   dev->ops = &ops_kernel;
> > +
> > +   dev->vhostfds = malloc(dev->max_queue_pairs *
> > +  sizeof(int));
> > +   dev->tapfds = malloc(de

Re: [dpdk-dev] [PATCH v4] ethdev: replace bus specific struct with generic dev

2018-04-05 Thread Ferruh Yigit
On 4/4/2018 6:57 PM, De Lara Guarch, Pablo wrote:
> 
> 
>> -Original Message-
>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ferruh Yigit
>> Sent: Tuesday, April 3, 2018 10:50 AM
>> To: David Marchand ; santosh
>> 
>> Cc: dev@dpdk.org; Shreyansh Jain ; Legacy, Allain
>> (Wind River) ; Tomasz Duszynski
>> ; Thomas Monjalon 
>> Subject: Re: [dpdk-dev] [PATCH v4] ethdev: replace bus specific struct with
>> generic dev
>>
>> On 4/3/2018 10:06 AM, David Marchand wrote:
>>> On Mon, Apr 2, 2018 at 6:13 PM, santosh
>>>  wrote:
 On Friday 30 March 2018 08:59 PM, David Marchand wrote:
> I can see we enforce the driver name by putting it after the call to
> .dev_infos_get.
> http://dpdk.org/browse/dpdk/tree/lib/librte_ether/rte_ethdev.c#n2399
>
> octeontx pmd seems to try to do something about it:
> http://dpdk.org/browse/dpdk/tree/drivers/net/octeontx/octeontx_ethde
> v.c#n622
>
> Not sure it does something, might be a thing to cleanup.
>
>
 In case, if your referring to driver_name update then indeed its a
 cleanup [1].

 Otherwise, I don't see any issue with v4 Or may be /I /misunderstood
 your comment.
>>>
>>> I agree there is no fundamental issue.
>>>
>>> dev_info->device = dev->device;
>>>
>>> RTE_FUNC_PTR_OR_RET(*dev->dev_ops->dev_infos_get);
>>> (*dev->dev_ops->dev_infos_get)(dev, dev_info);
>>> dev_info->driver_name = dev->device->driver->name;
>>>
>>> If somebody (I mean some pmd out there) has a usecase with
>>> dev_info->device != dev->device, why not.
>>
>> Intentional let drivers update this variable although I don't also see any 
>> use case
>> of it.
>>
>> This variable was set by PMDs before this patch, so I don't see any reason 
>> to be
>> so strict here.
>>
>> If driver does anything ethdev will set dev_info->device for it, if it want 
>> to
>> overwrite, for any reason, it will have the capability.
> 
> Looks good to me. Will do the same for cryptodev and bbdev.
> The only thing that I am missing here is an update in documentation,
> adding the ABI Change in release notes.

Right, I forget about it, will send a new version.

Thanks,
ferruh

> 
> Apart from it:
> 
> Acked-by: Pablo de Lara 
> 
>>
>>>
>>> Thomas ?
>>>
>>>
> 



Re: [dpdk-dev] [PATCH v5] net/virtio-user: add support for server mode

2018-04-05 Thread Yang, Zhiyong
Ping Maxime, Jianfeng

Do you have any comments about the patch?

Thanks
Zhiyong

> -Original Message-
> From: Yang, Zhiyong
> Sent: Thursday, April 5, 2018 1:18 AM
> To: dev@dpdk.org
> Cc: maxime.coque...@redhat.com; tho...@monjalon.net; Tan, Jianfeng
> ; Wang, Zhihong ; Bie,
> Tiwei ; Wang, Dong1 ; Yang,
> Zhiyong 
> Subject: [PATCH v5] net/virtio-user: add support for server mode
> 
> In a container environment if the vhost-user backend restarts, there's no
> way for it to reconnect to virtio-user. To address this, support for server
> mode is added. In this mode the socket file is created by virtio- user, which
> the backend then connects to. This means that if the backend restarts, it can
> reconnect to virtio-user and continue communications.
> 
> With current implementation, LSC is enabled at virtio-user side to support to
> accept the coming connection.
> 
> Release note is updated in this patch.
> 
> Signed-off-by: Zhiyong Yang 
> ---


Re: [dpdk-dev] Question on documentation / Mellanox ConnectX-3

2018-04-05 Thread Adrien Mazarguil
On Tue, Apr 03, 2018 at 02:59:38PM -0300, Marcelo Ricardo Leitner wrote:
> Hi,
> 
> http://docs.openvswitch.org/en/latest/howto/dpdk/ says:
> 
> Some NICs (i.e. Mellanox ConnectX-3) have only one PCI address
> associated with multiple ports. Using a PCI device like above won’t
> work. Instead, below usage is suggested:
> 
> $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
> options:dpdk-devargs="class=eth,mac=00:11:22:33:44:55:01"
> $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
> options:dpdk-devargs="class=eth,mac=00:11:22:33:44:55:02"
> 
> But these MACs are 7 bytes long. Seems the idea was to mention the two
> incremental MAC addresses that the ports have, and thus the ':55'
> should have been removed from there, right?
> 
> Reading the code, it doesn't seem prepared to handle the extra byte in
> any (special) way.

After a quick glance at the original patch [1], I confirm it looks like a
mistake in the OVS documentation. MAC addresses should be 6 bytes long, the
7th byte is not a workaround to identify a physical port like I initially
thought.

As you pointed out, since default MAC addresses on mlx4 ports are normally
incremental, documentation should read something like:

 $ ovs-vsctl add-port br0 dpdk-p0 -- set Interface dpdk-p0 type=dpdk \
 options:dpdk-devargs="class=eth,mac=00:11:22:33:44:55"
 $ ovs-vsctl add-port br0 dpdk-p1 -- set Interface dpdk-p1 type=dpdk \
 options:dpdk-devargs="class=eth,mac=00:11:22:33:44:56"

[1] https://github.com/openvswitch/ovs/commit/5e7588186839

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH v5 0/4] add ifcvf vdpa driver

2018-04-05 Thread Xiao Wang
This patch set has dependency on http://dpdk.org/dev/patchwork/patch/36772/
(vhost: support selective datapath).

IFCVF driver

The IFCVF vDPA (vhost data path acceleration) driver provides support for the
Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible, it
works as a HW vhost backend which can send/receive packets to/from virtio
directly by DMA. Besides, it supports dirty page logging and device state
report/restore. This driver enables its vDPA functionality with live migration
feature.

vDPA mode
=
IFCVF's vendor ID and device ID are same as that of virtio net pci device,
with its specific subsystem vendor ID and device ID. To let the device be
probed by IFCVF driver, adding "vdpa=1" parameter helps to specify that this
device is to be used in vDPA mode, rather than polling mode, virtio pmd will
skip when it detects this message.

Container per device

vDPA needs to create different containers for different devices, thus this
patch set adds some APIs in eal/vfio to support multiple container, e.g.
- rte_vfio_create_container
- rte_vfio_destroy_container
- rte_vfio_bind_group
- rte_vfio_unbind_group

By this extension, a device can be put into a new specific container, rather
than the previous default container.

IFCVF vDPA details
==
Key vDPA driver ops implemented:
- ifcvf_dev_config:
  Enable VF data path with virtio information provided by vhost lib, including
  IOMMU programming to enable VF DMA to VM's memory, VFIO interrupt setup to
  route HW interrupt to virtio driver, create notify relay thread to translate
  virtio driver's kick to a MMIO write onto HW, HW queues configuration.

  This function gets called to set up HW data path backend when virtio driver
  in VM gets ready.

- ifcvf_dev_close:
  Revoke all the setup in ifcvf_dev_config.

  This function gets called when virtio driver stops device in VM.

Change log
==
v5:
- Fix compilation in BSD, remove the rte_vfio.h including in BSD.

v4:
- Rebase on Zhihong's latest vDPA lib patch, with vDPA ops names change.
- Remove API "rte_vfio_get_group_fd", "rte_vfio_bind_group" will return the fd.
- Align the vfio_cfg search internal APIs naming.

v3:
- Add doc and release note for the new driver.
- Remove the vdev concept, make the driver as a PCI driver, it will get probed
  by PCI bus driver.
- Rebase on the v4 vDPA lib patch, register a vDPA device instead of a engine.
- Remove the PCI API exposure accordingly.
- Move the MAX_VFIO_CONTAINERS definition to config file.
- Let virtio pmd skips when a virtio device needs to work in vDPA mode.

v2:
- Rename function pci_get_kernel_driver_by_path to rte_pci_device_kdriver_name
  to make the API generic cross Linux and BSD, make it as EXPERIMENTAL.
- Rebase on Zhihong's vDPA v3 patch set.
- Minor code cleanup on vfio extension.


Xiao Wang (4):
  eal/vfio: add multiple container support
  net/virtio: skip device probe in vdpa mode
  net/ifcvf: add ifcvf vdpa driver
  doc: add ifcvf driver document and release note

 config/common_base   |   8 +
 config/common_linuxapp   |   1 +
 doc/guides/nics/features/ifcvf.ini   |   8 +
 doc/guides/nics/ifcvf.rst|  85 
 doc/guides/nics/index.rst|   1 +
 doc/guides/rel_notes/release_18_05.rst   |   9 +
 drivers/net/Makefile |   3 +
 drivers/net/ifc/Makefile |  36 ++
 drivers/net/ifc/base/ifcvf.c | 329 
 drivers/net/ifc/base/ifcvf.h | 160 ++
 drivers/net/ifc/base/ifcvf_osdep.h   |  52 ++
 drivers/net/ifc/ifcvf_vdpa.c | 840 +++
 drivers/net/ifc/rte_ifcvf_version.map|   4 +
 drivers/net/virtio/virtio_ethdev.c   |  43 ++
 lib/librte_eal/bsdapp/eal/eal.c  |  50 ++
 lib/librte_eal/common/include/rte_vfio.h | 113 +
 lib/librte_eal/linuxapp/eal/eal_vfio.c   | 522 +++
 lib/librte_eal/linuxapp/eal/eal_vfio.h   |   1 +
 lib/librte_eal/rte_eal_version.map   |   6 +
 mk/rte.app.mk|   3 +
 20 files changed, 2182 insertions(+), 92 deletions(-)
 create mode 100644 doc/guides/nics/features/ifcvf.ini
 create mode 100644 doc/guides/nics/ifcvf.rst
 create mode 100644 drivers/net/ifc/Makefile
 create mode 100644 drivers/net/ifc/base/ifcvf.c
 create mode 100644 drivers/net/ifc/base/ifcvf.h
 create mode 100644 drivers/net/ifc/base/ifcvf_osdep.h
 create mode 100644 drivers/net/ifc/ifcvf_vdpa.c
 create mode 100644 drivers/net/ifc/rte_ifcvf_version.map

-- 
2.15.1



[dpdk-dev] [PATCH v5 1/4] eal/vfio: add multiple container support

2018-04-05 Thread Xiao Wang
Currently eal vfio framework binds vfio group fd to the default
container fd during rte_vfio_setup_device, while in some cases,
e.g. vDPA (vhost data path acceleration), we want to put vfio group
to a separate container and program IOMMU via this container.

This patch adds some APIs to support container creating and device
binding with a container.

A driver could use "rte_vfio_create_container" helper to create a
new container from eal, use "rte_vfio_bind_group" to bind a device
to the newly created container.

During rte_vfio_setup_device, the container bound with the device
will be used for IOMMU setup.

Signed-off-by: Junjie Chen 
Signed-off-by: Xiao Wang 
Reviewed-by: Maxime Coquelin 
---
 config/common_base   |   1 +
 lib/librte_eal/bsdapp/eal/eal.c  |  50 +++
 lib/librte_eal/common/include/rte_vfio.h | 113 +++
 lib/librte_eal/linuxapp/eal/eal_vfio.c   | 522 +--
 lib/librte_eal/linuxapp/eal/eal_vfio.h   |   1 +
 lib/librte_eal/rte_eal_version.map   |   6 +
 6 files changed, 601 insertions(+), 92 deletions(-)

diff --git a/config/common_base b/config/common_base
index 7abf7c6fc..2c40b2603 100644
--- a/config/common_base
+++ b/config/common_base
@@ -74,6 +74,7 @@ CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
 CONFIG_RTE_EAL_IGB_UIO=n
 CONFIG_RTE_EAL_VFIO=n
 CONFIG_RTE_MAX_VFIO_GROUPS=64
+CONFIG_RTE_MAX_VFIO_CONTAINERS=64
 CONFIG_RTE_MALLOC_DEBUG=n
 CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=n
 
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 4eafcb5ad..0a3d8783d 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -746,6 +746,14 @@ int rte_vfio_enable(const char *modname);
 int rte_vfio_is_enabled(const char *modname);
 int rte_vfio_noiommu_is_enabled(void);
 int rte_vfio_clear_group(int vfio_group_fd);
+int rte_vfio_create_container(void);
+int rte_vfio_destroy_container(int container_fd);
+int rte_vfio_bind_group(int container_fd, int iommu_group_no);
+int rte_vfio_unbind_group(int container_fd, int iommu_group_no);
+int rte_vfio_dma_map(int container_fd, int dma_type,
+   const struct rte_memseg *ms);
+int rte_vfio_dma_unmap(int container_fd, int dma_type,
+   const struct rte_memseg *ms);
 
 int rte_vfio_setup_device(__rte_unused const char *sysfs_base,
  __rte_unused const char *dev_addr,
@@ -781,3 +789,45 @@ int rte_vfio_clear_group(__rte_unused int vfio_group_fd)
 {
return 0;
 }
+
+int __rte_experimental
+rte_vfio_create_container(void)
+{
+   return -1;
+}
+
+int __rte_experimental
+rte_vfio_destroy_container(__rte_unused int container_fd)
+{
+   return -1;
+}
+
+int __rte_experimental
+rte_vfio_bind_group(__rte_unused int container_fd,
+   __rte_unused int iommu_group_no)
+{
+   return -1;
+}
+
+int __rte_experimental
+rte_vfio_unbind_group(__rte_unused int container_fd,
+   __rte_unused int iommu_group_no)
+{
+   return -1;
+}
+
+int __rte_experimental
+rte_vfio_dma_map(__rte_unused int container_fd,
+   __rte_unused int dma_type,
+   __rte_unused const struct rte_memseg *ms)
+{
+   return -1;
+}
+
+int __rte_experimental
+rte_vfio_dma_unmap(__rte_unused int container_fd,
+   __rte_unused int dma_type,
+   __rte_unused const struct rte_memseg *ms)
+{
+   return -1;
+}
diff --git a/lib/librte_eal/common/include/rte_vfio.h 
b/lib/librte_eal/common/include/rte_vfio.h
index 249095e46..9bb026703 100644
--- a/lib/librte_eal/common/include/rte_vfio.h
+++ b/lib/librte_eal/common/include/rte_vfio.h
@@ -32,6 +32,8 @@
 extern "C" {
 #endif
 
+struct rte_memseg;
+
 /**
  * Setup vfio_cfg for the device identified by its address.
  * It discovers the configured I/O MMU groups or sets a new one for the device.
@@ -131,6 +133,117 @@ rte_vfio_clear_group(int vfio_group_fd);
 }
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Create a new container for device binding.
+ *
+ * @return
+ *   the container fd if successful
+ *   <0 if failed
+ */
+int __rte_experimental
+rte_vfio_create_container(void);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Destroy the container, unbind all vfio groups within it.
+ *
+ * @param container_fd
+ *   the container fd to destroy
+ *
+ * @return
+ *0 if successful
+ *   <0 if failed
+ */
+int __rte_experimental
+rte_vfio_destroy_container(int container_fd);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Bind a IOMMU group to a container.
+ *
+ * @param container_fd
+ *   the container's fd
+ *
+ * @param iommu_group_no
+ *   the iommu_group_no to bind to container
+ *
+ * @return
+ *   group fd if successful
+ *   <0 if failed
+ */
+int __rte_experimental
+rte_vfio_bind_group(int container_fd, int iommu_group_no);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, 

[dpdk-dev] [PATCH v5 4/4] doc: add ifcvf driver document and release note

2018-04-05 Thread Xiao Wang
Signed-off-by: Xiao Wang 
Reviewed-by: Maxime Coquelin 
---
 doc/guides/nics/features/ifcvf.ini |  8 
 doc/guides/nics/ifcvf.rst  | 85 ++
 doc/guides/nics/index.rst  |  1 +
 doc/guides/rel_notes/release_18_05.rst |  9 
 4 files changed, 103 insertions(+)
 create mode 100644 doc/guides/nics/features/ifcvf.ini
 create mode 100644 doc/guides/nics/ifcvf.rst

diff --git a/doc/guides/nics/features/ifcvf.ini 
b/doc/guides/nics/features/ifcvf.ini
new file mode 100644
index 0..ef1fc4711
--- /dev/null
+++ b/doc/guides/nics/features/ifcvf.ini
@@ -0,0 +1,8 @@
+;
+; Supported features of the 'ifcvf' vDPA driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+x86-32   = Y
+x86-64   = Y
diff --git a/doc/guides/nics/ifcvf.rst b/doc/guides/nics/ifcvf.rst
new file mode 100644
index 0..5d82bd25e
--- /dev/null
+++ b/doc/guides/nics/ifcvf.rst
@@ -0,0 +1,85 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+Copyright(c) 2018 Intel Corporation.
+
+IFCVF vDPA driver
+=
+
+The IFCVF vDPA (vhost data path acceleration) driver provides support for the
+Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible, it
+works as a HW vhost backend which can send/receive packets to/from virtio
+directly by DMA. Besides, it supports dirty page logging and device state
+report/restore. This driver enables its vDPA functionality with live migration
+feature.
+
+
+IFCVF vDPA Implementation
+-
+
+IFCVF's vendor ID and device ID are same as that of virtio net pci device,
+with its specific subsystem vendor ID and device ID. To let the device be
+probed by IFCVF driver, adding "vdpa=1" parameter helps to specify that this
+device is to be used in vDPA mode, rather than polling mode, virtio pmd will
+skip when it detects this message.
+
+Different VF devices serve different virtio frontends which are in different
+VMs, so each VF needs to have its own DMA address translation service. During
+the driver probe a new container is created for this device, with this
+container vDPA driver can program DMA remapping table with the VM's memory
+region information.
+
+Key IFCVF vDPA driver ops
+~
+
+- ifcvf_dev_config:
+  Enable VF data path with virtio information provided by vhost lib, including
+  IOMMU programming to enable VF DMA to VM's memory, VFIO interrupt setup to
+  route HW interrupt to virtio driver, create notify relay thread to translate
+  virtio driver's kick to a MMIO write onto HW, HW queues configuration.
+
+  This function gets called to set up HW data path backend when virtio driver
+  in VM gets ready.
+
+- ifcvf_dev_close:
+  Revoke all the setup in ifcvf_dev_config.
+
+  This function gets called when virtio driver stops device in VM.
+
+To create a vhost port with IFC VF
+~~
+
+- Create a vhost socket and assign a VF's device ID to this socket via
+  vhost API. When QEMU vhost connection gets ready, the assigned VF will
+  get configured automatically.
+
+
+Features
+
+
+Features of the IFCVF driver are:
+
+- Compatibility with virtio 0.95 and 1.0.
+- Live migration.
+
+
+Prerequisites
+-
+
+- Platform with IOMMU feature. IFC VF needs address translation service to
+  Rx/Tx directly with virtio driver in VM.
+
+
+Limitations
+---
+
+Dependency on vfio-pci
+~~
+
+vDPA driver needs to setup VF MSIX interrupts, each queue's interrupt vector
+is mapped to a callfd associated with a virtio ring. Currently only vfio-pci
+allows multiple interrupts, so the IFCVF driver is dependent on vfio-pci.
+
+Live Migration with VIRTIO_NET_F_GUEST_ANNOUNCE
+~~~
+
+IFC VF doesn't support RARP packet generation, virtio frontend supporting
+VIRTIO_NET_F_GUEST_ANNOUNCE feature can help to do that.
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 51c453d9c..a294ab389 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -44,6 +44,7 @@ Network Interface Controller Drivers
 vmxnet3
 pcap_ring
 fail_safe
+ifcvf
 
 **Figures**
 
diff --git a/doc/guides/rel_notes/release_18_05.rst 
b/doc/guides/rel_notes/release_18_05.rst
index 9cc77f893..c3d996fdc 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -58,6 +58,15 @@ New Features
   * Added support for NVGRE, VXLAN and GENEVE filters in flow API.
   * Added support for DROP action in flow API.
 
+* **Added IFCVF vDPA driver.**
+
+  Added IFCVF vDPA driver to support Intel FPGA 100G VF device. IFCVF works
+  as a HW vhost data path accelerator, it supports live migration and is
+  compatible with virtio 0.95 and 1.0. This driver registers ifcvf vDPA driver
+  to vhost lib, when virtio connected, with the help of the registered vDPA
+  driver the assigned VF gets 

[dpdk-dev] [PATCH v5 2/4] net/virtio: skip device probe in vdpa mode

2018-04-05 Thread Xiao Wang
If we want a virtio device to work in vDPA (vhost data path acceleration)
mode, we could add a "vdpa=1" devarg for this device to specify the mode.

This patch let virtio pmd skip device probe when detecting this parameter.

Signed-off-by: Xiao Wang 
Reviewed-by: Maxime Coquelin 
---
 drivers/net/virtio/virtio_ethdev.c | 43 ++
 1 file changed, 43 insertions(+)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 2ef213d1a..afb096804 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "virtio_ethdev.h"
 #include "virtio_pci.h"
@@ -1708,9 +1709,51 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
return 0;
 }
 
+static int vdpa_check_handler(__rte_unused const char *key,
+   const char *value, __rte_unused void *opaque)
+{
+   if (strcmp(value, "1"))
+   return -1;
+
+   return 0;
+}
+
+static int
+vdpa_mode_selected(struct rte_devargs *devargs)
+{
+   struct rte_kvargs *kvlist;
+   const char *key = "vdpa";
+   int ret = 0;
+
+   if (devargs == NULL)
+   return 0;
+
+   kvlist = rte_kvargs_parse(devargs->args, NULL);
+   if (kvlist == NULL)
+   return 0;
+
+   if (!rte_kvargs_count(kvlist, key))
+   goto exit;
+
+   /* vdpa mode selected when there's a key-value pair: vdpa=1 */
+   if (rte_kvargs_process(kvlist, key,
+   vdpa_check_handler, NULL) < 0) {
+   goto exit;
+   }
+   ret = 1;
+
+exit:
+   rte_kvargs_free(kvlist);
+   return ret;
+}
+
 static int eth_virtio_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
struct rte_pci_device *pci_dev)
 {
+   /* virtio pmd skips probe if device needs to work in vdpa mode */
+   if (vdpa_mode_selected(pci_dev->device.devargs))
+   return 1;
+
return rte_eth_dev_pci_generic_probe(pci_dev, sizeof(struct virtio_hw),
eth_virtio_dev_init);
 }
-- 
2.15.1



[dpdk-dev] [PATCH v5 3/4] net/ifcvf: add ifcvf vdpa driver

2018-04-05 Thread Xiao Wang
The IFCVF vDPA (vhost data path acceleration) driver provides support for
the Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible,
it works as a HW vhost backend which can send/receive packets to/from
virtio directly by DMA.

Different VF devices serve different virtio frontends which are in
different VMs, so each VF needs to have its own DMA address translation
service. During the driver probe a new container is created, with this
container vDPA driver can program DMA remapping table with the VM's memory
region information.

Key vDPA driver ops implemented:

- ifcvf_dev_config:
  Enable VF data path with virtio information provided by vhost lib,
  including IOMMU programming to enable VF DMA to VM's memory, VFIO
  interrupt setup to route HW interrupt to virtio driver, create notify
  relay thread to translate virtio driver's kick to a MMIO write onto HW,
  HW queues configuration.

- ifcvf_dev_close:
  Revoke all the setup in ifcvf_dev_config.

Live migration feature is supported by IFCVF and this driver enables
it. For the dirty page logging, VF helps to log for packet buffer write,
driver helps to make the used ring as dirty when device stops.

Because vDPA driver needs to set up MSI-X vector to interrupt the
guest, only vfio-pci is supported currently.

Signed-off-by: Xiao Wang 
Signed-off-by: Rosen Xu 
Reviewed-by: Maxime Coquelin 
---
 config/common_base|   7 +
 config/common_linuxapp|   1 +
 drivers/net/Makefile  |   3 +
 drivers/net/ifc/Makefile  |  36 ++
 drivers/net/ifc/base/ifcvf.c  | 329 +
 drivers/net/ifc/base/ifcvf.h  | 160 +++
 drivers/net/ifc/base/ifcvf_osdep.h|  52 +++
 drivers/net/ifc/ifcvf_vdpa.c  | 840 ++
 drivers/net/ifc/rte_ifcvf_version.map |   4 +
 mk/rte.app.mk |   3 +
 10 files changed, 1435 insertions(+)
 create mode 100644 drivers/net/ifc/Makefile
 create mode 100644 drivers/net/ifc/base/ifcvf.c
 create mode 100644 drivers/net/ifc/base/ifcvf.h
 create mode 100644 drivers/net/ifc/base/ifcvf_osdep.h
 create mode 100644 drivers/net/ifc/ifcvf_vdpa.c
 create mode 100644 drivers/net/ifc/rte_ifcvf_version.map

diff --git a/config/common_base b/config/common_base
index 2c40b2603..5d4f9e75c 100644
--- a/config/common_base
+++ b/config/common_base
@@ -796,6 +796,13 @@ CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
 #
 CONFIG_RTE_LIBRTE_PMD_VHOST=n
 
+#
+# Compile IFCVF driver
+# To compile, CONFIG_RTE_LIBRTE_VHOST and CONFIG_RTE_EAL_VFIO
+# should be enabled.
+#
+CONFIG_RTE_LIBRTE_IFCVF_VDPA=n
+
 #
 # Compile the test application
 #
diff --git a/config/common_linuxapp b/config/common_linuxapp
index d0437e5d6..e88e20f02 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -15,6 +15,7 @@ CONFIG_RTE_LIBRTE_PMD_KNI=y
 CONFIG_RTE_LIBRTE_VHOST=y
 CONFIG_RTE_LIBRTE_VHOST_NUMA=y
 CONFIG_RTE_LIBRTE_PMD_VHOST=y
+CONFIG_RTE_LIBRTE_IFCVF_VDPA=y
 CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
 CONFIG_RTE_LIBRTE_PMD_TAP=y
 CONFIG_RTE_LIBRTE_AVP_PMD=y
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 37ca19aa7..3fa51cca3 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -57,6 +57,9 @@ endif # $(CONFIG_RTE_LIBRTE_SCHED)
 
 ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost
+ifeq ($(CONFIG_RTE_EAL_VFIO),y)
+DIRS-$(CONFIG_RTE_LIBRTE_IFCVF_VDPA) += ifc
+endif
 endif # $(CONFIG_RTE_LIBRTE_VHOST)
 
 ifeq ($(CONFIG_RTE_LIBRTE_MVPP2_PMD),y)
diff --git a/drivers/net/ifc/Makefile b/drivers/net/ifc/Makefile
new file mode 100644
index 0..f08fcaad8
--- /dev/null
+++ b/drivers/net/ifc/Makefile
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_ifcvf_vdpa.a
+
+LDLIBS += -lpthread
+LDLIBS += -lrte_eal -lrte_pci -lrte_vhost -lrte_bus_pci
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+CFLAGS += -I$(RTE_SDK)/lib/librte_eal/linuxapp/eal
+
+#
+# Add extra flags for base driver source files to disable warnings in them
+#
+BASE_DRIVER_OBJS=$(sort $(patsubst %.c,%.o,$(notdir $(wildcard 
$(SRCDIR)/base/*.c
+
+VPATH += $(SRCDIR)/base
+
+EXPORT_MAP := rte_ifcvf_version.map
+
+LIBABIVER := 1
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-$(CONFIG_RTE_LIBRTE_IFCVF_VDPA) += ifcvf_vdpa.c
+SRCS-$(CONFIG_RTE_LIBRTE_IFCVF_VDPA) += ifcvf.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ifc/base/ifcvf.c b/drivers/net/ifc/base/ifcvf.c
new file mode 100644
index 0..d312ad99f
--- /dev/null
+++ b/drivers/net/ifc/base/ifcvf.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include "ifcvf.h"
+#include "ifcvf_osdep.h"
+
+STATIC void *
+get_cap_addr(struct ifcvf_hw *hw, struct ifcvf_pci_cap *cap)
+{
+   u8 bar = cap->bar;
+   u32 length = cap->length;
+   u32 off

Re: [dpdk-dev] [PATCH v2] pdump: change to use generic multi-process channel

2018-04-05 Thread Pattan, Reshma
Hi

> 
> Signed-off-by: Jianfeng Tan 
> ---
> Note this patch needs this patch set:
> http://dpdk.org/dev/patchwork/patch/36814/
> v2:
>   - Update doc for deprecation of API, rte_pdump_set_socket_dir,
> and API change for rte_pdump_init.
>   - Add notice for known incompatibility issue in doc.
>  app/pdump/main.c   |   6 +-
>  doc/guides/rel_notes/deprecation.rst   |   4 +
>  doc/guides/rel_notes/release_18_05.rst |   7 +
>  lib/librte_pdump/Makefile  |   3 +-
>  lib/librte_pdump/rte_pdump.c   | 423 
> +
>  lib/librte_pdump/rte_pdump.h   |   1 +
>  6 files changed, 84 insertions(+), 360 deletions(-)

>diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
> +
> 


> +
> + /* recv client requests */
> + if (mp_msg->len_param != sizeof(*cli_req)) {
> + RTE_LOG(ERR, PDUMP, "failed to recv from client\n");
> + resp->err_value = EINVAL;

resp->err_value = -EINVAL

> - /* save the socket in local configuration */
> - pdump_socket_fd = socket_fd;
> + snprintf(mp_resp.name, RTE_MP_MAX_NAME_LEN, PDUMP_MP);
> + mp_resp.len_param = sizeof(*resp);
> + mp_resp.num_fds = 0;
> + if (rte_mp_reply(&mp_resp, peer) < 0)
> + RTE_LOG(ERR, PDUMP, "failed to send to client:%s, %s:%d\n",
> + strerror(rte_errno), __func__, __LINE__);
> 

If failed to send the reply should'nt we return -1? 

>   return 0;
>  }
> 

>  int
> -rte_pdump_set_socket_dir(const char *path, enum rte_pdump_socktype
> type)
> +rte_pdump_set_socket_dir(const char *path __rte_unused,
> +  enum rte_pdump_socktype type __rte_unused)
>  {

What about enum rte_pdump_socktype in header file? When to delete them?

We need to update doxygen comments in header file for rte_pdump_init and 
rte_pdump_uninit()? What do you say.

Thanks,
Reshma


Re: [dpdk-dev] [PATCH v6 0/8] vhost: introduce vhost crypto backend

2018-04-05 Thread Maxime Coquelin



On 04/05/2018 10:26 AM, Maxime Coquelin wrote:



On 04/04/2018 04:24 PM, Fan Zhang wrote:

This patchset adds crypto backend suppport to vhost library
including a proof-of-concept sample application. The implementation
follows the virtio-crypto specification and have been tested
with qemu 2.11.50 (with several patches applied, detailed later)
with Fedora 24 running in the frontend.

The vhost_crypto library acts as a "bridge" method that translate
the virtio-crypto crypto requests to DPDK crypto operations, so it
is purely software implementation. However it does require the user
to provide the DPDK Cryptodev ID so it knows how to handle the
virtio-crypto session creation and deletion mesages.

Currently the implementation supports AES-CBC-128 and HMAC-SHA1
cipher only/chaining modes and does not support sessionless mode
yet. The guest can use standard virtio-crypto driver to set up
session and sends encryption/decryption requests to backend. The
vhost-crypto sample application provided in this patchset will
do the actual crypto work.

The following steps are involved to enable vhost-crypto support.

In the host:
1. Download the qemu source code.

2. Recompile your qemu with vhost-crypto option enabled.

3. Apply this patchset to latest DPDK code and recompile DPDK.

4. Compile and run vhost-crypto sample application.

./examples/vhost_crypto/build/vhost-crypto -l 11,12 -w :86:01.0 \
  --socket-mem 2048,2048

Where :86:01.0 is the QAT PCI address. You may use AES-NI-MB if it is
not available. The sample application requires 2 lcores: 1 master and 1
worker. The application will create a UNIX socket file
/tmp/vhost_crypto1.socket.

5. Start your qemu application. Here is my command:

qemu/x86_64-softmmu/qemu-system-x86_64 -machine accel=kvm -cpu host \
-smp 2 -m 1G -hda ~/path-to-your/image.qcow \
-object 
memory-backend-file,id=mem,size=1G,mem-path=/dev/hugepages,share=on \

-mem-prealloc -numa node,memdev=mem -chardev \
socket,id=charcrypto0,path=/tmp/vhost_crypto1.socket \
-object cryptodev-vhost-user,id=cryptodev0,chardev=charcrypto0 \
-device virtio-crypto-pci,id=crypto0,cryptodev=cryptodev0

6. Once guest is booted. The Linux virtio_crypto kernel module is 
loaded by

default. You shall see the following logs in your demsg:

[   17.611044] virtio_crypto: loading out-of-tree module taints kernel.
[   17.611083] virtio_crypto: module verification failed: signature 
and/or ...
[   17.611723] virtio_crypto virtio0: max_queues: 1, 
max_cipher_key_len: ...
[   17.612156] virtio_crypto virtio0: will run requests pump with 
realtime ...

[   18.376100] virtio_crypto virtio0: Accelerator is ready

The virtio_crypto driver in the guest is now up and running.

7. The rest steps can be as same as the Testing section in
https://wiki.qemu.org/Features/VirtioCrypto

8. It is possible to use DPDK Virtio Crypto PMD
(https://dpdk.org/dev/patchwork/patch/36921/) in the guest to work with
this patchset to achieve optimal performance.

v6:
- Changed commit message
- removed rte prefix in handler prototype

v5:
- removed external ops register API.
- patch cleaned.

v4:
- Changed external vhost backend ops register API.
- Fixed a bug.

v3:
- Changed external vhost backend private data and message handling
- Added experimental tag to rte_vhost_crypto_set_zero_copy()

v2:
- Moved vhost_crypto_data_req data from crypto op to source mbuf.
- Removed ZERO-COPY flag from config option and make it run-timely 
changeable.

- Guest-polling mode possible.
- Simplified vring descriptor access procedure.
- Work with both LKCF and DPDK Virtio-Crypto PMD guest drivers.

Fan Zhang (8):
   lib/librte_vhost: add vhost user message handlers
   lib/librte_vhost: add virtio-crypto user message structure
   lib/librte_vhost: add session message handler
   lib/librte_vhost: add request handler
   lib/librte_vhost: add public function implementation
   lib/librte_vhost: update makefile
   examples/vhost_crypto: add vhost crypto sample application
   doc: update for vhost crypto support

  doc/guides/prog_guide/vhost_lib.rst   |   25 +
  doc/guides/rel_notes/release_18_05.rst    |    5 +
  doc/guides/sample_app_ug/index.rst    |    1 +
  doc/guides/sample_app_ug/vhost_crypto.rst |   82 ++
  examples/vhost_crypto/Makefile    |   32 +
  examples/vhost_crypto/main.c  |  541 
  examples/vhost_crypto/meson.build |   14 +
  lib/librte_vhost/Makefile |    6 +-
  lib/librte_vhost/meson.build  |    8 +-
  lib/librte_vhost/rte_vhost_crypto.h   |  109 +++
  lib/librte_vhost/rte_vhost_version.map    |   11 +
  lib/librte_vhost/vhost.c  |    2 +-
  lib/librte_vhost/vhost.h  |   53 +-
  lib/librte_vhost/vhost_crypto.c   | 1312 
+

  lib/librte_vhost/vhost_user.c |   33 +-
  lib/librte_vhost/vhost_user.h |   35 +-
  16 files changed, 2256 insertions(+), 13 deletions(-)
  create mode 

Re: [dpdk-dev] [PATCH] ethdev: Additions to rte_flows to support vTEP encap/decap offload

2018-04-05 Thread Thomas Monjalon
+Cc Adrien, please review

10/03/2018 01:25, Declan Doherty:
> This V1 patchset contains the revised proposal to manage virtual
> tunnel endpoints (vTEP) hardware accleration based on community
> feedback on RFC
> (http://dpdk.org/ml/archives/dev/2017-December/084676.html). This
> proposal is purely enabled through rte_flow APIs with the
> additions of some new features which were previously implemented
> by the proposed rte_tep APIs which were proposed in the original
> RFC. This patchset ultimately aims to enable the configuration
> of inline data path encapsulation and decapsulation of tunnel
> endpoint network overlays on accelerated IO devices.
> 
> The summary of the additions to the rte_flow are as follows:
> 
> - Add new flow actions RTE_RTE_FLOW_ACTION_TYPE_VTEP_ENCAP and
> RTE_FLOW_ACTION_TYPE_VTEP_DECAP to rte_flow to support specfication
> of encapsulation and decapsulation of virtual Tunnel Endpoint on
> hardware.
> 
> - Updates the matching pattern item definition
> description to specify that all actions which modify a packet
> must be specified in the explicit order they are to be excuted.
> 
> - Introduces support for the use of pipeline metadata in
> the flow pattern defintion and the population of metadata fields
> from flow actions.
> 
> - Adds group counters to enable statistics to be kept on groups of
> flows such as all ingress/egress flows of a vTEP
> 
> - Adds group_action to allow a flows termination to be a group/table
> within the device.
> 
> A high level summary of the proposed usage model is as follows:
> 
> 1. Decapsulation
> 
> 1.1. Decapsulation of vTEP outer headers and forward all traffic
>  to the same queue/s or port, would have the follow flows
>  paramteters, sudo code used here.
> 
> struct rte_flow_attr attr = { .ingress = 1 };
> 
> struct rte_flow_item pattern[] = {
>   { .type = RTE_FLOW_ITEM_TYPE_ETH,  .spec = ð_item },
>   { .type = RTE_FLOW_ITEM_TYPE_IPV4, .spec = &ipv4_item },
>   { .type = RTE_FLOW_ITEM_TYPE_UDP, .spec = &udp_item },
>   { .type = RTE_FLOW_ITEM_TYPE_VxLAN, .spec = &vxlan_item },
>   { .type = RTE_FLOW_ITEM_TYPE_END }
> };
> 
> struct rte_flow_action actions[] = {
>   { .type = RTE_FLOW_ACTION_TYPE_VTEP_DECAP, .conf = VxLAN },
>   { .type = RTE_FLOW_ACTION_TYPE_VF, .conf = &vf_action  },
>   { .type = RTE_FLOW_ACTION_TYPE_END }
> 
> }
> 
> 1.2.
> Decapsulation of vTEP outer headers and matching on inner
> headers, and forwarding to the same queue/s or port.
> 
> 1.2.1.
> The same scenario as above but either the application
> or hardware requires configuration as 2 logically independent
> operations (viewing it as 2 logical tables). The first stage
> being the flow rule to define the pattern to match the vTEP
> and the action to decapsulate the packet, and the second stage
> stage table matches the inner header and defines the actions,
> forward to port etc.
> 
> flow rule for outer header on table 0
> 
> struct rte_flow_attr attr = { .ingress = 1, .table = 0 };
> 
> struct rte_flow_item pattern[] = {
>   { .type = RTE_FLOW_ITEM_TYPE_ETH,  .spec = ð_item },
>   { .type = RTE_FLOW_ITEM_TYPE_IPV4, .spec = &ipv4_item },
>   { .type = RTE_FLOW_ITEM_TYPE_UDP, .spec = &udp_item },
>   { .type = RTE_FLOW_ITEM_TYPE_VxLAN, .spec = &vxlan_item },
>   { .type = RTE_FLOW_ITEM_TYPE_END }
> };
> 
> struct rte_flow_action actions[] = {
>   { .type = RTE_FLOW_ACTION_TYPE_GROUP_COUNT, .conf = &vtep_counter },
>   { .type = RTE_FLOW_ACTION_TYPE_METADATA, .conf = &metadata_action },
>   { .type = RTE_FLOW_ACTION_TYPE_VTEP_DECAP, .conf = VxLAN },
>   { .type = RTE_FLOW_ACTION_TYPE_GROUP, .conf = &group_action = { .id = 1 
> } },
>   { .type = RTE_FLOW_ACTION_TYPE_END }
> }
> 
> flow rule for inner header on table 1
> 
> struct rte_flow_attr attr = { .ingress = 1, .table = 1 };
> 
> struct rte_flow_item pattern[] = {
>   { .type = RTE_FLOW_ITEM_TYPE_METADATA,  .spec = &metadata_item },
>   { .type = RTE_FLOW_ITEM_TYPE_ETH,  .spec = ð_item },
>   { .type = RTE_FLOW_ITEM_TYPE_IPV4, .spec = &ipv4_item },
>   { .type = RTE_FLOW_ITEM_TYPE_TCP, .spec = &tcp_item },
>   { .type = RTE_FLOW_ITEM_TYPE_END }
> };
> 
> struct rte_flow_action actions[] = {
>   { .type = RTE_FLOW_ACTION_TYPE_PORT, .conf = &port_action = { port_id } 
> },
>   { .type = RTE_FLOW_ACTION_TYPE_END }
> }
> 
> Note that the metadata action in the flow rule in table 0 is generating
> the metadata in the pipeline which is then used in as part as the flow
> pattern in table 1 to specify the exact flow to match against. In the
> case where exact match rules are being provided by the application
> then i this metadata could be provided by the application in both rules.
> If there was wildcard matching happening at the first table then this
> metadata could be generated by hw, but this would require extension to
> currently proposed API to allow specification of how the metadata should
> be generated

Re: [dpdk-dev] [PATCH v1 11/16] ethdev: refine TPID handling in flow API

2018-04-05 Thread Nélio Laranjeiro
On Wed, Apr 04, 2018 at 05:56:49PM +0200, Adrien Mazarguil wrote:
> TPID handling in rte_flow VLAN and E_TAG pattern item definitions is not
> consistent with the normal stacking order of pattern items, which is
> confusing to applications.
> 
> Problem is that when followed by one of these layers, the EtherType field
> of the preceding layer keeps its "inner" definition, and the "outer" TPID
> is provided by the subsequent layer, the reverse of how a packet looks like
> on the wire:
> 
>  Wire: [ ETH TPID = A | VLAN EtherType = B | B DATA ]
>  rte_flow: [ ETH EtherType = B | VLAN TPID = A | B DATA ]
> 
> Worse, when QinQ is involved, the stacking order of VLAN layers is
> unspecified. It is unclear whether it should be reversed (innermost to
> outermost) as well given TPID applies to the previous layer:
> 
>  Wire:   [ ETH TPID = A | VLAN TPID = B | VLAN EtherType = C | C DATA ]
>  rte_flow 1: [ ETH EtherType = C | VLAN TPID = B | VLAN TPID = A | C DATA ]
>  rte_flow 2: [ ETH EtherType = C | VLAN TPID = A | VLAN TPID = B | C DATA ]
> 
> While specifying EtherType/TPID is hopefully rarely necessary, the stacking
> order in case of QinQ and the lack of documentation remain an issue.
> 
> This patch replaces TPID in the VLAN pattern item with an inner
> EtherType/TPID as is usually done everywhere else (e.g. struct vlan_hdr),
> clarifies documentation and updates all relevant code.
> 
> Summary of changes for PMDs that implement ETH, VLAN or E_TAG pattern
> items:
> 
> - bnxt: EtherType matching is supported, and vlan->inner_type overrides
>   eth->type if the latter has standard TPID value 0x8100, otherwise an
>   error is triggered.
> 
> - e1000: EtherType matching is only supported with the ETHERTYPE filter,
>   which does not support VLAN matching, therefore no impact.
> 
> - enic: same as bnxt.
> 
> - i40e: same as bnxt with a configurable TPID value for the FDIR filter,
>   with existing limitations on allowed EtherType values. The remaining
>   filter types (VXLAN, NVGRE, QINQ) do not support EtherType matching.
> 
> - ixgbe: same as e1000.
> 
> - mlx4: EtherType/TPID matching is not supported, no impact.
> 
> - mlx5: same as bnxt.
> 
> - mrvl: EtherType matching is supported but eth->type cannot be specified
>   when a VLAN item is present. However vlan->inner_type is used if
>   specified.
> 
> - sfc: same as bnxt with QinQ TPID value 0x88a8 additionally supported.
> 
> - tap: same as bnxt.
> 
> Signed-off-by: Adrien Mazarguil 
> Cc: Ferruh Yigit 
> Cc: Thomas Monjalon 
> Cc: Wenzhuo Lu 
> Cc: Jingjing Wu 
> Cc: Ajit Khaparde 
> Cc: Somnath Kotur 
> Cc: John Daley 
> Cc: Hyong Youb Kim 
> Cc: Beilei Xing 
> Cc: Qi Zhang 
> Cc: Konstantin Ananyev 
> Cc: Nelio Laranjeiro 
> Cc: Yongseok Koh 
> Cc: Jacek Siuda 
> Cc: Tomasz Duszynski 
> Cc: Dmitri Epshtein 
> Cc: Natalie Samsonov 
> Cc: Jianbo Liu 
> Cc: Andrew Rybchenko 
> Cc: Pascal Mazon 
> 
> ---
> 
> Hi PMD maintainers, while I'm pretty confident in these changes, I could
> not validate them with all devices.
> 
> It would be great if you could apply this patch, run testpmd, create VLAN
> flow rules with/without inner EtherType as described and send matching
> traffic while making sure nothing was broken in the process.
> 
> Thanks!
> ---
>  app/test-pmd/cmdline_flow.c | 17 +++---
>  doc/guides/nics/tap.rst |  2 +-
>  doc/guides/prog_guide/rte_flow.rst  | 21 +--
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +-
>  drivers/net/bnxt/bnxt_filter.c  | 38 ++--
>  drivers/net/enic/enic_flow.c| 21 ---
>  drivers/net/i40e/i40e_flow.c| 74 
>  drivers/net/mlx5/mlx5_flow.c| 14 -
>  drivers/net/mvpp2/mrvl_flow.c   | 27 +++--
>  drivers/net/sfc/sfc_flow.c  | 27 +
>  drivers/net/tap/tap_flow.c  | 16 +++--
>  lib/librte_ether/rte_flow.h | 24 +---
>  12 files changed, 227 insertions(+), 58 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index 2fbd3d8ef..3a486032d 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -99,11 +99,11 @@ enum index {
>   ITEM_ETH_SRC,
>   ITEM_ETH_TYPE,
>   ITEM_VLAN,
> - ITEM_VLAN_TPID,
>   ITEM_VLAN_TCI,
>   ITEM_VLAN_PCP,
>   ITEM_VLAN_DEI,
>   ITEM_VLAN_VID,
> + ITEM_VLAN_INNER_TYPE,
>   ITEM_IPV4,
>   ITEM_IPV4_TOS,
>   ITEM_IPV4_TTL,
> @@ -505,11 +505,11 @@ static const enum index item_eth[] = {
>  };
>  
>  static const enum index item_vlan[] = {
> - ITEM_VLAN_TPID,
>   ITEM_VLAN_TCI,
>   ITEM_VLAN_PCP,
>   ITEM_VLAN_DEI,
>   ITEM_VLAN_VID,
> + ITEM_VLAN_INNER_TYPE,
>   ITEM_NEXT,
>   ZERO,
>  };
> @@ -1142,12 +1142,6 @@ static const struct token token_list[] = {
>   .next = NEXT(item_vlan),
>   .call = parse_vc,
>   

Re: [dpdk-dev] [dpdk-stable] [PATCH] net/vmxnet3: keep link state consistent

2018-04-05 Thread Thomas Monjalon
20/03/2018 15:12, Ferruh Yigit:
> On 3/18/2018 1:45 AM, Chas Williams wrote:
> > From: Chas Williams 
> > 
> > The vmxnet3 never attempts link speed negotiation.  As a virtual device
> > the link speed is vague at best.  However, it is important for certain
> > applications, like bonding, to see a consistent link_status.  802.3ad
> > requires that only links of the same cost (link speed) be enslaved.
> > Keeping the link status consistent in vmxnet3 avoids races with bonding
> > enslavement.

I don't understand the issue.
Are you sure it is not an issue in bonding?

About the right value to set for virtual PMDs, I don't know, both are fakes.
I thought that AUTONEG better convey the vague link speed you describe.


> > Author: Thomas Monjalon 
> > Date:   Fri Jan 5 18:38:55 2018 +0100
> > 
> > Fixes: 1e3a958f40b3 ("ethdev: fix link autonegotiation value")
> > Cc: sta...@dpdk.org
> 
> There were a few more PMDs [1] they have been updated from FIXED to AUTONEG 
> with
> above commit, do you think should we update them back to FIXED as well?
> 
> [1]
> pcap
> softnic
> vmxnet3

Yes, they all can be fixed/LINK_FIXED :) I guess





Re: [dpdk-dev] [PATCH v1 01/16] ethdev: update ABI for flow API functions

2018-04-05 Thread Thomas Monjalon
04/04/2018 17:56, Adrien Mazarguil:
> Subsequent patches will modify existing types and slightly alter the
> behavior of the flow API. This warrants a major ABI breakage.
> 
> While it is already taken care of for 18.05 (LIBABIVER was updated to
> version 9 by a prior commit), this patch explicitly adds the affected flow
> API functions as a safety measure.

I don't understand this patch.

If the API is broken, you must move the function from old block to
the new one. And it must be done in the patch modifying the function.


> --- a/lib/librte_ether/rte_ethdev_version.map
> +++ b/lib/librte_ether/rte_ethdev_version.map
> +DPDK_18.05 {
> + global:
> +
> + rte_flow_validate;
> + rte_flow_create;
> + rte_flow_query;
> + rte_flow_copy;
> +
> +} DPDK_18.02;





[dpdk-dev] [PATCH v3 00/21] implement packed virtqueues

2018-04-05 Thread Jens Freimann
This is a basic implementation of packed virtqueues as specified in the
Virtio 1.1 draft. A compiled version of the current draft is available
at https://github.com/oasis-tcs/virtio-docs.git (or as .pdf at
https://github.com/oasis-tcs/virtio-docs/blob/master/virtio-v1.1-packed-wd10.pdf

It does not implement yet indirect descriptors and checksum offloading.

A packed virtqueue is different from a split virtqueue in that it
consists of only a single descriptor ring that replaces available and
used ring, index and descriptor buffer.

Each descriptor is readable and writable and has a flags field. These flags
will mark if a descriptor is available or used.  To detect new available 
descriptors
even after the ring has wrapped, device and driver each have a
single-bit wrap counter that is flipped from 0 to 1 and vice versa every time
the last descriptor in the ring is used/made available.

The idea behind this is to 1. improve performance by avoiding cache misses
and 2. be easier for devices to implement.

Regarding performance: with these patches I get 21.13 Mpps on my system
as compared to 18.8 Mpps with the virtio 1.0 code. Packet size was 64
bytes, 0.05% acceptable loss.  Test setup is described as in
http://dpdk.org/doc/guides/howto/pvp_reference_benchmark.html

Packet generator:
MoonGen
Intel(R) Xeon(R) CPU E5-2665 0 @ 2.40GHz
Intel X710 NIC
RHEL 7.4

Device under test:
Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz
Intel X710 NIC
RHEL 7.4

VM on DuT: RHEL7.4

I plan to do more performance test with bigger frame sizes.

This patch series is based on a prototype implemented by Yuanhan Liu and
Tiwei Bie.

changes from v2->v3:
* implement event suppression
* add code do dump packed virtqueues
* don't use assert in vhost code
* rename virtio-user parameter to packed-vq
* support rxvf flush 

changes from v1->v2:
* don't use VIRTQ_DESC_F_NEXT in used descriptors (Jason)
* no rte_panice() in guest triggerable code (Maxime)
* use unlikely when checking for vq (Maxime)
* rename everything from _1_1 to _packed  (Yuanhan)
* add two more patches to implement mergeable receive buffers 

regards,
Jens 


Jens Freimann (17):
  net/virtio: by default disable packed virtqueues
  net/virtio: vring init for packed queues
  net/virtio: add virtio 1.1 defines
  net/virtio: add packed virtqueue helpers
  net/virtio: dump packed virtqueue data
  net/virtio: implement transmit path for packed queues
  vhost: disable packed virtqueues by default
  vhost: turn of indirect descriptors for packed virtqueues
  vhost: add virtio 1.1 defines
  vhost: add helpers for packed virtqueues
  vhost: dequeue for packed queues
  vhost: packed queue enqueue path
  net/virtio: disable ctrl virtqueue for packed rings
  net/virtio: add support for mergeable buffers with packed virtqueues
  vhost: support mergeable rx buffers with packed queues
  net/virtio: add support for event suppression
  vhost: add event suppression for packed queues

Yuanhan Liu (4):
  net/virtio-user: add option to use packed queues
  net/virtio: implement receive path for packed queues
  vhost: vring address setup for packed queues
  vhost: enable packed virtqueues

 config/common_base   |   2 +
 drivers/net/virtio/virtio_ethdev.c   |  51 ++-
 drivers/net/virtio/virtio_ethdev.h   |   7 +-
 drivers/net/virtio/virtio_pci.h  |   8 +
 drivers/net/virtio/virtio_ring.h |  91 +++-
 drivers/net/virtio/virtio_rxtx.c | 359 +++-
 drivers/net/virtio/virtio_user/virtio_user_dev.c |  12 +-
 drivers/net/virtio/virtio_user/virtio_user_dev.h |   3 +-
 drivers/net/virtio/virtio_user_ethdev.c  |  15 +-
 drivers/net/virtio/virtqueue.c   |  10 +
 drivers/net/virtio/virtqueue.h   |  91 +++-
 lib/librte_vhost/socket.c|   5 +
 lib/librte_vhost/vhost.c |  16 +-
 lib/librte_vhost/vhost.h |  69 ++-
 lib/librte_vhost/vhost_user.c|  40 +-
 lib/librte_vhost/virtio-1.1.h|  62 +++
 lib/librte_vhost/virtio_net.c| 507 +--
 17 files changed, 1271 insertions(+), 77 deletions(-)
 create mode 100644 lib/librte_vhost/virtio-1.1.h

-- 
2.14.3



[dpdk-dev] [PATCH v3 03/21] net/virtio: add virtio 1.1 defines

2018-04-05 Thread Jens Freimann
Signed-off-by: Jens Freimann 
---
 drivers/net/virtio/virtio_ring.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio/virtio_ring.h b/drivers/net/virtio/virtio_ring.h
index 1bd7ba98e..54a11d2a9 100644
--- a/drivers/net/virtio/virtio_ring.h
+++ b/drivers/net/virtio/virtio_ring.h
@@ -16,7 +16,10 @@
 #define VRING_DESC_F_WRITE  2
 /* This means the buffer contains a list of buffer descriptors. */
 #define VRING_DESC_F_INDIRECT   4
-
+/* This flag means the descriptor was made available by the driver */
+#define VRING_DESC_F_AVAIL (1ULL << 7)
+/* This flag means the descriptor was used by the device */
+#define VRING_DESC_F_USED  (1ULL << 15)
 /* The Host uses this in used->flags to advise the Guest: don't kick me
  * when you add a buffer.  It's unreliable, so it's simply an
  * optimization.  Guest will still kick if it's out of buffers. */
-- 
2.14.3



[dpdk-dev] [PATCH v3 02/21] net/virtio: vring init for packed queues

2018-04-05 Thread Jens Freimann
Add and initialize descriptor data structures.

Signed-off-by: Jens Freimann 
---
 drivers/net/virtio/virtio_ethdev.c | 22 +---
 drivers/net/virtio/virtio_pci.h|  8 ++
 drivers/net/virtio/virtio_ring.h   | 53 ++
 drivers/net/virtio/virtqueue.h | 10 +++
 4 files changed, 78 insertions(+), 15 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 06fbf7311..cccefafe9 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -298,19 +298,21 @@ virtio_init_vring(struct virtqueue *vq)
 
PMD_INIT_FUNC_TRACE();
 
-   /*
-* Reinitialise since virtio port might have been stopped and restarted
-*/
memset(ring_mem, 0, vq->vq_ring_size);
-   vring_init(vr, size, ring_mem, VIRTIO_PCI_VRING_ALIGN);
-   vq->vq_used_cons_idx = 0;
-   vq->vq_desc_head_idx = 0;
-   vq->vq_avail_idx = 0;
-   vq->vq_desc_tail_idx = (uint16_t)(vq->vq_nentries - 1);
+   vring_init(vq->hw, vr, size, ring_mem, VIRTIO_PCI_VRING_ALIGN);
+
vq->vq_free_cnt = vq->vq_nentries;
memset(vq->vq_descx, 0, sizeof(struct vq_desc_extra) * vq->vq_nentries);
+   vq->vq_used_cons_idx = 0;
+   vq->vq_avail_idx = 0;
+   if (vtpci_packed_queue(vq->hw)) {
+   vring_desc_init_packed(vr, size);
+   } else {
+   vq->vq_desc_head_idx = 0;
+   vq->vq_desc_tail_idx = (uint16_t)(vq->vq_nentries - 1);
 
-   vring_desc_init(vr->desc, size);
+   vring_desc_init(vr->desc, size);
+   }
 
/*
 * Disable device(host) interrupting guest
@@ -385,7 +387,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t 
vtpci_queue_idx)
/*
 * Reserve a memzone for vring elements
 */
-   size = vring_size(vq_size, VIRTIO_PCI_VRING_ALIGN);
+   size = vring_size(hw, vq_size, VIRTIO_PCI_VRING_ALIGN);
vq->vq_ring_size = RTE_ALIGN_CEIL(size, VIRTIO_PCI_VRING_ALIGN);
PMD_INIT_LOG(DEBUG, "vring_size: %d, rounded_vring_size: %d",
 size, vq->vq_ring_size);
diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index a28ba8339..528fb46b9 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -112,6 +112,8 @@ struct virtnet_ctl;
 
 #define VIRTIO_F_VERSION_1 32
 #define VIRTIO_F_IOMMU_PLATFORM33
+#define VIRTIO_F_RING_PACKED   34
+#define VIRTIO_F_IN_ORDER  35
 
 /*
  * Some VirtIO feature bits (currently bits 28 through 31) are
@@ -304,6 +306,12 @@ vtpci_with_feature(struct virtio_hw *hw, uint64_t bit)
return (hw->guest_features & (1ULL << bit)) != 0;
 }
 
+static inline int
+vtpci_packed_queue(struct virtio_hw *hw)
+{
+   return vtpci_with_feature(hw, VIRTIO_F_RING_PACKED);
+}
+
 /*
  * Function declaration from virtio_pci.c
  */
diff --git a/drivers/net/virtio/virtio_ring.h b/drivers/net/virtio/virtio_ring.h
index 9e3c2a015..1bd7ba98e 100644
--- a/drivers/net/virtio/virtio_ring.h
+++ b/drivers/net/virtio/virtio_ring.h
@@ -9,6 +9,7 @@
 
 #include 
 
+
 /* This marks a buffer as continuing via the next field. */
 #define VRING_DESC_F_NEXT   1
 /* This marks a buffer as write-only (otherwise read-only). */
@@ -54,11 +55,38 @@ struct vring_used {
struct vring_used_elem ring[0];
 };
 
+/* For support of packed virtqueues in Virtio 1.1 the format of descriptors
+ * looks like this.
+ */
+struct vring_desc_packed {
+   uint64_t addr;
+   uint32_t len;
+   uint16_t index;
+   uint16_t flags;
+};
+
+#define RING_EVENT_FLAGS_ENABLE 0x0
+#define RING_EVENT_FLAGS_DISABLE 0x1
+#define RING_EVENT_FLAGS_DESC 0x2
+struct vring_packed_desc_event {
+   uint16_t desc_event_off_wrap;
+   uint16_t desc_event_flags;
+};
+
 struct vring {
unsigned int num;
-   struct vring_desc  *desc;
-   struct vring_avail *avail;
-   struct vring_used  *used;
+   union {
+   struct vring_desc_packed *desc_packed;
+   struct vring_desc *desc;
+   };
+   union {
+   struct vring_avail *avail;
+   struct vring_packed_desc_event *driver_event;
+   };
+   union {
+   struct vring_used  *used;
+   struct vring_packed_desc_event *device_event;
+   };
 };
 
 /* The standard layout for the ring is a continuous chunk of memory which
@@ -95,10 +123,16 @@ struct vring {
 #define vring_avail_event(vr) (*(uint16_t *)&(vr)->used->ring[(vr)->num])
 
 static inline size_t
-vring_size(unsigned int num, unsigned long align)
+vring_size(struct virtio_hw *hw, unsigned int num, unsigned long align)
 {
size_t size;
 
+   if (vtpci_packed_queue(hw)) {
+   size = num * sizeof(struct vring_desc_packed);
+   size += 2 * sizeof(struct vring_packed_desc_event);
+   return size;
+   }
+

[dpdk-dev] [PATCH v3 01/21] net/virtio: by default disable packed virtqueues

2018-04-05 Thread Jens Freimann
Disable packed virtqueues for now and make it dependend on a build-time
config option. This can be reverted once we have missing features like
indirect descriptors implemented.

Signed-off-by: Jens Freimann 
---
 config/common_base | 1 +
 drivers/net/virtio/virtio_ethdev.c | 4 
 2 files changed, 5 insertions(+)

diff --git a/config/common_base b/config/common_base
index c09c7cf88..cd4b419b4 100644
--- a/config/common_base
+++ b/config/common_base
@@ -346,6 +346,7 @@ CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
 CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_RX=n
 CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX=n
 CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DUMP=n
+CONFIG_RTE_LIBRTE_VIRTIO_PQ=n
 
 #
 # Compile virtio device emulation inside virtio PMD driver
diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 11f758929..06fbf7311 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1149,6 +1149,10 @@ virtio_negotiate_features(struct virtio_hw *hw, uint64_t 
req_features)
req_features &= ~(1ULL << VIRTIO_NET_F_MTU);
}
 
+#ifndef RTE_LIBRTE_VIRTIO_PQ
+   req_features &= ~(1ull << VIRTIO_F_RING_PACKED);
+#endif
+
/*
 * Negotiate features: Subset of device feature bits are written back
 * guest feature bits.
-- 
2.14.3



[dpdk-dev] [PATCH v3 05/21] net/virtio: dump packed virtqueue data

2018-04-05 Thread Jens Freimann
Add support to dump packed virtqueue data to the
VIRTQUEUE_DUMP() macro.

Signed-off-by: Jens Freimann 
---
 drivers/net/virtio/virtqueue.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index cc2e7c0f6..7e265bf93 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -360,6 +360,13 @@ virtqueue_notify(struct virtqueue *vq)
 
 #ifdef RTE_LIBRTE_VIRTIO_DEBUG_DUMP
 #define VIRTQUEUE_DUMP(vq) do { \
+   if (vtpci_packed_queue((vq)->hw)) { \
+ PMD_INIT_LOG(DEBUG, \
+ "VQ: - size=%d; free=%d; last_used_idx=%d;" \
+ (vq)->vq_nentries, (vq)->vq_free_cnt, nused); \
+ break; \
+ } \
+   if (vtpci_packed_queue((vq)->hw)) break; \
uint16_t used_idx, nused; \
used_idx = (vq)->vq_ring.used->idx; \
nused = (uint16_t)(used_idx - (vq)->vq_used_cons_idx); \
-- 
2.14.3



[dpdk-dev] [PATCH v3 06/21] net/virtio-user: add option to use packed queues

2018-04-05 Thread Jens Freimann
From: Yuanhan Liu 

Add option to enable packed queue support for virtio-user
devices.

Signed-off-by: Yuanhan Liu 
---
 drivers/net/virtio/virtio_user/virtio_user_dev.c | 12 ++--
 drivers/net/virtio/virtio_user/virtio_user_dev.h |  3 ++-
 drivers/net/virtio/virtio_user_ethdev.c  | 15 ++-
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c 
b/drivers/net/virtio/virtio_user/virtio_user_dev.c
index f90fee9e5..e5e7af185 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
@@ -314,11 +314,13 @@ virtio_user_dev_setup(struct virtio_user_dev *dev)
 1ULL << VIRTIO_NET_F_GUEST_CSUM|   \
 1ULL << VIRTIO_NET_F_GUEST_TSO4|   \
 1ULL << VIRTIO_NET_F_GUEST_TSO6|   \
-1ULL << VIRTIO_F_VERSION_1)
+1ULL << VIRTIO_F_VERSION_1 |   \
+1ULL << VIRTIO_F_RING_PACKED)
 
 int
 virtio_user_dev_init(struct virtio_user_dev *dev, char *path, int queues,
-int cq, int queue_size, const char *mac, char **ifname)
+int cq, int queue_size, const char *mac, char **ifname,
+int packed_vq)
 {
snprintf(dev->path, PATH_MAX, "%s", path);
dev->max_queue_pairs = queues;
@@ -347,6 +349,12 @@ virtio_user_dev_init(struct virtio_user_dev *dev, char 
*path, int queues,
PMD_INIT_LOG(ERR, "get_features failed: %s", strerror(errno));
return -1;
}
+
+   if (packed_vq)
+   dev->device_features |= (1ull << VIRTIO_F_RING_PACKED);
+   else
+   dev->device_features &= ~(1ull << VIRTIO_F_RING_PACKED);
+
if (dev->mac_specified)
dev->device_features |= (1ull << VIRTIO_NET_F_MAC);
 
diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.h 
b/drivers/net/virtio/virtio_user/virtio_user_dev.h
index 5f8755771..50c60c8a7 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.h
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.h
@@ -42,7 +42,8 @@ int is_vhost_user_by_type(const char *path);
 int virtio_user_start_device(struct virtio_user_dev *dev);
 int virtio_user_stop_device(struct virtio_user_dev *dev);
 int virtio_user_dev_init(struct virtio_user_dev *dev, char *path, int queues,
-int cq, int queue_size, const char *mac, char 
**ifname);
+int cq, int queue_size, const char *mac, char **ifname,
+int packed_vq);
 void virtio_user_dev_uninit(struct virtio_user_dev *dev);
 void virtio_user_handle_cq(struct virtio_user_dev *dev, uint16_t queue_idx);
 #endif
diff --git a/drivers/net/virtio/virtio_user_ethdev.c 
b/drivers/net/virtio/virtio_user_ethdev.c
index 263649006..1c6086f14 100644
--- a/drivers/net/virtio/virtio_user_ethdev.c
+++ b/drivers/net/virtio/virtio_user_ethdev.c
@@ -278,6 +278,8 @@ static const char *valid_args[] = {
VIRTIO_USER_ARG_QUEUE_SIZE,
 #define VIRTIO_USER_ARG_INTERFACE_NAME "iface"
VIRTIO_USER_ARG_INTERFACE_NAME,
+#define VIRTIO_USER_ARG_PACKED_VQ "packed_vq"
+   VIRTIO_USER_ARG_PACKED_VQ,
NULL
 };
 
@@ -382,6 +384,7 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev)
char *ifname = NULL;
char *mac_addr = NULL;
int ret = -1;
+   uint64_t packed_vq = 0;
 
kvlist = rte_kvargs_parse(rte_vdev_device_args(dev), valid_args);
if (!kvlist) {
@@ -456,6 +459,15 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev)
cq = 1;
}
 
+   if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_PACKED_VQ) == 1) {
+   if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_PACKED_VQ,
+  &get_integer_arg, &packed_vq) < 0) {
+   PMD_INIT_LOG(ERR, "error to parse %s",
+VIRTIO_USER_ARG_PACKED_VQ);
+   goto end;
+   }
+   }
+
if (queues > 1 && cq == 0) {
PMD_INIT_LOG(ERR, "multi-q requires ctrl-q");
goto end;
@@ -477,7 +489,8 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev)
 
hw = eth_dev->data->dev_private;
if (virtio_user_dev_init(hw->virtio_user_dev, path, queues, cq,
-queue_size, mac_addr, &ifname) < 0) {
+queue_size, mac_addr, &ifname,
+packed_vq) < 0) {
PMD_INIT_LOG(ERR, "virtio_user_dev_init fails");
virtio_user_eth_dev_free(eth_dev);
goto end;
-- 
2.14.3



[dpdk-dev] [PATCH v3 07/21] net/virtio: implement transmit path for packed queues

2018-04-05 Thread Jens Freimann
This implements the transmit path for devices with
support for Virtio 1.1.

Add the feature bit for Virtio 1.1 and enable code to
add buffers to vring and mark descriptors as available.

This is based on a patch by Yuanhan Liu.

Signed-off-by: Jens Freiman 
---
 drivers/net/virtio/virtio_ethdev.c |   8 ++-
 drivers/net/virtio/virtio_ethdev.h |   3 ++
 drivers/net/virtio/virtio_rxtx.c   | 102 -
 3 files changed, 111 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index cccefafe9..089a161ac 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -383,6 +383,8 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t 
vtpci_queue_idx)
vq->hw = hw;
vq->vq_queue_index = vtpci_queue_idx;
vq->vq_nentries = vq_size;
+   if (vtpci_packed_queue(hw))
+   vq->vq_ring.avail_wrap_counter = 1;
 
/*
 * Reserve a memzone for vring elements
@@ -1328,7 +1330,11 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev)
eth_dev->rx_pkt_burst = &virtio_recv_pkts;
}
 
-   if (hw->use_simple_tx) {
+   if (vtpci_packed_queue(hw)) {
+   PMD_INIT_LOG(INFO, "virtio: using virtio 1.1 Tx path on port 
%u",
+   eth_dev->data->port_id);
+   eth_dev->tx_pkt_burst = virtio_xmit_pkts_packed;
+   } else if (hw->use_simple_tx) {
PMD_INIT_LOG(INFO, "virtio: using simple Tx path on port %u",
eth_dev->data->port_id);
eth_dev->tx_pkt_burst = virtio_xmit_pkts_simple;
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index bb40064ea..d457013cb 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -36,6 +36,7 @@
 1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE |  \
 1u << VIRTIO_RING_F_INDIRECT_DESC |\
 1ULL << VIRTIO_F_VERSION_1   | \
+1ULL << VIRTIO_F_RING_PACKED | \
 1ULL << VIRTIO_F_IOMMU_PLATFORM)
 
 #define VIRTIO_PMD_SUPPORTED_GUEST_FEATURES\
@@ -85,6 +86,8 @@ uint16_t virtio_recv_mergeable_pkts(void *rx_queue, struct 
rte_mbuf **rx_pkts,
 
 uint16_t virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);
+uint16_t virtio_xmit_pkts_packed(void *tx_queue, struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts);
 
 uint16_t virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts);
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index a8aa87b32..9f9b5a8f8 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -38,6 +38,101 @@
 #define  VIRTIO_DUMP_PACKET(m, len) do { } while (0)
 #endif
 
+#define VIRTIO_SIMPLE_FLAGS ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS | \
+   ETH_TXQ_FLAGS_NOOFFLOADS)
+
+/* Cleanup from completed transmits. */
+static void
+virtio_xmit_cleanup_packed(struct virtqueue *vq)
+{
+   uint16_t idx;
+   uint16_t size = vq->vq_nentries;
+   struct vring_desc_packed *desc = vq->vq_ring.desc_packed;
+
+   idx = vq->vq_used_cons_idx & (size - 1);
+   while (desc_is_used(&desc[idx]) &&
+  vq->vq_free_cnt < size) {
+   vq->vq_free_cnt++;
+   idx = ++vq->vq_used_cons_idx & (size - 1);
+   }
+}
+
+uint16_t
+virtio_xmit_pkts_packed(void *tx_queue, struct rte_mbuf **tx_pkts,
+uint16_t nb_pkts)
+{
+   struct virtnet_tx *txvq = tx_queue;
+   struct virtqueue *vq = txvq->vq;
+   uint16_t i;
+   struct vring_desc_packed *desc = vq->vq_ring.desc_packed;
+   uint16_t idx;
+   struct vq_desc_extra *dxp;
+
+   if (unlikely(nb_pkts < 1))
+   return nb_pkts;
+
+   PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts);
+
+   if (likely(vq->vq_free_cnt < vq->vq_free_thresh))
+   virtio_xmit_cleanup_packed(vq);
+
+   for (i = 0; i < nb_pkts; i++) {
+   struct rte_mbuf *txm = tx_pkts[i];
+   struct virtio_tx_region *txr = txvq->virtio_net_hdr_mz->addr;
+   uint16_t head_idx;
+   int wrap_counter;
+
+   if (unlikely(txm->nb_segs + 1 > vq->vq_free_cnt)) {
+   virtio_xmit_cleanup_packed(vq);
+
+   if (unlikely(txm->nb_segs + 1 > vq->vq_free_cnt)) {
+   PMD_TX_LOG(ERR,
+  "No free tx descriptors to 
transmit");
+   break;
+   }
+   }
+
+   txvq->stats.bytes += txm->pkt_len;
+
+   vq->vq_free_cnt -= txm->nb_segs + 1;
+
+   idx = (vq->vq_avail_idx++) & (vq->vq_nentries - 1);
+   head_idx = idx;
+   wrap_counter = vq->vq_ring.avail_wrap_counter;
+
+ 

[dpdk-dev] [PATCH v3 04/21] net/virtio: add packed virtqueue helpers

2018-04-05 Thread Jens Freimann
Add helper functions to set/clear and check descriptor flags.

Signed-off-by: Jens Freimann 
---
 drivers/net/virtio/virtio_ring.h | 33 +
 drivers/net/virtio/virtqueue.c   | 10 ++
 2 files changed, 43 insertions(+)

diff --git a/drivers/net/virtio/virtio_ring.h b/drivers/net/virtio/virtio_ring.h
index 54a11d2a9..663b4b01d 100644
--- a/drivers/net/virtio/virtio_ring.h
+++ b/drivers/net/virtio/virtio_ring.h
@@ -78,6 +78,7 @@ struct vring_packed_desc_event {
 
 struct vring {
unsigned int num;
+   unsigned int avail_wrap_counter;
union {
struct vring_desc_packed *desc_packed;
struct vring_desc *desc;
@@ -92,6 +93,38 @@ struct vring {
};
 };
 
+static inline void toggle_wrap_counter(struct vring *vr)
+{
+   vr->avail_wrap_counter ^= 1;
+}
+
+static inline void _set_desc_avail(struct vring_desc_packed *desc,
+  int wrap_counter)
+{
+   uint16_t flags = desc->flags;
+
+   if (wrap_counter) {
+   flags |= VRING_DESC_F_AVAIL;
+   flags &= ~VRING_DESC_F_USED;
+   } else {
+   flags &= ~VRING_DESC_F_AVAIL;
+   flags |= VRING_DESC_F_USED;
+   }
+
+   desc->flags = flags;
+}
+
+static inline void set_desc_avail(struct vring *vr,
+ struct vring_desc_packed *desc)
+{
+   _set_desc_avail(desc, vr->avail_wrap_counter);
+}
+
+static inline int desc_is_used(struct vring_desc_packed *desc)
+{
+   return !(desc->flags & VRING_DESC_F_AVAIL) == !(desc->flags & 
VRING_DESC_F_USED);
+}
+
 /* The standard layout for the ring is a continuous chunk of memory which
  * looks like this.  We assume num is a power of 2.
  *
diff --git a/drivers/net/virtio/virtqueue.c b/drivers/net/virtio/virtqueue.c
index a7d0a9cbe..4f95ed5c8 100644
--- a/drivers/net/virtio/virtqueue.c
+++ b/drivers/net/virtio/virtqueue.c
@@ -58,6 +58,7 @@ virtqueue_detach_unused(struct virtqueue *vq)
 void
 virtqueue_rxvq_flush(struct virtqueue *vq)
 {
+   struct vring_desc_packed *descs = vq->vq_ring.desc_packed;
struct virtnet_rx *rxq = &vq->rxq;
struct virtio_hw *hw = vq->hw;
struct vring_used_elem *uep;
@@ -65,6 +66,15 @@ virtqueue_rxvq_flush(struct virtqueue *vq)
uint16_t used_idx, desc_idx;
uint16_t nb_used, i;
 
+   if (vtpci_packed_queue(vq->hw)) {
+   i = vq->vq_used_cons_idx & (vq->vq_nentries - 1);
+   while (desc_is_used(&descs[i])) {
+   rte_pktmbuf_free(vq->sw_ring[i]);
+   vq->vq_free_cnt++;
+   }
+   return;
+   }
+
nb_used = VIRTQUEUE_NUSED(vq);
 
for (i = 0; i < nb_used; i++) {
-- 
2.14.3



[dpdk-dev] [PATCH v3 08/21] net/virtio: implement receive path for packed queues

2018-04-05 Thread Jens Freimann
From: Yuanhan Liu 

Implement the receive part here. No support for mergeable buffers yet.

Signed-off-by: Jens Freimann 
Signed-off-by: Yuanhan Liu 
---
 drivers/net/virtio/virtio_ethdev.c |  10 ++-
 drivers/net/virtio/virtio_ethdev.h |   2 +
 drivers/net/virtio/virtio_rxtx.c   | 137 -
 3 files changed, 146 insertions(+), 3 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 089a161ac..dc220c743 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1315,10 +1315,15 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev)
 {
struct virtio_hw *hw = eth_dev->data->dev_private;
 
-   if (hw->use_simple_rx) {
+   /* workarount for packed vqs which don't support mrg_rxbuf at this 
point */
+   if (vtpci_packed_queue(hw) && vtpci_with_feature(hw, 
VIRTIO_NET_F_MRG_RXBUF)) {
+   eth_dev->rx_pkt_burst = &virtio_recv_pkts_packed;
+   } else if (hw->use_simple_rx) {
PMD_INIT_LOG(INFO, "virtio: using simple Rx path on port %u",
eth_dev->data->port_id);
eth_dev->rx_pkt_burst = virtio_recv_pkts_vec;
+   } else if (vtpci_packed_queue(hw)) {
+   eth_dev->rx_pkt_burst = &virtio_recv_pkts_packed;
} else if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
PMD_INIT_LOG(INFO,
"virtio: using mergeable buffer Rx path on port %u",
@@ -1474,7 +1479,8 @@ virtio_init_device(struct rte_eth_dev *eth_dev, uint64_t 
req_features)
 
/* Setting up rx_header size for the device */
if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF) ||
-   vtpci_with_feature(hw, VIRTIO_F_VERSION_1))
+   vtpci_with_feature(hw, VIRTIO_F_VERSION_1) ||
+   vtpci_with_feature(hw, VIRTIO_F_RING_PACKED))
hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr_mrg_rxbuf);
else
hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr);
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index d457013cb..3aeced4bb 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -80,6 +80,8 @@ int virtio_dev_tx_queue_setup_finish(struct rte_eth_dev *dev,
 
 uint16_t virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts);
+uint16_t virtio_recv_pkts_packed(void *rx_queue, struct rte_mbuf **rx_pkts,
+   uint16_t nb_pkts);
 
 uint16_t virtio_recv_mergeable_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts);
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 9f9b5a8f8..9220ae661 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -31,6 +31,7 @@
 #include "virtqueue.h"
 #include "virtio_rxtx.h"
 #include "virtio_rxtx_simple.h"
+#include "virtio_ring.h"
 
 #ifdef RTE_LIBRTE_VIRTIO_DEBUG_DUMP
 #define VIRTIO_DUMP_PACKET(m, len) rte_pktmbuf_dump(stdout, m, len)
@@ -521,10 +522,38 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, 
uint16_t queue_idx)
struct virtnet_rx *rxvq = &vq->rxq;
struct rte_mbuf *m;
uint16_t desc_idx;
-   int error, nbufs;
+   int error, nbufs = 0;
 
PMD_INIT_FUNC_TRACE();
 
+   if (vtpci_packed_queue(hw)) {
+   struct vring_desc_packed *desc;
+   struct vq_desc_extra *dxp;
+
+   for (desc_idx = 0; desc_idx < vq->vq_nentries;
+   desc_idx++) {
+   m = rte_mbuf_raw_alloc(rxvq->mpool);
+   if (unlikely(m == NULL))
+   return -ENOMEM;
+
+   dxp = &vq->vq_descx[desc_idx];
+   dxp->cookie = m;
+   dxp->ndescs = 1;
+
+   desc = &vq->vq_ring.desc_packed[desc_idx];
+   desc->addr = VIRTIO_MBUF_ADDR(m, vq) +
+   RTE_PKTMBUF_HEADROOM - hw->vtnet_hdr_size;
+   desc->len = m->buf_len - RTE_PKTMBUF_HEADROOM +
+   hw->vtnet_hdr_size;
+   desc->flags |= VRING_DESC_F_WRITE;
+   rte_smp_wmb();
+   set_desc_avail(&vq->vq_ring, desc);
+   }
+   toggle_wrap_counter(&vq->vq_ring);
+   nbufs = desc_idx;
+   goto out;
+   }
+
/* Allocate blank mbufs for the each rx descriptor */
nbufs = 0;
 
@@ -569,6 +598,7 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, 
uint16_t queue_idx)
vq_update_avail_idx(vq);
}
 
+out:
PMD_INIT_LOG(DEBUG, "Allocated %d bufs", nbufs);
 
VIRTQUEUE_DUMP(vq);
@@ -799,6 +829,111 @@ rx_offload_enabled(struct virtio_hw *hw)
vtpci_with_feature(hw, VIRTIO_NET_F_GUE

[dpdk-dev] [PATCH v3 11/21] vhost: add virtio 1.1 defines

2018-04-05 Thread Jens Freimann
This should actually be in the kernel header file, but it isn't
yet. For now let's use our own headers.

Signed-off-by: Jens Freimann 
---
 lib/librte_vhost/vhost.h  |  4 
 lib/librte_vhost/virtio-1.1.h | 18 ++
 2 files changed, 22 insertions(+)
 create mode 100644 lib/librte_vhost/virtio-1.1.h

diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index c14a90529..3004c26c1 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -177,6 +177,10 @@ struct vhost_msg {
 #ifndef VIRTIO_F_VERSION_1
  #define VIRTIO_F_VERSION_1 32
 #endif
+#ifndef VIRTIO_F_RING_PACKED
+ #define VIRTIO_F_RING_PACKED 34
+#endif
+#define VHOST_USER_F_PROTOCOL_FEATURES 30
 
 /* Features supported by this builtin vhost-user net driver. */
 #define VIRTIO_NET_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
diff --git a/lib/librte_vhost/virtio-1.1.h b/lib/librte_vhost/virtio-1.1.h
new file mode 100644
index 0..7b48caed7
--- /dev/null
+++ b/lib/librte_vhost/virtio-1.1.h
@@ -0,0 +1,18 @@
+#ifndef __VIRTIO_PACKED_H
+#define __VIRTIO_PACKED_H
+
+#define VRING_DESC_F_NEXT   1
+#define VRING_DESC_F_WRITE  2
+#define VRING_DESC_F_INDIRECT   4
+
+#define VRING_DESC_F_AVAIL  (1ULL << 7)
+#define VRING_DESC_F_USED  (1ULL << 15)
+
+struct vring_desc_packed {
+   uint64_t addr;
+   uint32_t len;
+   uint16_t index;
+   uint16_t flags;
+};
+
+#endif /* __VIRTIO_PACKED_H */
-- 
2.14.3



[dpdk-dev] [PATCH v3 09/21] vhost: disable packed virtqueues by default

2018-04-05 Thread Jens Freimann
Signed-off-by: Jens Freimann 
---
 config/common_base| 1 +
 lib/librte_vhost/socket.c | 4 
 2 files changed, 5 insertions(+)

diff --git a/config/common_base b/config/common_base
index cd4b419b4..bf969d82d 100644
--- a/config/common_base
+++ b/config/common_base
@@ -783,6 +783,7 @@ CONFIG_RTE_LIBRTE_PDUMP=y
 CONFIG_RTE_LIBRTE_VHOST=n
 CONFIG_RTE_LIBRTE_VHOST_NUMA=n
 CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
+CONFIG_RTE_LIBRTE_VHOST_PQ=n
 
 #
 # Compile vhost PMD
diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 636fc25c6..72d769e6a 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -850,6 +850,10 @@ rte_vhost_driver_register(const char *path, uint64_t flags)
vsocket->features &= ~(1ULL << VIRTIO_F_IOMMU_PLATFORM);
}
 
+#ifndef RTE_LIBRTE_VHOST_PQ
+   vsocket->features &= ~(1ULL << VIRTIO_F_RING_PACKED);
+#endif
+
if ((flags & RTE_VHOST_USER_CLIENT) != 0) {
vsocket->reconnect = !(flags & RTE_VHOST_USER_NO_RECONNECT);
if (vsocket->reconnect && reconn_tid == 0) {
-- 
2.14.3



[dpdk-dev] [PATCH v3 10/21] vhost: turn of indirect descriptors for packed virtqueues

2018-04-05 Thread Jens Freimann
Signed-off-by: Jens Freimann 
---
 lib/librte_vhost/socket.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c
index 72d769e6a..05193e368 100644
--- a/lib/librte_vhost/socket.c
+++ b/lib/librte_vhost/socket.c
@@ -852,6 +852,7 @@ rte_vhost_driver_register(const char *path, uint64_t flags)
 
 #ifndef RTE_LIBRTE_VHOST_PQ
vsocket->features &= ~(1ULL << VIRTIO_F_RING_PACKED);
+   vsocket->features &= ~(1ULL << VIRTIO_RING_F_INDIRECT_DESC);
 #endif
 
if ((flags & RTE_VHOST_USER_CLIENT) != 0) {
-- 
2.14.3



[dpdk-dev] [PATCH v3 13/21] vhost: add helpers for packed virtqueues

2018-04-05 Thread Jens Freimann
Add some helper functions to set/check descriptor flags
and toggle the used wrap counter.

Signed-off-by: Jens Freimann 
---
 lib/librte_vhost/virtio-1.1.h | 44 +++
 1 file changed, 44 insertions(+)

diff --git a/lib/librte_vhost/virtio-1.1.h b/lib/librte_vhost/virtio-1.1.h
index 7b48caed7..e77d7aa6c 100644
--- a/lib/librte_vhost/virtio-1.1.h
+++ b/lib/librte_vhost/virtio-1.1.h
@@ -15,4 +15,48 @@ struct vring_desc_packed {
uint16_t flags;
 };
 
+static inline void
+toggle_wrap_counter(struct vhost_virtqueue *vq)
+{
+   vq->used_wrap_counter ^= 1;
+}
+
+static inline int
+desc_is_avail(struct vhost_virtqueue *vq, struct vring_desc_packed *desc)
+{
+   if (vq->used_wrap_counter == 1) {
+   if ((desc->flags & VRING_DESC_F_AVAIL) &&
+   !(desc->flags & VRING_DESC_F_USED))
+   return 1;
+   }
+   if (vq->used_wrap_counter == 0) {
+   if (!(desc->flags & VRING_DESC_F_AVAIL) &&
+   (desc->flags & VRING_DESC_F_USED))
+   return 1;
+   }
+   return 0;
+}
+
+static inline void
+_set_desc_used(struct vring_desc_packed *desc, int wrap_counter)
+{
+   uint16_t flags = desc->flags;
+
+   if (wrap_counter == 1) {
+   flags |= VRING_DESC_F_USED;
+   flags |= VRING_DESC_F_AVAIL;
+   } else {
+   flags &= ~VRING_DESC_F_USED;
+   flags &= ~VRING_DESC_F_AVAIL;
+   }
+
+   desc->flags = flags;
+}
+
+static inline void
+set_desc_used(struct vhost_virtqueue *vq, struct vring_desc_packed *desc)
+{
+   _set_desc_used(desc, vq->used_wrap_counter);
+}
+
 #endif /* __VIRTIO_PACKED_H */
-- 
2.14.3



[dpdk-dev] [PATCH v3 15/21] vhost: packed queue enqueue path

2018-04-05 Thread Jens Freimann
Implement enqueue of packets to the receive virtqueue.

Set descriptor flag VIRTQ_DESC_F_USED and toggle used wrap counter if
last descriptor in ring is used. Perform a write memory barrier before
flags are written to descriptor.

Chained descriptors are not supported with this patch.

Signed-off-by: Jens Freimann 
---
 lib/librte_vhost/virtio_net.c | 129 ++
 1 file changed, 129 insertions(+)

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 7eea1da04..578e5612e 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -695,6 +695,135 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,
return pkt_idx;
 }
 
+static inline uint32_t __attribute__((always_inline))
+vhost_enqueue_burst_packed(struct virtio_net *dev, uint16_t queue_id,
+ struct rte_mbuf **pkts, uint32_t count)
+{
+   struct vhost_virtqueue *vq;
+   struct vring_desc_packed *descs;
+   uint16_t idx;
+   uint16_t mask;
+   uint16_t i;
+
+   vq = dev->virtqueue[queue_id];
+
+   rte_spinlock_lock(&vq->access_lock);
+
+   if (unlikely(vq->enabled == 0)) {
+   i = 0;
+   goto out_access_unlock;
+   }
+
+   if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
+   vhost_user_iotlb_rd_lock(vq);
+
+   descs = vq->desc_packed;
+   mask = vq->size - 1;
+
+   for (i = 0; i < count; i++) {
+   uint32_t desc_avail, desc_offset;
+   uint32_t mbuf_avail, mbuf_offset;
+   uint32_t cpy_len;
+   struct vring_desc_packed *desc;
+   uint64_t desc_addr;
+   struct virtio_net_hdr_mrg_rxbuf *hdr;
+   struct rte_mbuf *m = pkts[i];
+
+   /* XXX: there is an assumption that no desc will be chained */
+   idx = vq->last_used_idx & mask;
+   desc = &descs[idx];
+
+   if (!desc_is_avail(vq, desc))
+   break;
+   rte_smp_rmb();
+
+   desc_addr = vhost_iova_to_vva(dev, vq, desc->addr,
+ sizeof(*desc), VHOST_ACCESS_RW);
+   /*
+* Checking of 'desc_addr' placed outside of 'unlikely' macro
+* to avoid performance issue with some versions of gcc (4.8.4
+* and 5.3.0) which otherwise stores offset on the stack instead
+* of in a register.
+*/
+   if (unlikely(desc->len < dev->vhost_hlen) || !desc_addr)
+   break;
+
+   hdr = (struct virtio_net_hdr_mrg_rxbuf *)(uintptr_t)desc_addr;
+   virtio_enqueue_offload(m, &hdr->hdr);
+   vhost_log_write(dev, desc->addr, dev->vhost_hlen);
+   PRINT_PACKET(dev, (uintptr_t)desc_addr, dev->vhost_hlen, 0);
+
+   desc_offset = dev->vhost_hlen;
+   desc_avail  = desc->len - dev->vhost_hlen;
+
+   mbuf_avail  = rte_pktmbuf_data_len(m);
+   mbuf_offset = 0;
+   while (mbuf_avail != 0 || m->next != NULL) {
+   /* done with current mbuf, fetch next */
+   if (mbuf_avail == 0) {
+   m = m->next;
+
+   mbuf_offset = 0;
+   mbuf_avail  = rte_pktmbuf_data_len(m);
+   }
+
+   /* done with current desc buf, fetch next */
+   if (desc_avail == 0) {
+   if ((desc->flags & VRING_DESC_F_NEXT) == 0) {
+   /* Room in vring buffer is not enough */
+   goto out;
+   }
+
+   idx = (idx+1) & (vq->size - 1);
+   desc = &descs[idx];
+   if (unlikely(!desc_is_avail(vq, desc)))
+   goto out ;
+
+   desc_addr = vhost_iova_to_vva(dev, vq, 
desc->addr,
+ sizeof(*desc),
+ VHOST_ACCESS_RW);
+   if (unlikely(!desc_addr))
+   goto out;
+
+   desc_offset = 0;
+   desc_avail  = desc->len;
+   }
+
+   cpy_len = RTE_MIN(desc_avail, mbuf_avail);
+   rte_memcpy((void *)((uintptr_t)(desc_addr + 
desc_offset)),
+   rte_pktmbuf_mtod_offset(m, void *, mbuf_offset),
+   cpy_len);
+   vhost_log_write(dev, desc->addr + desc_offset, cpy_len);
+   PRINT_PACKET(dev, (uintptr_t)(desc_addr + desc_offs

[dpdk-dev] [PATCH v3 12/21] vhost: vring address setup for packed queues

2018-04-05 Thread Jens Freimann
From: Yuanhan Liu 

Add code to set up packed queues when enabled.

Signed-off-by: Yuanhan Liu 
Signed-off-by: Jens Freimann 
---
 lib/librte_vhost/vhost.h  |  1 +
 lib/librte_vhost/vhost_user.c | 21 -
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 3004c26c1..20d78f883 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -74,6 +74,7 @@ struct batch_copy_elem {
  */
 struct vhost_virtqueue {
struct vring_desc   *desc;
+   struct vring_desc_packed   *desc_packed;
struct vring_avail  *avail;
struct vring_used   *used;
uint32_tsize;
diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c
index 157cf2f60..183893e46 100644
--- a/lib/librte_vhost/vhost_user.c
+++ b/lib/librte_vhost/vhost_user.c
@@ -469,6 +469,23 @@ translate_ring_addresses(struct virtio_net *dev, int 
vq_index)
struct vhost_virtqueue *vq = dev->virtqueue[vq_index];
struct vhost_vring_addr *addr = &vq->ring_addrs;
 
+   if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) {
+   vq->desc_packed = (struct vring_desc_packed *) ring_addr_to_vva
+   (dev, vq, addr->desc_user_addr, 
sizeof(vq->desc_packed));
+   vq->desc = NULL;
+   vq->avail = NULL;
+   vq->used = NULL;
+   vq->log_guest_addr = 0;
+
+   if (vq->last_used_idx != 0) {
+   RTE_LOG(WARNING, VHOST_CONFIG,
+   "last_used_idx (%u) not 0\n",
+   vq->last_used_idx);
+   vq->last_used_idx = 0;
+   }
+   return dev;
+   }
+
/* The addresses are converted from QEMU virtual to Vhost virtual. */
if (vq->desc && vq->avail && vq->used)
return dev;
@@ -481,6 +498,7 @@ translate_ring_addresses(struct virtio_net *dev, int 
vq_index)
dev->vid);
return dev;
}
+   vq->desc_packed = NULL;
 
dev = numa_realloc(dev, vq_index);
vq = dev->virtqueue[vq_index];
@@ -853,7 +871,8 @@ vhost_user_set_mem_table(struct virtio_net *dev, struct 
VhostUserMsg *pmsg)
 static int
 vq_is_ready(struct vhost_virtqueue *vq)
 {
-   return vq && vq->desc && vq->avail && vq->used &&
+   return vq &&
+  (vq->desc_packed || (vq->desc && vq->avail && vq->used)) &&
   vq->kickfd != VIRTIO_UNINITIALIZED_EVENTFD &&
   vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD;
 }
-- 
2.14.3



[dpdk-dev] [PATCH v3 14/21] vhost: dequeue for packed queues

2018-04-05 Thread Jens Freimann
Implement code to dequeue and process descriptors from
the vring if VIRTIO_F_RING_PACKED is enabled.

Check if descriptor was made available by driver by looking at
VIRTIO_F_DESC_AVAIL flag in descriptor. If so dequeue and set
the used flag VIRTIO_F_DESC_USED to the current value of the
used wrap counter.

Used ring wrap counter needs to be toggled when last descriptor is
written out. This allows the host/guest to detect new descriptors even
after the ring has wrapped.

Signed-off-by: Jens Freimann 
---
 lib/librte_vhost/vhost.c  |   1 +
 lib/librte_vhost/vhost.h  |   1 +
 lib/librte_vhost/virtio_net.c | 228 ++
 3 files changed, 230 insertions(+)

diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 1f17cdd75..eb5a98875 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -185,6 +185,7 @@ init_vring_queue(struct virtio_net *dev, uint32_t vring_idx)
 
vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD;
vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD;
+   vq->used_wrap_counter = 1;
 
vhost_user_iotlb_init(dev, vring_idx);
/* Backends are set to -1 indicating an inactive device. */
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index 20d78f883..c8aa946fd 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -112,6 +112,7 @@ struct vhost_virtqueue {
 
struct batch_copy_elem  *batch_copy_elems;
uint16_tbatch_copy_nb_elems;
+   uint16_tused_wrap_counter;
 
rte_rwlock_tiotlb_lock;
rte_rwlock_tiotlb_pending_lock;
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index ed7198dbb..7eea1da04 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -19,6 +19,7 @@
 
 #include "iotlb.h"
 #include "vhost.h"
+#include "virtio-1.1.h"
 
 #define MAX_PKT_BURST 32
 
@@ -1118,6 +1119,233 @@ restore_mbuf(struct rte_mbuf *m)
}
 }
 
+static inline uint16_t
+dequeue_desc_packed(struct virtio_net *dev, struct vhost_virtqueue *vq,
+struct rte_mempool *mbuf_pool, struct rte_mbuf *m,
+struct vring_desc_packed *descs)
+{
+   struct vring_desc_packed *desc;
+   uint64_t desc_addr;
+   uint32_t desc_avail, desc_offset;
+   uint32_t mbuf_avail, mbuf_offset;
+   uint32_t cpy_len;
+   struct rte_mbuf *cur = m, *prev = m;
+   struct virtio_net_hdr *hdr = NULL;
+   uint16_t head_idx = vq->last_used_idx & (vq->size - 1);
+   int wrap_counter = vq->used_wrap_counter;
+   int rc = 0;
+
+   rte_spinlock_lock(&vq->access_lock);
+
+   if (unlikely(vq->enabled == 0))
+   goto out;
+
+   if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
+   vhost_user_iotlb_rd_lock(vq);
+
+   desc = &descs[vq->last_used_idx & (vq->size - 1)];
+   if (unlikely((desc->len < dev->vhost_hlen)) ||
+   (desc->flags & VRING_DESC_F_INDIRECT)) {
+   RTE_LOG(ERR, VHOST_DATA,
+   "INDIRECT not supported yet\n");
+   rc = -1;
+   goto out;
+   }
+
+   desc_addr = vhost_iova_to_vva(dev, vq, desc->addr,
+ sizeof(*desc), VHOST_ACCESS_RO);
+
+   if (unlikely(!desc_addr)) {
+   rc = -1;
+   goto out;
+   }
+
+   if (virtio_net_with_host_offload(dev)) {
+   hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr);
+   rte_prefetch0(hdr);
+   }
+
+   /*
+* A virtio driver normally uses at least 2 desc buffers
+* for Tx: the first for storing the header, and others
+* for storing the data.
+*/
+   if (likely((desc->len == dev->vhost_hlen) &&
+  (desc->flags & VRING_DESC_F_NEXT) != 0)) {
+   if ((++vq->last_used_idx & (vq->size - 1)) == 0)
+   toggle_wrap_counter(vq);
+
+   desc = &descs[vq->last_used_idx & (vq->size - 1)];
+
+   if (unlikely(desc->flags & VRING_DESC_F_INDIRECT)) {
+   RTE_LOG(ERR, VHOST_DATA,
+   "INDIRECT not supported yet\n");
+   rc = -1;
+   goto out;
+   }
+
+   desc_addr = vhost_iova_to_vva(dev, vq, desc->addr,
+ sizeof(*desc), VHOST_ACCESS_RO);
+   if (unlikely(!desc_addr)) {
+   rc = -1;
+   goto out;
+   }
+
+   desc_offset = 0;
+   desc_avail  = desc->len;
+   } else {
+   desc_avail  = desc->len - dev->vhost_hlen;
+   desc_offset = dev->vhost_hlen;
+   }
+
+   rte_prefetch0((void *)(uintptr_t)(desc_addr + desc_offset));
+
+   PRINT_PACKET(dev, (uintptr_t)(desc_addr + desc_offset),

[dpdk-dev] [PATCH v3 19/21] vhost: support mergeable rx buffers with packed queues

2018-04-05 Thread Jens Freimann
This implements support for mergeable receive buffers in vhost when using
packed virtqueues. The difference to split virtqueues is not big, it differs
mostly where descriptor flags are touched and virtio features are checked.

Signed-off-by: Jens Freimann 
---
 lib/librte_vhost/vhost.c  |   2 +
 lib/librte_vhost/virtio_net.c | 160 +-
 2 files changed, 127 insertions(+), 35 deletions(-)

diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index eb5a98875..3c633e71e 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -580,6 +580,8 @@ rte_vhost_enable_guest_notification(int vid, uint16_t 
queue_id, int enable)
 
if (dev == NULL)
return -1;
+   if (dev->features & (1ULL << VIRTIO_F_RING_PACKED))
+   return -1;
 
if (enable) {
RTE_LOG(ERR, VHOST_CONFIG,
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 18e67fdc1..b82c24081 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -401,17 +401,53 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 }
 
 static __rte_always_inline int
-fill_vec_buf(struct virtio_net *dev, struct vhost_virtqueue *vq,
-uint32_t avail_idx, uint32_t *vec_idx,
-struct buf_vector *buf_vec, uint16_t *desc_chain_head,
-uint16_t *desc_chain_len)
+__fill_vec_buf_packed(struct virtio_net *dev, struct vhost_virtqueue *vq,
+struct buf_vector *buf_vec,
+uint32_t *len, uint32_t *vec_id)
+{
+   uint16_t idx = vq->last_avail_idx & (vq->size - 1);
+   struct vring_desc_packed *descs= vq->desc_packed;
+   uint32_t _vec_id = *vec_id;
+
+   if (vq->desc_packed[idx].flags & VRING_DESC_F_INDIRECT) {
+   descs = (struct vring_desc_packed *)(uintptr_t)
+   vhost_iova_to_vva(dev, vq, vq->desc_packed[idx].addr,
+   vq->desc_packed[idx].len,
+   VHOST_ACCESS_RO);
+   if (unlikely(!descs))
+   return -1;
+
+   idx = 0;
+   }
+
+   while (1) {
+   if (unlikely(_vec_id >= BUF_VECTOR_MAX || idx >= vq->size))
+   return -1;
+
+   *len += descs[idx & (vq->size - 1)].len;
+   buf_vec[_vec_id].buf_addr = descs[idx].addr;
+   buf_vec[_vec_id].buf_len  = descs[idx].len;
+   buf_vec[_vec_id].desc_idx = idx;
+   _vec_id++;
+
+   if ((descs[idx & (vq->size - 1)].flags & VRING_DESC_F_NEXT) == 
0)
+   break;
+
+   idx++;
+   }
+   *vec_id = _vec_id;
+
+   return 0;
+}
+
+static __rte_always_inline int
+__fill_vec_buf_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
+struct buf_vector *buf_vec,
+uint32_t *len, uint32_t *vec_id, uint32_t avail_idx)
 {
uint16_t idx = vq->avail->ring[avail_idx & (vq->size - 1)];
-   uint32_t vec_id = *vec_idx;
-   uint32_t len= 0;
struct vring_desc *descs = vq->desc;
-
-   *desc_chain_head = idx;
+   uint32_t _vec_id = *vec_id;
 
if (vq->desc[idx].flags & VRING_DESC_F_INDIRECT) {
descs = (struct vring_desc *)(uintptr_t)
@@ -425,20 +461,53 @@ fill_vec_buf(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
}
 
while (1) {
-   if (unlikely(vec_id >= BUF_VECTOR_MAX || idx >= vq->size))
+   if (unlikely(_vec_id >= BUF_VECTOR_MAX || idx >= vq->size))
return -1;
 
-   len += descs[idx].len;
-   buf_vec[vec_id].buf_addr = descs[idx].addr;
-   buf_vec[vec_id].buf_len  = descs[idx].len;
-   buf_vec[vec_id].desc_idx = idx;
-   vec_id++;
+   *len += descs[idx].len;
+   buf_vec[_vec_id].buf_addr = descs[idx].addr;
+   buf_vec[_vec_id].buf_len  = descs[idx].len;
+   buf_vec[_vec_id].desc_idx = idx;
+   _vec_id++;
 
if ((descs[idx].flags & VRING_DESC_F_NEXT) == 0)
break;
 
idx = descs[idx].next;
}
+   *vec_id = _vec_id;
+
+   return 0;
+}
+
+static __rte_always_inline int
+fill_vec_buf(struct virtio_net *dev, struct vhost_virtqueue *vq,
+uint32_t avail_idx, uint32_t *vec_idx,
+struct buf_vector *buf_vec, uint16_t *desc_chain_head,
+uint16_t *desc_chain_len)
+{
+   uint16_t idx;
+   uint32_t vec_id = *vec_idx;
+   uint32_t len= 0;
+
+   if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) {
+   idx = vq->last_avail_idx & (vq->size -1);
+   } else {
+   idx = vq->avail->ring[av

[dpdk-dev] [PATCH v3 17/21] net/virtio: disable ctrl virtqueue for packed rings

2018-04-05 Thread Jens Freimann
Signed-off-by: Jens Freiman 
---
 drivers/net/virtio/virtio_ethdev.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index dc220c743..7367d9c5d 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1157,6 +1157,13 @@ virtio_negotiate_features(struct virtio_hw *hw, uint64_t 
req_features)
req_features &= ~(1ull << VIRTIO_F_RING_PACKED);
 #endif
 
+   if (req_features & (1ULL << VIRTIO_F_RING_PACKED)) {
+   req_features &= ~(1ull << VIRTIO_NET_F_CTRL_MAC_ADDR);
+   req_features &= ~(1ull << VIRTIO_NET_F_CTRL_VQ);
+   req_features &= ~(1ull << VIRTIO_NET_F_CTRL_RX);
+   req_features &= ~(1ull << VIRTIO_NET_F_CTRL_VLAN);
+   }
+
/*
 * Negotiate features: Subset of device feature bits are written back
 * guest feature bits.
-- 
2.14.3



[dpdk-dev] [PATCH v3 21/21] vhost: add event suppression for packed queues

2018-04-05 Thread Jens Freimann
Signed-off-by: Jens Freimann 
---
 lib/librte_vhost/vhost.c  | 17 +---
 lib/librte_vhost/vhost.h  | 62 ---
 lib/librte_vhost/vhost_user.c | 19 +
 lib/librte_vhost/virtio_net.c |  1 +
 4 files changed, 86 insertions(+), 13 deletions(-)

diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c
index 3c633e71e..b0fc1c1ac 100644
--- a/lib/librte_vhost/vhost.c
+++ b/lib/librte_vhost/vhost.c
@@ -577,11 +577,21 @@ int
 rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable)
 {
struct virtio_net *dev = get_device(vid);
+   struct vhost_virtqueue *vq;
 
if (dev == NULL)
return -1;
-   if (dev->features & (1ULL << VIRTIO_F_RING_PACKED))
-   return -1;
+
+   vq = dev->virtqueue[queue_id];
+   if (!vq->enabled)
+   return 0;
+
+   if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) {
+   if (!enable) {
+   vq->driver_event->desc_event_flags |= 
RING_EVENT_FLAGS_DISABLE;
+   } else
+   vq->driver_event->desc_event_flags |= 
RING_EVENT_FLAGS_ENABLE;
+   }
 
if (enable) {
RTE_LOG(ERR, VHOST_CONFIG,
@@ -589,7 +599,8 @@ rte_vhost_enable_guest_notification(int vid, uint16_t 
queue_id, int enable)
return -1;
}
 
-   dev->virtqueue[queue_id]->used->flags = VRING_USED_F_NO_NOTIFY;
+   if (!(dev->features & (1ULL << VIRTIO_F_RING_PACKED)))
+   dev->virtqueue[queue_id]->used->flags = VRING_USED_F_NO_NOTIFY;
return 0;
 }
 
diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index a0b61b7b0..77eeed8c9 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -69,14 +69,31 @@ struct batch_copy_elem {
uint64_t log_addr;
 };
 
+#define RING_EVENT_FLAGS_ENABLE 0x0
+#define RING_EVENT_FLAGS_DISABLE 0x1
+#define RING_EVENT_FLAGS_DESC 0x2
+#define RING_EVENT_FLAGS_MASK 0xFFFC
+#define RING_EVENT_WRAP_MASK 0x8000
+#define RING_EVENT_OFF_MASK 0x7FFF
+struct vring_packed_desc_event {
+uint16_t desc_event_off_wrap;
+uint16_t desc_event_flags;
+};
+
 /**
  * Structure contains variables relevant to RX/TX virtqueues.
  */
 struct vhost_virtqueue {
struct vring_desc   *desc;
struct vring_desc_packed   *desc_packed;
-   struct vring_avail  *avail;
-   struct vring_used   *used;
+   union {
+   struct vring_avail  *avail;
+   struct vring_packed_desc_event *driver_event;
+   };
+   union {
+   struct vring_used   *used;
+   struct vring_packed_desc_event *device_event;
+   };
uint32_tsize;
 
uint16_tlast_avail_idx;
@@ -211,7 +228,6 @@ struct vhost_msg {
(1ULL << VIRTIO_NET_F_MTU) | \
(1ULL << VIRTIO_F_IOMMU_PLATFORM))
 
-
 struct guest_page {
uint64_t guest_phys_addr;
uint64_t host_phys_addr;
@@ -427,6 +443,11 @@ vhost_need_event(uint16_t event_idx, uint16_t new_idx, 
uint16_t old)
 static __rte_always_inline void
 vhost_vring_call(struct virtio_net *dev, struct vhost_virtqueue *vq)
 {
+   uint16_t off_wrap, wrap = 0;
+   uint16_t event_flags;
+   uint16_t event_idx = 0;
+   int do_kick = 0;
+
/* Flush used->idx update before we read avail->flags. */
rte_mb();
 
@@ -434,22 +455,43 @@ vhost_vring_call(struct virtio_net *dev, struct 
vhost_virtqueue *vq)
if (dev->features & (1ULL << VIRTIO_RING_F_EVENT_IDX)) {
uint16_t old = vq->signalled_used;
uint16_t new = vq->last_used_idx;
+   if (dev->features & (1ULL << VIRTIO_F_RING_PACKED)) {
+   event_flags = vq->driver_event->desc_event_flags &
+   RING_EVENT_FLAGS_MASK;
+   if (!(event_flags & RING_EVENT_FLAGS_DESC))
+   do_kick = event_flags & RING_EVENT_FLAGS_ENABLE 
? 1 : 0;
+   else {
+   off_wrap = 
vq->driver_event->desc_event_off_wrap;
+   wrap = off_wrap & RING_EVENT_WRAP_MASK;
+   event_idx = off_wrap & RING_EVENT_OFF_MASK;
+   }
+   if (vhost_need_event(event_idx, new, old)
+   && (vq->callfd >= 0) && (wrap == 
vq->used_wrap_counter)) {
+   vq->signalled_used = vq->last_used_idx;
+   do_kick = 1;
+   }
+   } else {
+   event_idx = vhost_used_event(vq);
+   if (vhost_need_event(event_idx, new, old)
+   && (vq->callfd >= 0)) {
+   vq->signalled_used = vq->last_used_idx;

[dpdk-dev] [PATCH v3 16/21] vhost: enable packed virtqueues

2018-04-05 Thread Jens Freimann
From: Yuanhan Liu 

This patch enables the code do enqueue and dequeue packed to/from a
packed virtqueue.  Add feature bit for packed virtqueues as defined in
Virtio 1.1 draft.

Signed-off-by: Jens Freimann 
Signed-off-by: Yuanhan Liu 
---
 lib/librte_vhost/vhost.h  | 1 +
 lib/librte_vhost/virtio_net.c | 7 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h
index c8aa946fd..a0b61b7b0 100644
--- a/lib/librte_vhost/vhost.h
+++ b/lib/librte_vhost/vhost.h
@@ -192,6 +192,7 @@ struct vhost_msg {
(1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) | \
(1ULL << VIRTIO_NET_F_MQ)  | \
(1ULL << VIRTIO_F_VERSION_1)   | \
+   (1ULL << VIRTIO_F_RING_PACKED) | \
(1ULL << VHOST_F_LOG_ALL)  | \
(1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \
(1ULL << VIRTIO_NET_F_GSO) | \
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 578e5612e..18e67fdc1 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -840,7 +840,9 @@ rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
return 0;
}
 
-   if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF))
+   if (dev->features & (1ULL << VIRTIO_F_RING_PACKED))
+   return vhost_enqueue_burst_packed(dev, queue_id, pkts, count);
+   else if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF))
return virtio_dev_merge_rx(dev, queue_id, pkts, count);
else
return virtio_dev_rx(dev, queue_id, pkts, count);
@@ -1513,6 +1515,9 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
if (unlikely(vq->enabled == 0))
goto out_access_unlock;
 
+   if (dev->features & (1ULL << VIRTIO_F_RING_PACKED))
+   return vhost_dequeue_burst_packed(dev, vq, mbuf_pool, pkts, 
count);
+
vq->batch_copy_nb_elems = 0;
 
if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
-- 
2.14.3



[dpdk-dev] [PATCH v3 18/21] net/virtio: add support for mergeable buffers with packed virtqueues

2018-04-05 Thread Jens Freimann
Implement support for receiving merged buffers in virtio when packed virtqueues
are enabled.

Signed-off-by: Jens Freimann 
---
 drivers/net/virtio/virtio_ethdev.c |  10 ++--
 drivers/net/virtio/virtio_rxtx.c   | 107 +
 drivers/net/virtio/virtqueue.h |   1 +
 3 files changed, 104 insertions(+), 14 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 7367d9c5d..a3c3376d7 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1322,15 +1322,15 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev)
 {
struct virtio_hw *hw = eth_dev->data->dev_private;
 
-   /* workarount for packed vqs which don't support mrg_rxbuf at this 
point */
-   if (vtpci_packed_queue(hw) && vtpci_with_feature(hw, 
VIRTIO_NET_F_MRG_RXBUF)) {
-   eth_dev->rx_pkt_burst = &virtio_recv_pkts_packed;
-   } else if (hw->use_simple_rx) {
+   if (hw->use_simple_rx) {
PMD_INIT_LOG(INFO, "virtio: using simple Rx path on port %u",
eth_dev->data->port_id);
eth_dev->rx_pkt_burst = virtio_recv_pkts_vec;
} else if (vtpci_packed_queue(hw)) {
-   eth_dev->rx_pkt_burst = &virtio_recv_pkts_packed;
+   if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF))
+   eth_dev->rx_pkt_burst = &virtio_recv_mergeable_pkts;
+   else
+   eth_dev->rx_pkt_burst = &virtio_recv_pkts_packed;
} else if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) {
PMD_INIT_LOG(INFO,
"virtio: using mergeable buffer Rx path on port %u",
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 9220ae661..a48ca6aaa 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -155,8 +155,8 @@ vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx)
vq->vq_free_cnt = (uint16_t)(vq->vq_free_cnt + dxp->ndescs);
if ((dp->flags & VRING_DESC_F_INDIRECT) == 0) {
while (dp->flags & VRING_DESC_F_NEXT) {
-   desc_idx_last = dp->next;
-   dp = &vq->vq_ring.desc[dp->next];
+   desc_idx_last = desc_idx++;
+   dp = &vq->vq_ring.desc[desc_idx];
}
}
dxp->ndescs = 0;
@@ -177,6 +177,76 @@ vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx)
dp->next = VQ_RING_DESC_CHAIN_END;
 }
 
+static void
+virtio_refill_packed(struct virtqueue *vq, uint16_t used_idx, struct 
virtnet_rx *rxvq)
+{
+   struct vq_desc_extra *dxp;
+   struct vring_desc_packed *descs = vq->vq_ring.desc_packed;
+   struct vring_desc_packed *desc;
+   struct rte_mbuf *nmb;
+
+   nmb = rte_mbuf_raw_alloc(rxvq->mpool);
+   if (unlikely(nmb == NULL)) {
+   struct rte_eth_dev *dev
+   = &rte_eth_devices[rxvq->port_id];
+   dev->data->rx_mbuf_alloc_failed++;
+   return;
+   }
+
+   desc = &descs[used_idx & (vq->vq_nentries - 1)];
+
+   dxp = &vq->vq_descx[used_idx & (vq->vq_nentries - 1)];
+
+   dxp->cookie = nmb;
+   dxp->ndescs = 1;
+
+   desc->addr = VIRTIO_MBUF_ADDR(nmb, vq) +
+   RTE_PKTMBUF_HEADROOM - vq->hw->vtnet_hdr_size;
+   desc->len = nmb->buf_len - RTE_PKTMBUF_HEADROOM +
+   vq->hw->vtnet_hdr_size;
+   desc->flags |= VRING_DESC_F_WRITE;
+
+
+}
+
+static uint16_t
+virtqueue_dequeue_burst_rx_packed(struct virtqueue *vq, struct rte_mbuf 
**rx_pkts,
+  uint32_t *len, uint16_t num, struct virtnet_rx 
*rx_queue)
+{
+   struct rte_mbuf *cookie;
+   uint16_t used_idx;
+   struct vring_desc_packed *desc;
+   uint16_t i;
+
+   for (i = 0; i < num; i++) {
+   used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 
1));
+   desc = &vq->vq_ring.desc_packed[used_idx];
+   if (!desc_is_used(desc))
+   return i;
+   len[i] = desc->len;
+   cookie = (struct rte_mbuf *)vq->vq_descx[used_idx].cookie;
+
+   if (unlikely(cookie == NULL)) {
+   PMD_DRV_LOG(ERR, "vring descriptor with no mbuf cookie 
at %u",
+   vq->vq_used_cons_idx);
+   break;
+   }
+   rte_prefetch0(cookie);
+   rte_packet_prefetch(rte_pktmbuf_mtod(cookie, void *));
+   rx_pkts[i] = cookie;
+
+   virtio_refill_packed(vq, used_idx, rx_queue);
+
+   rte_smp_wmb();
+   if ((vq->vq_used_cons_idx & (vq->vq_nentries - 1)) == 0)
+   toggle_wrap_counter(&vq->vq_ring);
+   set_desc_avail(&vq->vq_ring, desc);
+   vq->vq_used_cons_idx++;
+   }
+
+   return i;
+}
+
 static ui

[dpdk-dev] [PATCH v3 20/21] net/virtio: add support for event suppression

2018-04-05 Thread Jens Freimann
Signed-off-by: Jens Freimann 
---
 drivers/net/virtio/virtio_ethdev.c |  2 +-
 drivers/net/virtio/virtio_ethdev.h |  2 +-
 drivers/net/virtio/virtio_rxtx.c   | 15 +++-
 drivers/net/virtio/virtqueue.h | 73 --
 4 files changed, 86 insertions(+), 6 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index a3c3376d7..65a6a9d89 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -727,7 +727,7 @@ virtio_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, 
uint16_t queue_id)
struct virtnet_rx *rxvq = dev->data->rx_queues[queue_id];
struct virtqueue *vq = rxvq->vq;
 
-   virtqueue_enable_intr(vq);
+   virtqueue_enable_intr(vq, 0, 0);
return 0;
 }
 
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index 3aeced4bb..19d3f2617 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -37,7 +37,7 @@
 1u << VIRTIO_RING_F_INDIRECT_DESC |\
 1ULL << VIRTIO_F_VERSION_1   | \
 1ULL << VIRTIO_F_RING_PACKED | \
-1ULL << VIRTIO_F_IOMMU_PLATFORM)
+1ULL << VIRTIO_RING_F_EVENT_IDX)
 
 #define VIRTIO_PMD_SUPPORTED_GUEST_FEATURES\
(VIRTIO_PMD_DEFAULT_GUEST_FEATURES |\
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index a48ca6aaa..ed65434ce 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -127,6 +127,10 @@ virtio_xmit_pkts_packed(void *tx_queue, struct rte_mbuf 
**tx_pkts,
 
rte_smp_wmb();
_set_desc_avail(&desc[head_idx], wrap_counter);
+   if (unlikely(virtqueue_kick_prepare_packed(vq))) {
+   virtqueue_notify(vq);
+   PMD_RX_LOG(DEBUG, "Notified");
+   }
}
 
txvq->stats.packets += i;
@@ -998,6 +1002,10 @@ virtio_recv_pkts_packed(void *rx_queue, struct rte_mbuf 
**rx_pkts,
}
 
rxvq->stats.packets += nb_rx;
+   if (nb_rx > 0 && unlikely(virtqueue_kick_prepare_packed(vq))) {
+   virtqueue_notify(vq);
+   PMD_RX_LOG(DEBUG, "Notified");
+   }
 
vq->vq_used_cons_idx = used_idx;
 
@@ -1276,8 +1284,13 @@ virtio_recv_mergeable_pkts(void *rx_queue,
 
rxvq->stats.packets += nb_rx;
 
-   if (vtpci_packed_queue(vq->hw))
+   if (vtpci_packed_queue(vq->hw)) {
+   if (unlikely(virtqueue_kick_prepare(vq))) {
+   virtqueue_notify(vq);
+   PMD_RX_LOG(DEBUG, "Notified");
+   }
return nb_rx;
+   }
 
/* Allocate new mbuf for the used descriptor */
error = ENOSPC;
diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h
index 7196bd717..6fd3317d2 100644
--- a/drivers/net/virtio/virtqueue.h
+++ b/drivers/net/virtio/virtqueue.h
@@ -176,6 +176,8 @@ struct virtqueue {
uint16_t vq_free_cnt;  /**< num of desc available */
uint16_t vq_avail_idx; /**< sync until needed */
uint16_t vq_free_thresh; /**< free threshold */
+   uint16_t vq_signalled_avail;
+   int vq_signalled_avail_valid;
 
void *vq_ring_virt_mem;  /**< linear address of vring*/
unsigned int vq_ring_size;
@@ -273,16 +275,34 @@ vring_desc_init(struct vring_desc *dp, uint16_t n)
 static inline void
 virtqueue_disable_intr(struct virtqueue *vq)
 {
-   vq->vq_ring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
+   if (vtpci_packed_queue(vq->hw) && vtpci_with_feature(vq->hw,
+   VIRTIO_RING_F_EVENT_IDX))
+   vq->vq_ring.device_event->desc_event_flags = 
RING_EVENT_FLAGS_DISABLE;
+   else
+   vq->vq_ring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
 }
 
 /**
  * Tell the backend to interrupt us.
  */
 static inline void
-virtqueue_enable_intr(struct virtqueue *vq)
+virtqueue_enable_intr(struct virtqueue *vq, uint16_t off, uint16_t 
wrap_counter)
 {
-   vq->vq_ring.avail->flags &= (~VRING_AVAIL_F_NO_INTERRUPT);
+   uint16_t *flags = &vq->vq_ring.device_event->desc_event_flags;
+   uint16_t *event_off_wrap = 
&vq->vq_ring.device_event->desc_event_off_wrap;
+   if (vtpci_packed_queue(vq->hw)) {
+   *flags = 0;
+   *event_off_wrap = 0;
+   if (*event_off_wrap & RING_EVENT_FLAGS_DESC) {
+   *event_off_wrap = off | 0x7FFF;
+   *event_off_wrap |= wrap_counter << 15;
+   *flags |= RING_EVENT_FLAGS_DESC;
+   } else
+   *event_off_wrap = 0;
+   *flags |= RING_EVENT_FLAGS_ENABLE;
+   } else {
+   vq->vq_ring.avail->flags &= (~VRING_AVAIL_F_NO_INTERRUPT);
+   }
 }
 
 /**
@@ -342,12 +362,59 @@ vq_update_avail_ring(struct virtqueue *vq, uint16_t 
desc

Re: [dpdk-dev] [PATCH v4] lib/librte_meter: add meter configuration profile

2018-04-05 Thread Thomas Monjalon
19/02/2018 22:12, Thomas Monjalon:
> 08/01/2018 16:43, Jasvinder Singh:
> > Signed-off-by: Cristian Dumitrescu 
> > Signed-off-by: Jasvinder Singh 
> 
> Applied for 18.05 (was postponed to preserve 18.02 ABI), thanks.

We forgot to update the release notes about the API change.
Please, could you send a patch to add it in the appropriate section?
Thanks




Re: [dpdk-dev] [PATCH V19 2/4] eal: add device event monitor framework

2018-04-05 Thread Tan, Jianfeng



On 4/5/2018 4:32 PM, Jeff Guo wrote:

This patch aims to add a general device event monitor framework at
EAL device layer, for device hotplug awareness and actions adopted
accordingly. It could also expand for all other types of device event
monitor, but not in this scope at the stage.

To get started, users firstly call below new added APIs to enable/disable
the device event monitor mechanism:
   - rte_dev_event_monitor_start
   - rte_dev_event_monitor_stop

Then users shell register or unregister callbacks through the new added
APIs. Callbacks can be some device specific, or for all devices.
   -rte_dev_event_callback_register
   -rte_dev_event_callback_unregister

Use hotplug case for example, when device hotplug insertion or hotplug
removal, we will get notified from kernel, then call user's callbacks
accordingly to handle it, such as detach or attach the device from the
bus, and could benefit further fail-safe or live-migration.

Signed-off-by: Jeff Guo 


Except some trivial things, I'm ok with this patch, so

Reviewed-by: Jianfeng Tan 


---
v19->v18:
clear the coding style and fix typo
---
  doc/guides/rel_notes/release_18_05.rst  |   9 ++
  lib/librte_eal/bsdapp/eal/Makefile  |   1 +
  lib/librte_eal/bsdapp/eal/eal_dev.c |  21 +
  lib/librte_eal/bsdapp/eal/meson.build   |   1 +
  lib/librte_eal/common/eal_common_dev.c  | 161 
  lib/librte_eal/common/eal_private.h |  15 +++
  lib/librte_eal/common/include/rte_dev.h |  94 +++
  lib/librte_eal/linuxapp/eal/Makefile|   1 +
  lib/librte_eal/linuxapp/eal/eal_dev.c   |  22 +
  lib/librte_eal/linuxapp/eal/meson.build |   1 +
  lib/librte_eal/rte_eal_version.map  |  10 ++
  11 files changed, 336 insertions(+)
  create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
  create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c

diff --git a/doc/guides/rel_notes/release_18_05.rst 
b/doc/guides/rel_notes/release_18_05.rst
index e5fac1c..d3c86bd 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -58,6 +58,15 @@ New Features
* Added support for NVGRE, VXLAN and GENEVE filters in flow API.
* Added support for DROP action in flow API.
  
+* **Added device event monitor framework.**

+
+  Added a general device event monitor framework at EAL, for device dynamic 
management.
+  Such as device hotplug awareness and actions adopted accordingly. The list 
of new APIs:
+
+  * ``rte_dev_event_monitor_start`` and ``rte_dev_event_monitor_stop`` are for
+the event monitor enable and disable.
+  * ``rte_dev_event_callback_register`` and 
``rte_dev_event_callback_unregister``
+are for the user's callbacks register and unregister.
  
  API Changes

  ---
diff --git a/lib/librte_eal/bsdapp/eal/Makefile 
b/lib/librte_eal/bsdapp/eal/Makefile
index ed1d17b..90b88eb 100644
--- a/lib/librte_eal/bsdapp/eal/Makefile
+++ b/lib/librte_eal/bsdapp/eal/Makefile
@@ -33,6 +33,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_lcore.c
  SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_timer.c
  SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_interrupts.c
  SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_alarm.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_dev.c
  
  # from common dir

  SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_common_lcore.c
diff --git a/lib/librte_eal/bsdapp/eal/eal_dev.c 
b/lib/librte_eal/bsdapp/eal/eal_dev.c
new file mode 100644
index 000..1c6c51b
--- /dev/null
+++ b/lib/librte_eal/bsdapp/eal/eal_dev.c
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+int __rte_experimental
+rte_dev_event_monitor_start(void)
+{
+   RTE_LOG(ERR, EAL, "Device event is not supported for FreeBSD\n");
+   return -1;
+}
+
+int __rte_experimental
+rte_dev_event_monitor_stop(void)
+{
+   RTE_LOG(ERR, EAL, "Device event is not supported for FreeBSD\n");
+   return -1;
+}
diff --git a/lib/librte_eal/bsdapp/eal/meson.build 
b/lib/librte_eal/bsdapp/eal/meson.build
index e83fc91..6dfc533 100644
--- a/lib/librte_eal/bsdapp/eal/meson.build
+++ b/lib/librte_eal/bsdapp/eal/meson.build
@@ -12,4 +12,5 @@ env_sources = files('eal_alarm.c',
'eal_timer.c',
'eal.c',
'eal_memory.c',
+   'eal_dev.c'
  )
diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index cd07144..e202cf2 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -14,9 +14,34 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  
  #include "eal_private.h"
  
+/**

+ * The device event callback description.
+ *
+ * It contains callback address to be registered by user application,
+ * the pointer to the parameters for callback, and the device name.
+ */
+struct dev_event_callback {
+   TAILQ_ENTRY(dev_event_callback) next; /**< Callbacks list */
+   rt

Re: [dpdk-dev] [PATCH v3 1/4] ethdev: add support for PMD-tuned Tx/Rx parameters

2018-04-05 Thread Thomas Monjalon
04/04/2018 20:56, De Lara Guarch, Pablo:
> 
> API and ABI changes should be documented in release notes.

When sending a v4 for the API change, you can add my ack:

Acked-by: Thomas Monjalon 




Re: [dpdk-dev] [PATCH v5 2/2] eal/vfio: export internal vfio functions

2018-04-05 Thread Thomas Monjalon
05/04/2018 11:03, Wang, Xiao W:

> > +int rte_vfio_get_group_num(const char *sysfs_base, const char *dev_addr,
> > +  int *iommu_group_num);
> > +int rte_vfio_get_container_fd(void);
> > +int rte_vfio_get_group_fd(int iommu_group_num);
> 
> Considering the "group_no" field defined in eal_vfio.h, will 
> "iommu_group_num" cause inconsistency
> In naming?

I asked to change the function name to "num" because it is more meaningful.
"group_no" field is private? Can it be renamed?




Re: [dpdk-dev] [PATCH] vhost: fix meson build issues

2018-04-05 Thread Bruce Richardson
On Thu, Apr 05, 2018 at 10:08:16AM +0200, Tomasz Duszynski wrote:
> This patch addresses following meson build issues:
> 
> 1) Since rte_vdpa.h includes rte_pci.h it introduces pci
>dependency thus deps array should be updated accordingly.
> 
> 2) Since vhost.h includes rte_vdpa.h vdpa.c should be added to
>the sources list. Otherwise we end up with linker errors
>caused by undefined references.
> 
> Fixes: 34b30b2e7e42 ("vhost: add apis for datapath configuration")
> Cc: zhihong.w...@intel.com
> 
> Signed-off-by: Tomasz Duszynski 
> ---

Confirmed that meson build on dpdk-next-virtio tree is broken and that this
fixes it.

Acked-by: Bruce Richardson 


Re: [dpdk-dev] [PATCH v2] pdump: change to use generic multi-process channel

2018-04-05 Thread Pattan, Reshma
Hi,

> -Original Message-
> From: Tan, Jianfeng
> Sent: Wednesday, April 4, 2018 4:08 PM
> To: dev@dpdk.org
> Cc: Tan, Jianfeng ; Pattan, Reshma
> 
> Subject: [PATCH v2] pdump: change to use generic multi-process channel
> 
> The original code replies on the private channel for primary and secondary
> communication. Change to use the generic multi-process channel.
> 
> Note with this change, dpdk-pdump will be not compatible with old version
> DPDK applications.
> 
> Cc: reshma.pat...@intel.com
> 
> Signed-off-by: Jianfeng Tan 
> ---
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index 0c696f7..d55fd05 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -153,3 +153,7 @@ Deprecation Notices
>be added between the producer and consumer structures. The size of the
>structure and the offset of the fields will remain the same on
>platforms with 64B cache line, but will change on other platforms.
> +
> +* pdump: As we changed to use generic IPC, ``rte_pdump_set_socket_dir``
> +will be
> +  deprecated and removed in subsequent release; the parameter, path, of
> +  ``rte_pdump_init`` will also be removed.

Do we need to mention about deprecation of below enums too?
enum rte_pdump_socktype {
RTE_PDUMP_SOCKET_SERVER = 1,
RTE_PDUMP_SOCKET_CLIENT = 2
};

Thanks,
Reshma


Re: [dpdk-dev] [PATCH v2] net/bonding: clear dev_started if start fails

2018-04-05 Thread Radu Nicolau



On 3/23/2018 5:05 PM, Chas Williams wrote:

From: "Charles (Chas) Williams" 

There are several error paths where the bonding device may not start.
Clear dev_started before we return if we take one of these paths.

Fixes: 2efb58cbab ("bond: new link bonding library")
Cc: sta...@dpdk.org

Signed-off-by: Chas Williams 
---

Acked-by: Radu Nicolau  



Re: [dpdk-dev] [pull-request] next-pipeline 18.05 PRE-RC1

2018-04-05 Thread Dumitrescu, Cristian


> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Thursday, April 5, 2018 1:03 AM
> To: Dumitrescu, Cristian 
> Cc: dev@dpdk.org; Singh, Jasvinder 
> Subject: Re: [dpdk-dev] [pull-request] next-pipeline 18.05 PRE-RC1
> 
> > > Hi,
> > >
> > > 30/03/2018 14:45, Cristian Dumitrescu:
> > > >   http://dpdk.org/git/next/dpdk-next-pipeline
> > >
> > > I saw 2 issues:
> > >   - table_index is wrongly placed in doxygen index
> > >   - compilation error with clang: status_data is used uninitialized
> > >
> >
> > Hi Thomas,
> >
> > All the above issues are fixed now, are you OK to resume the pull request
> now?
> 
> Sorry, one more compilation error:
> 
> error: incompatible pointer types
>   passing 'uint8_t *' (aka 'unsigned char *') to parameter of type 
> 'uint32_t
> *'
>   (aka 'unsigned int *') [-Werror,-Wincompatible-pointer-types]
> if (parser_read_uint32(&p.tc_ov_weight, tokens[10]) != 0) {
>^~~
> 
> 

Fixed, sorry about this.


Re: [dpdk-dev] [PATCH v4] lib/librte_meter: add meter configuration profile

2018-04-05 Thread Dumitrescu, Cristian


> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Thursday, April 5, 2018 11:12 AM
> To: Singh, Jasvinder ; Dumitrescu, Cristian
> 
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4] lib/librte_meter: add meter
> configuration profile
> 
> 19/02/2018 22:12, Thomas Monjalon:
> > 08/01/2018 16:43, Jasvinder Singh:
> > > Signed-off-by: Cristian Dumitrescu 
> > > Signed-off-by: Jasvinder Singh 
> >
> > Applied for 18.05 (was postponed to preserve 18.02 ABI), thanks.
> 
> We forgot to update the release notes about the API change.
> Please, could you send a patch to add it in the appropriate section?
> Thanks
> 

Will send a quick patch later today, thanks!



Re: [dpdk-dev] [PATCH v5 2/2] eal/vfio: export internal vfio functions

2018-04-05 Thread Wang, Xiao W
Yes, it's private. We could do that if really needed.

BRs,
Xiao
> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Thursday, April 5, 2018 6:23 PM
> To: Wang, Xiao W 
> Cc: Hemant Agrawal ; dev@dpdk.org; Burakov,
> Anatoly 
> Subject: Re: [dpdk-dev] [PATCH v5 2/2] eal/vfio: export internal vfio 
> functions
> 
> 05/04/2018 11:03, Wang, Xiao W:
> 
> > > +int rte_vfio_get_group_num(const char *sysfs_base, const char
> *dev_addr,
> > > +int *iommu_group_num);
> > > +int rte_vfio_get_container_fd(void);
> > > +int rte_vfio_get_group_fd(int iommu_group_num);
> >
> > Considering the "group_no" field defined in eal_vfio.h, will
> "iommu_group_num" cause inconsistency
> > In naming?
> 
> I asked to change the function name to "num" because it is more meaningful.
> "group_no" field is private? Can it be renamed?
> 



Re: [dpdk-dev] [PATCH V19 3/4] eal/linux: uevent parse and process

2018-04-05 Thread Tan, Jianfeng



On 4/5/2018 5:02 PM, Jeff Guo wrote:

In order to handle the uevent which has been detected from the kernel
side, add uevent parse and process function to translate the uevent into
device event, which user has subscribed to monitor.

Signed-off-by: Jeff Guo 
---
v19->18:
fix some misunderstanding part
---
  lib/librte_eal/linuxapp/eal/eal_dev.c | 196 +-
  1 file changed, 194 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_dev.c 
b/lib/librte_eal/linuxapp/eal/eal_dev.c
index 9c8d1a0..4686c41 100644
--- a/lib/librte_eal/linuxapp/eal/eal_dev.c
+++ b/lib/librte_eal/linuxapp/eal/eal_dev.c
@@ -2,21 +2,213 @@
   * Copyright(c) 2018 Intel Corporation
   */
  
+#include 

+#include 
+#include 
+#include 
+
  #include 
  #include 
  #include 
+#include 
+#include 
+
+#include "eal_private.h"
+
+static struct rte_intr_handle intr_handle = {.fd = -1 };
+static bool monitor_started;
+
+#define EAL_UEV_MSG_LEN 4096
+#define EAL_UEV_MSG_ELEM_LEN 128
+
+/* identify the system layer which reports this event. */
+enum eal_dev_event_subsystem {
+   EAL_DEV_EVENT_SUBSYSTEM_PCI, /* PCI bus device event */
+   EAL_DEV_EVENT_SUBSYSTEM_UIO, /* UIO driver device event */
+   EAL_DEV_EVENT_SUBSYSTEM_VFIO, /* VFIO driver device event */
+   EAL_DEV_EVENT_SUBSYSTEM_MAX
+};
+
+static int
+dev_uev_socket_fd_create(void)
+{
+   struct sockaddr_nl addr;
+   int ret;
+
+   intr_handle.fd = socket(PF_NETLINK, SOCK_RAW | SOCK_CLOEXEC |
+   SOCK_NONBLOCK,
+   NETLINK_KOBJECT_UEVENT);
+   if (intr_handle.fd < 0) {
+   RTE_LOG(ERR, EAL, "create uevent fd failed.\n");
+   return -1;
+   }
+
+   memset(&addr, 0, sizeof(addr));
+   addr.nl_family = AF_NETLINK;
+   addr.nl_pid = 0;
+   addr.nl_groups = 0x;
+
+   ret = bind(intr_handle.fd, (struct sockaddr *) &addr, sizeof(addr));
+   if (ret < 0) {
+   RTE_LOG(ERR, EAL, "Failed to bind uevent socket.\n");
+   goto err;
+   }
+
+   return 0;
+err:
+   close(intr_handle.fd);
+   intr_handle.fd = -1;
+   return ret;
+}
+
+static int
+dev_uev_parse(const char *buf, struct rte_dev_event *event, int length)
+{
+   char action[EAL_UEV_MSG_ELEM_LEN];
+   char subsystem[EAL_UEV_MSG_ELEM_LEN];
+   char pci_slot_name[EAL_UEV_MSG_ELEM_LEN];
+   int i = 0, ret = 0;
+
+   memset(action, 0, EAL_UEV_MSG_ELEM_LEN);
+   memset(subsystem, 0, EAL_UEV_MSG_ELEM_LEN);
+   memset(pci_slot_name, 0, EAL_UEV_MSG_ELEM_LEN);
+
+   while (i < length) {
+   for (; i < length; i++) {
+   if (*buf)
+   break;
+   buf++;
+   }
+   /**
+* check device uevent from kernel side, no need to check
+* uevent from udev.
+*/
+   if (!strncmp(buf, "libudev", 7)) {
+   buf += 7;
+   i += 7;
+   return -1;
+   }
+   if (!strncmp(buf, "ACTION=", 7)) {
+   buf += 7;
+   i += 7;
+   snprintf(action, sizeof(action), "%s", buf);
+   } else if (!strncmp(buf, "SUBSYSTEM=", 10)) {
+   buf += 10;
+   i += 10;
+   snprintf(subsystem, sizeof(subsystem), "%s", buf);
+   } else if (!strncmp(buf, "PCI_SLOT_NAME=", 14)) {
+   buf += 14;
+   i += 14;
+   snprintf(pci_slot_name, sizeof(subsystem), "%s", buf);
+   event->devname = strdup(pci_slot_name);
+   }
+   for (; i < length; i++) {
+   if (*buf == '\0')
+   break;
+   buf++;
+   }
+   }
+
+   /* parse the subsystem layer */
+   if (!strncmp(subsystem, "uio", 3))
+   event->subsystem = EAL_DEV_EVENT_SUBSYSTEM_UIO;
+   else if (!strncmp(subsystem, "pci", 3))
+   event->subsystem = EAL_DEV_EVENT_SUBSYSTEM_PCI;
+   else if (!strncmp(subsystem, "vfio", 4))
+   event->subsystem = EAL_DEV_EVENT_SUBSYSTEM_VFIO;
+   else
+   ret = -1;


We can just return -1 here.

  
+	/* parse the action type */

+   if (!strncmp(action, "add", 3))
+   event->type = RTE_DEV_EVENT_ADD;
+   else if (!strncmp(action, "remove", 6))
+   event->type = RTE_DEV_EVENT_REMOVE;
+   else
+   ret = -1;


We can just return -1 here.


+   return ret;


return 0 here.


+}
+
+static void
+dev_uev_handler(__rte_unused void *param)
+{
+   struct rte_dev_event uevent;
+   int ret;
+   char buf[EAL_UEV_MSG_LEN];
+
+   memset(&uevent, 0, sizeof(struct rte_dev_event));
+   me

Re: [dpdk-dev] [PATCH] net/bonding: fix setting VLAN ID on slave ports

2018-04-05 Thread Radu Nicolau


On 4/3/2018 5:01 PM, Chas Williams wrote:

From: Chas Williams

The pos returned is just the offset of the slab.  You need to use this
to offset the bits in the slab.

Fixes: c771e4ef38 ("net/bonding: enable slave VLAN filter")
Cc:sta...@dpdk.org

Signed-off-by: Chas Williams
---
  
Acked-by: Radu Nicolau  



Re: [dpdk-dev] [PATCH v2] pdump: change to use generic multi-process channel

2018-04-05 Thread Tan, Jianfeng



On 4/5/2018 6:37 PM, Pattan, Reshma wrote:

Hi,


-Original Message-
From: Tan, Jianfeng
Sent: Wednesday, April 4, 2018 4:08 PM
To: dev@dpdk.org
Cc: Tan, Jianfeng ; Pattan, Reshma

Subject: [PATCH v2] pdump: change to use generic multi-process channel

The original code replies on the private channel for primary and secondary
communication. Change to use the generic multi-process channel.

Note with this change, dpdk-pdump will be not compatible with old version
DPDK applications.

Cc: reshma.pat...@intel.com

Signed-off-by: Jianfeng Tan 
---
diff --git a/doc/guides/rel_notes/deprecation.rst
b/doc/guides/rel_notes/deprecation.rst
index 0c696f7..d55fd05 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -153,3 +153,7 @@ Deprecation Notices
be added between the producer and consumer structures. The size of the
structure and the offset of the fields will remain the same on
platforms with 64B cache line, but will change on other platforms.
+
+* pdump: As we changed to use generic IPC, ``rte_pdump_set_socket_dir``
+will be
+  deprecated and removed in subsequent release; the parameter, path, of
+  ``rte_pdump_init`` will also be removed.

Do we need to mention about deprecation of below enums too?
enum rte_pdump_socktype {
 RTE_PDUMP_SOCKET_SERVER = 1,
 RTE_PDUMP_SOCKET_CLIENT = 2
};


Nice catch, will add in next version.

Thanks,
Jianfeng



Thanks,
Reshma




[dpdk-dev] [PATCH v2 2/6] lib/cryptodev: add asym op support in cryptodev

2018-04-05 Thread Shally Verma
Extend DPDK librte_cryptodev to:
- define asym op type in rte_crypto_op_type and associated
  op pool create/alloc APIs
- define asym session and associated session APIs

If PMD shows in its feature flag that it supports both sym and
asym then it must support those on all its qps.

If PMD support both but internally has hw with dedicated qp for
each service then it *can* split itself into *symmetric only and
asymmetric only PMD instances*.

List of TBDs:
- change PMD ops session_get_size, session_configure, session_clear
  to sym_session_* APIs
- change external get_session_private_size to sym_get_session_*
- per-service stats update

Open for consideration:
- sessionless asymmetric ops.
  current proposal only define session based operations.

Changes from PATCHv1
- resolve new line error in librte_cryptodev/rte_cryptodev_version.map

Signed-off-by: Shally Verma 
Signed-off-by: Sunila Sahu 
Signed-off-by: Ashish Gupta 

---
User must apply patch
"lib/cryptodev: add asymmetric algos in cryptodev" before compilation
---
 lib/librte_cryptodev/rte_crypto.h  |  37 +++-
 lib/librte_cryptodev/rte_cryptodev.c   | 122 -
 lib/librte_cryptodev/rte_cryptodev.h   |  81 +++-
 lib/librte_cryptodev/rte_cryptodev_pmd.h   |  58 +++-
 lib/librte_cryptodev/rte_cryptodev_version.map |  12 +++
 5 files changed, 305 insertions(+), 5 deletions(-)

diff --git a/lib/librte_cryptodev/rte_crypto.h 
b/lib/librte_cryptodev/rte_crypto.h
index 95cf861..0fdec1b 100644
--- a/lib/librte_cryptodev/rte_crypto.h
+++ b/lib/librte_cryptodev/rte_crypto.h
@@ -23,6 +23,7 @@
 #include 
 
 #include "rte_crypto_sym.h"
+#include "rte_crypto_asym.h"
 
 /** Crypto operation types */
 enum rte_crypto_op_type {
@@ -30,6 +31,8 @@ enum rte_crypto_op_type {
/**< Undefined operation type */
RTE_CRYPTO_OP_TYPE_SYMMETRIC,
/**< Symmetric operation */
+   RTE_CRYPTO_OP_TYPE_ASYMMETRIC
+   /**< Asymmetric operation */
 };
 
 /** Status of crypto operation */
@@ -97,6 +100,10 @@ struct rte_crypto_op {
union {
struct rte_crypto_sym_op sym[0];
/**< Symmetric operation parameters */
+
+   struct rte_crypto_asym_op asym[0];
+   /**< Asymmetric operation parameters */
+
}; /**< operation specific parameters */
 };
 
@@ -117,6 +124,9 @@ struct rte_crypto_op {
case RTE_CRYPTO_OP_TYPE_SYMMETRIC:
__rte_crypto_sym_op_reset(op->sym);
break;
+   case RTE_CRYPTO_OP_TYPE_ASYMMETRIC:
+   __rte_crypto_asym_op_reset(op->asym);
+   break;
case RTE_CRYPTO_OP_TYPE_UNDEFINED:
default:
break;
@@ -283,9 +293,14 @@ struct rte_crypto_op_pool_private {
if (likely(op->mempool != NULL)) {
priv_size = __rte_crypto_op_get_priv_data_size(op->mempool);
 
-   if (likely(priv_size >= size))
-   return (void *)((uint8_t *)(op + 1) +
+   if (likely(priv_size >= size)) {
+   if (op->type == RTE_CRYPTO_OP_TYPE_SYMMETRIC)
+   return (void *)((uint8_t *)(op + 1) +
sizeof(struct rte_crypto_sym_op));
+   if (op->type == RTE_CRYPTO_OP_TYPE_ASYMMETRIC)
+   return (void *)((uint8_t *)(op+1) +
+   sizeof(struct rte_crypto_asym_op));
+   }
}
 
return NULL;
@@ -388,6 +403,24 @@ struct rte_crypto_op_pool_private {
return __rte_crypto_sym_op_attach_sym_session(op->sym, sess);
 }
 
+/**
+ * Attach a asymmetric session to a crypto operation
+ *
+ * @param  op  crypto operation, must be of type asymmetric
+ * @param  sesscryptodev session
+ */
+static inline int
+rte_crypto_op_attach_asym_session(struct rte_crypto_op *op,
+   struct rte_cryptodev_asym_session *sess)
+{
+   if (unlikely(op->type != RTE_CRYPTO_OP_TYPE_ASYMMETRIC))
+   return -1;
+
+   op->sess_type = RTE_CRYPTO_OP_WITH_SESSION;
+
+   return __rte_crypto_op_attach_asym_session(op->asym, sess);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index 8745b6b..cca8d4c 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -1088,13 +1088,62 @@ struct rte_cryptodev *
return 0;
 }
 
+int __rte_experimental
+rte_cryptodev_asym_session_init(uint8_t dev_id,
+   struct rte_cryptodev_asym_session *sess,
+   struct rte_crypto_asym_xform *xforms,
+   struct rte_mempool *mp)
+{
+   struct rte_cryptodev *dev;
+   uint8_t index;
+   int ret;
+
+   dev = rte_cryptodev_pmd_get_dev(dev_id);
+
+   if (sess == NULL || xforms == NULL || dev == NULL)
+   return -EINVAL;
+
+   index 

[dpdk-dev] [PATCH v2 0/6] crypto: add asym crypto support

2018-04-05 Thread Shally Verma
This patch series add support for asymmetric crypto in DPDK
librte_cryptodev framework along with unit test, PMD and
documentation updates and addresses patch apply failure
raised on asym crypto v1 patch series:
https://dpdk.org/dev/patchwork/patch/36575/
https://dpdk.org/dev/patchwork/patch/36576/
https://dpdk.org/dev/patchwork/patch/36577/

And, unit test and PMD patch series:
https://dpdk.org/dev/patchwork/patch/36928/
https://dpdk.org/dev/patchwork/patch/36929/
https://dpdk.org/dev/patchwork/patch/36930/

Key changes from PATCH v1:
- resolve git apply patch error on patch id 36575
- resolve git apply patch error on patch id 36929

Key changes from RFCv1 include:
- removal of dedicated sym and asym qp setup,
- remove asym qp count and attach/detach_session apis
- re-org xforms params for deffie-hellman to allow
  public key and optional private key generations
- move elliptic curve changes into another separate patch/patch series

This patch series is divided in to following categories:
1. library patches with asymmetric API, xform and capability
   definitions
2. Unit test case addition
3. Openssl PMD changes with asymmetric crypto support
4. Programmer Guide updates with asymmetric description

TBD:
- add elliptic curve support
- rename of existing session_configure/clear APIs to
  sym_session_configure/clear/init APIs

It is based on review discussion on RFC v1 asym crypto patch
http://dpdk.org/patch/34308.

RFC v1 patch http://dpdk.org/patch/34308 is further a derivative of
earlier reviewed  RFC v2 patch series:
 http://dpdk.org/dev/patchwork/patch/24245/
 http://dpdk.org/dev/patchwork/patch/24246/
 http://dpdk.org/dev/patchwork/patch/24247/

Shally Verma (6):
  lib/cryptodev: add asymmetric algos in cryptodev
  lib/cryptodev: add asym op support in cryptodev
  lib/cryptodev: add asymmetric crypto capability in cryptodev
  test/crypto: add unit testcase for asym crypto
  crypto/openssl: add asym crypto support
  doc: add asym crypto in cryptodev programmer guide

 doc/guides/cryptodevs/features/openssl.ini   |   11 +
 doc/guides/cryptodevs/openssl.rst|1 +
 doc/guides/prog_guide/cryptodev_lib.rst  |  338 +++-
 drivers/crypto/openssl/rte_openssl_pmd.c |  377 -
 drivers/crypto/openssl/rte_openssl_pmd_ops.c |  400 -
 drivers/crypto/openssl/rte_openssl_pmd_private.h |   29 +
 lib/librte_cryptodev/Makefile|3 +-
 lib/librte_cryptodev/rte_crypto.h|   37 +-
 lib/librte_cryptodev/rte_crypto_asym.h   |  519 +++
 lib/librte_cryptodev/rte_cryptodev.c |  218 ++-
 lib/librte_cryptodev/rte_cryptodev.h |  186 ++-
 lib/librte_cryptodev/rte_cryptodev_pmd.h |   58 +-
 lib/librte_cryptodev/rte_cryptodev_version.map   |   16 +
 test/test/Makefile   |3 +-
 test/test/test_cryptodev_asym.c  | 1785 ++
 15 files changed, 3952 insertions(+), 29 deletions(-)
 create mode 100644 lib/librte_cryptodev/rte_crypto_asym.h
 create mode 100644 test/test/test_cryptodev_asym.c

--
1.9.1



[dpdk-dev] [PATCH v2 3/6] lib/cryptodev: add asymmetric crypto capability in cryptodev

2018-04-05 Thread Shally Verma
Extend cryptodev with asymmetric capability APIs and
definitions.

Signed-off-by: Shally Verma 
Signed-off-by: Sunila Sahu 
Signed-off-by: Ashish Gupta 

---
User must apply patch
"lib/cryptodev: add asymmetric algos in cryptodev" before compilation
---
 lib/librte_cryptodev/rte_cryptodev.c   |  96 ++
 lib/librte_cryptodev/rte_cryptodev.h   | 105 -
 lib/librte_cryptodev/rte_cryptodev_version.map |   4 +
 3 files changed, 204 insertions(+), 1 deletion(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index cca8d4c..f1e9f7d 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -166,6 +166,31 @@ struct rte_cryptodev_callback {
[RTE_CRYPTO_AEAD_OP_DECRYPT]= "decrypt"
 };
 
+/**
+ * Asymmetric crypto transform operation strings identifiers.
+ */
+const char *rte_crypto_asym_xform_strings[] = {
+   [RTE_CRYPTO_ASYM_XFORM_NONE]= "none",
+   [RTE_CRYPTO_ASYM_XFORM_RSA] = "rsa",
+   [RTE_CRYPTO_ASYM_XFORM_MODEX]   = "modexp",
+   [RTE_CRYPTO_ASYM_XFORM_MODINV]  = "modinv",
+   [RTE_CRYPTO_ASYM_XFORM_DH]  = "dh",
+   [RTE_CRYPTO_ASYM_XFORM_DSA] = "dsa",
+};
+
+/**
+ * Asymmetric crypto operation strings identifiers.
+ */
+const char *rte_crypto_asym_op_strings[] = {
+   [RTE_CRYPTO_ASYM_OP_ENCRYPT]= "encrypt",
+   [RTE_CRYPTO_ASYM_OP_DECRYPT]= "decrypt",
+   [RTE_CRYPTO_ASYM_OP_SIGN]   = "sign",
+   [RTE_CRYPTO_ASYM_OP_VERIFY] = "verify",
+   [RTE_CRYPTO_ASYM_OP_PRIVATE_KEY_GENERATE]   = "priv_key_generate",
+   [RTE_CRYPTO_ASYM_OP_PUBLIC_KEY_GENERATE] = "pub_key_generate",
+   [RTE_CRYPTO_ASYM_OP_SHARED_SECRET_COMPUTE] = "sharedsecret_compute",
+};
+
 int
 rte_cryptodev_get_cipher_algo_enum(enum rte_crypto_cipher_algorithm *algo_enum,
const char *algo_string)
@@ -217,6 +242,24 @@ struct rte_cryptodev_callback {
return -1;
 }
 
+int __rte_experimental
+rte_cryptodev_get_asym_xform_enum(enum rte_crypto_asym_xform_type *xform_enum,
+   const char *xform_string)
+{
+   unsigned int i;
+
+   for (i = 1; i < RTE_DIM(rte_crypto_asym_xform_strings); i++) {
+   if (strcmp(xform_string,
+  rte_crypto_asym_xform_strings[i]) == 0) {
+   *xform_enum = (enum rte_crypto_asym_xform_type) i;
+   return 0;
+   }
+   }
+
+   /* Invalid string */
+   return -1;
+}
+
 /**
  * The crypto auth operation strings identifiers.
  * It could be used in application command line.
@@ -262,6 +305,28 @@ struct rte_cryptodev_callback {
 
 }
 
+const struct rte_cryptodev_asymmetric_xfrm_capability * __rte_experimental
+rte_cryptodev_asym_capability_get(uint8_t dev_id,
+   const struct rte_cryptodev_asym_capability_idx *idx)
+{
+   const struct rte_cryptodev_capabilities *capability;
+   struct rte_cryptodev_info dev_info;
+   unsigned int i = 0;
+
+   memset(&dev_info, 0, sizeof(struct rte_cryptodev_info));
+   rte_cryptodev_info_get(dev_id, &dev_info);
+
+   while ((capability = &dev_info.capabilities[i++])->op !=
+   RTE_CRYPTO_OP_TYPE_UNDEFINED) {
+   if (capability->op != RTE_CRYPTO_OP_TYPE_ASYMMETRIC)
+   continue;
+
+   if (capability->asym.xform_type == idx->type)
+   return &capability->asym.xfrm_capa;
+   }
+   return NULL;
+};
+
 #define param_range_check(x, y) \
(((x < y.min) || (x > y.max)) || \
(y.increment != 0 && (x % y.increment) != 0))
@@ -317,6 +382,37 @@ struct rte_cryptodev_callback {
 
return 0;
 }
+int __rte_experimental
+rte_cryptodev_asym_xfrm_capability_check_optype(
+   const struct rte_cryptodev_asymmetric_xfrm_capability *capability,
+   enum rte_crypto_asym_op_type op_type)
+{
+   if (capability->op_types & (1 << op_type))
+   return 1;
+
+   return 0;
+}
+
+int __rte_experimental
+rte_cryptodev_asym_xfrm_capability_check_modlen(
+   const struct rte_cryptodev_asymmetric_xfrm_capability *capability,
+   uint16_t modlen)
+{
+   /* handle special case of 0 which mean PMD define no limit defined */
+   if ((capability->modlen.min != 0) &&
+   ((modlen < capability->modlen.min) ||
+   (capability->modlen.increment != 0 &&
+   (modlen % (capability->modlen.increment)
+   return -1;
+   if ((capability->modlen.max != 0) &&
+   ((modlen > capability->modlen.max) ||
+   (capability->modlen.increment != 0 &&
+   (modlen % (capability->modlen.increment)
+   return -1;
+
+   return 0;
+}
+
 
 const char *
 rte_cryptodev_get_feature_name(uint64_t flag)
diff --git a/lib/librte_cryptodev/rte_cryptodev.h 
b/lib/librte_cryptodev/rte_cryptodev.h
index 6

[dpdk-dev] [PATCH v2 4/6] test/crypto: add unit testcase for asym crypto

2018-04-05 Thread Shally Verma
Add unit test case to test openssl PMD asym crypto
operations. Test case invoke asymmetric operation on DPDK
Openssl PMD and cross-verify results via Openssl SW library.
Tests have been verified with openssl 1.0.2m release.

Tested for:

* RSA Encrypt, Decrypt, Sign and Verify using pre-defined
  test vectors
* Modular Inversion and Exponentiation using pre-defined
  test vectors
* Deiffie-Hellman Public key generation using pre-defined
  private key and dynamically generated test vectors
* Deffie-hellman private key generation using dynamically
  generated test vectors
* Deffie-hellman private and public key pair generation
  using xform chain and using dynamically generated test
  vectors
* Deffie-hellman shared secret compute using dynamically
  generated test vectors
* DSA Sign and Verification

Deffie-hellman testcases use run-time generated test params,
thus may take some time for execution.

Signed-off-by: Shally Verma 
Signed-off-by: Sunila Sahu 
Signed-off-by: Ashish Gupta 

---
This patch dependent on asym crypto API patches.
Please apply them before compilation
---
 test/test/Makefile  |3 +-
 test/test/test_cryptodev_asym.c | 1785 +++
 2 files changed, 1787 insertions(+), 1 deletion(-)

diff --git a/test/test/Makefile b/test/test/Makefile
index a88cc38..04ab522 100644
--- a/test/test/Makefile
+++ b/test/test/Makefile
@@ -180,6 +180,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_RING) += test_pmd_ring_perf.c
 
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev_blockcipher.c
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev.c
+SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev_asym.c
 
 ifeq ($(CONFIG_RTE_LIBRTE_EVENTDEV),y)
 SRCS-y += test_eventdev.c
@@ -200,7 +201,7 @@ CFLAGS += $(WERROR_FLAGS)
 
 CFLAGS += -D_GNU_SOURCE
 
-LDLIBS += -lm
+LDLIBS += -lm -lcrypto
 
 # Disable VTA for memcpy test
 ifeq ($(CONFIG_RTE_TOOLCHAIN_GCC),y)
diff --git a/test/test/test_cryptodev_asym.c b/test/test/test_cryptodev_asym.c
new file mode 100644
index 000..5f15d10
--- /dev/null
+++ b/test/test/test_cryptodev_asym.c
@@ -0,0 +1,1785 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017-2018 Cavium Networks
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+#include "test_cryptodev.h"
+
+#define TEST_DATA_SIZE 4096
+#define TEST_NUM_BUFS 10
+#define TEST_NUM_SESSIONS 4
+#define ASYM_TEST_MSG_LEN  256
+#define TEST_DH_MOD_LEN 1024
+
+static int gbl_driver_id;
+struct crypto_testsuite_params {
+   struct rte_mempool *op_mpool;
+   struct rte_mempool *session_mpool;
+   struct rte_cryptodev_config conf;
+   struct rte_cryptodev_qp_conf qp_conf;
+   uint8_t valid_devs[RTE_CRYPTO_MAX_DEVS];
+   uint8_t valid_dev_count;
+};
+
+struct crypto_unittest_params {
+   struct rte_cryptodev_asym_session *sess;
+   struct rte_crypto_op *op;
+};
+
+static struct crypto_testsuite_params testsuite_params = { NULL };
+
+struct rsa_test_data {
+   enum rte_crypto_asym_op_type op_type;
+
+   struct {
+   uint8_t data[TEST_DATA_SIZE];
+   unsigned int len;
+   } plainText;
+   struct {
+   uint8_t data[TEST_DATA_SIZE];
+   unsigned int len;
+   } encryptedText;
+   struct {
+   uint8_t data[TEST_DATA_SIZE];
+   unsigned int len;
+   } signText;
+};
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wcast-qual"
+
+static unsigned char base[] = {0xF8, 0xBA, 0x1A, 0x55, 0xD0, 0x2F, 
0x85,
+   0xAE, 0x96, 0x7B, 0xB6, 0x2F, 0xB6, 0xCD,
+   0xA8, 0xEB, 0x7E, 0x78, 0xA0, 0x50 };
+
+static struct rsa_test_data rsa_test_case = {
+   .op_type = RTE_CRYPTO_ASYM_OP_ENCRYPT,
+   .plainText = {
+   .data = {
+   0xF8, 0xBA, 0x1A, 0x55, 0xD0, 0x2F, 0x85, 0xAE,
+   0x96, 0x7B, 0xB6, 0x2F, 0xB6, 0xCD, 0xA8, 0xEB,
+   0x7E, 0x78, 0xA0, 0x50
+   },
+   .len = 20
+   },
+   .encryptedText = {
+   .data = {
+   0x4B, 0x22, 0x88, 0xF1, 0x91, 0x5A, 0x6A, 0xCC,
+   0x75, 0xD6, 0x40, 0xE3, 0x58, 0xCA, 0xC8, 0x70,
+   0x9B, 0x2B, 0xC7, 0x36, 0x1F, 0xAE, 0x38, 0xF3,
+   0x97, 0xA6, 0xEE, 0xA7, 0xDB, 0xFF, 0x9F, 0x09,
+   0x73, 0x1A, 0x2F, 0x01, 0xFA, 0xAF, 0x77, 0x09,
+   0xE1, 0x8D, 0x3E, 0x2D, 0x1D, 0x45, 0x15, 0x66,
+   0xE1, 0x79, 0xD7, 0xC6, 0x94, 0x1D, 0x54, 0xBF,
+   0xDD, 0xAB, 0x46, 0x34, 0xC7, 0x55, 0x62, 0x5B,
+   0x9D, 0xBD, 0x28, 0xDB, 0x46, 0x0D, 0x2D, 0x3D,
+   0x41, 0x46, 0xDA, 0x45, 0x31, 0x78, 0xD5, 0xE7,
+   0x2C, 0xA4, 0x1F, 0x73,

[dpdk-dev] [PATCH v2 1/6] lib/cryptodev: add asymmetric algos in cryptodev

2018-04-05 Thread Shally Verma
Add rte_crypto_asym.h with supported xfrms
and associated op structures and APIs

API currently supports:
- RSA Encrypt, Decrypt, Sign and Verify
- Modular Exponentiation and Inversion
- DSA Sign and Verify
- Deffie-hellman private key exchange
- Deffie-hellman public key exchange
- Deffie-hellman shared secret compute
- Deffie-hellman public/private key pair generation
using xform chain

Signed-off-by: Shally Verma 
Signed-off-by: Sunila Sahu 
Signed-off-by: Ashish Gupta 
---
 lib/librte_cryptodev/Makefile  |   3 +-
 lib/librte_cryptodev/rte_crypto_asym.h | 519 +
 2 files changed, 521 insertions(+), 1 deletion(-)

diff --git a/lib/librte_cryptodev/Makefile b/lib/librte_cryptodev/Makefile
index bba8dee..93f9d2d 100644
--- a/lib/librte_cryptodev/Makefile
+++ b/lib/librte_cryptodev/Makefile
@@ -12,6 +12,7 @@ LIBABIVER := 4
 # build flags
 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -DALLOW_EXPERIMENTAL_API
 LDLIBS += -lrte_eal -lrte_mempool -lrte_ring -lrte_mbuf
 LDLIBS += -lrte_kvargs
 
@@ -23,7 +24,7 @@ SYMLINK-y-include += rte_crypto.h
 SYMLINK-y-include += rte_crypto_sym.h
 SYMLINK-y-include += rte_cryptodev.h
 SYMLINK-y-include += rte_cryptodev_pmd.h
-
+SYMLINK-y-include += rte_crypto_asym.h
 # versioning export map
 EXPORT_MAP := rte_cryptodev_version.map
 
diff --git a/lib/librte_cryptodev/rte_crypto_asym.h 
b/lib/librte_cryptodev/rte_crypto_asym.h
new file mode 100644
index 000..d0e2f1d
--- /dev/null
+++ b/lib/librte_cryptodev/rte_crypto_asym.h
@@ -0,0 +1,519 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017-2018 Cavium Networks
+ */
+
+#ifndef _RTE_CRYPTO_ASYM_H_
+#define _RTE_CRYPTO_ASYM_H_
+
+/**
+ * @file rte_crypto_asym.h
+ *
+ * RTE Definitions for Asymmetric Cryptography
+ *
+ * Defines asymmetric algorithms and modes, as well as supported
+ * asymmetric crypto operations.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+typedef struct rte_crypto_param_t {
+   uint8_t *data;
+   /**< pointer to buffer holding data */
+   rte_iova_t iova;
+   /**< IO address of data buffer */
+   size_t length;
+   /**< length of data in bytes */
+} rte_crypto_param;
+
+/** asym xform type name strings */
+extern const char *
+rte_crypto_asym_xform_strings[];
+
+/** asym operations type name strings */
+extern const char *
+rte_crypto_asym_op_strings[];
+
+/**
+ * Asymmetric crypto transformation types.
+ * Each xform type maps to one asymmetric algorithm
+ * performing specific operation
+ *
+ */
+enum rte_crypto_asym_xform_type {
+   RTE_CRYPTO_ASYM_XFORM_UNSPECIFIED = 0,
+   /**< Invalid xform. */
+   RTE_CRYPTO_ASYM_XFORM_NONE,
+   /**< Xform type None.
+* May be supported by PMD to support
+* passthrough op for debugging purpose.
+* if xform_type none , op_type is disregarded.
+*/
+   RTE_CRYPTO_ASYM_XFORM_RSA,
+   /**< RSA. Performs Encrypt, Decrypt, Sign and Verify.
+* Refer to rte_crypto_asym_op_type
+*/
+   RTE_CRYPTO_ASYM_XFORM_DH,
+   /**< Deffie-Hellman.
+* Performs Key Generate and Shared Secret Compute.
+* Refer to rte_crypto_asym_op_type
+*/
+   RTE_CRYPTO_ASYM_XFORM_DSA,
+   /**< Digital Signature Algorithm
+* Performs Signature Generation and Verification.
+* Refer to rte_crypto_asym_op_type
+*/
+   RTE_CRYPTO_ASYM_XFORM_MODINV,
+   /**< Modular Inverse
+* Perform Modulus inverse b^(-1) mod n
+*/
+   RTE_CRYPTO_ASYM_XFORM_MODEX,
+   /**< Modular Exponentiation
+* Perform Modular Exponentiation b^e mod n
+*/
+   RTE_CRYPTO_ASYM_XFORM_TYPE_LIST_END
+   /**< End of list */
+};
+
+/**
+ * Asymmetric crypto operation type variants
+ */
+enum rte_crypto_asym_op_type {
+   RTE_CRYPTO_ASYM_OP_ENCRYPT,
+   /**< Asymmetric Encrypt operation */
+   RTE_CRYPTO_ASYM_OP_DECRYPT,
+   /**< Asymmetric Decrypt operation */
+   RTE_CRYPTO_ASYM_OP_SIGN,
+   /**< Signature Generation operation */
+   RTE_CRYPTO_ASYM_OP_VERIFY,
+   /**< Signature Verification operation */
+   RTE_CRYPTO_ASYM_OP_PRIVATE_KEY_GENERATE,
+   /**< DH Private Key generation operation */
+   RTE_CRYPTO_ASYM_OP_PUBLIC_KEY_GENERATE,
+   /**< DH Public Key generation operation */
+   RTE_CRYPTO_ASYM_OP_SHARED_SECRET_COMPUTE,
+   /**< DH Shared Secret compute operation */
+   RTE_CRYPTO_ASYM_OP_LIST_END
+};
+
+/**
+ * Padding types for RSA signature.
+ */
+enum rte_crypto_rsa_padding_type {
+   RTE_CRYPTO_RSA_PADDING_NONE = 0,
+   /**< RSA no padding scheme */
+   RTE_CRYPTO_RSA_PKCS1_V1_5_BT0,
+   /**< RSA PKCS#1 V1.5 Block Type 0 padding scheme
+* as descibed in rfc2313
+*/
+   RTE_CRYPTO_RSA_PKCS1_V1_5_BT1,
+   /**< RSA PKCS#1 V1.5 Block Type 01 padding scheme
+* as descibed in rfc2313
+*/

[dpdk-dev] [PATCH v2 5/6] crypto/openssl: add asym crypto support

2018-04-05 Thread Shally Verma
Add asymmetric crypto operation support in openssl PMD.
Current list of supported asym xforms:
* RSA
* DSA
* Deffie-hellman
* Modular Operations

changes from v1:
- resolve new line error in dod/guides/cryptodevs/openssl.rst

Signed-off-by: Shally Verma 
Signed-off-by: Sunila Sahu 
Signed-off-by: Ashish Gupta 

--
Please apply asymmetric crypto API patches before compilation
---
 doc/guides/cryptodevs/features/openssl.ini   |  11 +
 doc/guides/cryptodevs/openssl.rst|   1 +
 drivers/crypto/openssl/rte_openssl_pmd.c | 377 -
 drivers/crypto/openssl/rte_openssl_pmd_ops.c | 400 ++-
 drivers/crypto/openssl/rte_openssl_pmd_private.h |  29 ++
 5 files changed, 806 insertions(+), 12 deletions(-)

diff --git a/doc/guides/cryptodevs/features/openssl.ini 
b/doc/guides/cryptodevs/features/openssl.ini
index 6915658..bef5c7f 100644
--- a/doc/guides/cryptodevs/features/openssl.ini
+++ b/doc/guides/cryptodevs/features/openssl.ini
@@ -7,6 +7,7 @@
 Symmetric crypto   = Y
 Sym operation chaining = Y
 Mbuf scatter gather= Y
+Asymmetric crypto = Y
 
 ;
 ; Supported crypto algorithms of the 'openssl' crypto driver.
@@ -49,3 +50,13 @@ AES GCM (256) = Y
 AES CCM (128) = Y
 AES CCM (192) = Y
 AES CCM (256) = Y
+
+;
+; Supported Asymmetric algorithms of the 'openssl' crypto driver.
+;
+[Asymmetric]
+RSA = Y
+DSA = Y
+Modular Exponentiation = Y
+Modular Inversion = Y
+Deffie-hellman = Y
diff --git a/doc/guides/cryptodevs/openssl.rst 
b/doc/guides/cryptodevs/openssl.rst
index 427fc80..4f90be8 100644
--- a/doc/guides/cryptodevs/openssl.rst
+++ b/doc/guides/cryptodevs/openssl.rst
@@ -80,6 +80,7 @@ crypto processing.
 
 Test name is cryptodev_openssl_autotest.
 For performance test cryptodev_openssl_perftest can be used.
+For asymmetric crypto operations testing, run cryptodev_openssl_asym_autotest
 
 To verify real traffic l2fwd-crypto example can be used with this command:
 
diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c 
b/drivers/crypto/openssl/rte_openssl_pmd.c
index f584d0d..527e427 100644
--- a/drivers/crypto/openssl/rte_openssl_pmd.c
+++ b/drivers/crypto/openssl/rte_openssl_pmd.c
@@ -727,19 +727,35 @@ static void HMAC_CTX_free(HMAC_CTX *ctx)
 }
 
 /** Provide session for operation */
-static struct openssl_session *
+static void  *
 get_session(struct openssl_qp *qp, struct rte_crypto_op *op)
 {
struct openssl_session *sess = NULL;
+   struct openssl_asym_session *asym_sess = NULL;
 
if (op->sess_type == RTE_CRYPTO_OP_WITH_SESSION) {
-   /* get existing session */
-   if (likely(op->sym->session != NULL))
-   sess = (struct openssl_session *)
-   get_session_private_data(
-   op->sym->session,
-   cryptodev_driver_id);
+   if (op->type == RTE_CRYPTO_OP_TYPE_SYMMETRIC) {
+   /* get existing session */
+   if (likely(op->sym->session != NULL))
+   sess = (struct openssl_session *)
+   get_session_private_data(
+   op->sym->session,
+   cryptodev_driver_id);
+   } else {
+   if (likely(op->asym->session != NULL))
+   asym_sess = (struct openssl_asym_session *)
+   get_asym_session_private_data(
+   op->asym->session,
+   cryptodev_driver_id);
+   if (asym_sess == NULL)
+   op->status =
+   RTE_CRYPTO_OP_STATUS_INVALID_SESSION;
+   return asym_sess;
+   }
} else {
+   if (op->type == RTE_CRYPTO_OP_TYPE_ASYMMETRIC)
+   return NULL; /* sessionless asymmetric not supported */
+
/* provide internal session */
void *_sess = NULL;
void *_sess_private_data = NULL;
@@ -1525,6 +1541,341 @@ static void HMAC_CTX_free(HMAC_CTX *ctx)
op->status = RTE_CRYPTO_OP_STATUS_ERROR;
 }
 
+static int process_openssl_modinv_op(struct rte_crypto_op *cop,
+ struct openssl_asym_session *sess)
+{
+   struct rte_crypto_asym_op *op = cop->asym;
+   BIGNUM *base = BN_CTX_get(sess->u.m.ctx);
+   BIGNUM *res = BN_CTX_get(sess->u.m.ctx);
+
+   if (unlikely(base == NULL || res == NULL)) {
+   if (base)
+   BN_free(base);
+   if (res)
+   BN_free(res);
+   cop->status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
+   return -1;
+   }
+
+   base = BN_bin2bn((cons

[dpdk-dev] [PATCH v2 6/6] doc: add asym crypto in cryptodev programmer guide

2018-04-05 Thread Shally Verma
Update cryptodev programmer guide with description of
asymmetric crypto framework in lib cryptodev.

Signed-off-by: Shally Verma 
Signed-off-by: Sunila Sahu 
Signed-off-by: Ashish Gupta 
---
 doc/guides/prog_guide/cryptodev_lib.rst | 338 +++-
 1 file changed, 329 insertions(+), 9 deletions(-)

diff --git a/doc/guides/prog_guide/cryptodev_lib.rst 
b/doc/guides/prog_guide/cryptodev_lib.rst
index 066fe2d..d429833 100644
--- a/doc/guides/prog_guide/cryptodev_lib.rst
+++ b/doc/guides/prog_guide/cryptodev_lib.rst
@@ -8,7 +8,7 @@ The cryptodev library provides a Crypto device framework for 
management and
 provisioning of hardware and software Crypto poll mode drivers, defining 
generic
 APIs which support a number of different Crypto operations. The framework
 currently only supports cipher, authentication, chained cipher/authentication
-and AEAD symmetric Crypto operations.
+and AEAD symmetric and asymmetric Crypto operations.
 
 
 Design Principles
@@ -159,8 +159,8 @@ Device Features and Capabilities
 Crypto devices define their functionality through two mechanisms, global device
 features and algorithm capabilities. Global devices features identify device
 wide level features which are applicable to the whole device such as
-the device having hardware acceleration or supporting symmetric Crypto
-operations,
+the device having hardware acceleration or supporting symmetric and/or 
asymmetric
+Crypto operations,
 
 The capabilities mechanism defines the individual algorithms/functions which
 the device supports, such as a specific symmetric Crypto cipher,
@@ -199,7 +199,7 @@ scope of the Crypto capability see the definition of the 
structure in the
 Each Crypto poll mode driver defines its own private array of capabilities
 for the operations it supports. Below is an example of the capabilities for a
 PMD which supports the authentication algorithm SHA1_HMAC and the cipher
-algorithm AES_CBC.
+algorithm AES_CBC and RSA operations.
 
 .. code-block:: c
 
@@ -245,7 +245,29 @@ algorithm AES_CBC.
 }
 }
 }
-}
+},
+   {   /* RSA */
+   .op = RTE_CRYPTO_OP_TYPE_ASYMMETRIC,
+   {.asym = {
+   .xform_type = RTE_CRYPTO_ASYM_XFORM_RSA,
+   .xfrm_capa = {
+   .xform_type = RTE_CRYPTO_ASYM_XFORM_RSA,
+   .op_types = ((1 << RTE_CRYPTO_ASYM_OP_SIGN) |
+   (1 << RTE_CRYPTO_ASYM_OP_VERIFY) |
+   (1 << RTE_CRYPTO_ASYM_OP_ENCRYPT) |
+   (1 << RTE_CRYPTO_ASYM_OP_DECRYPT)),
+   {
+   .modlen = {
+   /* min length is based on openssl rsa keygen */
+   .min = 30,
+   /* value 0 symbolizes no limit on max length */
+   .max = 0,
+   .increment = 1
+   }, }
+   }
+   },
+   }
+   }
 }
 
 
@@ -761,14 +783,312 @@ using one of the crypto PMDs available in DPDK.
 num_dequeued_ops);
 } while (total_num_dequeued_ops < num_enqueued_ops);
 
-
 Asymmetric Cryptography
 ---
 
-Asymmetric functionality is currently not supported by the cryptodev API.
+The cryptodev library currently provides support for the following asymmetric
+Crypto operations; RSA, Modular exponentiation and inversion, Deffie-hellman
+public and/or private key generation and shared secret compute, DSA Signature
+generation and verification.
+
+Session and Session Management
+~~
+
+Sessions are used in asymmetric cryptographic processing to store the immutable
+data defined in asymmetric cryptographic transform which is further used in the
+operation processing. Sessions typically stores information, such as, public
+and private key information or domain params or prime modulus data i.e. 
immutable
+across data sets. Crypto sessions cache this immutable data in a optimal way 
for the
+underlying PMD and this allows further acceleration of the offload of Crypto 
workloads.
+
+Like symmetric, the Crypto device framework provides APIs to allocate and 
initizalize
+asymmetric sessions for crypto devices, where sessions are mempool objects.
+It is the application's responsibility to create and manage the session 
mempools.
+Application using both symmetric and asymmetric sessions should allocate and 
maintain
+different sessions pools for each type.
+
+An application can use ``rte_cryptodev_get_asym_session_private_size()`` to
+get the private size of asymmetric session on a given crypto device. This
+function would allow an application to calculate the max device asymmetric
+session size

Re: [dpdk-dev] [PATCH v3 46/68] vfio: allow to map other memory regions

2018-04-05 Thread Burakov, Anatoly

On 04-Apr-18 12:21 AM, Anatoly Burakov wrote:

Currently it is not possible to use memory that is not owned by DPDK to
perform DMA. This scenarion might be used in vhost applications (like
SPDK) where guest send its own memory table. To fill this gap provide
API to allow registering arbitrary address in VFIO container.

Signed-off-by: Pawel Wodkowski 
Signed-off-by: Anatoly Burakov 
Signed-off-by: Gowrishankar Muthukrishnan 
---


@Gowrishankar,

We've discussed this privately already, but just to make sure it is 
publicly stated: as it is, parts of this patchset for PPC64 have 
potential issues with them.


Unmapping and remapping the entire segment list on alloc introduces a 
race condition - what if a DMA request comes in while we're in the 
middle of remapping? We cannot realistically stop NICs from doing DMA 
while some other thread is allocating memory.


There is also a larger issue that i've raised in a previous response to 
this patch [1], and PPC64 will will have this problem worse, because not 
only  the described issue will happen on hot-unplug/hotplug, but it may 
also happen during regular allocations, because PPC64 IOMMU will drop 
all mappings on window resize.


[1] http://dpdk.org/ml/archives/dev/2018-April/095182.html

--
Thanks,
Anatoly


Re: [dpdk-dev] [PATCH v4 01/17] net/axgbe: add minimal dev init and uninit support

2018-04-05 Thread Ferruh Yigit
On 4/5/2018 7:39 AM, Ravi Kumar wrote:
> add ethernet poll mode driver for AMD 10G devices embedded in
> AMD EPYC™ EMBEDDED 3000 family processors
> 
> Signed-off-by: Ravi Kumar 
<...>

> @@ -410,6 +410,12 @@ CONFIG_RTE_PMD_RING_MAX_TX_RINGS=16
>  CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y
>  
>  #
> +# Compile AMD PMD
> +#
> +CONFIG_RTE_LIBRTE_AXGBE_DEBUG_INIT=n
> +CONFIG_RTE_LIBRTE_AXGBE_PMD=y


Please add alphabetically.

RTE_LIBRTE_AXGBE_DEBUG_INIT is used for data path logs, otherwise it should be
dynamic logging. So the name is wrong for the purpose, it is no more "init"
debug log, you have dynamic log for init already.
And in documentation this has been documented as "Toggle display of
initialization related messages" which seems wrong as well.

<...>

> @@ -12,6 +12,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += af_packet
>  DIRS-$(CONFIG_RTE_LIBRTE_ARK_PMD) += ark
>  DIRS-$(CONFIG_RTE_LIBRTE_AVF_PMD) += avf
>  DIRS-$(CONFIG_RTE_LIBRTE_AVP_PMD) += avp
> +DIRS-$(CONFIG_RTE_LIBRTE_AXGBE_PMD) += axgbe

Please consider adding meson support too. Perhaps with a separate patch to this 
set.

<...>

> +
> +#
> +# all source are stored in SRCS-y
> +#
> +SRCS-$(CONFIG_RTE_LIBRTE_AXGBE_PMD) += axgbe_ethdev.c

Shared build causing build error, you need to add dependent libraries [1], and
please test shared library builds:

[1] something like:
 +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
 +LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
 +LDLIBS += -lrte_bus_pci

<...>

> +RTE_PMD_REGISTER_PCI(net_axgbe, rte_axgbe_pmd);
> +RTE_PMD_REGISTER_PCI_TABLE(net_axgbe, pci_id_axgbe_map);
> +RTE_PMD_REGISTER_KMOD_DEP(net_axgbe, "* igb_uio | uio_pci_generic | 
> vfio-pci");

Is vfio-pci supported?

Documentation says:
"
AXGBE PMD works only with legacy interrupts. Load ``igb_uio`` module in legacy
interrupt mode using module params.

Bind the intended AMD device to igb_uio module
"
<...>


Re: [dpdk-dev] [PATCH v4 17/17] net/axgbe: add workaround for axgbe ethernet training bug

2018-04-05 Thread Ferruh Yigit
On 4/5/2018 7:39 AM, Ravi Kumar wrote:
> Signed-off-by: Ravi Kumar 

Can you please give more information what is the bug solved here? What problem
observed if it is not fixed? This may help people having problem.



Re: [dpdk-dev] [PATCH v4 10/17] net/axgbe: add transmit and receive data path apis

2018-04-05 Thread Ferruh Yigit
On 4/5/2018 7:39 AM, Ravi Kumar wrote:
> Supported scalar implementation for RX data path
> Supported scalar and vector implementation for TX data path
> 
> Signed-off-by: Ravi Kumar 
> ---
>  drivers/net/axgbe/Makefile |   1 +
>  drivers/net/axgbe/axgbe_ethdev.c   |  22 +-
>  drivers/net/axgbe/axgbe_rxtx.c | 429 
> +
>  drivers/net/axgbe/axgbe_rxtx.h |  19 ++
>  drivers/net/axgbe/axgbe_rxtx_vec_sse.c |  93 +++
>  5 files changed, 563 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/axgbe/axgbe_rxtx_vec_sse.c
> 
> diff --git a/drivers/net/axgbe/Makefile b/drivers/net/axgbe/Makefile
> index 9fd7b5e..aff7917 100644
> --- a/drivers/net/axgbe/Makefile
> +++ b/drivers/net/axgbe/Makefile
> @@ -24,5 +24,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_AXGBE_PMD) += axgbe_mdio.c
>  SRCS-$(CONFIG_RTE_LIBRTE_AXGBE_PMD) += axgbe_phy_impl.c
>  SRCS-$(CONFIG_RTE_LIBRTE_AXGBE_PMD) += axgbe_i2c.c
>  SRCS-$(CONFIG_RTE_LIBRTE_AXGBE_PMD) += axgbe_rxtx.c
> +SRCS-$(CONFIG_RTE_LIBRTE_AXGBE_PMD) += axgbe_rxtx_vec_sse.c

This needs to be protected with x86 checks. PMD is enabled by default in config,
which means it will be enabled for other architectures too, like arm and ibm,
and this file will cause build error for them.


[dpdk-dev] [PATCH] doc: add meter API change to release notes

2018-04-05 Thread Jasvinder Singh
Update the release notes with meter api change to support configuration
profiles.

Signed-off-by: Jasvinder Singh 
---
 doc/guides/rel_notes/release_18_05.rst | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/doc/guides/rel_notes/release_18_05.rst 
b/doc/guides/rel_notes/release_18_05.rst
index e5fac1c..34222cd 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -72,6 +72,16 @@ API Changes
Also, make sure to start the actual text at the margin.
=
 
+* **Meter API updated to accomodate configuration profiles.**
+
+  The meter API is changed to support meter configuration profiles. The
+  configuration profile represents the set of configuration parameters
+  for a given meter object, such as the rates and sizes for the token
+  buckets. These configuration parameters were previously the part of meter
+  object internal data strcuture. The separation of the configuration
+  parameters from meter object data structure results in reducing its
+  memory footprint which helps in better cache utilization when large number
+  of meter objects are used.
 
 ABI Changes
 ---
-- 
2.9.3



[dpdk-dev] [PATCH v3] pdump: change to use generic multi-process channel

2018-04-05 Thread Jianfeng Tan
The original code replies on the private channel for primary and
secondary communication. Change to use the generic multi-process
channel.

Note with this change, dpdk-pdump will be not compatible with
old version DPDK applications.

Cc: reshma.pat...@intel.com

Signed-off-by: Jianfeng Tan 
---
v3:
  - Deprecate the enum as suggested by Reshma.
  - Add __rte_deprecate flag for rte_pdump_set_socket_dir.
  - Delete rte_pdump_set_socket_dir in pdump example.
v2:
  - Update doc for deprecation of API, rte_pdump_set_socket_dir,
and API change for rte_pdump_init.
  - Add notice for known incompatibility issue in doc.

 app/pdump/main.c   |  22 +-
 doc/guides/rel_notes/deprecation.rst   |   7 +
 doc/guides/rel_notes/release_18_05.rst |   7 +
 lib/librte_pdump/Makefile  |   3 +-
 lib/librte_pdump/rte_pdump.c   | 423 +
 lib/librte_pdump/rte_pdump.h   |   3 +-
 6 files changed, 88 insertions(+), 377 deletions(-)

diff --git a/app/pdump/main.c b/app/pdump/main.c
index d29de03..2d0879c 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -156,9 +156,11 @@ pdump_usage(const char *prgname)
"[mbuf-size=default:2176],"
"[total-num-mbufs=default:65535]'\n"
"[--server-socket-path="
-   "default:/var/run/.dpdk/ (or) ~/.dpdk/]\n"
+   " which is deprecated and will be removed soon,"
+   " default:/var/run/.dpdk/ (or) ~/.dpdk/]\n"
"[--client-socket-path="
-   "default:/var/run/.dpdk/ (or) ~/.dpdk/]\n",
+   " which is deprecated and will be removed soon,"
+   " default:/var/run/.dpdk/ (or) ~/.dpdk/]\n",
prgname);
 }
 
@@ -744,22 +746,6 @@ enable_pdump(void)
struct pdump_tuples *pt;
int ret = 0, ret1 = 0;
 
-   if (server_socket_path[0] != 0)
-   ret = rte_pdump_set_socket_dir(server_socket_path,
-   RTE_PDUMP_SOCKET_SERVER);
-   if (ret == 0 && client_socket_path[0] != 0) {
-   ret = rte_pdump_set_socket_dir(client_socket_path,
-   RTE_PDUMP_SOCKET_CLIENT);
-   }
-   if (ret < 0) {
-   cleanup_pdump_resources();
-   rte_exit(EXIT_FAILURE,
-   "failed to set socket paths of server:%s, "
-   "client:%s\n",
-   server_socket_path,
-   client_socket_path);
-   }
-
for (i = 0; i < num_tuples; i++) {
pt = &pdump_t[i];
if (pt->dir == RTE_PDUMP_FLAG_RXTX) {
diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index ec70b5f..857450a 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -150,3 +150,10 @@ Deprecation Notices
   be added between the producer and consumer structures. The size of the
   structure and the offset of the fields will remain the same on
   platforms with 64B cache line, but will change on other platforms.
+
+* pdump: As we changed to use generic IPC, some changes in APIs and structure
+  are expected in subsequent release.
+
+  - ``rte_pdump_set_socket_dir`` will be removed;
+  - The parameter, ``path``, of ``rte_pdump_init`` will be removed;
+  - The enum ``rte_pdump_socktype`` will be removed.
diff --git a/doc/guides/rel_notes/release_18_05.rst 
b/doc/guides/rel_notes/release_18_05.rst
index e5fac1c..966bd36 100644
--- a/doc/guides/rel_notes/release_18_05.rst
+++ b/doc/guides/rel_notes/release_18_05.rst
@@ -114,6 +114,13 @@ Known Issues
Also, make sure to start the actual text at the margin.
=
 
+* **pdump is not compatible with old applications.**
+
+  As we changed to use generic multi-process communication for pdump 
negotiation
+  instead of previous dedicated unix socket way, pdump applications, including
+  dpdk-pdump example and any other applications using librte_pdump, cannot work
+  with older version DPDK primary applications.
+
 
 Shared Library Versions
 ---
diff --git a/lib/librte_pdump/Makefile b/lib/librte_pdump/Makefile
index 98fa752..0ee0fa1 100644
--- a/lib/librte_pdump/Makefile
+++ b/lib/librte_pdump/Makefile
@@ -1,11 +1,12 @@
 # SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2016 Intel Corporation
+# Copyright(c) 2016-2018 Intel Corporation
 
 include $(RTE_SDK)/mk/rte.vars.mk
 
 # library name
 LIB = librte_pdump.a
 
+CFLAGS += -DALLOW_EXPERIMENTAL_API
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3
 CFLAGS += -D_GNU_SOURCE
 LDLIBS += -lpthread
diff --git a/lib/librte_pdump/rte_pdump.c b/lib/librte_pdump/rte_pdump.c
index ad6efc6..f7cfaec 100644
--- a/lib/librte_pdump/rte_pd

[dpdk-dev] [PATCH v3 1/4] app/eventdev: add event timer adapter as a producer

2018-04-05 Thread Pavan Nikhilesh
Add event timer adapter as producer option that can be selected by
passing --prod_type_timerdev.

Signed-off-by: Pavan Nikhilesh 
---

 v3 Changes:
 - Add detailed options dump.
 - Fix few typos.

 v2 Changes:
 - set timer to NOT_ARMED before trying to arm it.
 - prevent edge cases for timeout_ticks being set to 0.

 app/test-eventdev/evt_options.c  |  54 +++
 app/test-eventdev/evt_options.h  |  24 +
 app/test-eventdev/test_perf_atq.c|  10 +--
 app/test-eventdev/test_perf_common.c | 170 +--
 app/test-eventdev/test_perf_common.h |   7 ++
 app/test-eventdev/test_perf_queue.c  |   7 +-
 6 files changed, 236 insertions(+), 36 deletions(-)

diff --git a/app/test-eventdev/evt_options.c b/app/test-eventdev/evt_options.c
index 9683b2224..49cd9c419 100644
--- a/app/test-eventdev/evt_options.c
+++ b/app/test-eventdev/evt_options.c
@@ -27,6 +27,11 @@ evt_options_default(struct evt_options *opt)
opt->pool_sz = 16 * 1024;
opt->wkr_deq_dep = 16;
opt->nb_pkts = (1ULL << 26); /* do ~64M packets */
+   opt->nb_timers = 1E8;
+   opt->nb_timer_adptrs = 1;
+   opt->bkt_tck_nsec = 1E3; /* 1000ns ~ 1us */
+   opt->max_tmo_nsec = 1E5; /* 100us */
+   opt->nb_bkt_tcks = 10;   /* 50us */
opt->prod_type = EVT_PROD_TYPE_SYNT;
 }

@@ -86,6 +91,13 @@ evt_parse_eth_prod_type(struct evt_options *opt, const char 
*arg __rte_unused)
return 0;
 }

+static int
+evt_parse_timer_prod_type(struct evt_options *opt, const char *arg 
__rte_unused)
+{
+   opt->prod_type = EVT_PROD_TYPE_EVENT_TIMER_ADPTR;
+   return 0;
+}
+
 static int
 evt_parse_test_name(struct evt_options *opt, const char *arg)
 {
@@ -169,7 +181,10 @@ usage(char *program)
"\t--worker_deq_depth : dequeue depth of the worker\n"
"\t--fwd_latency  : perform fwd_latency measurement\n"
"\t--queue_priority   : enable queue priority\n"
-   "\t--prod_type_ethdev : use ethernet device as producer\n."
+   "\t--prod_type_ethdev : use ethernet device as producer.\n"
+   "\t--prod_type_timerdev : use event timer device as producer.\n"
+   "\t x * bkt_tck_nsec would be the timeout\n"
+   "\t in ns.\n"
);
printf("available tests:\n");
evt_test_dump_names();
@@ -217,22 +232,23 @@ evt_parse_sched_type_list(struct evt_options *opt, const 
char *arg)
 }

 static struct option lgopts[] = {
-   { EVT_NB_FLOWS, 1, 0, 0 },
-   { EVT_DEVICE,   1, 0, 0 },
-   { EVT_VERBOSE,  1, 0, 0 },
-   { EVT_TEST, 1, 0, 0 },
-   { EVT_PROD_LCORES,  1, 0, 0 },
-   { EVT_WORK_LCORES,  1, 0, 0 },
-   { EVT_SOCKET_ID,1, 0, 0 },
-   { EVT_POOL_SZ,  1, 0, 0 },
-   { EVT_NB_PKTS,  1, 0, 0 },
-   { EVT_WKR_DEQ_DEP,  1, 0, 0 },
-   { EVT_SCHED_TYPE_LIST,  1, 0, 0 },
-   { EVT_FWD_LATENCY,  0, 0, 0 },
-   { EVT_QUEUE_PRIORITY,   0, 0, 0 },
-   { EVT_PROD_ETHDEV,  0, 0, 0 },
-   { EVT_HELP, 0, 0, 0 },
-   { NULL, 0, 0, 0 }
+   { EVT_NB_FLOWS,1, 0, 0 },
+   { EVT_DEVICE,  1, 0, 0 },
+   { EVT_VERBOSE, 1, 0, 0 },
+   { EVT_TEST,1, 0, 0 },
+   { EVT_PROD_LCORES, 1, 0, 0 },
+   { EVT_WORK_LCORES, 1, 0, 0 },
+   { EVT_SOCKET_ID,   1, 0, 0 },
+   { EVT_POOL_SZ, 1, 0, 0 },
+   { EVT_NB_PKTS, 1, 0, 0 },
+   { EVT_WKR_DEQ_DEP, 1, 0, 0 },
+   { EVT_SCHED_TYPE_LIST, 1, 0, 0 },
+   { EVT_FWD_LATENCY, 0, 0, 0 },
+   { EVT_QUEUE_PRIORITY,  0, 0, 0 },
+   { EVT_PROD_ETHDEV, 0, 0, 0 },
+   { EVT_PROD_TIMERDEV,   0, 0, 0 },
+   { EVT_HELP,0, 0, 0 },
+   { NULL,0, 0, 0 }
 };

 static int
@@ -255,11 +271,12 @@ evt_opts_parse_long(int opt_idx, struct evt_options *opt)
{ EVT_FWD_LATENCY, evt_parse_fwd_latency},
{ EVT_QUEUE_PRIORITY, evt_parse_queue_priority},
{ EVT_PROD_ETHDEV, evt_parse_eth_prod_type},
+   { EVT_PROD_TIMERDEV, evt_parse_timer_prod_type},
};

for (i = 0; i < RTE_DIM(parsermap); i++) {
if (strncmp(lgopts[opt_idx].name, parsermap[i].lgopt_name,
-   strlen(parsermap[i].lgopt_name)) == 0)
+   strlen(lgopts[opt_idx].name)) == 0)
return parsermap[i].parser_fn(opt, optarg);
}

@@ -305,6 +322,7 @@ evt_options_dump(struct evt_options *opt)
evt_dump("pool_sz", "%d", opt->pool_sz);
evt_dump("master lcore", "%d", rte_get_master_lcore());
evt_dump("nb_pkts", "%"PRIu64, opt->nb_pkts);
+   evt_dump("nb_timers", "%"PRIu64, opt->nb_timers

  1   2   3   >