Re: [dpdk-dev] [PATCH v5 2/6] eal: abstract away IPC socket path generation

2018-03-11 Thread Tan, Jianfeng



On 3/8/2018 12:56 AM, Anatoly Burakov wrote:

Signed-off-by: Anatoly Burakov 


Acked-by: Jianfeng Tan 

Thanks,
Jianfeng


---

Notes:
 v5: remove lock files, leaving only socket paths code
 
 v4: replace lock files with init files


  lib/librte_eal/common/eal_common_proc.c | 48 -
  1 file changed, 23 insertions(+), 25 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_proc.c 
b/lib/librte_eal/common/eal_common_proc.c
index da7930f..1aab3ac 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -91,6 +91,17 @@ find_sync_request(const char *dst, const char *act_name)
return r;
  }
  
+static void

+create_socket_path(const char *name, char *buf, int len)
+{
+   const char *prefix = eal_mp_socket_path();
+
+   if (strlen(name) > 0)
+   snprintf(buf, len, "%s_%s", prefix, name);
+   else
+   snprintf(buf, len, "%s", prefix);
+}
+
  int
  rte_eal_primary_proc_alive(const char *config_file_path)
  {
@@ -290,8 +301,12 @@ mp_handle(void *arg __rte_unused)
  static int
  open_socket_fd(void)
  {
+   char peer_name[PATH_MAX] = {0};
struct sockaddr_un un;
-   const char *prefix = eal_mp_socket_path();
+
+   if (rte_eal_process_type() == RTE_PROC_SECONDARY)
+   snprintf(peer_name, sizeof(peer_name),
+   "%d_%"PRIx64, getpid(), rte_rdtsc());
  
  	mp_fd = socket(AF_UNIX, SOCK_DGRAM, 0);

if (mp_fd < 0) {
@@ -301,13 +316,11 @@ open_socket_fd(void)
  
  	memset(&un, 0, sizeof(un));

un.sun_family = AF_UNIX;
-   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-   snprintf(un.sun_path, sizeof(un.sun_path), "%s", prefix);
-   else {
-   snprintf(un.sun_path, sizeof(un.sun_path), "%s_%d_%"PRIx64,
-prefix, getpid(), rte_rdtsc());
-   }
+
+   create_socket_path(peer_name, un.sun_path, sizeof(un.sun_path));
+
unlink(un.sun_path); /* May still exist since last run */
+
if (bind(mp_fd, (struct sockaddr *)&un, sizeof(un)) < 0) {
RTE_LOG(ERR, EAL, "failed to bind %s: %s\n",
un.sun_path, strerror(errno));
@@ -342,20 +355,6 @@ unlink_sockets(const char *filter)
return 0;
  }
  
-static void

-unlink_socket_by_path(const char *path)
-{
-   char *filename;
-   char *fullpath = strdup(path);
-
-   if (!fullpath)
-   return;
-   filename = basename(fullpath);
-   unlink_sockets(filename);
-   free(fullpath);
-   RTE_LOG(INFO, EAL, "Remove socket %s\n", path);
-}
-
  int
  rte_mp_channel_init(void)
  {
@@ -444,10 +443,9 @@ send_msg(const char *dst_path, struct rte_mp_msg *msg, int 
type)
if (snd < 0) {
rte_errno = errno;
/* Check if it caused by peer process exits */
-   if (errno == ECONNREFUSED) {
-   /* We don't unlink the primary's socket here */
-   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-   unlink_socket_by_path(dst_path);
+   if (errno == ECONNREFUSED &&
+   rte_eal_process_type() == RTE_PROC_PRIMARY) {
+   unlink(dst_path);
return 0;
}
if (errno == ENOBUFS) {




Re: [dpdk-dev] [PATCH v5 4/6] eal: lock IPC directory on init and send

2018-03-11 Thread Tan, Jianfeng



On 3/8/2018 12:56 AM, Anatoly Burakov wrote:

When sending IPC messages, prevent new sockets from initializing.

Signed-off-by: Anatoly Burakov 


Make sense.

Acked-by: Jianfeng Tan 

Thanks!


Re: [dpdk-dev] [PATCH v5 5/6] eal: simplify IPC sync request timeout code

2018-03-11 Thread Tan, Jianfeng



On 3/8/2018 12:56 AM, Anatoly Burakov wrote:

Signed-off-by: Anatoly Burakov 


Why I did not find this simple way in the first place :-)

Acked-by: Jianfeng Tan 

Thanks!


[dpdk-dev] Question on AESNI PMD

2018-03-11 Thread Luse, Paul E
Hi,

I'm working on an SPDK module that uses the DPDK cryptodev framework, initially 
I'm using the AESNI PMD and have a few questions. in the doc it says that only 
in-place is supported however I see code in set_mb_job_params() just after the 
comment "Mutable crypto operation parameters" it appears to support a separate 
src and dst m_buf so curious about that.

For my use case (storage) I'm using external data buffers so I can't use that 
code anyways but I was able to make some minor changes and am able to pass in 
different src and dst m_bufs that point to my own data buffers (not in the 
packet) and it seems to be working fine.

So my 2 questions are:

(1) is the documented in-place limitation simply not correct?

(2) would there be any upstream interest in supporting a patch that enables 
m_bufs using external data buffers for src and dst?

Thanks!
Paul


Re: [dpdk-dev] [PATCH v1 1/9] mempool: add op to calculate memory size to be allocated

2018-03-11 Thread santosh
Hi Andrew,


On Saturday 10 March 2018 09:09 PM, Andrew Rybchenko wrote:
> Size of memory chunk required to populate mempool objects depends
> on how objects are stored in the memory. Different mempool drivers
> may have different requirements and a new operation allows to
> calculate memory size in accordance with driver requirements and
> advertise requirements on minimum memory chunk size and alignment
> in a generic way.
>
> Bump ABI version since the patch breaks it.
>
> Signed-off-by: Andrew Rybchenko 
> ---
> RFCv2 -> v1:
>  - move default calc_mem_size callback to rte_mempool_ops_default.c
>  - add ABI changes to release notes
>  - name default callback consistently: rte_mempool_op__default()
>  - bump ABI version since it is the first patch which breaks ABI
>  - describe default callback behaviour in details
>  - avoid introduction of internal function to cope with depration
>(keep it to deprecation patch)
>  - move cache-line or page boundary chunk alignment to default callback
>  - highlight that min_chunk_size and align parameters are output only
>
[...]

> diff --git a/lib/librte_mempool/rte_mempool_ops_default.c 
> b/lib/librte_mempool/rte_mempool_ops_default.c
> new file mode 100644
> index 000..57fe79b
> --- /dev/null
> +++ b/lib/librte_mempool/rte_mempool_ops_default.c
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2016 Intel Corporation.
> + * Copyright(c) 2016 6WIND S.A.
> + * Copyright(c) 2018 Solarflare Communications Inc.
> + */
> +
> +#include 
> +
> +ssize_t
> +rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp,
> +  uint32_t obj_num, uint32_t pg_shift,
> +  size_t *min_chunk_size, size_t *align)
> +{
> + unsigned int mp_flags;
> + int ret;
> + size_t total_elt_sz;
> + size_t mem_size;
> +
> + /* Get mempool capabilities */
> + mp_flags = 0;
> + ret = rte_mempool_ops_get_capabilities(mp, &mp_flags);
> + if ((ret < 0) && (ret != -ENOTSUP))
> + return ret;
> +
> + total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
> +
> + mem_size = rte_mempool_xmem_size(obj_num, total_elt_sz, pg_shift,
> +  mp->flags | mp_flags);
> +

Looks ok to me except a nit:
(mp->flags | mp_flags) style expression is to differentiate that
mp_flags holds driver specific flag like BLK_ALIGN and mp->flags
has appl specific flags.. is it so? If not then why not simply
do like:
mp->flags |= mp_flags.

Thanks.



Re: [dpdk-dev] [dpdk-stable] [PATCH] net/mlx5: fix disabling Tx packet inlining

2018-03-11 Thread Shahaf Shuler
Tuesday, February 27, 2018 9:51 AM, NĂ©lio Laranjeiro
> On Mon, Feb 26, 2018 at 09:50:57AM -0800, Yongseok Koh wrote:
> > Adding 'txq_inline=0' to PMD parameter should disable Tx packet
> > inlining but it doesn't work properly for Enhanced Multi-Packet Send.
> >
> > Fixes: 6ce84bd88919 ("net/mlx5: add enhanced multi-packet send for
> > ConnectX-5")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Yongseok Koh 
> > Acked-by: Shahaf Shuler 
> Acked-by: Nelio Laranjeiro 

Applied to next-net-mlx, thanks. 



[dpdk-dev] Mellanox 18.05 roadmap

2018-03-11 Thread Olga Shern
Hi,

Sorry for sending it a little bit late.
Below is what we are planning to implement in 18.05 release:

Support Rx stateless ethdev offloads (RSS, checksum and packet type 
classification) for various tunnel protocols. Including VXLAN, VXLAN-GPE, 
MPLSoGRE and MPLSoUDP.
This new API will be implemented in mlx5.


Support Tx stateless ethdev offloads (TSO and checksum) for custom tunnel 
protocols.  Custom tunnel protocol is defined by informing the PMD on the 
offsets and header types of the inner and outer headers inside the packet 
payload.
This new API will be implemented in mlx5.



API to support full HW offload including:

Redirecting specific flows from one VM to another VM without SW processing.

Manipulation of packet payload including encap/decap of headers, vlan push/pop, 
headers rewrite, etc



Defining , in collaboration with Intel,  API for VF representors  and  control 
API to configure VF from host.



Participating in new devargs  syntax development, which is bus agnostic  and   
separate generic and driver specific properties.



rte_flow support for bonding PMD.



TSO support for TAP PMD.



mlx5 and mlx4 support  new dynamic memory allocation suggested by the series: 
http://dpdk.org/ml/archives/dev/2018-March/092070.html.



mlx5 and mlx4 support non-contiguous memory pools.



mlx5 performance improvements:

   - stride RQ to reach  100G with 64B message

   - OOB performance




---
Olga Shern
SW Director DPDK
Mellanox Technologies, Raanana, Israel



Re: [dpdk-dev] [PATCH] net/octeontx: use the new offload APIs

2018-03-11 Thread santosh

On Friday 09 March 2018 12:37 AM, Pavan Nikhilesh wrote:
> Use the new Rx/Tx offload APIs and remove the old style offloads.
>
> Signed-off-by: Pavan Nikhilesh 
> ---

Acked-by: Santosh Shukla 



Re: [dpdk-dev] [PATCH v3 18/18] net/axgbe: moved license headers to SPDX format

2018-03-11 Thread Stephen Hemminger
On Fri,  9 Mar 2018 03:42:34 -0500
Ravi Kumar  wrote:

> +..  Copyright (c) 2018 Advanced Micro Devices, Inc. All rights reserved.
> +SPDX-License-Identifier: BSD-3-Clause
>  

By convention SPDX tag is always first line of the file.


Re: [dpdk-dev] [PATCH v5 6/6] eal: ignore messages until init is complete

2018-03-11 Thread Tan, Jianfeng


On 3/8/2018 12:56 AM, Anatoly Burakov wrote:

If we receive messages that don't have a callback registered for
them, and we haven't finished initialization yet, it can be reasonably
inferred that we shouldn't have gotten the message in the first
place. Therefore, send requester a special message telling them to
ignore response to this request, as if this process wasn't there.


Two nits:
- I think we could add a note in commit log that "it only applies to 
primary to secondary request".
- To make the change more simple, maybe we can just put a declaration of 
function mp_send.



Signed-off-by: Anatoly Burakov 


Acked-by: Jianfeng Tan 


---

Notes:
 v5: add this patch
 
 No changes in mp_send and send_msg - just code move.


  lib/librte_eal/common/eal_common_proc.c | 280 +---
  1 file changed, 151 insertions(+), 129 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_proc.c 
b/lib/librte_eal/common/eal_common_proc.c
index fe27d68..1ea6045 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -52,6 +52,7 @@ enum mp_type {
MP_MSG, /* Share message with peers, will not block */
MP_REQ, /* Request for information, Will block for a reply */
MP_REP, /* Response to previously-received request */
+   MP_IGN, /* Response telling requester to ignore this response */
  };
  
  struct mp_msg_internal {

@@ -205,6 +206,130 @@ rte_mp_action_unregister(const char *name)
free(entry);
  }
  
+/**

+ * Return -1, as fail to send message and it's caused by the local side.
+ * Return 0, as fail to send message and it's caused by the remote side.
+ * Return 1, as succeed to send message.
+ *
+ */
+static int
+send_msg(const char *dst_path, struct rte_mp_msg *msg, int type)
+{
+   int snd;
+   struct iovec iov;
+   struct msghdr msgh;
+   struct cmsghdr *cmsg;
+   struct sockaddr_un dst;
+   struct mp_msg_internal m;
+   int fd_size = msg->num_fds * sizeof(int);
+   char control[CMSG_SPACE(fd_size)];
+
+   m.type = type;
+   memcpy(&m.msg, msg, sizeof(*msg));
+
+   memset(&dst, 0, sizeof(dst));
+   dst.sun_family = AF_UNIX;
+   snprintf(dst.sun_path, sizeof(dst.sun_path), "%s", dst_path);
+
+   memset(&msgh, 0, sizeof(msgh));
+   memset(control, 0, sizeof(control));
+
+   iov.iov_base = &m;
+   iov.iov_len = sizeof(m) - sizeof(msg->fds);
+
+   msgh.msg_name = &dst;
+   msgh.msg_namelen = sizeof(dst);
+   msgh.msg_iov = &iov;
+   msgh.msg_iovlen = 1;
+   msgh.msg_control = control;
+   msgh.msg_controllen = sizeof(control);
+
+   cmsg = CMSG_FIRSTHDR(&msgh);
+   cmsg->cmsg_len = CMSG_LEN(fd_size);
+   cmsg->cmsg_level = SOL_SOCKET;
+   cmsg->cmsg_type = SCM_RIGHTS;
+   memcpy(CMSG_DATA(cmsg), msg->fds, fd_size);
+
+   do {
+   snd = sendmsg(mp_fd, &msgh, 0);
+   } while (snd < 0 && errno == EINTR);
+
+   if (snd < 0) {
+   rte_errno = errno;
+   /* Check if it caused by peer process exits */
+   if (errno == ECONNREFUSED &&
+   rte_eal_process_type() == RTE_PROC_PRIMARY) {
+   unlink(dst_path);
+   return 0;
+   }
+   if (errno == ENOBUFS) {
+   RTE_LOG(ERR, EAL, "Peer cannot receive message %s\n",
+   dst_path);
+   return 0;
+   }
+   RTE_LOG(ERR, EAL, "failed to send to (%s) due to %s\n",
+   dst_path, strerror(errno));
+   return -1;
+   }
+
+   return 1;
+}
+
+static int
+mp_send(struct rte_mp_msg *msg, const char *peer, int type)
+{
+   int dir_fd, ret = 0;
+   DIR *mp_dir;
+   struct dirent *ent;
+
+   if (!peer && (rte_eal_process_type() == RTE_PROC_SECONDARY))
+   peer = eal_mp_socket_path();
+
+   if (peer) {
+   if (send_msg(peer, msg, type) < 0)
+   return -1;
+   else
+   return 0;
+   }
+
+   /* broadcast to all secondary processes */
+   mp_dir = opendir(mp_dir_path);
+   if (!mp_dir) {
+   RTE_LOG(ERR, EAL, "Unable to open directory %s\n",
+   mp_dir_path);
+   rte_errno = errno;
+   return -1;
+   }
+
+   dir_fd = dirfd(mp_dir);
+   /* lock the directory to prevent processes spinning up while we send */
+   if (flock(dir_fd, LOCK_EX)) {
+   RTE_LOG(ERR, EAL, "Unable to lock directory %s\n",
+   mp_dir_path);
+   rte_errno = errno;
+   closedir(mp_dir);
+   return -1;
+   }
+
+   while ((ent = readdir(mp_dir))) {
+   char path[PATH_MAX];
+
+   if (fnmatch(mp_filter, ent->d_name, 0) != 0)
+   

Re: [dpdk-dev] [PATCH] vdev: fix name comparison in find_vdev

2018-03-11 Thread Tan, Jianfeng

Hi,


On 3/10/2018 6:28 AM, Nachiketa Prachanda wrote:

Use strcmp to compare device names as the strncmp in original code
causes find_vdev to return -EEXIST  for names that are prefix
of another. The creation of interfaces fails unpredictably based
on the order of their creation. An easy way hit this bug is to create
eth_vhost1 after eth_vhost11.

Signed-off-by: Nachiketa Prachanda 


Except a nit bellow:

Acked-by: Jianfeng Tan 



Fixes: dda987315ca2 ("vdev: make virtual bus use its device struct")
Cc: jblu...@infradead.org
Cc: sta...@dpdk.org


AFAIK, we usually put above line before Signed-off-by.


---
  drivers/bus/vdev/vdev.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index e4bc724..7eae319 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -188,7 +188,7 @@ find_vdev(const char *name)
TAILQ_FOREACH(dev, &vdev_device_list, next) {
const char *devname = rte_vdev_device_name(dev);
  
-		if (!strncmp(devname, name, strlen(name)))

+   if (!strcmp(devname, name))
return dev;
}
  




Re: [dpdk-dev] [PATCH v3] doc: add driver limitation for vhost dequeue zero copy

2018-03-11 Thread Tan, Jianfeng



On 3/9/2018 6:07 PM, Junjie Chen wrote:

In vhost-switch example, when binding nic to vfio-pci with iommu enabled,
dequeue zero copy cannot work in VM2NIC mode due to no iommu dma mapping
is setup for guest memory currently.

Signed-off-by: Junjie Chen 
---
Changes in V3:
- update limitation to iommu
Changes in V2:
- add doc in vhost lib

  doc/guides/prog_guide/vhost_lib.rst | 5 +
  doc/guides/sample_app_ug/vhost.rst  | 5 -
  2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/vhost_lib.rst 
b/doc/guides/prog_guide/vhost_lib.rst
index 18227b6..06d214f 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -83,6 +83,11 @@ The following is an overview of some key Vhost API functions:
of those segments, thus the fewer the segments, the quicker we will get
the mapping. NOTE: we may speed it by using tree searching in future.
  
+* zero copy does not work when using driver with iommu mode currently, this


Considering FreeBSD driver nic_uio does not support iommu, we can make 
"driver" more explicitly. For example,


"driver with iommu mode" -> "vfio-pci with iommu mode"


+  is because we don't setup iommu dma mapping for guest memory.



For example, when you bind device to vfio-pci driver, you need to set driver to 
work in noiommu mode.


Reword the above sentence a little bit:
"If vfio-pci is a must in your case, insert vfio-pci kernel module in 
noiommu mode."



+
- ``RTE_VHOST_USER_IOMMU_SUPPORT``
  
  IOMMU support will be enabled when this flag is set. It is disabled by

diff --git a/doc/guides/sample_app_ug/vhost.rst 
b/doc/guides/sample_app_ug/vhost.rst
index a4bdc6a..f0bb169 100644
--- a/doc/guides/sample_app_ug/vhost.rst
+++ b/doc/guides/sample_app_ug/vhost.rst
@@ -147,7 +147,10 @@ retries on an RX burst, it takes effect only when rx retry 
is enabled. The
  default value is 15.
  
  **--dequeue-zero-copy**

-Dequeue zero copy will be enabled when this option is given.
+Dequeue zero copy will be enabled when this option is given. it is worth to
+note that if NIC is binded to driver with iommu enabled, dequeue zero copy
+cannot work at VM2NIC mode (vm2vm=0) due to currently we don't setup iommu
+dma mapping for guest memory.
  
  **--vlan-strip 0|1**

  VLAN strip option is removed, because different NICs have different behaviors




Re: [dpdk-dev] [PATCH 2/4] bus/vdev: bus scan by multi-process channel

2018-03-11 Thread Tan, Jianfeng



On 3/7/2018 10:00 PM, Burakov, Anatoly wrote:

On 04-Mar-18 3:30 PM, Jianfeng Tan wrote:

To scan the vdevs in primary, we send request to primary process
to obtain the names for vdevs.

Only the name is shared from the primary. In probe(), the device
driver is supposed to locate (or request more) the detail
information from the primary.

Signed-off-by: Jianfeng Tan 
---


General note - you probably want to syncrhonize access to the tailq. 
Multiple secondaries may initialize, a vdev hotplug event may be in 
process, etc.




Make sense, will change it in next version.

Thanks,
Jianfeng


Re: [dpdk-dev] [PATCH 3/4] drivers/net: do not allocate rte_eth_dev_data privately

2018-03-11 Thread Tan, Jianfeng



On 3/7/2018 2:10 PM, Matan Azrad wrote:



Yes, we are removing all private rte_eth_dev_data. If I missed some
device, welcome to point out.

net/ring



What is about next PCI device?

net/cxgbe


Will change these two in next version. Thank you, Matan.




Re: [dpdk-dev] [PATCH] eventdev: add timestamping to received packets

2018-03-11 Thread Jerin Jacob
-Original Message-
> Date: Tue, 20 Feb 2018 06:30:54 -0500
> From: Nikhil Rao 
> To: jerin.ja...@caviumnetworks.com
> CC: dev@dpdk.org, gage.e...@intel.com, narender.vang...@intel.com,
>  abhinandan.guj...@intel.com, nikhil@intel.com
> Subject: [PATCH] eventdev: add timestamping to received packets
> X-Mailer: git-send-email 1.8.3.1
> 
> Add timestamp to received packets before enqueuing to
> event device if the timestamp is not already set. Adding
> timestamp in the Rx adapter avoids additional latency due
> to the event device.
> 
> Signed-off-by: Nikhil Rao 

Acked-by: Jerin Jacob 



Re: [dpdk-dev] [PATCH] net/dpaa: prefer defines for link speed values

2018-03-11 Thread Shreyansh Jain
> -Original Message-
> From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
> Sent: Friday, March 9, 2018 11:29 PM
> To: Hemant Agrawal ; Shreyansh Jain
> 
> Cc: dev@dpdk.org; Ferruh Yigit 
> Subject: [PATCH] net/dpaa: prefer defines for link speed values
> 
> Use existing defines instead of hardcoded vales.
> 
> Signed-off-by: Ferruh Yigit 
> ---

Acked-by: Shreyansh Jain 


Re: [dpdk-dev] [PATCH v2 1/2] eventdev: add device stop flush callback

2018-03-11 Thread Jerin Jacob
-Original Message-
> When an event device is stopped, it drains all event queues. These events
> may contain pointers, so to prevent memory leaks eventdev now supports a
> user-provided flush callback that is called during the queue drain process.
> This callback is stored in process memory, so the callback must be
> registered by any process that may call rte_event_dev_stop().
> 
> This commit also clarifies the behavior of rte_event_dev_stop().
> 
> This follows this mailing list discussion:
> http://dpdk.org/ml/archives/dev/2018-January/087484.html
> 
> Signed-off-by: Gage Eads 
> ---
> v2: allow a NULL callback pointer to unregister the callback
> 
>  lib/librte_eventdev/rte_eventdev.c   | 17 +
>  lib/librte_eventdev/rte_eventdev.h   | 55 
> +++-
>  lib/librte_eventdev/rte_eventdev_version.map |  6 +++
>  3 files changed, 76 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eventdev/rte_eventdev.c 
> b/lib/librte_eventdev/rte_eventdev.c
> index 851a119..1aacb7b 100644
> --- a/lib/librte_eventdev/rte_eventdev.c
> +++ b/lib/librte_eventdev/rte_eventdev.c
> @@ -1123,6 +1123,23 @@ rte_event_dev_start(uint8_t dev_id)
>   return 0;
>  }
>  
> +typedef void (*eventdev_stop_flush_t)(uint8_t dev_id, struct rte_event event,
> + void *arg);
> +/**< Callback function called during rte_event_dev_stop(), invoked once per
> + * flushed event.
> + */
> +
>  #define RTE_EVENTDEV_NAME_MAX_LEN(64)
>  /**< @internal Max length of name of event PMD */
>  
> @@ -1176,6 +1194,11 @@ struct rte_eventdev {
>   event_dequeue_burst_t dequeue_burst;
>   /**< Pointer to PMD dequeue burst function. */
>  
> + eventdev_stop_flush_t dev_stop_flush;
> + /**< Optional, user-provided event flush function */
> + void *dev_stop_flush_arg;
> + /**< User-provided argument for event flush function */
> +

I think, we can move this additions to the internal rte_eventdev_data 
structure. Reasons are
1) Changes to "struct rte_eventdev" would call for ABI change
2) We can keep "struct rte_eventdev" only for fast path functions,
slow path functions can have additional redirection.

>   struct rte_eventdev_data *data;
>   /**< Pointer to device data */
>   const struct rte_eventdev_ops *dev_ops;
> @@ -1822,6 +1845,34 @@ rte_event_dev_xstats_reset(uint8_t dev_id,
>   */
>  int rte_event_dev_selftest(uint8_t dev_id);
>  
> +/**
> + * Registers a callback function to be invoked during rte_event_dev_stop() 
> for
> + * each flushed event. This function can be used to properly dispose of 
> queued
> + * events, for example events containing memory pointers.
> + *
> + * The callback function is only registered for the calling process. The
> + * callback function must be registered in every process that can call
> + * rte_event_dev_stop().
> + *
> + * To unregister a callback, call this function with a NULL callback pointer.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param callback
> + *   Callback function invoked once per flushed event.
> + * @param userdata
> + *   Argument supplied to callback.
> + *
> + * @return
> + *  - 0 on success.
> + *  - -EINVAL if *dev_id* is invalid
> + *
> + * @see rte_event_dev_stop()
> + */
> +int
> +rte_event_dev_stop_flush_callback_register(uint8_t dev_id,
> + eventdev_stop_flush_t callback, void *userdata);
> +
IMO, It would be better if we place this function near to rte_event_dev_stop().

Other than above minor changes, It looks good to me.


Re: [dpdk-dev] [PATCH v1 1/9] mempool: add op to calculate memory size to be allocated

2018-03-11 Thread Andrew Rybchenko

On 03/11/2018 03:51 PM, santosh wrote:

Hi Andrew,


On Saturday 10 March 2018 09:09 PM, Andrew Rybchenko wrote:

Size of memory chunk required to populate mempool objects depends
on how objects are stored in the memory. Different mempool drivers
may have different requirements and a new operation allows to
calculate memory size in accordance with driver requirements and
advertise requirements on minimum memory chunk size and alignment
in a generic way.

Bump ABI version since the patch breaks it.

Signed-off-by: Andrew Rybchenko 
---
RFCv2 -> v1:
  - move default calc_mem_size callback to rte_mempool_ops_default.c
  - add ABI changes to release notes
  - name default callback consistently: rte_mempool_op__default()
  - bump ABI version since it is the first patch which breaks ABI
  - describe default callback behaviour in details
  - avoid introduction of internal function to cope with depration
(keep it to deprecation patch)
  - move cache-line or page boundary chunk alignment to default callback
  - highlight that min_chunk_size and align parameters are output only


[...]


diff --git a/lib/librte_mempool/rte_mempool_ops_default.c 
b/lib/librte_mempool/rte_mempool_ops_default.c
new file mode 100644
index 000..57fe79b
--- /dev/null
+++ b/lib/librte_mempool/rte_mempool_ops_default.c
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2016 Intel Corporation.
+ * Copyright(c) 2016 6WIND S.A.
+ * Copyright(c) 2018 Solarflare Communications Inc.
+ */
+
+#include 
+
+ssize_t
+rte_mempool_op_calc_mem_size_default(const struct rte_mempool *mp,
+uint32_t obj_num, uint32_t pg_shift,
+size_t *min_chunk_size, size_t *align)
+{
+   unsigned int mp_flags;
+   int ret;
+   size_t total_elt_sz;
+   size_t mem_size;
+
+   /* Get mempool capabilities */
+   mp_flags = 0;
+   ret = rte_mempool_ops_get_capabilities(mp, &mp_flags);
+   if ((ret < 0) && (ret != -ENOTSUP))
+   return ret;
+
+   total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size;
+
+   mem_size = rte_mempool_xmem_size(obj_num, total_elt_sz, pg_shift,
+mp->flags | mp_flags);
+

Looks ok to me except a nit:
(mp->flags | mp_flags) style expression is to differentiate that
mp_flags holds driver specific flag like BLK_ALIGN and mp->flags
has appl specific flags.. is it so? If not then why not simply
do like:
mp->flags |= mp_flags.


In fact it does not mater a lot since the code is removed in the patch 3.
Here it is required just for consistency. Also, mp argument is a const
which will not allow to change its members.