Re: [dpdk-dev] [PATCH 2/2] net/mlx5: add Rx and Tx tuning parameters
On Sun, Apr 29, 2018 at 09:03:08PM +0300, Shahaf Shuler wrote: > A new ethdev API was exposed by > commit 3be82f5cc5e3 ("ethdev: support PMD-tuned Tx/Rx parameters") > > Enabling the PMD to provide default parameters in case no strict request > from application in order to improve the out of the box experience. > > While the current API lacks the means for the PMD to provide the best > possible value, providing the best default the PMD can guess. > The values are based on Mellanox performance report and depends on the > underlying NIC capabilities. > > Cc: ere...@mellanox.com > Cc: am...@mellanox.com > Cc: ol...@mellanox.com > > Signed-off-by: Shahaf Shuler > --- > drivers/net/mlx5/mlx5_ethdev.c | 51 > ++ > 1 file changed, 51 insertions(+) > > diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c > index 588d4ba627..78354922b0 100644 > --- a/drivers/net/mlx5/mlx5_ethdev.c > +++ b/drivers/net/mlx5/mlx5_ethdev.c > @@ -417,6 +417,56 @@ mlx5_dev_configure(struct rte_eth_dev *dev) > } > > /** > + * Sets default tuning parameters. > + * > + * @param dev > + * Pointer to Ethernet device. > + * @param[out] info > + * Info structure output buffer. > + */ > +static void > +mlx5_set_default_params(struct rte_eth_dev *dev, struct rte_eth_dev_info > *info) > +{ > + struct priv *priv = dev->data->dev_private; > + > + if (priv->link_speed_capa & ETH_LINK_SPEED_100G) { > + if (dev->data->nb_rx_queues <= 2 && > + dev->data->nb_tx_queues <= 2) { > + /* Minimum CPU utilization. */ > + info->default_rxportconf.ring_size = 256; > + info->default_txportconf.ring_size = 256; > + /* Don't care as queue num is set. */ > + info->default_rxportconf.nb_queues = 0; > + info->default_txportconf.nb_queues = 0; > + } else { > + /* Max Throughput. */ > + info->default_rxportconf.ring_size = 2048; > + info->default_txportconf.ring_size = 2048; > + info->default_rxportconf.nb_queues = 16; > + info->default_txportconf.nb_queues = 16; > + } > + } else { > + if (dev->data->nb_rx_queues <= 2 && > + dev->data->nb_tx_queues <= 2) { > + /* Minimum CPU utilization. */ > + info->default_rxportconf.ring_size = 256; > + info->default_txportconf.ring_size = 256; > + /* Don't care as queue num is set. */ > + info->default_rxportconf.nb_queues = 0; > + info->default_txportconf.nb_queues = 0; > + } else { > + /* Max Throughput. */ > + info->default_rxportconf.ring_size = 4096; > + info->default_txportconf.ring_size = 4096; > + info->default_rxportconf.nb_queues = 8; > + info->default_txportconf.nb_queues = 8; > + } > + } > + info->default_rxportconf.burst_size = 64; > + info->default_txportconf.burst_size = 64; This can be fully re-written to simplify and ease the maintenance, default values i.e. "Minimum CPU utilization" are duplicated, this can be used as default values and just tweak in case the amount of queues are different from 2 according to the link speed. > +} > + > +/** > * DPDK callback to get information about the device. > * > * @param dev > @@ -458,6 +508,7 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct > rte_eth_dev_info *info) > info->hash_key_size = rss_hash_default_key_len; > info->speed_capa = priv->link_speed_capa; > info->flow_type_rss_offloads = ~MLX5_RSS_HF_MASK; > + mlx5_set_default_params(dev, info); > } > > /** > -- > 2.12.0 Thanks, -- Nélio Laranjeiro 6WIND
Re: [dpdk-dev] [PATCH 1/2] net/mlx5: fix ethtool link setting call order
On Sun, Apr 29, 2018 at 09:03:07PM +0300, Shahaf Shuler wrote: > According to ethtool_link_setting API recommendation ETHTOOL_GLINKSETTINGS > should be called before ETHTOOL_GSET as the later one deprecated. > > Fixes: f47ba80080ab ("net/mlx5: remove kernel version check") > Cc: nelio.laranje...@6wind.com > > Signed-off-by: Shahaf Shuler Acked-by: Nelio Laranjeiro > --- > drivers/net/mlx5/mlx5_ethdev.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c > index 746b94f734..588d4ba627 100644 > --- a/drivers/net/mlx5/mlx5_ethdev.c > +++ b/drivers/net/mlx5/mlx5_ethdev.c > @@ -697,9 +697,9 @@ mlx5_link_update(struct rte_eth_dev *dev, int > wait_to_complete) > time_t start_time = time(NULL); > > do { > - ret = mlx5_link_update_unlocked_gset(dev, &dev_link); > + ret = mlx5_link_update_unlocked_gs(dev, &dev_link); > if (ret) > - ret = mlx5_link_update_unlocked_gs(dev, &dev_link); > + ret = mlx5_link_update_unlocked_gset(dev, &dev_link); > if (ret == 0) > break; > /* Handle wait to complete situation. */ > -- > 2.12.0 > -- Nélio Laranjeiro 6WIND
Re: [dpdk-dev] [PATCH 0/4] support for write combining
Hello Bruce, It should work because decision about kind of mapping is made in patch 3 based on PMD request. ENA use only one BAR in wc mode and two other without caching, therefore not making remap in igb_uio rather not spoil anything. I added patch 1 because this variable is provided also outside the DPDK to Linux generic functions. I cannot test it with all drivers, therefore I make it configurable. But general combination of wc and not wc PMD should work, because of patch 3. Also if WC is enabled by PMD (like in patch 4) not all BARs will by mapped, but only those which has prefetchable flag enabled. Best regards, Rafal Kozik
Re: [dpdk-dev] [PATCH v3 2/2] mem: revert to using flock() and add per-segment lockfiles
On 28-Apr-18 10:38 AM, Andrew Rybchenko wrote: On 04/25/2018 01:36 PM, Anatoly Burakov wrote: The original implementation used flock() locks, but was later switched to using fcntl() locks for page locking, because fcntl() locks allow locking parts of a file, which is useful for single-file segments mode, where locking the entire file isn't as useful because we still need to grow and shrink it. However, according to fcntl()'s Ubuntu manpage [1], semantics of fcntl() locks have a giant oversight: This interface follows the completely stupid semantics of System V and IEEE Std 1003.1-1988 (“POSIX.1”) that require that all locks associated with a file for a given process are removed when any file descriptor for that file is closed by that process. This semantic means that applications must be aware of any files that a subroutine library may access. Basically, closing *any* fd with an fcntl() lock (which we do because we don't want to leak fd's) will drop the lock completely. So, in this commit, we will be reverting back to using flock() locks everywhere. However, that still leaves the problem of locking parts of a memseg list file in single file segments mode, and we will be solving it with creating separate lock files per each page, and tracking those with flock(). We will also be removing all of this tailq business and replacing it with a simple array - saving a few bytes is not worth the extra hassle of dealing with pointers and potential memory allocation failures. Also, remove the tailq lock since it is not needed - these fd lists are per-process, and within a given process, it is always only one thread handling access to hugetlbfs. So, first one to allocate a segment will create a lockfile, and put a shared lock on it. When we're shrinking the page file, we will be trying to take out a write lock on that lockfile, which would fail if any other process is holding onto the lockfile as well. This way, we can know if we can shrink the segment file. Also, if no other locks are found in the lock list for a given memseg list, the memseg list fd is automatically closed. One other thing to note is, according to flock() Ubuntu manpage [2], upgrading the lock from shared to exclusive is implemented by dropping and reacquiring the lock, which is not atomic and thus would have created race conditions. So, on attempting to perform operations in hugetlbfs, we will take out a writelock on hugetlbfs directory, so that only one process could perform hugetlbfs operations concurrently. [1] http://manpages.ubuntu.com/manpages/artful/en/man2/fcntl.2freebsd.html [2] http://manpages.ubuntu.com/manpages/bionic/en/man2/flock.2.html Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists") Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime") Fixes: a5ff05d60fc5 ("mem: support unmapping pages at runtime") Fixes: 2a04139f66b4 ("eal: add single file segments option") Cc: anatoly.bura...@intel.com Signed-off-by: Anatoly Burakov Acked-by: Bruce Richardson We have a problem with the changeset if EAL option -m or --socket-mem is used. EAL initialization hangs just after EAL: Probing VFIO support... strace points to flock(7, LOCK_EX List of file descriptors: # ls /proc/25452/fd -l total 0 lrwx-- 1 root root 64 Apr 28 10:34 0 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:34 1 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:32 2 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:34 3 -> /run/.rte_config lrwx-- 1 root root 64 Apr 28 10:34 4 -> socket:[154166] lrwx-- 1 root root 64 Apr 28 10:34 5 -> socket:[154158] lr-x-- 1 root root 64 Apr 28 10:34 6 -> /dev/hugepages lr-x-- 1 root root 64 Apr 28 10:34 7 -> /dev/hugepages I guess the problem is that there are two /dev/hugepages and it hangs on the second. Ideas how to solve it? Andrew. Seeing similar reports from validation. I'm looking into it. -- Thanks, Anatoly
Re: [dpdk-dev] [PATCH 0/4] support for write combining
On Mon, Apr 30, 2018 at 10:07:07AM +0200, Rafał Kozik wrote: > Hello Bruce, > > It should work because decision about kind of mapping is made in patch > 3 based on PMD request. > > ENA use only one BAR in wc mode and two other without caching, > therefore not making remap in igb_uio rather not spoil anything. > > I added patch 1 because this variable is provided also outside the > DPDK to Linux generic functions. I cannot test it with all drivers, > therefore I make it configurable. > > But general combination of wc and not wc PMD should work, because of patch 3. > Also if WC is enabled by PMD (like in patch 4) not all BARs will by > mapped, but only those which has prefetchable flag enabled. > Sounds good, so in that case I've no objection to the patch. Acked-by: Bruce Richardson
Re: [dpdk-dev] [PATCH v4 3/9] mem: fix potential double close
On Fri, Apr 27, 2018 at 06:07:04PM +0100, Anatoly Burakov wrote: > We were closing descriptor before checking if mapping has > failed, but if it did, we did a second close afterwards. Fix > it by moving closing descriptor to after we've done all error > checks. > > Coverity issue: 272560 > > Fixes: 2a04139f66b4 ("eal: add single file segments option") > Cc: anatoly.bura...@intel.com > > Signed-off-by: Anatoly Burakov > --- > > Notes: > v4: > - Moved fd close to until after all error checks are done > Acked-by: Bruce Richardson
[dpdk-dev] [PATCH 1/2] examples/vhost: fix header copy to discontiguous desc buffer
In the loop to copy virtio-net header to the descriptor buffer, destination pointer was incremented instead of the source pointer. Coverity issue: 277240 Fixes: 82c93a567d3b ("examples/vhost: move to safe GPA translation API") Signed-off-by: Maxime Coquelin --- examples/vhost/virtio_net.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/vhost/virtio_net.c b/examples/vhost/virtio_net.c index 5a965a346..8ea6b36d5 100644 --- a/examples/vhost/virtio_net.c +++ b/examples/vhost/virtio_net.c @@ -103,7 +103,7 @@ enqueue_pkt(struct vhost_dev *dev, struct rte_vhost_vring *vr, remain -= len; guest_addr += len; - dst += len; + src += len; } desc_chunck_len = desc->len - dev->hdr_len; -- 2.14.3
[dpdk-dev] [PATCH 0/2] Fix enqueueing vnet header in discontiguous decs buffer
This series fixes copying virtio net header to discontiguous descriptor buffer in VA space. The issue was spotted by Coverity for examples/vhost, but same issue was present in vhost-user library. Maxime Coquelin (2): examples/vhost: fix header copy to discontiguous desc buffer vhost: fix header copy to discontiguous desc buffer examples/vhost/virtio_net.c | 2 +- lib/librte_vhost/virtio_net.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) -- 2.14.3
[dpdk-dev] [PATCH 2/2] vhost: fix header copy to discontiguous desc buffer
In the loop to copy virtio-net header to the descriptor buffer, destination pointer was incremented instead of the source pointer. Fixes: fb3815cc614d ("vhost: handle virtually non-contiguous buffers in Rx-mrg") Fixes: 6727f5a739b6 ("vhost: handle virtually non-contiguous buffers in Rx") Cc: sta...@dpdk.org Signed-off-by: Maxime Coquelin --- lib/librte_vhost/virtio_net.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 5fdd4172b..eed6b0227 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -277,7 +277,7 @@ copy_mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, vhost_log_write(dev, guest_addr, len); remain -= len; guest_addr += len; - dst += len; + src += len; } } @@ -771,7 +771,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq, remain -= len; guest_addr += len; - dst += len; + src += len; } } else { PRINT_PACKET(dev, (uintptr_t)hdr_addr, -- 2.14.3
[dpdk-dev] [PATCH] eal: check if hugedir write lock is already being held
At hugepage info initialization, EAL takes out a write lock on hugetlbfs directories, and drops it after the memory init is finished. However, in non-legacy mode, if "-m" or "--socket-mem" switches are passed, this leads to a deadlock because EAL tries to allocate pages (and thus take out a write lock on hugedir) while still holding a separate hugedir write lock in EAL. Fix it by checking if write lock in hugepage info is active, and not trying to lock the directory if the hugedir fd is valid. Fixes: 1a7dc2252f28 ("mem: revert to using flock and add per-segment lockfiles") Cc: anatoly.bura...@intel.com Signed-off-by: Anatoly Burakov --- lib/librte_eal/linuxapp/eal/eal_memalloc.c | 71 ++ 1 file changed, 42 insertions(+), 29 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c b/lib/librte_eal/linuxapp/eal/eal_memalloc.c index 00d7886..360d8f7 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c @@ -666,7 +666,7 @@ alloc_seg_walk(const struct rte_memseg_list *msl, void *arg) struct alloc_walk_param *wa = arg; struct rte_memseg_list *cur_msl; size_t page_sz; - int cur_idx, start_idx, j, dir_fd; + int cur_idx, start_idx, j, dir_fd = -1; unsigned int msl_idx, need, i; if (msl->page_sz != wa->page_sz) @@ -691,19 +691,24 @@ alloc_seg_walk(const struct rte_memseg_list *msl, void *arg) * because file creation and locking operations are not atomic, * and we might be the first or the last ones to use a particular page, * so we need to ensure atomicity of every operation. +* +* during init, we already hold a write lock, so don't try to take out +* another one. */ - dir_fd = open(wa->hi->hugedir, O_RDONLY); - if (dir_fd < 0) { - RTE_LOG(ERR, EAL, "%s(): Cannot open '%s': %s\n", __func__, - wa->hi->hugedir, strerror(errno)); - return -1; - } - /* blocking writelock */ - if (flock(dir_fd, LOCK_EX)) { - RTE_LOG(ERR, EAL, "%s(): Cannot lock '%s': %s\n", __func__, - wa->hi->hugedir, strerror(errno)); - close(dir_fd); - return -1; + if (wa->hi->lock_descriptor == -1) { + dir_fd = open(wa->hi->hugedir, O_RDONLY); + if (dir_fd < 0) { + RTE_LOG(ERR, EAL, "%s(): Cannot open '%s': %s\n", + __func__, wa->hi->hugedir, strerror(errno)); + return -1; + } + /* blocking writelock */ + if (flock(dir_fd, LOCK_EX)) { + RTE_LOG(ERR, EAL, "%s(): Cannot lock '%s': %s\n", + __func__, wa->hi->hugedir, strerror(errno)); + close(dir_fd); + return -1; + } } for (i = 0; i < need; i++, cur_idx++) { @@ -742,7 +747,8 @@ alloc_seg_walk(const struct rte_memseg_list *msl, void *arg) if (wa->ms) memset(wa->ms, 0, sizeof(*wa->ms) * wa->n_segs); - close(dir_fd); + if (dir_fd >= 0) + close(dir_fd); return -1; } if (wa->ms) @@ -754,7 +760,8 @@ alloc_seg_walk(const struct rte_memseg_list *msl, void *arg) wa->segs_allocated = i; if (i > 0) cur_msl->version++; - close(dir_fd); + if (dir_fd >= 0) + close(dir_fd); return 1; } @@ -769,7 +776,7 @@ free_seg_walk(const struct rte_memseg_list *msl, void *arg) struct rte_memseg_list *found_msl; struct free_walk_param *wa = arg; uintptr_t start_addr, end_addr; - int msl_idx, seg_idx, ret, dir_fd; + int msl_idx, seg_idx, ret, dir_fd = -1; start_addr = (uintptr_t) msl->base_va; end_addr = start_addr + msl->memseg_arr.len * (size_t)msl->page_sz; @@ -788,19 +795,24 @@ free_seg_walk(const struct rte_memseg_list *msl, void *arg) * because file creation and locking operations are not atomic, * and we might be the first or the last ones to use a particular page, * so we need to ensure atomicity of every operation. +* +* during init, we already hold a write lock, so don't try to take out +* another one. */ - dir_fd = open(wa->hi->hugedir, O_RDONLY); - if (dir_fd < 0) { - RTE_LOG(ERR, EAL, "%s(): Cannot open '%s': %s\n", __func__, - wa->hi->hugedir, strerror(errno)); - return -1; - } - /* blocking writelock */ - if (flock(dir_fd, LOCK_EX)) { - RTE_LOG(ERR, EAL, "%s(): Cannot lock '%s': %s\n", __func__, - wa->hi->hugedir, s
[dpdk-dev] [PATCH] vhost/crypto: fix incorrect bracket location
Coverity issue: 277232 Coverity issue: 277237 Fixes: 3bb595ecd682 ("vhost/crypto: add request handler") Signed-off-by: Fan Zhang --- lib/librte_vhost/vhost_crypto.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_vhost/vhost_crypto.c b/lib/librte_vhost/vhost_crypto.c index c38eb3bb5..3fa50281c 100644 --- a/lib/librte_vhost/vhost_crypto.c +++ b/lib/librte_vhost/vhost_crypto.c @@ -675,8 +675,8 @@ prepare_sym_cipher_op(struct vhost_crypto *vcrypto, struct rte_crypto_op *op, goto error_exit; } if (unlikely(copy_data(rte_pktmbuf_mtod(m_src, uint8_t *), head, - mem, &desc, cipher->para.src_data_len)) - < 0) { + mem, &desc, cipher->para.src_data_len) + < 0)) { ret = VIRTIO_CRYPTO_BADMSG; goto error_exit; } -- 2.13.6
[dpdk-dev] [PATCH] vhost/crypto: fix bracket
Coverity issue: 233232 Coverity issue: 233237 Fixes: 3bb595ecd682 ("vhost/crypto: add request handler") Signed-off-by: Fan Zhang --- lib/librte_vhost/vhost_crypto.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/lib/librte_vhost/vhost_crypto.c b/lib/librte_vhost/vhost_crypto.c index c38eb3bb5..a3bce6379 100644 --- a/lib/librte_vhost/vhost_crypto.c +++ b/lib/librte_vhost/vhost_crypto.c @@ -675,8 +675,7 @@ prepare_sym_cipher_op(struct vhost_crypto *vcrypto, struct rte_crypto_op *op, goto error_exit; } if (unlikely(copy_data(rte_pktmbuf_mtod(m_src, uint8_t *), head, - mem, &desc, cipher->para.src_data_len)) - < 0) { + mem, &desc, cipher->para.src_data_len) < 0)) { ret = VIRTIO_CRYPTO_BADMSG; goto error_exit; } -- 2.13.6
Re: [dpdk-dev] [PATCH] vhost/crypto: fix bracket
On 04/30/2018 12:36 PM, Fan Zhang wrote: Coverity issue: 233232 Coverity issue: 233237 Fixes: 3bb595ecd682 ("vhost/crypto: add request handler") Signed-off-by: Fan Zhang --- lib/librte_vhost/vhost_crypto.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [PATCH] vhost/crypto: fix incorrect bracket location
On 04/30/2018 12:31 PM, Fan Zhang wrote: Coverity issue: 277232 Coverity issue: 277237 Fixes: 3bb595ecd682 ("vhost/crypto: add request handler") Signed-off-by: Fan Zhang --- lib/librte_vhost/vhost_crypto.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [dpdk-web] [PATCH v2] update stable releases roadmap
25/04/2018 12:03, Luca Boccassi: > On Wed, 2018-04-25 at 09:33 +0100, Ferruh Yigit wrote: > > On 4/20/2018 4:52 PM, Aaron Conole wrote: > > > Kevin Traynor writes: > > > > On 04/18/2018 02:28 PM, Thomas Monjalon wrote: > > > > > 18/04/2018 14:28, Ferruh Yigit: > > > > > > On 4/18/2018 10:14 AM, Thomas Monjalon wrote: > > > > > > > 18/04/2018 11:05, Ferruh Yigit: > > > > > > > > On 4/11/2018 12:28 AM, Thomas Monjalon wrote: > > > > > > > > > - Typically a new stable release version > > > > > > > > > follows a mainline release > > > > > > > > > - by 1-2 weeks, depending on the test results. > > > > > > > > > + The first stable release (.1) of a branch > > > > > > > > > should follow > > > > > > > > > + its mainline release (.0) by at least two > > > > > > > > > months, > > > > > > > > > + after the first release candidate (-rc1) of > > > > > > > > > the next branch. > > > > > > > > > > > > > > > > Hi Thomas, > > > > > > > > > > > > > > > > What this change suggest? To be able to backport patches > > > > > > > > from rc1? > > > > > > > > > > > > > > Yes, it is the proposal we discussed earlier. > > > > > > > We can wait one week after RC1 to get some validation > > > > > > > confirmation. > > > > > > > Do you agree? > > > > > > > > > > > > This has been discussed in tech-board, what I remember the > > > > > > decision was to wait > > > > > > the release to backport patches into stable tree. > > > > > > > > Any minutes? I couldn't find them > > > > > > > > > It was not so clear to me. > > > > > I thought post-rc1 was acceptable. The idea is to speed-up > > > > > stable releases > > > > > pace, especially first release of a series. > > > > > > > > > > > > > > > > > > I think timing of stable releases and bugfix backports to the > > > > stable > > > > branch are two separate items. > > > > > > > > I do think that bugfix backports to stable should happen on a > > > > regular > > > > basis (e.g. every 2 weeks). Otherwise we are back to the > > > > situation where > > > > if there's a bugfix after a DPDK release, a user like (surprise, > > > > surprise) OVS may not be able to use that DPDK version for ~3 > > > > months. > > > > > > > > Someone who wants to get the latest bugfixes can just take the > > > > latest on > > > > the stable branch and importantly, can have confidence that the > > > > community has officially accepted those patches. If someone > > > > requires > > > > stable to be validated, then they have to wait until the release. > > > > > > +1 - this seems to make the most sense to me. Keep the patches > > > flowing, > > > but don't label/tag it until validation. That serves an additional > > > function: developers know their CC's to stable are being processed. > > > > Are stable trees verified? > > Verification is one issue - so far, Intel and ATT have provided time > and resources to do some regression tests, but only at release time > (before tagging). And it has been a manual process. > It would be great if more companies would step up to help - and even > better if regressions could be automated (nightly job?). > > The other issue is deciding when a patch is "good to go" - until now, > the criteria has been "when it's merged into master". > So either that criteria needs to change, and another equally > "authoritative" is decided on, or patches should get reviewed and > merged in master more often and more quickly :-P > > We also have not been looking directly at the the various -next trees, > as things are more "in-flux" there and could be reverted, or clash with > changes from other trees - hence why we merge from master. Yes, backporting from master is definitely the right thing to do. Backporting more regularly would be also an improvement. There will be always the question of the bug-free ideal in stable branches. I agree we need more help to validate the stable branches. But realistically, it will never be perfect. So the questions are: - What we must wait before pushing a backport in the stable tree? - What we must wait before tagging a stable release? I think it is reasonnable to push backports one or two weeks after it is in the master branch, assuming master is tested by the community. If a corner case is found later, it will be fixed with another patch. That's why it's important to wait a validation period (happening after each release candidate) before tagging a stable release. So, if we are aware of a regression in the master branch, which has been backported, we can wait few more days to fix it. The last thing we need to consider before tagging, is the validation of the stable release itself. Are we able to run some non-regression tests on the stable branch if it is ready few days after a RC1?
Re: [dpdk-dev] [v2, 2/6] eventdev: add APIs and PMD callbacks for crypto adapter
> -Original Message- > From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com] > Sent: Sunday, April 29, 2018 9:44 PM > To: Gujjar, Abhinandan S > Cc: hemant.agra...@nxp.com; akhil.go...@nxp.com; dev@dpdk.org; Vangati, > Narender ; Rao, Nikhil ; > Eads, Gage > Subject: Re: [v2,2/6] eventdev: add APIs and PMD callbacks for crypto adapter > > -Original Message- > > Date: Tue, 24 Apr 2018 18:13:23 +0530 > > From: Abhinandan Gujjar > > To: jerin.ja...@caviumnetworks.com, hemant.agra...@nxp.com, > > akhil.go...@nxp.com, dev@dpdk.org > > CC: narender.vang...@intel.com, abhinandan.guj...@intel.com, > > nikhil@intel.com, gage.e...@intel.com > > Subject: [v2,2/6] eventdev: add APIs and PMD callbacks for crypto > > adapter > > X-Mailer: git-send-email 1.9.1 > > > > Signed-off-by: Abhinandan Gujjar > > --- > > drivers/event/sw/sw_evdev.c| 13 +++ > > lib/librte_eventdev/rte_eventdev.c | 25 + > > lib/librte_eventdev/rte_eventdev.h | 52 + > > lib/librte_eventdev/rte_eventdev_pmd.h | 189 > > + > > 4 files changed, 279 insertions(+) > > > > diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c > > index dcb6551..10f0e1a 100644 > > --- a/drivers/event/sw/sw_evdev.c > > +++ b/drivers/event/sw/sw_evdev.c > > @@ -480,6 +480,17 @@ > > return 0; > > } > > > > +static int > > +sw_crypto_adapter_caps_get(const struct rte_eventdev *dev, > > + const struct rte_cryptodev *cdev, > > + uint32_t *caps) > > +{ > > + RTE_SET_USED(dev); > > + RTE_SET_USED(cdev); > > + *caps = RTE_EVENT_CRYPTO_ADAPTER_SW_CAP; > > + return 0; > > +} > > + > > static void > > sw_info_get(struct rte_eventdev *dev, struct rte_event_dev_info > > *info) { @@ -809,6 +820,8 @@ static int32_t > > sw_sched_service_func(void *args) > > > > .timer_adapter_caps_get = > sw_timer_adapter_caps_get, > > > > + .crypto_adapter_caps_get = > sw_crypto_adapter_caps_get, > > + > > .xstats_get = sw_xstats_get, > > .xstats_get_names = sw_xstats_get_names, > > .xstats_get_by_name = sw_xstats_get_by_name, diff -- > git > > a/lib/librte_eventdev/rte_eventdev.c > > b/lib/librte_eventdev/rte_eventdev.c > > index 3f016f4..7ca9fd1 100644 > > --- a/lib/librte_eventdev/rte_eventdev.c > > +++ b/lib/librte_eventdev/rte_eventdev.c > > @@ -29,6 +29,8 @@ > > #include > > #include > > #include > > +#include > > +#include > > > > #include "rte_eventdev.h" > > #include "rte_eventdev_pmd.h" > > @@ -145,6 +147,29 @@ > > : 0; > > } > > > > +int __rte_experimental > > +rte_event_crypto_adapter_caps_get(uint8_t dev_id, uint8_t cdev_id, > > + uint32_t *caps) > > +{ > > + struct rte_eventdev *dev; > > + struct rte_cryptodev *cdev; > > + > > + RTE_EVENTDEV_VALID_DEVID_OR_ERR_RET(dev_id, -EINVAL); > > + if (!rte_cryptodev_pmd_is_valid_dev(cdev_id)) > > + return -EINVAL; > > + > > + dev = &rte_eventdevs[dev_id]; > > + cdev = rte_cryptodev_pmd_get_dev(cdev_id); > > + > > + if (caps == NULL) > > + return -EINVAL; > > + *caps = 0; > > + > > + return dev->dev_ops->crypto_adapter_caps_get ? > > + (*dev->dev_ops->crypto_adapter_caps_get) > > + (dev, cdev, caps) : -ENOTSUP; > > +} > > + > > static inline int > > rte_event_dev_queue_config(struct rte_eventdev *dev, uint8_t > > nb_queues) { diff --git a/lib/librte_eventdev/rte_eventdev.h > > b/lib/librte_eventdev/rte_eventdev.h > > index 8297f24..9822747 100644 > > --- a/lib/librte_eventdev/rte_eventdev.h > > +++ b/lib/librte_eventdev/rte_eventdev.h > > @@ -8,6 +8,8 @@ > > #ifndef _RTE_EVENTDEV_H_ > > #define _RTE_EVENTDEV_H_ > > > > +#include > > + > > /** > > * @file > > * > > @@ -1135,6 +1137,56 @@ struct rte_event { int __rte_experimental > > rte_event_timer_adapter_caps_get(uint8_t dev_id, uint32_t *caps); > > > > +/* Crypto adapter capability bitmap flag */ > > +#define RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_NEW > 0x1 > > +/**< Flag indicates HW is capable of generating events. > > events in RTE_EVENT_OP_NEW enqueue operation Ok > > > + * Cryptodev will send packets to the event device as new events > > + * using an internal event port. > > + */ > > + > > +#define RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD > 0x2 > > +/**< Flag indicates HW is capable of generating events. > > events in RTE_EVENT_OP_FWD enqueue operation Ok > > > + * Cryptodev will send packets to the event device as forwarded event > > + * using an internal event port. > > + */ > > + > > +#define RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_QP_EV_BIND > 0x4 > > +/**< Flag indicates HW is capable of mapping crypto queue pair to > > + * event queue. > > + */ > > + > > +#define RTE_EVENT_CRYPTO_ADAPTER_CAP_SESSION_PRIVATE_DATA 0x8 > > +/**< Flag indicates HW/SW suports a mech
Re: [dpdk-dev] [v2, 3/6] eventdev: add crypto adapter implementation
> -Original Message- > From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com] > Sent: Sunday, April 29, 2018 9:53 PM > To: Gujjar, Abhinandan S > Cc: hemant.agra...@nxp.com; akhil.go...@nxp.com; dev@dpdk.org; Vangati, > Narender ; Rao, Nikhil ; > Eads, Gage > Subject: Re: [v2,3/6] eventdev: add crypto adapter implementation > > -Original Message- > > Date: Tue, 24 Apr 2018 18:13:24 +0530 > > From: Abhinandan Gujjar > > To: jerin.ja...@caviumnetworks.com, hemant.agra...@nxp.com, > > akhil.go...@nxp.com, dev@dpdk.org > > CC: narender.vang...@intel.com, abhinandan.guj...@intel.com, > > nikhil@intel.com, gage.e...@intel.com > > Subject: [v2,3/6] eventdev: add crypto adapter implementation > > X-Mailer: git-send-email 1.9.1 > > > > Signed-off-by: Abhinandan Gujjar > > Signed-off-by: Nikhil Rao > > Signed-off-by: Gage Eads > > --- > > + > > +/* Per crypto device information */ > > +struct crypto_device_info { > > + /* Pointer to cryptodev */ > > + struct rte_cryptodev *dev; > > + /* Pointer to queue pair info */ > > + struct crypto_queue_pair_info *qpairs; > > + /* Next queue pair to be processed */ > > + uint16_t next_queue_pair_id; > > + /* Set to indicate cryptodev->eventdev packet > > +* transfer uses a hardware mechanism > > +*/ > > + uint8_t internal_event_port; > > + /* Set to indicate processing has been started */ > > + uint8_t dev_started; > > + /* If num_qpairs > 0, the start callback will > > +* be invoked if not already invoked > > +*/ > > + uint16_t num_qpairs; > > +}; > > Looks like it is used in fastpath, if so add the cache alignment. Sure. > > > + > > +/* Per queue pair information */ > > +struct crypto_queue_pair_info { > > + /* Set to indicate queue pair is enabled */ > > + bool qp_enabled; > > + /* Pointer to hold rte_crypto_ops for batching */ > > + struct rte_crypto_op **op_buffer; > > + /* No of crypto ops accumulated */ > > + uint8_t len; > > +}; > > + > > +static struct rte_event_crypto_adapter **event_crypto_adapter; > > + > > +eca_enq_to_cryptodev(struct rte_event_crypto_adapter *adapter, > > +struct rte_event *ev, unsigned int cnt) { > > + struct rte_event_crypto_adapter_stats *stats = &adapter->crypto_stats; > > + union rte_event_crypto_metadata *m_data = NULL; > > + struct crypto_queue_pair_info *qp_info = NULL; > > + struct rte_crypto_op *crypto_op; > > + unsigned int i, n = 0; > > + uint16_t qp_id = 0, len = 0, ret = 0; > > Please review the explicit '0' assignment. I have initialized only those, which are complained by gcc. I will look at it again. If required, I will initialize them separately. Is that ok? > > > + uint8_t cdev_id = 0; > > + > > + stats->event_dequeue_count += cnt; > > + > > + for (i = 0; i < cnt; i++) { > > + crypto_op = ev[i].event_ptr; > > + if (crypto_op == NULL) > > + continue; > > + if (crypto_op->sess_type == RTE_CRYPTO_OP_WITH_SESSION) { > > + m_data = > rte_cryptodev_sym_session_get_private_data( > > + crypto_op->sym->session); > > + if (m_data == NULL) { > > + rte_pktmbuf_free(crypto_op->sym->m_src); > > + rte_crypto_op_free(crypto_op); > > + continue; > > + } > > + > > + cdev_id = m_data->request_info.cdev_id; > > + qp_id = m_data->request_info.queue_pair_id; > > + qp_info = &adapter->cdevs[cdev_id].qpairs[qp_id]; > > + if (qp_info == NULL) { > > + rte_pktmbuf_free(crypto_op->sym->m_src); > > + rte_crypto_op_free(crypto_op); > > + continue; > > + } > > + len = qp_info->len; > > + qp_info->op_buffer[len] = crypto_op; > > + len++; > > + > > +int __rte_experimental > > +rte_event_crypto_adapter_queue_pair_add(uint8_t id, > > + uint8_t cdev_id, > > + int32_t queue_pair_id, > > + const struct rte_event_crypto_queue_pair_conf *conf) > { > > + struct rte_event_crypto_adapter *adapter; > > + struct rte_eventdev *dev; > > + struct crypto_device_info *dev_info; > > + uint32_t cap; > > + int ret; > > + > > + RTE_EVENT_CRYPTO_ADAPTER_ID_VALID_OR_ERR_RET(id, -EINVAL); > > + > > + if (!rte_cryptodev_pmd_is_valid_dev(cdev_id)) { > > + RTE_EDEV_LOG_ERR("Invalid dev_id=%" PRIu8, cdev_id); > > + return -EINVAL; > > + } > > + > > + adapter = eca_id_to_adapter(id); > > + if (adapter == NULL) > > + return -EINVAL; > > + > > + dev = &rte_eventdevs[adapter->eventdev_id]; > > + ret = rte_event_crypto_adapter_caps_get(adapter->eventdev_id, > > + cdev_id, > > +
Re: [dpdk-dev] [v2, 5/6] eventdev: add event crypto adapter to meson build system
> -Original Message- > From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com] > Sent: Sunday, April 29, 2018 9:55 PM > To: Gujjar, Abhinandan S > Cc: hemant.agra...@nxp.com; akhil.go...@nxp.com; dev@dpdk.org; Vangati, > Narender ; Rao, Nikhil ; > Eads, Gage > Subject: Re: [v2,5/6] eventdev: add event crypto adapter to meson build system > > -Original Message- > > Date: Tue, 24 Apr 2018 18:13:26 +0530 > > From: Abhinandan Gujjar > > To: jerin.ja...@caviumnetworks.com, hemant.agra...@nxp.com, > > akhil.go...@nxp.com, dev@dpdk.org > > CC: narender.vang...@intel.com, abhinandan.guj...@intel.com, > > nikhil@intel.com, gage.e...@intel.com > > Subject: [v2,5/6] eventdev: add event crypto adapter to meson build > > system > > X-Mailer: git-send-email 1.9.1 > > > > Signed-off-by: Abhinandan Gujjar > > --- > > lib/librte_eventdev/meson.build | 8 +--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/lib/librte_eventdev/meson.build > > b/lib/librte_eventdev/meson.build > > Separate patch is not required for meson build. Have it in the same patch for > make based build and ADD each files as when it added in the patch. Should I add changes related to " lib/librte_eventdev/meson.build" as part of crypto adapter implementation? Or you recommend the changes in "eventdev pmd" patch?
[dpdk-dev] [PATCH v2 1/2] mem: check if allocation size is too big
Mapping size is a 64-bit integer, but mmap() will accept size_t for size mappings. A user could request a mapping with an alignment, which would have overflown size_t, so check if (size + alignment) will overflow size_t. Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/eal_common_memory.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c index 4c943b0..0ac7b33 100644 --- a/lib/librte_eal/common/eal_common_memory.c +++ b/lib/librte_eal/common/eal_common_memory.c @@ -75,8 +75,13 @@ eal_get_virtual_area(void *requested_addr, size_t *size, do { map_sz = no_align ? *size : *size + page_sz; + if (map_sz > SIZE_MAX) { + RTE_LOG(ERR, EAL, "Map size too big\n"); + rte_errno = E2BIG; + return NULL; + } - mapped_addr = mmap(requested_addr, map_sz, PROT_READ, + mapped_addr = mmap(requested_addr, (size_t)map_sz, PROT_READ, mmap_flags, -1, 0); if (mapped_addr == MAP_FAILED && allow_shrink) *size -= page_sz; -- 2.7.4
[dpdk-dev] [PATCH v2 2/2] mem: unmap unneeded space
When we ask to reserve virtual areas, we usually include alignment in the mapping size, and that memory ends up being wasted. Wasting a gigabyte of VA space while trying to reserve one gigabyte is pretty expensive on 32-bit, so after we're done mapping, unmap unneeded space. Signed-off-by: Anatoly Burakov --- Notes: v2: - Split fix for size_t overflow into separate patch - Improve readability of unmapping code - Added comment explaining why unmapping is done lib/librte_eal/common/eal_common_memory.c | 26 +- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c index 0ac7b33..60aed4a 100644 --- a/lib/librte_eal/common/eal_common_memory.c +++ b/lib/librte_eal/common/eal_common_memory.c @@ -121,8 +121,32 @@ eal_get_virtual_area(void *requested_addr, size_t *size, RTE_LOG(DEBUG, EAL, "Virtual area found at %p (size = 0x%zx)\n", aligned_addr, *size); - if (unmap) + if (unmap) { munmap(mapped_addr, map_sz); + } else if (!no_align) { + void *map_end, *aligned_end; + size_t before_len, after_len; + + /* when we reserve space with alignment, we add alignment to +* mapping size. On 32-bit, if 1GB alignment was requested, this +* would waste 1GB of address space, which is a luxury we cannot +* afford. so, if alignment was performed, check if any unneeded +* address space can be unmapped back. +*/ + + map_end = RTE_PTR_ADD(mapped_addr, (size_t)map_sz); + aligned_end = RTE_PTR_ADD(aligned_addr, *size); + + /* unmap space before aligned mmap address */ + before_len = RTE_PTR_DIFF(aligned_addr, mapped_addr); + if (before_len > 0) + munmap(mapped_addr, before_len); + + /* unmap space after aligned end mmap address */ + after_len = RTE_PTR_DIFF(map_end, aligned_end); + if (after_len > 0) + munmap(aligned_end, after_len); + } baseaddr_offset += *size; -- 2.7.4
Re: [dpdk-dev] [v2, 5/6] eventdev: add event crypto adapter to meson build system
-Original Message- > Date: Mon, 30 Apr 2018 11:21:38 + > From: "Gujjar, Abhinandan S" > To: Jerin Jacob > CC: "hemant.agra...@nxp.com" , > "akhil.go...@nxp.com" , "dev@dpdk.org" > , "Vangati, Narender" , "Rao, > Nikhil" , "Eads, Gage" > Subject: RE: [v2,5/6] eventdev: add event crypto adapter to meson build > system > > > > > -Original Message- > > From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com] > > Sent: Sunday, April 29, 2018 9:55 PM > > To: Gujjar, Abhinandan S > > Cc: hemant.agra...@nxp.com; akhil.go...@nxp.com; dev@dpdk.org; Vangati, > > Narender ; Rao, Nikhil ; > > Eads, Gage > > Subject: Re: [v2,5/6] eventdev: add event crypto adapter to meson build > > system > > > > -Original Message- > > > Date: Tue, 24 Apr 2018 18:13:26 +0530 > > > From: Abhinandan Gujjar > > > To: jerin.ja...@caviumnetworks.com, hemant.agra...@nxp.com, > > > akhil.go...@nxp.com, dev@dpdk.org > > > CC: narender.vang...@intel.com, abhinandan.guj...@intel.com, > > > nikhil@intel.com, gage.e...@intel.com > > > Subject: [v2,5/6] eventdev: add event crypto adapter to meson build > > > system > > > X-Mailer: git-send-email 1.9.1 > > > > > > Signed-off-by: Abhinandan Gujjar > > > --- > > > lib/librte_eventdev/meson.build | 8 +--- > > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > > > diff --git a/lib/librte_eventdev/meson.build > > > b/lib/librte_eventdev/meson.build > > > > Separate patch is not required for meson build. Have it in the same patch > > for > > make based build and ADD each files as when it added in the patch. > Should I add changes related to " lib/librte_eventdev/meson.build" as part of > crypto adapter implementation? > Or you recommend the changes in "eventdev pmd" patch? IMO, You can add in second patch where your implementation gets added. Both make based and meson based build enablement you can add it in that patch. >
Re: [dpdk-dev] [PATCH v3 2/2] mem: revert to using flock() and add per-segment lockfiles
On 28-Apr-18 10:38 AM, Andrew Rybchenko wrote: On 04/25/2018 01:36 PM, Anatoly Burakov wrote: The original implementation used flock() locks, but was later switched to using fcntl() locks for page locking, because fcntl() locks allow locking parts of a file, which is useful for single-file segments mode, where locking the entire file isn't as useful because we still need to grow and shrink it. However, according to fcntl()'s Ubuntu manpage [1], semantics of fcntl() locks have a giant oversight: This interface follows the completely stupid semantics of System V and IEEE Std 1003.1-1988 (“POSIX.1”) that require that all locks associated with a file for a given process are removed when any file descriptor for that file is closed by that process. This semantic means that applications must be aware of any files that a subroutine library may access. Basically, closing *any* fd with an fcntl() lock (which we do because we don't want to leak fd's) will drop the lock completely. So, in this commit, we will be reverting back to using flock() locks everywhere. However, that still leaves the problem of locking parts of a memseg list file in single file segments mode, and we will be solving it with creating separate lock files per each page, and tracking those with flock(). We will also be removing all of this tailq business and replacing it with a simple array - saving a few bytes is not worth the extra hassle of dealing with pointers and potential memory allocation failures. Also, remove the tailq lock since it is not needed - these fd lists are per-process, and within a given process, it is always only one thread handling access to hugetlbfs. So, first one to allocate a segment will create a lockfile, and put a shared lock on it. When we're shrinking the page file, we will be trying to take out a write lock on that lockfile, which would fail if any other process is holding onto the lockfile as well. This way, we can know if we can shrink the segment file. Also, if no other locks are found in the lock list for a given memseg list, the memseg list fd is automatically closed. One other thing to note is, according to flock() Ubuntu manpage [2], upgrading the lock from shared to exclusive is implemented by dropping and reacquiring the lock, which is not atomic and thus would have created race conditions. So, on attempting to perform operations in hugetlbfs, we will take out a writelock on hugetlbfs directory, so that only one process could perform hugetlbfs operations concurrently. [1] http://manpages.ubuntu.com/manpages/artful/en/man2/fcntl.2freebsd.html [2] http://manpages.ubuntu.com/manpages/bionic/en/man2/flock.2.html Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists") Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime") Fixes: a5ff05d60fc5 ("mem: support unmapping pages at runtime") Fixes: 2a04139f66b4 ("eal: add single file segments option") Cc: anatoly.bura...@intel.com Signed-off-by: Anatoly Burakov Acked-by: Bruce Richardson We have a problem with the changeset if EAL option -m or --socket-mem is used. EAL initialization hangs just after EAL: Probing VFIO support... strace points to flock(7, LOCK_EX List of file descriptors: # ls /proc/25452/fd -l total 0 lrwx-- 1 root root 64 Apr 28 10:34 0 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:34 1 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:32 2 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:34 3 -> /run/.rte_config lrwx-- 1 root root 64 Apr 28 10:34 4 -> socket:[154166] lrwx-- 1 root root 64 Apr 28 10:34 5 -> socket:[154158] lr-x-- 1 root root 64 Apr 28 10:34 6 -> /dev/hugepages lr-x-- 1 root root 64 Apr 28 10:34 7 -> /dev/hugepages I guess the problem is that there are two /dev/hugepages and it hangs on the second. Ideas how to solve it? Andrew. Hi Andrew, Please try the following patch: http://dpdk.org/dev/patchwork/patch/39166/ This should fix the issue. -- Thanks, Anatoly
Re: [dpdk-dev] [v2, 6/6] doc: add event crypto adapter documentation
> -Original Message- > From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com] > Sent: Sunday, April 29, 2018 10:01 PM > To: Gujjar, Abhinandan S > Cc: hemant.agra...@nxp.com; akhil.go...@nxp.com; dev@dpdk.org; Vangati, > Narender ; Rao, Nikhil ; > Eads, Gage > Subject: Re: [v2,6/6] doc: add event crypto adapter documentation > > -Original Message- > > Date: Tue, 24 Apr 2018 18:13:27 +0530 > > From: Abhinandan Gujjar > > To: jerin.ja...@caviumnetworks.com, hemant.agra...@nxp.com, > > akhil.go...@nxp.com, dev@dpdk.org > > CC: narender.vang...@intel.com, abhinandan.guj...@intel.com, > > nikhil@intel.com, gage.e...@intel.com > > Subject: [v2,6/6] doc: add event crypto adapter documentation > > X-Mailer: git-send-email 1.9.1 > > > > Add entries in the programmer's guide, API index, maintainer's file > > and release notes for the event crypto adapter. > > > > Signed-off-by: Abhinandan Gujjar > > --- > > + > > +The packet flow from cryptodev to the event device can be > > +accomplished using both SW and HW based transfer mechanisms. > > +The Adapter queries an eventdev PMD to determine which mechanism to be > used. > > +The adapter uses an EAL service core function for SW based packet > > +transfer and uses the eventdev PMD functions to configure HW based > > +packet transfer between the cryptodev and the event device. > > + > > +Crypto adapter uses a new event type called > > +``RTE_EVENT_TYPE_CRYPTODEV`` to indicate the event source. > > + > > I think, we can add diagrams used in rte_event_crypto_adapter.h with sequence > number in SVG format here to make it easier to understand for the end user. Sure > > > > +API Overview > > + > > + > > +This section has a brief introduction to the event crypto adapter APIs. > > +The application is expected to create an adapter which is associated > > +with a single eventdev, then add cryptodev and queue pair to the adapter > instance. > > + > > +Adapter can be started in ``RTE_EVENT_CRYPTO_ADAPTER_DEQ_ONLY`` or > > +``RTE_EVENT_CRYPTO_ADAPTER_ENQ_DEQ`` mode. > > +In first mode, application will submit a crypto operation directly to > > cryptodev. > > +In the second mode, application will send a crypto ops to cryptodev > > +adapter via eventdev. The cryptodev adapter then submits the crypto > > +operation to the crypto device. > > + > > +Create an adapter instance > > +-- > > + > > +An adapter instance is created using > > +``rte_event_crypto_adapter_create()``. This function is called with > > +event device to be associated with the adapter and port configuration > > +for the adapter to setup an event port(if the adapter needs to use a > > service > function). > > + > > +.. code-block:: c > > + > > +int err; > > +uint8_t dev_id, id; > > +struct rte_event_dev_info dev_info; > > +struct rte_event_port_conf conf; > > + enum rte_event_crypto_adapter_mode mode; > > + > > +err = rte_event_dev_info_get(id, &dev_info); > > + > > +conf.new_event_threshold = dev_info.max_num_events; > > +conf.dequeue_depth = dev_info.max_event_port_dequeue_depth; > > +conf.enqueue_depth = dev_info.max_event_port_enqueue_depth; > > + mode = RTE_EVENT_CRYPTO_ADAPTER_ENQ_DEQ; > > +err = rte_event_crypto_adapter_create(id, dev_id, &conf, > > +mode); > > + > > +If the application desires to have finer control of eventdev port > > +allocation and setup, it can use the > ``rte_event_crypto_adapter_create_ext()`` function. > > +The ``rte_event_crypto_adapter_create_ext()`` function is passed as a > > +callback function. The callback function is invoked if the adapter > > +needs to use a service function and needs to create an event port for > > +it. The callback is expected to fill the ``struct > > +rte_event_crypto_adapter_conf`` structure passed to it. > > + > > +For ENQ-DEQ mode, the event port created by adapter can be retrived > > +using > > s/retrived/retrieved ? Ok > > > +``rte_event_crypto_adapter_event_port_get()`` API. > > +Application can use this event port to link with event queue on which > > +it enqueue events towards the crypto adapter. > > + > > +.. code-block:: c > > + > > + uint8_t id, evdev, crypto_ev_port_id, app_qid; > > + struct rte_event ev; > > + int ret; > > + > > + ret = rte_event_crypto_adapter_event_port_get(id, > &crypto_ev_port_id); > > + ret = rte_event_queue_setup(evdev, app_qid, NULL); > > + ret = rte_event_port_link(evdev, crypto_ev_port_id, &app_qid, NULL, > > +1); > > + > > + /* Fill in event info and update event_ptr with rte_crypto_op */ > > + memset(&ev, 0, sizeof(ev)); > > + ev.queue_id = app_qid; > > + . > > + . > > + ev.event_ptr = op; > > + ret = rte_event_enqueue_burst(evdev, app_ev_port_id, ev, nb_events); > > + > > + > > +Adding queue pair to the adapter instance > > +- > > + > > +Cryptodev device id and queue pair are created using cryptodev APIs. > >
Re: [dpdk-dev] [PATCH v3 2/2] mem: revert to using flock() and add per-segment lockfiles
On 04/30/2018 01:31 PM, Burakov, Anatoly wrote: On 28-Apr-18 10:38 AM, Andrew Rybchenko wrote: On 04/25/2018 01:36 PM, Anatoly Burakov wrote: The original implementation used flock() locks, but was later switched to using fcntl() locks for page locking, because fcntl() locks allow locking parts of a file, which is useful for single-file segments mode, where locking the entire file isn't as useful because we still need to grow and shrink it. However, according to fcntl()'s Ubuntu manpage [1], semantics of fcntl() locks have a giant oversight: This interface follows the completely stupid semantics of System V and IEEE Std 1003.1-1988 (“POSIX.1”) that require that all locks associated with a file for a given process are removed when any file descriptor for that file is closed by that process. This semantic means that applications must be aware of any files that a subroutine library may access. Basically, closing *any* fd with an fcntl() lock (which we do because we don't want to leak fd's) will drop the lock completely. So, in this commit, we will be reverting back to using flock() locks everywhere. However, that still leaves the problem of locking parts of a memseg list file in single file segments mode, and we will be solving it with creating separate lock files per each page, and tracking those with flock(). We will also be removing all of this tailq business and replacing it with a simple array - saving a few bytes is not worth the extra hassle of dealing with pointers and potential memory allocation failures. Also, remove the tailq lock since it is not needed - these fd lists are per-process, and within a given process, it is always only one thread handling access to hugetlbfs. So, first one to allocate a segment will create a lockfile, and put a shared lock on it. When we're shrinking the page file, we will be trying to take out a write lock on that lockfile, which would fail if any other process is holding onto the lockfile as well. This way, we can know if we can shrink the segment file. Also, if no other locks are found in the lock list for a given memseg list, the memseg list fd is automatically closed. One other thing to note is, according to flock() Ubuntu manpage [2], upgrading the lock from shared to exclusive is implemented by dropping and reacquiring the lock, which is not atomic and thus would have created race conditions. So, on attempting to perform operations in hugetlbfs, we will take out a writelock on hugetlbfs directory, so that only one process could perform hugetlbfs operations concurrently. [1] http://manpages.ubuntu.com/manpages/artful/en/man2/fcntl.2freebsd.html [2] http://manpages.ubuntu.com/manpages/bionic/en/man2/flock.2.html Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists") Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime") Fixes: a5ff05d60fc5 ("mem: support unmapping pages at runtime") Fixes: 2a04139f66b4 ("eal: add single file segments option") Cc: anatoly.bura...@intel.com Signed-off-by: Anatoly Burakov Acked-by: Bruce Richardson We have a problem with the changeset if EAL option -m or --socket-mem is used. EAL initialization hangs just after EAL: Probing VFIO support... strace points to flock(7, LOCK_EX List of file descriptors: # ls /proc/25452/fd -l total 0 lrwx-- 1 root root 64 Apr 28 10:34 0 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:34 1 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:32 2 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:34 3 -> /run/.rte_config lrwx-- 1 root root 64 Apr 28 10:34 4 -> socket:[154166] lrwx-- 1 root root 64 Apr 28 10:34 5 -> socket:[154158] lr-x-- 1 root root 64 Apr 28 10:34 6 -> /dev/hugepages lr-x-- 1 root root 64 Apr 28 10:34 7 -> /dev/hugepages I guess the problem is that there are two /dev/hugepages and it hangs on the second. Ideas how to solve it? Andrew. Hi Andrew, Please try the following patch: http://dpdk.org/dev/patchwork/patch/39166/ This should fix the issue. I faced the regression in my test bench, your patch fixes the issue in my case: Tested-by: Maxime Coquelin Thanks, Maxime
[dpdk-dev] [PATCH v2 0/4] Clean up EAL runtime data paths
As has been suggested [1], all DPDK runtime paths should be put into a single place. This patchset accomplishes exactly that. If running as root, all files will be put under /var/run/dpdk/, otherwise they will be put under $XDG_RUNTIME_PATH/dpdk/, or, if that environment variable is not defined, all files will go under /tmp/dpdk/. [1] http://dpdk.org/dev/patchwork/patch/38688/ v2: - Rebase on rc1 Anatoly Burakov (4): eal: remove unused define eal: rename function returning hugepage data path eal: add directory for DPDK runtime data eal: move all runtime data into DPDK runtime dir lib/librte_eal/bsdapp/eal/eal.c | 70 +++ lib/librte_eal/common/eal_filesystem.h | 81 ++-- lib/librte_eal/linuxapp/eal/eal.c| 69 +++ lib/librte_eal/linuxapp/eal/eal_memory.c | 10 ++-- 4 files changed, 171 insertions(+), 59 deletions(-) -- 2.7.4
[dpdk-dev] [PATCH v2 3/4] eal: add directory for DPDK runtime data
Currently, during runtime, DPDK will store a bunch of files here and there (in /var/run, /tmp or in $HOME). Fix it by creating a DPDK-specific runtime directory, under which all runtime data will be placed. The template for creating this runtime directory is the following: /dpdk// Where is set to either "/var/run" if run as root, or $XDG_RUNTIME_DIR if run as non-root, with a fallback to /tmp if $XDG_RUNTIME_DIR is not defined. So, for example, if run as root, by default all runtime data will be stored at /var/run/dpdk/rte/. There is no equivalent of "mkdir -p", so we will be creating the path step by step. Nothing uses this new path yet, changes for that will come in next commit. Signed-off-by: Anatoly Burakov --- lib/librte_eal/bsdapp/eal/eal.c| 68 ++ lib/librte_eal/common/eal_filesystem.h | 8 lib/librte_eal/linuxapp/eal/eal.c | 67 + 3 files changed, 143 insertions(+) diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c index a63f11f..256ab2d 100644 --- a/lib/librte_eal/bsdapp/eal/eal.c +++ b/lib/librte_eal/bsdapp/eal/eal.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include @@ -83,6 +84,66 @@ struct internal_config internal_config; /* used by rte_rdtsc() */ int rte_cycles_vmware_tsc_map; +/* platform-specific runtime dir */ +static char runtime_dir[PATH_MAX]; + +int +eal_create_runtime_dir(void) +{ + const char *directory = default_config_dir; + const char *xdg_runtime_dir = getenv("XDG_RUNTIME_DIR"); + const char *fallback = "/tmp"; + char tmp[PATH_MAX]; + int ret; + + if (getuid() != 0) { + /* try XDG path first, fall back to /tmp */ + if (xdg_runtime_dir != NULL) + directory = xdg_runtime_dir; + else + directory = fallback; + } + /* create DPDK subdirectory under runtime dir */ + ret = snprintf(tmp, sizeof(tmp), "%s/dpdk", directory); + if (ret < 0 || ret == sizeof(tmp)) { + RTE_LOG(ERR, EAL, "Error creating DPDK runtime path name\n"); + return -1; + } + + /* create prefix-specific subdirectory under DPDK runtime dir */ + ret = snprintf(runtime_dir, sizeof(runtime_dir), "%s/%s", + tmp, internal_config.hugefile_prefix); + if (ret < 0 || ret == sizeof(runtime_dir)) { + RTE_LOG(ERR, EAL, "Error creating prefix-specific runtime path name\n"); + return -1; + } + + /* create the path if it doesn't exist. no "mkdir -p" here, so do it +* step by step. +*/ + ret = mkdir(tmp, 0600); + if (ret < 0 && errno != EEXIST) { + RTE_LOG(ERR, EAL, "Error creating '%s': %s\n", + tmp, strerror(errno)); + return -1; + } + + ret = mkdir(runtime_dir, 0600); + if (ret < 0 && errno != EEXIST) { + RTE_LOG(ERR, EAL, "Error creating '%s': %s\n", + runtime_dir, strerror(errno)); + return -1; + } + + return 0; +} + +const char * +eal_get_runtime_dir(void) +{ + return runtime_dir; +} + /* Return user provided mbuf pool ops name */ const char * __rte_experimental rte_eal_mbuf_user_pool_ops(void) @@ -522,6 +583,13 @@ rte_eal_init(int argc, char **argv) /* set log level as early as possible */ eal_log_level_parse(argc, argv); + /* create runtime data directory */ + if (eal_create_runtime_dir() < 0) { + rte_eal_init_alert("Cannot create runtime directory\n"); + rte_errno = EACCES; + return -1; + } + if (rte_eal_cpu_init() < 0) { rte_eal_init_alert("Cannot detect lcores."); rte_errno = ENOTSUP; diff --git a/lib/librte_eal/common/eal_filesystem.h b/lib/librte_eal/common/eal_filesystem.h index 060ac2b..67f5ca8 100644 --- a/lib/librte_eal/common/eal_filesystem.h +++ b/lib/librte_eal/common/eal_filesystem.h @@ -25,6 +25,14 @@ static const char *default_config_dir = "/var/run"; +/* sets up platform-specific runtime data dir */ +int +eal_create_runtime_dir(void); + +/* returns runtime dir */ +const char * +eal_get_runtime_dir(void); + static inline const char * eal_runtime_config_path(void) { diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c index e2c0bd6..053b7e7 100644 --- a/lib/librte_eal/linuxapp/eal/eal.c +++ b/lib/librte_eal/linuxapp/eal/eal.c @@ -92,6 +92,66 @@ struct internal_config internal_config; /* used by rte_rdtsc() */ int rte_cycles_vmware_tsc_map; +/* platform-specific runtime dir */ +static char runtime_dir[PATH_MAX]; + +int +eal_create_runtime_dir(void) +{ + const char *directory = default_config_dir; + const char *xdg_runtime_dir = getenv("XDG_RUNTIME_DIR"); + const c
[dpdk-dev] [PATCH v2 2/4] eal: rename function returning hugepage data path
The original name for this path was not too descriptive and confusing. Rename it to a more appropriate and descriptive name: it stores data about hugepages, so name it eal_hugepage_data_path(). Signed-off-by: Anatoly Burakov --- lib/librte_eal/common/eal_filesystem.h | 2 +- lib/librte_eal/linuxapp/eal/eal_memory.c | 10 ++ 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/lib/librte_eal/common/eal_filesystem.h b/lib/librte_eal/common/eal_filesystem.h index 078e2eb..060ac2b 100644 --- a/lib/librte_eal/common/eal_filesystem.h +++ b/lib/librte_eal/common/eal_filesystem.h @@ -89,7 +89,7 @@ eal_hugepage_info_path(void) #define HUGEPAGE_FILE_FMT "%s/.%s_hugepage_file" static inline const char * -eal_hugepage_file_path(void) +eal_hugepage_data_path(void) { static char buffer[PATH_MAX]; /* static so auto-zeroed */ const char *directory = default_config_dir; diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index e0baabb..c917de1 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -1499,7 +1499,7 @@ eal_legacy_hugepage_init(void) } /* create shared memory */ - hugepage = create_shared_memory(eal_hugepage_file_path(), + hugepage = create_shared_memory(eal_hugepage_data_path(), nr_hugefiles * sizeof(struct hugepage_file)); if (hugepage == NULL) { @@ -1727,16 +1727,18 @@ eal_legacy_hugepage_attach(void) test_phys_addrs_available(); - fd_hugepage = open(eal_hugepage_file_path(), O_RDONLY); + fd_hugepage = open(eal_hugepage_data_path(), O_RDONLY); if (fd_hugepage < 0) { - RTE_LOG(ERR, EAL, "Could not open %s\n", eal_hugepage_file_path()); + RTE_LOG(ERR, EAL, "Could not open %s\n", + eal_hugepage_data_path()); goto error; } size = getFileSize(fd_hugepage); hp = mmap(NULL, size, PROT_READ, MAP_PRIVATE, fd_hugepage, 0); if (hp == MAP_FAILED) { - RTE_LOG(ERR, EAL, "Could not mmap %s\n", eal_hugepage_file_path()); + RTE_LOG(ERR, EAL, "Could not mmap %s\n", + eal_hugepage_data_path()); goto error; } -- 2.7.4
[dpdk-dev] [PATCH v2 4/4] eal: move all runtime data into DPDK runtime dir
Fix all calls to functions in eal_filesystem to produce paths residing inside dedicated DPDK runtime directory. Signed-off-by: Anatoly Burakov --- lib/librte_eal/bsdapp/eal/eal.c| 2 + lib/librte_eal/common/eal_filesystem.h | 71 +- lib/librte_eal/linuxapp/eal/eal.c | 2 + 3 files changed, 22 insertions(+), 53 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c index 256ab2d..ebda2ef 100644 --- a/lib/librte_eal/bsdapp/eal/eal.c +++ b/lib/librte_eal/bsdapp/eal/eal.c @@ -87,6 +87,8 @@ int rte_cycles_vmware_tsc_map; /* platform-specific runtime dir */ static char runtime_dir[PATH_MAX]; +static const char *default_config_dir = "/var/run"; + int eal_create_runtime_dir(void) { diff --git a/lib/librte_eal/common/eal_filesystem.h b/lib/librte_eal/common/eal_filesystem.h index 67f5ca8..c98102f 100644 --- a/lib/librte_eal/common/eal_filesystem.h +++ b/lib/librte_eal/common/eal_filesystem.h @@ -11,10 +11,6 @@ #ifndef EAL_FILESYSTEM_H #define EAL_FILESYSTEM_H -/** Path of rte config file. */ -#define RUNTIME_CONFIG_FMT "%s/.%s_config" -#define FBARRAY_FMT "%s/.%s_%s" - #include #include #include @@ -23,8 +19,6 @@ #include #include "eal_internal_cfg.h" -static const char *default_config_dir = "/var/run"; - /* sets up platform-specific runtime data dir */ int eal_create_runtime_dir(void); @@ -33,80 +27,57 @@ eal_create_runtime_dir(void); const char * eal_get_runtime_dir(void); +#define RUNTIME_CONFIG_FNAME "config" static inline const char * eal_runtime_config_path(void) { static char buffer[PATH_MAX]; /* static so auto-zeroed */ - const char *directory = default_config_dir; - const char *home_dir = getenv("HOME"); - if (getuid() != 0 && home_dir != NULL) - directory = home_dir; - snprintf(buffer, sizeof(buffer) - 1, RUNTIME_CONFIG_FMT, directory, - internal_config.hugefile_prefix); + snprintf(buffer, sizeof(buffer) - 1, "%s/%s", eal_get_runtime_dir(), + RUNTIME_CONFIG_FNAME); return buffer; } /** Path of primary/secondary communication unix socket file. */ -#define MP_SOCKET_PATH_FMT "%s/.%s_unix" +#define MP_SOCKET_FNAME "mp_socket" static inline const char * eal_mp_socket_path(void) { static char buffer[PATH_MAX]; /* static so auto-zeroed */ - const char *directory = default_config_dir; - const char *home_dir = getenv("HOME"); - - if (getuid() != 0 && home_dir != NULL) - directory = home_dir; - snprintf(buffer, sizeof(buffer) - 1, MP_SOCKET_PATH_FMT, -directory, internal_config.hugefile_prefix); + snprintf(buffer, sizeof(buffer) - 1, "%s/%s", eal_get_runtime_dir(), + MP_SOCKET_FNAME); return buffer; } +#define FBARRAY_NAME_FMT "%s/fbarray_%s" static inline const char * eal_get_fbarray_path(char *buffer, size_t buflen, const char *name) { - const char *directory = "/tmp"; - const char *home_dir = getenv("HOME"); - - if (getuid() != 0 && home_dir != NULL) - directory = home_dir; - snprintf(buffer, buflen - 1, FBARRAY_FMT, directory, - internal_config.hugefile_prefix, name); + snprintf(buffer, buflen, FBARRAY_NAME_FMT, eal_get_runtime_dir(), name); return buffer; } /** Path of hugepage info file. */ -#define HUGEPAGE_INFO_FMT "%s/.%s_hugepage_info" - +#define HUGEPAGE_INFO_FNAME "hugepage_info" static inline const char * eal_hugepage_info_path(void) { static char buffer[PATH_MAX]; /* static so auto-zeroed */ - const char *directory = default_config_dir; - const char *home_dir = getenv("HOME"); - if (getuid() != 0 && home_dir != NULL) - directory = home_dir; - snprintf(buffer, sizeof(buffer) - 1, HUGEPAGE_INFO_FMT, directory, - internal_config.hugefile_prefix); + snprintf(buffer, sizeof(buffer) - 1, "%s/%s", eal_get_runtime_dir(), + HUGEPAGE_INFO_FNAME); return buffer; } -/** Path of hugepage info file. */ -#define HUGEPAGE_FILE_FMT "%s/.%s_hugepage_file" - +/** Path of hugepage data file. */ +#define HUGEPAGE_DATA_FNAME "hugepage_data" static inline const char * eal_hugepage_data_path(void) { static char buffer[PATH_MAX]; /* static so auto-zeroed */ - const char *directory = default_config_dir; - const char *home_dir = getenv("HOME"); - if (getuid() != 0 && home_dir != NULL) - directory = home_dir; - snprintf(buffer, sizeof(buffer) - 1, HUGEPAGE_FILE_FMT, directory, - internal_config.hugefile_prefix); + snprintf(buffer, sizeof(buffer) - 1, "%s/%s", eal_get_runtime_dir(), + HUGEPAGE_DATA_FNAME); return buffer; } @@ -122,18 +93,12 @@ eal_get_hugefile_path(char *buffer, size_t buflen, const
[dpdk-dev] [PATCH v2 1/4] eal: remove unused define
The define was a leftover from IVSHMEM library. Fixes: c711ccb30987 ("ivshmem: remove library and its EAL integration") Cc: david.march...@6wind.com Cc: sta...@dpdk.org Signed-off-by: Anatoly Burakov Reviewed-by: David Marchand --- lib/librte_eal/common/eal_filesystem.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/lib/librte_eal/common/eal_filesystem.h b/lib/librte_eal/common/eal_filesystem.h index 4db5c10..078e2eb 100644 --- a/lib/librte_eal/common/eal_filesystem.h +++ b/lib/librte_eal/common/eal_filesystem.h @@ -104,8 +104,6 @@ eal_hugepage_file_path(void) /** String format for hugepage map files. */ #define HUGEFILE_FMT "%s/%smap_%d" -#define TEMP_HUGEFILE_FMT "%s/%smap_temp_%d" - static inline const char * eal_get_hugefile_path(char *buffer, size_t buflen, const char *hugedir, int f_id) { -- 2.7.4
[dpdk-dev] [PATCH v1] net/mlx4: fix CRC stripping capability report
There are two capabilities related to CRC stripping: 1. mlx4 HW capability to perform CRC stripping on a recieved packet. This capability is built in mlx4 HW. It should be returned by the API call mlx4_get_rx_queue_offloads(). 2. mlx4 driver capability to enable/disable HW CRC stripping. This capability is dependent on the driver version. Before this commit the seccond capability was falsely returned by the mentioned API. This commit fixes it by returning the first capability. Fixes: de1df14e6e6ec ("net/mlx4: support CRC strip toggling") Cc: sta...@dpdk.org Signed-off-by: Ophir Munk --- drivers/net/mlx4/mlx4_rxq.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/mlx4/mlx4_rxq.c b/drivers/net/mlx4/mlx4_rxq.c index b430678..88e5912 100644 --- a/drivers/net/mlx4/mlx4_rxq.c +++ b/drivers/net/mlx4/mlx4_rxq.c @@ -658,10 +658,9 @@ mlx4_rxq_detach(struct rxq *rxq) uint64_t mlx4_get_rx_queue_offloads(struct priv *priv) { - uint64_t offloads = DEV_RX_OFFLOAD_SCATTER; + uint64_t offloads = DEV_RX_OFFLOAD_SCATTER | + DEV_RX_OFFLOAD_CRC_STRIP; - if (priv->hw_fcs_strip) - offloads |= DEV_RX_OFFLOAD_CRC_STRIP; if (priv->hw_csum) offloads |= DEV_RX_OFFLOAD_CHECKSUM; return offloads; -- 2.7.4
Re: [dpdk-dev] [PATCH 5/8 v4] raw/dpaa2_qdma: introduce the DPAA2 QDMA driver
24/04/2018 13:49, Nipun Gupta: > drivers/raw/dpaa2_qdma/dpaa2_qdma.c| 294 > + > drivers/raw/dpaa2_qdma/dpaa2_qdma.h| 66 + > drivers/raw/dpaa2_qdma/dpaa2_qdma_logs.h | 46 [...] > +install_headers('rte_pmd_dpaa2_qdma.h') I think you need to rename the exported header file with rte_pmd_ prefix.
Re: [dpdk-dev] [PATCH] eal: check if hugedir write lock is already being held
Monday, April 30, 2018 1:38 PM, Anatoly Burakov: > Cc: arybche...@solarflare.com; anatoly.bura...@intel.com > Subject: [dpdk-dev] [PATCH] eal: check if hugedir write lock is already being > held > > At hugepage info initialization, EAL takes out a write lock on hugetlbfs > directories, and drops it after the memory init is finished. However, in non- > legacy mode, if "-m" or "--socket-mem" > switches are passed, this leads to a deadlock because EAL tries to allocate > pages (and thus take out a write lock on hugedir) while still holding a > separate hugedir write lock in EAL. > > Fix it by checking if write lock in hugepage info is active, and not trying > to lock > the directory if the hugedir fd is valid. > > Fixes: 1a7dc2252f28 ("mem: revert to using flock and add per-segment > lockfiles") > Cc: anatoly.bura...@intel.com > > Signed-off-by: Anatoly Burakov > --- > lib/librte_eal/linuxapp/eal/eal_memalloc.c | 71 ++-- > -- > 1 file changed, 42 insertions(+), 29 deletions(-) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c > b/lib/librte_eal/linuxapp/eal/eal_memalloc.c > index 00d7886..360d8f7 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c > +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c > @@ -666,7 +666,7 @@ alloc_seg_walk(const struct rte_memseg_list *msl, > void *arg) > struct alloc_walk_param *wa = arg; > struct rte_memseg_list *cur_msl; > size_t page_sz; > - int cur_idx, start_idx, j, dir_fd; > + int cur_idx, start_idx, j, dir_fd = -1; > unsigned int msl_idx, need, i; > > if (msl->page_sz != wa->page_sz) > @@ -691,19 +691,24 @@ alloc_seg_walk(const struct rte_memseg_list *msl, > void *arg) >* because file creation and locking operations are not atomic, >* and we might be the first or the last ones to use a particular page, >* so we need to ensure atomicity of every operation. > + * > + * during init, we already hold a write lock, so don't try to take out > + * another one. >*/ > - dir_fd = open(wa->hi->hugedir, O_RDONLY); > - if (dir_fd < 0) { > - RTE_LOG(ERR, EAL, "%s(): Cannot open '%s': %s\n", > __func__, > - wa->hi->hugedir, strerror(errno)); > - return -1; > - } > - /* blocking writelock */ > - if (flock(dir_fd, LOCK_EX)) { > - RTE_LOG(ERR, EAL, "%s(): Cannot lock '%s': %s\n", __func__, > - wa->hi->hugedir, strerror(errno)); > - close(dir_fd); > - return -1; > + if (wa->hi->lock_descriptor == -1) { > + dir_fd = open(wa->hi->hugedir, O_RDONLY); > + if (dir_fd < 0) { > + RTE_LOG(ERR, EAL, "%s(): Cannot open '%s': %s\n", > + __func__, wa->hi->hugedir, strerror(errno)); > + return -1; > + } > + /* blocking writelock */ > + if (flock(dir_fd, LOCK_EX)) { > + RTE_LOG(ERR, EAL, "%s(): Cannot lock '%s': %s\n", > + __func__, wa->hi->hugedir, strerror(errno)); > + close(dir_fd); > + return -1; > + } > } > > for (i = 0; i < need; i++, cur_idx++) { @@ -742,7 +747,8 @@ > alloc_seg_walk(const struct rte_memseg_list *msl, void *arg) > if (wa->ms) > memset(wa->ms, 0, sizeof(*wa->ms) * wa- > >n_segs); > > - close(dir_fd); > + if (dir_fd >= 0) > + close(dir_fd); > return -1; > } > if (wa->ms) > @@ -754,7 +760,8 @@ alloc_seg_walk(const struct rte_memseg_list *msl, > void *arg) > wa->segs_allocated = i; > if (i > 0) > cur_msl->version++; > - close(dir_fd); > + if (dir_fd >= 0) > + close(dir_fd); > return 1; > } > > @@ -769,7 +776,7 @@ free_seg_walk(const struct rte_memseg_list *msl, > void *arg) > struct rte_memseg_list *found_msl; > struct free_walk_param *wa = arg; > uintptr_t start_addr, end_addr; > - int msl_idx, seg_idx, ret, dir_fd; > + int msl_idx, seg_idx, ret, dir_fd = -1; > > start_addr = (uintptr_t) msl->base_va; > end_addr = start_addr + msl->memseg_arr.len * (size_t)msl- > >page_sz; @@ -788,19 +795,24 @@ free_seg_walk(const struct > rte_memseg_list *msl, void *arg) >* because file creation and locking operations are not atomic, >* and we might be the first or the last ones to use a particular page, >* so we need to ensure atomicity of every operation. > + * > + * during init, we already hold a write lock, so don't try to take out > + * another one. >*/ > - dir_fd = open(wa->hi->hugedir, O_RDONLY); > - if (dir_fd < 0) { > - RTE_LOG(ERR, EAL, "%s(): Cannot open '%s': %s\n", > __func__
Re: [dpdk-dev] [PATCH v2 1/2] mem: check if allocation size is too big
On Mon, Apr 30, 2018 at 12:21:42PM +0100, Anatoly Burakov wrote: > Mapping size is a 64-bit integer, but mmap() will accept size_t for > size mappings. A user could request a mapping with an alignment, which > would have overflown size_t, so check if (size + alignment) will > overflow size_t. > > Signed-off-by: Anatoly Burakov > --- Acked-by: Bruce Richardson
Re: [dpdk-dev] [PATCH v2 2/2] mem: unmap unneeded space
On Mon, Apr 30, 2018 at 12:21:43PM +0100, Anatoly Burakov wrote: > When we ask to reserve virtual areas, we usually include > alignment in the mapping size, and that memory ends up > being wasted. Wasting a gigabyte of VA space while trying to > reserve one gigabyte is pretty expensive on 32-bit, so after > we're done mapping, unmap unneeded space. > > Signed-off-by: Anatoly Burakov > --- > > Notes: > v2: > - Split fix for size_t overflow into separate patch > - Improve readability of unmapping code > - Added comment explaining why unmapping is done > Acked-by: Bruce Richardson
Re: [dpdk-dev] [PATCH] vhost/crypto: fix bracket
30/04/2018 12:36, Fan Zhang: > Coverity issue: 233232 > Coverity issue: 233237 > Fixes: 3bb595ecd682 ("vhost/crypto: add request handler") > > Signed-off-by: Fan Zhang 2 comments, Fan: 1/ I think it the v2 of a previous commit. Please update the patchwork status (superseded), use -v option for revision numbering, and add a changelog. 2/ The title must give the scope of the change, or give an idea of the impact of the patch. Example: fix symmetric ciphering The root cause (bracket location) is better in the commit message than the title. Thanks
[dpdk-dev] New RC needed for stablility?
Hi Thomas, Ferruh, all, Initial testing on RC1 from our System Test and Validation shows a lot of defects/issues, and from the list it appears others may be seeing issues too. These issues, as well as causing test failures are blocking other tests from being run. We are also seeing some serious performance regressions with testpmd. Based on the number of issues we are seeing, which is probably a result of the huge number of changes in this release even at the RC1 point, I would propose that we look to do a new RC some time this week to get as many critical bugs as possible ironed out, before we add in the remaining 18.05 features. That is: *bug-fix only* RC2, say Wed/Thurs (or sooner if fixes are ready), with remaining features in RC3 as planned on 11th. Thoughts, comments? /Bruce
Re: [dpdk-dev] New RC needed for stablility?
30/04/2018 14:57, Bruce Richardson: > Hi Thomas, Ferruh, all, > > Initial testing on RC1 from our System Test and Validation shows a lot of > defects/issues, and from the list it appears others may be seeing issues > too. These issues, as well as causing test failures are blocking other > tests from being run. We are also seeing some serious performance > regressions with testpmd. > > Based on the number of issues we are seeing, which is probably a result of > the huge number of changes in this release even at the RC1 point, I would > propose that we look to do a new RC some time this week to get as many > critical bugs as possible ironed out, before we add in the remaining 18.05 > features. That is: *bug-fix only* RC2, say Wed/Thurs (or sooner if fixes > are ready), with remaining features in RC3 as planned on 11th. > > Thoughts, comments? OK +1 We need to agree on a list of bugs to be fixed. Are there some Bugzilla entries? When they will be fixed, I will tag the RC2.
Re: [dpdk-dev] [PATCH] eal: check if hugedir write lock is already being held
On 04/30/2018 01:38 PM, Anatoly Burakov wrote: At hugepage info initialization, EAL takes out a write lock on hugetlbfs directories, and drops it after the memory init is finished. However, in non-legacy mode, if "-m" or "--socket-mem" switches are passed, this leads to a deadlock because EAL tries to allocate pages (and thus take out a write lock on hugedir) while still holding a separate hugedir write lock in EAL. Fix it by checking if write lock in hugepage info is active, and not trying to lock the directory if the hugedir fd is valid. Fixes: 1a7dc2252f28 ("mem: revert to using flock and add per-segment lockfiles") Cc: anatoly.bura...@intel.com Signed-off-by: Anatoly Burakov Tested-by: Andrew Rybchenko
Re: [dpdk-dev] [PATCH v3 2/2] mem: revert to using flock() and add per-segment lockfiles
On 04/30/2018 02:31 PM, Burakov, Anatoly wrote: On 28-Apr-18 10:38 AM, Andrew Rybchenko wrote: On 04/25/2018 01:36 PM, Anatoly Burakov wrote: The original implementation used flock() locks, but was later switched to using fcntl() locks for page locking, because fcntl() locks allow locking parts of a file, which is useful for single-file segments mode, where locking the entire file isn't as useful because we still need to grow and shrink it. However, according to fcntl()'s Ubuntu manpage [1], semantics of fcntl() locks have a giant oversight: This interface follows the completely stupid semantics of System V and IEEE Std 1003.1-1988 (“POSIX.1”) that require that all locks associated with a file for a given process are removed when any file descriptor for that file is closed by that process. This semantic means that applications must be aware of any files that a subroutine library may access. Basically, closing *any* fd with an fcntl() lock (which we do because we don't want to leak fd's) will drop the lock completely. So, in this commit, we will be reverting back to using flock() locks everywhere. However, that still leaves the problem of locking parts of a memseg list file in single file segments mode, and we will be solving it with creating separate lock files per each page, and tracking those with flock(). We will also be removing all of this tailq business and replacing it with a simple array - saving a few bytes is not worth the extra hassle of dealing with pointers and potential memory allocation failures. Also, remove the tailq lock since it is not needed - these fd lists are per-process, and within a given process, it is always only one thread handling access to hugetlbfs. So, first one to allocate a segment will create a lockfile, and put a shared lock on it. When we're shrinking the page file, we will be trying to take out a write lock on that lockfile, which would fail if any other process is holding onto the lockfile as well. This way, we can know if we can shrink the segment file. Also, if no other locks are found in the lock list for a given memseg list, the memseg list fd is automatically closed. One other thing to note is, according to flock() Ubuntu manpage [2], upgrading the lock from shared to exclusive is implemented by dropping and reacquiring the lock, which is not atomic and thus would have created race conditions. So, on attempting to perform operations in hugetlbfs, we will take out a writelock on hugetlbfs directory, so that only one process could perform hugetlbfs operations concurrently. [1] http://manpages.ubuntu.com/manpages/artful/en/man2/fcntl.2freebsd.html [2] http://manpages.ubuntu.com/manpages/bionic/en/man2/flock.2.html Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists") Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime") Fixes: a5ff05d60fc5 ("mem: support unmapping pages at runtime") Fixes: 2a04139f66b4 ("eal: add single file segments option") Cc: anatoly.bura...@intel.com Signed-off-by: Anatoly Burakov Acked-by: Bruce Richardson We have a problem with the changeset if EAL option -m or --socket-mem is used. EAL initialization hangs just after EAL: Probing VFIO support... strace points to flock(7, LOCK_EX List of file descriptors: # ls /proc/25452/fd -l total 0 lrwx-- 1 root root 64 Apr 28 10:34 0 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:34 1 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:32 2 -> /dev/pts/0 lrwx-- 1 root root 64 Apr 28 10:34 3 -> /run/.rte_config lrwx-- 1 root root 64 Apr 28 10:34 4 -> socket:[154166] lrwx-- 1 root root 64 Apr 28 10:34 5 -> socket:[154158] lr-x-- 1 root root 64 Apr 28 10:34 6 -> /dev/hugepages lr-x-- 1 root root 64 Apr 28 10:34 7 -> /dev/hugepages I guess the problem is that there are two /dev/hugepages and it hangs on the second. Ideas how to solve it? Andrew. Hi Andrew, Please try the following patch: http://dpdk.org/dev/patchwork/patch/39166/ This should fix the issue. Hi Anatoly, yes, it fixes the issue. Thanks, Andrew.
Re: [dpdk-dev] [PATCH 1/2] eal: fix build with glibc < 2.16
Thomas Monjalon writes: > The fake getauxval function does not use its parameter. > So the compiler raised this error: > lib/librte_eal/common/eal_common_cpuflags.c:25:25: error: > unused parameter 'type' > > Fixes: 2ed9bf330709 ("eal: abstract away the auxiliary vector") > Cc: acon...@redhat.com > Cc: tredae...@redhat.com > > Signed-off-by: Thomas Monjalon > --- Oops - sorry about that. Acked-by: Aaron Conole > lib/librte_eal/common/eal_common_cpuflags.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/lib/librte_eal/common/eal_common_cpuflags.c > b/lib/librte_eal/common/eal_common_cpuflags.c > index a09667563..6a9dbaeb1 100644 > --- a/lib/librte_eal/common/eal_common_cpuflags.c > +++ b/lib/librte_eal/common/eal_common_cpuflags.c > @@ -22,7 +22,7 @@ > > #ifndef HAS_AUXV > static unsigned long > -getauxval(unsigned long type) > +getauxval(unsigned long type __rte_unused) > { > errno = ENOTSUP; > return 0;
Re: [dpdk-dev] [PATCH 2/2] eal: fix build on FreeBSD
Thomas Monjalon writes: > The auxiliary vector read is implemented only for Linux. > It could be done with procstat_getauxv() for FreeBSD. > > Since the commit below, the auxiliary vector functions > are compiled for every architectures, including x86 > which is tested with FreeBSD. > > This patch is only adding a fake/empty implementation > of auxiliary vector read, for compilation on FreeBSD. > > Fixes: 2ed9bf330709 ("eal: abstract away the auxiliary vector") > Cc: acon...@redhat.com > Cc: tredae...@redhat.com > > Signed-off-by: Thomas Monjalon > --- Makes sense to me. Thanks for fixing this up, Thomas. Sorry for turning it sideways. I'll make sure to test on freebsd next time. Acked-by: Aaron Conole > lib/librte_eal/bsdapp/eal/Makefile | 1 + > lib/librte_eal/bsdapp/eal/eal_cpuflags.c | 21 ++ > lib/librte_eal/bsdapp/eal/meson.build | 1 + > lib/librte_eal/common/eal_common_cpuflags.c| 79 > -- > lib/librte_eal/linuxapp/eal/Makefile | 1 + > .../eal/eal_cpuflags.c}| 47 + > lib/librte_eal/linuxapp/eal/meson.build| 1 + > 7 files changed, 26 insertions(+), 125 deletions(-) > create mode 100644 lib/librte_eal/bsdapp/eal/eal_cpuflags.c > copy lib/librte_eal/{common/eal_common_cpuflags.c => > linuxapp/eal/eal_cpuflags.c} (61%) > > diff --git a/lib/librte_eal/bsdapp/eal/Makefile > b/lib/librte_eal/bsdapp/eal/Makefile > index 200285e01..3fd33f1e4 100644 > --- a/lib/librte_eal/bsdapp/eal/Makefile > +++ b/lib/librte_eal/bsdapp/eal/Makefile > @@ -25,6 +25,7 @@ LIBABIVER := 7 > > # specific to bsdapp exec-env > SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) := eal.c > +SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_cpuflags.c > SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_memory.c > SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_hugepage_info.c > SRCS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += eal_thread.c > diff --git a/lib/librte_eal/bsdapp/eal/eal_cpuflags.c > b/lib/librte_eal/bsdapp/eal/eal_cpuflags.c > new file mode 100644 > index 0..69b161ea6 > --- /dev/null > +++ b/lib/librte_eal/bsdapp/eal/eal_cpuflags.c > @@ -0,0 +1,21 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright 2018 Mellanox Technologies, Ltd > + */ > + > +#include > +#include > + > +unsigned long > +rte_cpu_getauxval(unsigned long type __rte_unused) > +{ > + /* not implemented */ > + return 0; > +} > + > +int > +rte_cpu_strcmp_auxval(unsigned long type __rte_unused, > + const char *str __rte_unused) > +{ > + /* not implemented */ > + return -1; > +} > diff --git a/lib/librte_eal/bsdapp/eal/meson.build > b/lib/librte_eal/bsdapp/eal/meson.build > index 4c5611879..47e16a649 100644 > --- a/lib/librte_eal/bsdapp/eal/meson.build > +++ b/lib/librte_eal/bsdapp/eal/meson.build > @@ -4,6 +4,7 @@ > env_objs = [] > env_headers = [] > env_sources = files('eal_alarm.c', > + 'eal_cpuflags.c', > 'eal_debug.c', > 'eal_hugepage_info.c', > 'eal_interrupts.c', > diff --git a/lib/librte_eal/common/eal_common_cpuflags.c > b/lib/librte_eal/common/eal_common_cpuflags.c > index 6a9dbaeb1..3a055f7c7 100644 > --- a/lib/librte_eal/common/eal_common_cpuflags.c > +++ b/lib/librte_eal/common/eal_common_cpuflags.c > @@ -2,90 +2,11 @@ > * Copyright(c) 2010-2014 Intel Corporation > */ > > -#include > -#include > #include > -#include > -#include > -#include > -#include > - > -#if defined(__GLIBC__) && defined(__GLIBC_PREREQ) > -#if __GLIBC_PREREQ(2, 16) > -#include > -#define HAS_AUXV 1 > -#endif > -#endif > > #include > #include > > -#ifndef HAS_AUXV > -static unsigned long > -getauxval(unsigned long type __rte_unused) > -{ > - errno = ENOTSUP; > - return 0; > -} > -#endif > - > -#ifdef RTE_ARCH_64 > -typedef Elf64_auxv_t Internal_Elfx_auxv_t; > -#else > -typedef Elf32_auxv_t Internal_Elfx_auxv_t; > -#endif > - > - > -/** > - * Provides a method for retrieving values from the auxiliary vector and > - * possibly running a string comparison. > - * > - * @return Always returns a result. When the result is 0, check errno > - * to see if an error occurred during processing. > - */ > -static unsigned long > -_rte_cpu_getauxval(unsigned long type, const char *str) > -{ > - unsigned long val; > - > - errno = 0; > - val = getauxval(type); > - > - if (!val && (errno == ENOTSUP || errno == ENOENT)) { > - int auxv_fd = open("/proc/self/auxv", O_RDONLY); > - Internal_Elfx_auxv_t auxv; > - > - if (auxv_fd == -1) > - return 0; > - > - errno = ENOENT; > - while (read(auxv_fd, &auxv, sizeof(auxv)) == sizeof(auxv)) { > - if (auxv.a_type == type) { > - errno = 0; > - val = auxv.a_un.a_val; > - if (str) > -
Re: [dpdk-dev] [PATCH] eal: check if hugedir write lock is already being held
30/04/2018 15:07, Andrew Rybchenko: > On 04/30/2018 01:38 PM, Anatoly Burakov wrote: > > At hugepage info initialization, EAL takes out a write lock on > > hugetlbfs directories, and drops it after the memory init is > > finished. However, in non-legacy mode, if "-m" or "--socket-mem" > > switches are passed, this leads to a deadlock because EAL tries > > to allocate pages (and thus take out a write lock on hugedir) > > while still holding a separate hugedir write lock in EAL. > > > > Fix it by checking if write lock in hugepage info is active, and > > not trying to lock the directory if the hugedir fd is valid. > > > > Fixes: 1a7dc2252f28 ("mem: revert to using flock and add per-segment > > lockfiles") > > Cc: anatoly.bura...@intel.com > > > > Signed-off-by: Anatoly Burakov Tested-by: Maxime Coquelin Tested-by: Shahaf Shuler > Tested-by: Andrew Rybchenko Applied, thanks
Re: [dpdk-dev] [PATCH] mem: fix heap size not set on init
25/04/2018 15:42, Anatoly Burakov: > When heap initializes, we need to add already allocated segments > onto the heap. However, in doing that, we never increased total > heap size. Fix it by adding segment length to total heap length > when initializing the heap. > > Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists") > Cc: anatoly.bura...@intel.com > > Signed-off-by: Anatoly Burakov Applied, thanks
[dpdk-dev] [PATCH] examples/flow_classify: fix failure in port_init function
The port_init function calls the rte_eth_dev_is_valid_port function. This function now returns 1 if the port state is attached. A return value of 1 now means a valid port. Fixes: a9dbe1802226 ("fix ethdev port id validation") Signed-off-by: Bernard Iremonger --- examples/flow_classify/flow_classify.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/flow_classify/flow_classify.c b/examples/flow_classify/flow_classify.c index 3b087ce..6412fe4 100644 --- a/examples/flow_classify/flow_classify.c +++ b/examples/flow_classify/flow_classify.c @@ -200,7 +200,7 @@ struct rte_flow_query_count count = { struct rte_eth_dev_info dev_info; struct rte_eth_txconf txconf; - if (rte_eth_dev_is_valid_port(port)) + if (!rte_eth_dev_is_valid_port(port)) return -1; rte_eth_dev_info_get(port, &dev_info); -- 1.9.1
Re: [dpdk-dev] [PATCH] examples/flow_classify: fix failure in port_init function
30/04/2018 15:43, Bernard Iremonger: > The port_init function calls the rte_eth_dev_is_valid_port function. > This function now returns 1 if the port state is attached. > A return value of 1 now means a valid port. > > Fixes: a9dbe1802226 ("fix ethdev port id validation") > Signed-off-by: Bernard Iremonger My mistake. Applied, thanks
Re: [dpdk-dev] [PATCH] eal/service: remove experimental tags
I just wanted to say I'm using the functionality for debugging NFP firmware and getting some useful information from the device. I did not plan to have this upstream, but after this patch for removing the experimental tag, I think it would be a good idea. Thanks! On Wed, Apr 25, 2018 at 1:58 PM, Thomas Monjalon wrote: > > > This commit removes the experimental tags from the > > > service cores functions, they now become part of the > > > main DPDK API/ABI. > > > > > > Signed-off-by: Harry van Haaren > > > > Acked-by: Jerin Jacob > > Acked-by: Thomas Monjalon > > Applied, congratulations! > > >
Re: [dpdk-dev] [PATCH v5 0/4] ethdev additions to support tunnel encap/decap
24/04/2018 18:26, Thomas Monjalon: > Hi, > > > Declan Doherty (4): > > ethdev: Add tunnel encap/decap actions > > ethdev: Add group JUMP action > > ethdev: add mark flow item to rte_flow_item_types > > ethdev: add shared counter support to rte_flow > > No specific comment. > > It is only an API without any PMD implementation. > Which PMDs are planned to be supported? When? > > Next time, we could require to have at least one implementation, > when submitting a new API. One more comment: there is no testpmd usage of this API. Please Declan, could you fix testpmd by adding new commands using this new flow encapsulation feature? We need it in 18.05 in order to avoid having some orphan code. Thanks
Re: [dpdk-dev] [PATCH] net/vhost: Initialise vid to -1
> > On 04/27/2018 04:19 PM, Ciara Loftus wrote: > > rte_eth_vhost_get_vid_from_port_id returns a value of 0 if called before > > the first call to the new_device callback. A vid value >=0 suggests the > > device is active which is not the case in this instance. Initialise vid > > to a negative value to prevent this. > > > > Signed-off-by: Ciara Loftus > > --- > > drivers/net/vhost/rte_eth_vhost.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/net/vhost/rte_eth_vhost.c > b/drivers/net/vhost/rte_eth_vhost.c > > index 99a7727..f47950c 100644 > > --- a/drivers/net/vhost/rte_eth_vhost.c > > +++ b/drivers/net/vhost/rte_eth_vhost.c > > @@ -1051,6 +1051,7 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, > uint16_t rx_queue_id, > > return -ENOMEM; > > } > > > > + vq->vid = -1; > > vq->mb_pool = mb_pool; > > vq->virtqueue_id = rx_queue_id * VIRTIO_QNUM + VIRTIO_TXQ; > > dev->data->rx_queues[rx_queue_id] = vq; > > > > Reviewed-by: Maxime Coquelin > > Thanks, > Maxime On second thoughts, self-NACK. We need to provision for the case where we want to call eth_rx_queue_setup AFTER new_device. For instance when we want to change the mb_pool. In this case we need to maintain the same vid and not reset it to -1. Without this patch the original problem still exists and need to find an alternative workaround. Thanks, Ciara
[dpdk-dev] nfp doing its own pci_read_config
Why is Netronome driver using its own version of existing rte_pci_read_config? And hard coding magic numbers for offsets. This shows up as Coverity error *** CID 277243: Error handling issues (CHECKED_RETURN) /drivers/net/nfp/nfpcore/nfp_cpp_pcie_ops.c: 684 in nfp6000_set_interface() 678 desc->busdev); 679 680 fp = open(tmp_str, O_RDONLY); 681 if (!fp) 682 return -1; 683 >>> CID 277243: Error handling issues (CHECKED_RETURN) >>> Calling "lseek(fp, 340L, 0)" without checking return value. This >>> library function may fail and return an error code. 684 lseek(fp, 0x154, SEEK_SET); 685 686 if (read(fp, &tmp, sizeof(tmp)) != sizeof(tmp)) { 687 printf("error reading config file for interface\n"); 688 return -1; 689 static int nfp6000_set_model(struct nfp_pcie_user *desc, struct nfp_cpp *cpp) { char tmp_str[80]; uint32_t tmp; int fp; snprintf(tmp_str, sizeof(tmp_str), "%s/%s/config", PCI_DEVICES, desc->busdev); fp = open(tmp_str, O_RDONLY); if (!fp) return -1; lseek(fp, 0x2e, SEEK_SET); if (read(fp, &tmp, sizeof(tmp)) != sizeof(tmp)) { printf("Error reading config file for model\n"); return -1; } tmp = tmp << 16;
[dpdk-dev] [PATCH] net/i40e: revert default PF PMD device name
Changes introduced by e0cb96204b71 modified the default name generated for the i40e PF PMD, this patch reverts the default name to the original PCI BDBF. Fixes: e0cb96204b71 ("net/i40e: add support for representor ports") Signed-off-by: Declan Doherty --- drivers/net/i40e/i40e_ethdev.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index d869add95..284e9cb64 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -630,10 +630,7 @@ eth_i40e_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, return retval; } - /* physical port net_bdf_port */ - snprintf(name, sizeof(name), "net_%s", pci_dev->device.name); - - retval = rte_eth_dev_create(&pci_dev->device, name, + retval = rte_eth_dev_create(&pci_dev->device, pci_dev->device.name, sizeof(struct i40e_adapter), eth_dev_pci_specific_init, pci_dev, eth_i40e_dev_init, NULL); @@ -642,7 +639,8 @@ eth_i40e_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, return retval; /* probe VF representor ports */ - struct rte_eth_dev *pf_ethdev = rte_eth_dev_allocated(name); + struct rte_eth_dev *pf_ethdev = rte_eth_dev_allocated( + pci_dev->device.name); if (pf_ethdev == NULL) return -ENODEV; -- 2.14.3
[dpdk-dev] [PATCH 2/3] net/ixgbe: initialise nb_representor_ports value
Initialise rte_ethdev_args nb_representor_ports to zero to handle the case where no devargs are passed to the IXGBE PF on device probe, so that there is no invalid attempts to create representor ports. Coverity Issue: 277231 Fixes: cf80ba6e2038 ("net/ixgbe: add support for representor ports") Signed-off-by: Declan Doherty --- drivers/net/ixgbe/ixgbe_ethdev.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c index 0ccf55dc8..283dd7e49 100644 --- a/drivers/net/ixgbe/ixgbe_ethdev.c +++ b/drivers/net/ixgbe/ixgbe_ethdev.c @@ -1725,8 +1725,7 @@ eth_ixgbe_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, struct rte_pci_device *pci_dev) { char name[RTE_ETH_NAME_MAX_LEN]; - - struct rte_eth_devargs eth_da; + struct rte_eth_devargs eth_da = { .nb_representor_ports = 0 }; int i, retval; if (pci_dev->device.devargs) { -- 2.14.3
[dpdk-dev] [PATCH 1/3] net/ixgbe: revert default PF PMD device name
Changes introduced by cf80ba6e2038 modified the default name generated for the IXGBE PF PMD, this patch reverts the default name to the original PCI BDBF. Fixes: cf80ba6e2038 ("net/ixgbe: add support for representor ports") Signed-off-by: Declan Doherty --- drivers/net/ixgbe/ixgbe_ethdev.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c index 6088c7e48..0ccf55dc8 100644 --- a/drivers/net/ixgbe/ixgbe_ethdev.c +++ b/drivers/net/ixgbe/ixgbe_ethdev.c @@ -1736,10 +1736,7 @@ eth_ixgbe_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, return retval; } - /* physical port net_bdf_port */ - snprintf(name, sizeof(name), "net_%s_%d", pci_dev->device.name, 0); - - retval = rte_eth_dev_create(&pci_dev->device, name, + retval = rte_eth_dev_create(&pci_dev->device, pci_dev->device.name, sizeof(struct ixgbe_adapter), eth_dev_pci_specific_init, pci_dev, eth_ixgbe_dev_init, NULL); @@ -1748,7 +1745,8 @@ eth_ixgbe_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, return retval; /* probe VF representor ports */ - struct rte_eth_dev *pf_ethdev = rte_eth_dev_allocated(name); + struct rte_eth_dev *pf_ethdev = rte_eth_dev_allocated( + pci_dev->device.name); for (i = 0; i < eth_da.nb_representor_ports; i++) { struct ixgbe_vf_info *vfinfo; -- 2.14.3
[dpdk-dev] [PATCH 3/3] net/ixgbe: add null pointer check for pf_ethdev
Add NULL parameter check for rte_eth_dev_allocated() API call to eth_ixgbe_pci_probe(). Coverity Issue: 277216 Fixes: cf80ba6e2038 ("net/ixgbe: add support for representor ports") Signed-off-by: Declan Doherty --- drivers/net/ixgbe/ixgbe_ethdev.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c index 283dd7e49..75f927c06 100644 --- a/drivers/net/ixgbe/ixgbe_ethdev.c +++ b/drivers/net/ixgbe/ixgbe_ethdev.c @@ -1747,6 +1747,9 @@ eth_ixgbe_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, struct rte_eth_dev *pf_ethdev = rte_eth_dev_allocated( pci_dev->device.name); + if (pf_ethdev == NULL) + return -ENODEV; + for (i = 0; i < eth_da.nb_representor_ports; i++) { struct ixgbe_vf_info *vfinfo; struct ixgbe_vf_representor representor; -- 2.14.3
[dpdk-dev] pthread_barrier_deadlock in -rc1 (was: "Re: [PATCH v3 0/5] fix control thread affinities")
Hi Olivier, On 04/24/2018 04:46 PM, Olivier Matz wrote: Some parts of dpdk use their own management threads. Most of the time, the affinity of the thread is not properly set: it should not be scheduled on the dataplane cores, because interrupting them can cause packet losses. This patchset introduces a new wrapper for thread creation that does the job automatically, avoiding code duplication. v3: * new patch: use this API in examples when relevant. * replace pthread_kill by pthread_cancel. Note that pthread_join() is still needed. * rebase: vfio and pdump do not have control pthreads anymore, and eal has 2 new pthreads * remove all calls to snprintf/strlcpy that truncate the thread name: all strings lengths are already < 16. v2: * set affinity to master core if no core is off, as suggested by Anatoly Olivier Matz (5): eal: use sizeof to avoid a double use of a define eal: new function to create control threads eal: set name when creating a control thread eal: set affinity for control threads examples: use new API to create control threads drivers/net/kni/Makefile | 1 + drivers/net/kni/rte_eth_kni.c| 3 +- examples/tep_termination/main.c | 16 +++ examples/vhost/main.c| 19 +++- lib/librte_eal/bsdapp/eal/eal.c | 4 +- lib/librte_eal/bsdapp/eal/eal_thread.c | 2 +- lib/librte_eal/common/eal_common_proc.c | 15 ++ lib/librte_eal/common/eal_common_thread.c| 72 lib/librte_eal/common/include/rte_lcore.h| 26 ++ lib/librte_eal/linuxapp/eal/eal.c| 4 +- lib/librte_eal/linuxapp/eal/eal_interrupts.c | 17 ++- lib/librte_eal/linuxapp/eal/eal_thread.c | 2 +- lib/librte_eal/linuxapp/eal/eal_timer.c | 12 + lib/librte_eal/rte_eal_version.map | 1 + lib/librte_vhost/socket.c| 25 ++ 15 files changed, 135 insertions(+), 84 deletions(-) I face a deadlock issue with your series, that Jianfeng patch does not resolve ("eal: fix threads block on barrier"). Reverting the series and Jianfeng patch makes the issue to disappear. I face the problem in a VM (not seen on the host): # ./install/bin/testpmd -l 0,1,2 --socket-mem 1024 -n 4 --proc-type auto --file-prefix pg -- --portmask=3 --forward-mode=macswap --port-topology=chained --disable-rss -i --rxq=1 --txq=1 --rxd=256 --txd=256 --nb-cores=2 --auto-start EAL: Detected 3 lcore(s) EAL: Detected 1 NUMA nodes EAL: Auto-detected process type: PRIMARY EAL: Multi-process socket /var/run/.pg_unix Then it is stuck. Attaching with GDB, I get below backtrace information: (gdb) info threads Id Target Id Frame 3Thread 0x7f63e1f9f700 (LWP 8808) "rte_mp_handle" 0x7f63e2591bfd in recvmsg () at ../sysdeps/unix/syscall-template.S:81 2Thread 0x7f63e179e700 (LWP 8809) "rte_mp_async" pthread_barrier_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 * 1Thread 0x7f63e32cec00 (LWP 8807) "testpmd" pthread_barrier_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 (gdb) bt full #0 pthread_barrier_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 No locals. #1 0x00520c54 in rte_ctrl_thread_create (thread=thread@entry=0x7ffe5c895020, name=name@entry=0x869d86 "rte_mp_async", attr=attr@entry=0x0, start_routine=start_routine@entry=0x521030 , arg=arg@entry=0x0) at /root/src/dpdk/lib/librte_eal/common/eal_common_thread.c:207 params = 0x17b1e40 lcore_id = cpuset = {__bits = {1, 0 }} cpu_found = ret = 0 #2 0x005220b6 in rte_mp_channel_init () at /root/src/dpdk/lib/librte_eal/common/eal_common_proc.c:674 path = "/var/run\000.pg_unix_*", '\000' ... dir_fd = 4 mp_handle_tid = 140066969745152 async_reply_handle_tid = 140066961352448 #3 0x0050c227 in rte_eal_init (argc=argc@entry=23, argv=argv@entry=0x7ffe5c896378) at /root/src/dpdk/lib/librte_eal/linuxapp/eal/eal.c:775 i = fctret = 11 ret = thread_id = 140066989861888 run_once = {cnt = 1} logid = 0x17b1e00 "testpmd" cpuset = "T}\211\\\376\177", '\000' , "\020", '\000' ... thread_name = "X}\211\\\376\177\000\000\226\301\036\342c\177\000" __func__ = "rte_eal_init" #4 0x00473214 in main (argc=23, argv=0x7ffe5c896378) at /root/src/dpdk/app/test-pmd/testpmd.c:2597 diag = port_id = ret = __func__ = "main" (gdb) thread 2 [Switching to thread 2 (Thread 0x7f63e179e700 (LWP 8809))] #0 pthread_barrier_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 71 cmpl%edx, (%rdi) (gdb) bt full #0 pthread_barrier_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 No locals. #1 0x
[dpdk-dev] [PATCH] vhost: improve dirty pages logging performance
This patch caches all dirty pages logging until the used ring index is updated. These dirty pages won't be accessed by the guest as long as the host doesn't give them back to it by updating the index. The goal of this optimization is to fix a performance regression introduced when the vhost library started to use atomic operations to set bits in the shared dirty log map. While the fix was valid as previous implementation wasn't safe against concurent accesses, contention was induced. With this patch, during migration, we have: 1. Less atomic operations as only a single atomic OR operation per 32 pages. 2. Less atomic operations as during a burst, the same page will be marked dirty only once. 3. Less write memory barriers. Fixes: 897f13a1f726 ("vhost: make page logging atomic") Cc: sta...@dpdk.org Suggested-by: Michael S. Tsirkin Signed-off-by: Maxime Coquelin --- Hi, This series was tested with migrating a guest while running PVP benchmark at 1Mpps with both ovs-dpdk and testpmd as vswitch. With this patch we recover the packet drops regressions seen since the use of atomic operations to log dirty pages. Some numbers: A) PVP Live migration using testpmd (Single queue pair) --- Without patch: =Stream Rate: 1Mpps== No Stream_Rate Downtime Totaltime Ping_Loss trex_Loss 0 1Mpps 134 1896616 11628963.0 1 1Mpps 125 18790168436300.0 2 1Mpps 122 1917115 13881342.0 3 1Mpps 132 1891315 12079492.0Max 1Mpps 134 1917116 13881342 Min 1Mpps 122 1879015 8436300 Mean 1Mpps 128 1896015 11506524 Median 1Mpps 128 1893915 11854227 Stdev 0 5.68158.81 0.58 2266371.52 = With patch: =Stream Rate: 1Mpps== No Stream_Rate Downtime Totaltime Ping_Loss trex_Loss 0 1Mpps 119 1352115 478174.0 1 1Mpps 116 1408414 452018.0 2 1Mpps 122 1390814 464486.0 3 1Mpps 129 1378716 478234.0 Max 1Mpps 129 1408416 478234 Min 1Mpps 116 1352114 452018 Mean 1Mpps 121 1382514 468228 Median 1Mpps 120 1384714 471330 Stdev 0 5.57236.52 0.96 12593.78 = B) OVS-DPDK migration with 2 queue pairs Without patch: ===Stream Rate: 1Mpps No Stream_Rate Downtime Totaltime Ping_Loss trex_Loss 0 1Mpps 146 20270 116 15937394.0 1 1Mpps 150 2561716 11120370.0 2 1Mpps 138 40983 115 24779971.0 3 1Mpps 161 2043517 15363519.0 Max 1Mpps 161 40983 116 24779971 Min 1Mpps 138 2027016 11120370 Mean 1Mpps 148 2682666 16800313 Median 1Mpps 148 2302666 15650456 Stdev 0 9.579758.9 57.16 5737179.93 = With patch: ===Stream Rate: 1Mpps No Stream_Rate Downtime Totaltime Ping_Loss trex_Loss 0 1Mpps 155 1891517 481330.0 1 1Mpps 156 2109718 370556.0 2 1Mpps 156 4961015 369291.0 3 1Mpps 144 3112415 361914.0 Max 1Mpps 156 4961018 481330 Min 1Mpps 144 1891515 361914 Mean 1Mpps 152 3018616 395772 Median 1Mpps 155 2611016 369923 Stdev 0 5.85 13997.82 1.5 57165.33 = C) OVS-DPDK migration with single queue pair Without patch: ===Stream Rate: 1Mpps No Stream_Rate Downtime Totaltime Ping_Loss trex_Loss 0 1Mpps 129 1741115 11105414.0 1 1Mpps 130 16544158028438.0 2 1Mpps 132 15202157491584.0 3 1Mpps 133 18100158385047.0 Max 1Mpps 133 1810015 11105414 Min 1Mpps 129 1520215 7491584 Mean 1Mp
Re: [dpdk-dev] pthread_barrier_deadlock in -rc1 (was: "Re: [PATCH v3 0/5] fix control thread affinities")
Hi Maxime, Le 30 avril 2018 17:45:52 GMT+02:00, Maxime Coquelin a écrit : >Hi Olivier, > >On 04/24/2018 04:46 PM, Olivier Matz wrote: >> Some parts of dpdk use their own management threads. Most of the >time, >> the affinity of the thread is not properly set: it should not be >scheduled >> on the dataplane cores, because interrupting them can cause packet >losses. >> >> This patchset introduces a new wrapper for thread creation that does >> the job automatically, avoiding code duplication. >> >> v3: >> * new patch: use this API in examples when relevant. >> * replace pthread_kill by pthread_cancel. Note that pthread_join() >>is still needed. >> * rebase: vfio and pdump do not have control pthreads anymore, and >eal >>has 2 new pthreads >> * remove all calls to snprintf/strlcpy that truncate the thread name: >>all strings lengths are already < 16. >> >> v2: >> * set affinity to master core if no core is off, as suggested by >>Anatoly >> >> Olivier Matz (5): >>eal: use sizeof to avoid a double use of a define >>eal: new function to create control threads >>eal: set name when creating a control thread >>eal: set affinity for control threads >>examples: use new API to create control threads >> >> drivers/net/kni/Makefile | 1 + >> drivers/net/kni/rte_eth_kni.c| 3 +- >> examples/tep_termination/main.c | 16 +++ >> examples/vhost/main.c| 19 +++- >> lib/librte_eal/bsdapp/eal/eal.c | 4 +- >> lib/librte_eal/bsdapp/eal/eal_thread.c | 2 +- >> lib/librte_eal/common/eal_common_proc.c | 15 ++ >> lib/librte_eal/common/eal_common_thread.c| 72 > >> lib/librte_eal/common/include/rte_lcore.h| 26 ++ >> lib/librte_eal/linuxapp/eal/eal.c| 4 +- >> lib/librte_eal/linuxapp/eal/eal_interrupts.c | 17 ++- >> lib/librte_eal/linuxapp/eal/eal_thread.c | 2 +- >> lib/librte_eal/linuxapp/eal/eal_timer.c | 12 + >> lib/librte_eal/rte_eal_version.map | 1 + >> lib/librte_vhost/socket.c| 25 ++ >> 15 files changed, 135 insertions(+), 84 deletions(-) >> > >I face a deadlock issue with your series, that Jianfeng patch does not >resolve ("eal: fix threads block on barrier"). Reverting the series and >Jianfeng patch makes the issue to disappear. > >I face the problem in a VM (not seen on the host): ># ./install/bin/testpmd -l 0,1,2 --socket-mem 1024 -n 4 --proc-type >auto >--file-prefix pg -- --portmask=3 --forward-mode=macswap >--port-topology=chained --disable-rss -i --rxq=1 --txq=1 --rxd=256 >--txd=256 --nb-cores=2 --auto-start >EAL: Detected 3 lcore(s) >EAL: Detected 1 NUMA nodes >EAL: Auto-detected process type: PRIMARY >EAL: Multi-process socket /var/run/.pg_unix > > >Then it is stuck. Attaching with GDB, I get below backtrace >information: > >(gdb) info threads > Id Target Id Frame > 3Thread 0x7f63e1f9f700 (LWP 8808) "rte_mp_handle" >0x7f63e2591bfd in recvmsg () at >../sysdeps/unix/syscall-template.S:81 > 2Thread 0x7f63e179e700 (LWP 8809) "rte_mp_async" >pthread_barrier_wait () at >../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >* 1Thread 0x7f63e32cec00 (LWP 8807) "testpmd" pthread_barrier_wait >() at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >(gdb) bt full >#0 pthread_barrier_wait () at >../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >No locals. >#1 0x00520c54 in rte_ctrl_thread_create >(thread=thread@entry=0x7ffe5c895020, name=name@entry=0x869d86 >"rte_mp_async", attr=attr@entry=0x0, >start_routine=start_routine@entry=0x521030 , >arg=arg@entry=0x0) > at /root/src/dpdk/lib/librte_eal/common/eal_common_thread.c:207 > params = 0x17b1e40 > lcore_id = > cpuset = {__bits = {1, 0 }} > cpu_found = > ret = 0 >#2 0x005220b6 in rte_mp_channel_init () at >/root/src/dpdk/lib/librte_eal/common/eal_common_proc.c:674 >path = "/var/run\000.pg_unix_*", '\000' ... > dir_fd = 4 > mp_handle_tid = 140066969745152 > async_reply_handle_tid = 140066961352448 >#3 0x0050c227 in rte_eal_init (argc=argc@entry=23, >argv=argv@entry=0x7ffe5c896378) at >/root/src/dpdk/lib/librte_eal/linuxapp/eal/eal.c:775 > i = > fctret = 11 > ret = > thread_id = 140066989861888 > run_once = {cnt = 1} > logid = 0x17b1e00 "testpmd" > cpuset = "T}\211\\\376\177", '\000' , >"\020", '\000' ... > thread_name = "X}\211\\\376\177\000\000\226\301\036\342c\177\000" > __func__ = "rte_eal_init" >#4 0x00473214 in main (argc=23, argv=0x7ffe5c896378) at >/root/src/dpdk/app/test-pmd/testpmd.c:2597 > diag = > port_id = > ret = > __func__ = "main" >(gdb) thread 2 >[Switching to thread 2
[dpdk-dev] [PATCH] net/vmxnet3: convert to new rx offload api
Ethdev RX offloads API has changed since: commit ce17eddefc20 ("ethdev: introduce Rx queue offloads API") This patch adopts the new RX Offload API in vmxnet3 driver. Signed-off-by: Louis Luo --- drivers/net/vmxnet3/vmxnet3_ethdev.c | 61 ++-- 1 file changed, 45 insertions(+), 16 deletions(-) diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c b/drivers/net/vmxnet3/vmxnet3_ethdev.c index 4568521..d9d5bda 100644 --- a/drivers/net/vmxnet3/vmxnet3_ethdev.c +++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c @@ -42,6 +42,23 @@ #defineVMXNET3_TX_MAX_SEG UINT8_MAX +#define VMXNET3_TX_OFFLOAD_CAP \ + (DEV_TX_OFFLOAD_VLAN_INSERT | \ +DEV_TX_OFFLOAD_IPV4_CKSUM |\ +DEV_TX_OFFLOAD_TCP_CKSUM | \ +DEV_TX_OFFLOAD_UDP_CKSUM | \ +DEV_TX_OFFLOAD_TCP_TSO | \ +DEV_TX_OFFLOAD_MULTI_SEGS) + +#define VMXNET3_RX_OFFLOAD_CAP \ + (DEV_RX_OFFLOAD_VLAN_STRIP |\ +DEV_RX_OFFLOAD_SCATTER | \ +DEV_RX_OFFLOAD_IPV4_CKSUM |\ +DEV_RX_OFFLOAD_UDP_CKSUM | \ +DEV_RX_OFFLOAD_TCP_CKSUM | \ +DEV_RX_OFFLOAD_TCP_LRO | \ +DEV_RX_OFFLOAD_JUMBO_FRAME) + static int eth_vmxnet3_dev_init(struct rte_eth_dev *eth_dev); static int eth_vmxnet3_dev_uninit(struct rte_eth_dev *eth_dev); static int vmxnet3_dev_configure(struct rte_eth_dev *dev); @@ -376,9 +393,25 @@ vmxnet3_dev_configure(struct rte_eth_dev *dev) const struct rte_memzone *mz; struct vmxnet3_hw *hw = dev->data->dev_private; size_t size; + uint64_t rx_offloads = dev->data->dev_conf.rxmode.offloads; + uint64_t tx_offloads = dev->data->dev_conf.txmode.offloads; PMD_INIT_FUNC_TRACE(); + if ((rx_offloads & VMXNET3_RX_OFFLOAD_CAP) != rx_offloads) { + RTE_LOG(ERR, PMD, "Requested RX offloads 0x%lx" + " do not match supported 0x%lx\n", + rx_offloads, (uint64_t)VMXNET3_RX_OFFLOAD_CAP); + return -ENOTSUP; + } + + if ((tx_offloads & VMXNET3_TX_OFFLOAD_CAP) != tx_offloads) { + RTE_LOG(ERR, PMD, "Requested TX offloads 0x%lx" + " do not match supported 0x%lx\n", + tx_offloads, (uint64_t)VMXNET3_TX_OFFLOAD_CAP); + return -ENOTSUP; + } + if (dev->data->nb_tx_queues > VMXNET3_MAX_TX_QUEUES || dev->data->nb_rx_queues > VMXNET3_MAX_RX_QUEUES) { PMD_INIT_LOG(ERR, "ERROR: Number of queues not supported"); @@ -567,6 +600,7 @@ vmxnet3_setup_driver_shared(struct rte_eth_dev *dev) uint32_t mtu = dev->data->mtu; Vmxnet3_DriverShared *shared = hw->shared; Vmxnet3_DSDevRead *devRead = &shared->devRead; + uint64_t rx_offloads = dev->data->dev_conf.rxmode.offloads; uint32_t i; int ret; @@ -644,10 +678,10 @@ vmxnet3_setup_driver_shared(struct rte_eth_dev *dev) devRead->rxFilterConf.rxMode = 0; /* Setting up feature flags */ - if (dev->data->dev_conf.rxmode.hw_ip_checksum) + if (rx_offloads & DEV_RX_OFFLOAD_CHECKSUM) devRead->misc.uptFeatures |= VMXNET3_F_RXCSUM; - if (dev->data->dev_conf.rxmode.enable_lro) { + if (rx_offloads & DEV_RX_OFFLOAD_TCP_LRO) { devRead->misc.uptFeatures |= VMXNET3_F_LRO; devRead->misc.maxNumRxSG = 0; } @@ -1050,17 +1084,10 @@ vmxnet3_dev_info_get(struct rte_eth_dev *dev __rte_unused, .nb_mtu_seg_max = VMXNET3_MAX_TXD_PER_PKT, }; - dev_info->rx_offload_capa = - DEV_RX_OFFLOAD_VLAN_STRIP | - DEV_RX_OFFLOAD_UDP_CKSUM | - DEV_RX_OFFLOAD_TCP_CKSUM | - DEV_RX_OFFLOAD_TCP_LRO; - - dev_info->tx_offload_capa = - DEV_TX_OFFLOAD_VLAN_INSERT | - DEV_TX_OFFLOAD_TCP_CKSUM | - DEV_TX_OFFLOAD_UDP_CKSUM | - DEV_TX_OFFLOAD_TCP_TSO; + dev_info->rx_offload_capa = VMXNET3_RX_OFFLOAD_CAP; + dev_info->rx_queue_offload_capa = 0; + dev_info->tx_offload_capa = VMXNET3_TX_OFFLOAD_CAP; + dev_info->tx_queue_offload_capa = 0; } static const uint32_t * @@ -1154,8 +1181,9 @@ vmxnet3_dev_promiscuous_disable(struct rte_eth_dev *dev) { struct vmxnet3_hw *hw = dev->data->dev_private; uint32_t *vf_table = hw->shared->devRead.rxFilterConf.vfTable; + uint64_t rx_offloads = dev->data->dev_conf.rxmode.offloads; - if (dev->data->dev_conf.rxmode.hw_vlan_filter) + if (rx_offloads & DEV_RX_OFFLOAD_VLAN_FILTER) memcpy(vf_table, hw->shadow_vfta, VMXNET3_VFT_TABLE_SIZE); else memset(vf_table, 0xff, VMXNET3_VFT_TABLE_SIZE); @@ -1217,9 +1245,10 @@ vmxnet3_dev_vlan_offload_set(struct rte_eth_dev *dev, int mask) struct vmxnet3_hw *hw = dev->data->dev_private; Vmxnet3_DSDevRead *devRead
Re: [dpdk-dev] [PATCH] net/vmxnet3: convert to new rx offload api
> -Original Message- > From: Louis Luo [mailto:llo...@vmware.com] > Sent: Monday, April 30, 2018 3:21 PM > To: Yong Wang > Cc: dev@dpdk.org; Louis Luo > Subject: [PATCH] net/vmxnet3: convert to new rx offload api > > Ethdev RX offloads API has changed since: commit ce17eddefc20 > ("ethdev: introduce Rx queue offloads API") > > This patch adopts the new RX Offload API in vmxnet3 driver. > > Signed-off-by: Louis Luo Acked-by: Yong Wang > --- > drivers/net/vmxnet3/vmxnet3_ethdev.c | 61 > ++-- > 1 file changed, 45 insertions(+), 16 deletions(-) > > diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c > b/drivers/net/vmxnet3/vmxnet3_ethdev.c > index 4568521..d9d5bda 100644 > --- a/drivers/net/vmxnet3/vmxnet3_ethdev.c > +++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c > @@ -42,6 +42,23 @@ > > #define VMXNET3_TX_MAX_SEG UINT8_MAX > > +#define VMXNET3_TX_OFFLOAD_CAP \ > + (DEV_TX_OFFLOAD_VLAN_INSERT | \ > + DEV_TX_OFFLOAD_IPV4_CKSUM |\ > + DEV_TX_OFFLOAD_TCP_CKSUM | \ > + DEV_TX_OFFLOAD_UDP_CKSUM | \ > + DEV_TX_OFFLOAD_TCP_TSO | \ > + DEV_TX_OFFLOAD_MULTI_SEGS) > + > +#define VMXNET3_RX_OFFLOAD_CAP \ > + (DEV_RX_OFFLOAD_VLAN_STRIP |\ > + DEV_RX_OFFLOAD_SCATTER | \ > + DEV_RX_OFFLOAD_IPV4_CKSUM |\ > + DEV_RX_OFFLOAD_UDP_CKSUM | \ > + DEV_RX_OFFLOAD_TCP_CKSUM | \ > + DEV_RX_OFFLOAD_TCP_LRO | \ > + DEV_RX_OFFLOAD_JUMBO_FRAME) > + > static int eth_vmxnet3_dev_init(struct rte_eth_dev *eth_dev); > static int eth_vmxnet3_dev_uninit(struct rte_eth_dev *eth_dev); > static int vmxnet3_dev_configure(struct rte_eth_dev *dev); > @@ -376,9 +393,25 @@ vmxnet3_dev_configure(struct rte_eth_dev *dev) > const struct rte_memzone *mz; > struct vmxnet3_hw *hw = dev->data->dev_private; > size_t size; > + uint64_t rx_offloads = dev->data->dev_conf.rxmode.offloads; > + uint64_t tx_offloads = dev->data->dev_conf.txmode.offloads; > > PMD_INIT_FUNC_TRACE(); > > + if ((rx_offloads & VMXNET3_RX_OFFLOAD_CAP) != rx_offloads) { > + RTE_LOG(ERR, PMD, "Requested RX offloads 0x%lx" > + " do not match supported 0x%lx\n", > + rx_offloads, > (uint64_t)VMXNET3_RX_OFFLOAD_CAP); > + return -ENOTSUP; > + } > + > + if ((tx_offloads & VMXNET3_TX_OFFLOAD_CAP) != tx_offloads) { > + RTE_LOG(ERR, PMD, "Requested TX offloads 0x%lx" > + " do not match supported 0x%lx\n", > + tx_offloads, (uint64_t)VMXNET3_TX_OFFLOAD_CAP); > + return -ENOTSUP; > + } > + > if (dev->data->nb_tx_queues > VMXNET3_MAX_TX_QUEUES || > dev->data->nb_rx_queues > VMXNET3_MAX_RX_QUEUES) { > PMD_INIT_LOG(ERR, "ERROR: Number of queues not > supported"); > @@ -567,6 +600,7 @@ vmxnet3_setup_driver_shared(struct rte_eth_dev > *dev) > uint32_t mtu = dev->data->mtu; > Vmxnet3_DriverShared *shared = hw->shared; > Vmxnet3_DSDevRead *devRead = &shared->devRead; > + uint64_t rx_offloads = dev->data->dev_conf.rxmode.offloads; > uint32_t i; > int ret; > > @@ -644,10 +678,10 @@ vmxnet3_setup_driver_shared(struct rte_eth_dev > *dev) > devRead->rxFilterConf.rxMode = 0; > > /* Setting up feature flags */ > - if (dev->data->dev_conf.rxmode.hw_ip_checksum) > + if (rx_offloads & DEV_RX_OFFLOAD_CHECKSUM) > devRead->misc.uptFeatures |= VMXNET3_F_RXCSUM; > > - if (dev->data->dev_conf.rxmode.enable_lro) { > + if (rx_offloads & DEV_RX_OFFLOAD_TCP_LRO) { > devRead->misc.uptFeatures |= VMXNET3_F_LRO; > devRead->misc.maxNumRxSG = 0; > } > @@ -1050,17 +1084,10 @@ vmxnet3_dev_info_get(struct rte_eth_dev *dev > __rte_unused, > .nb_mtu_seg_max = VMXNET3_MAX_TXD_PER_PKT, > }; > > - dev_info->rx_offload_capa = > - DEV_RX_OFFLOAD_VLAN_STRIP | > - DEV_RX_OFFLOAD_UDP_CKSUM | > - DEV_RX_OFFLOAD_TCP_CKSUM | > - DEV_RX_OFFLOAD_TCP_LRO; > - > - dev_info->tx_offload_capa = > - DEV_TX_OFFLOAD_VLAN_INSERT | > - DEV_TX_OFFLOAD_TCP_CKSUM | > - DEV_TX_OFFLOAD_UDP_CKSUM | > - DEV_TX_OFFLOAD_TCP_TSO; > + dev_info->rx_offload_capa = VMXNET3_RX_OFFLOAD_CAP; > + dev_info->rx_queue_offload_capa = 0; > + dev_info->tx_offload_capa = VMXNET3_TX_OFFLOAD_CAP; > + dev_info->tx_queue_offload_capa = 0; > } > > static const uint32_t * > @@ -1154,8 +1181,9 @@ vmxnet3_dev_promiscuous_disable(struct > rte_eth_dev *dev) > { > struct vmxnet3_hw *hw = dev->data->dev_private; > uint32_t *vf_table = hw->shared->devRead.rxFilterConf.vfTable; > + uint64_t rx_offloads = dev->data->dev_conf.rxmode.offloads; > > - if (dev->data->dev_conf.rxmode.hw_vlan_filter) > + if (rx_offloads & DEV_RX_O
[dpdk-dev] [PATCH 03/12] net/bnxt: rename driver version from Cumulus to NetXtreme
From: Scott Branden Rename driver version from "Broadcom Cumulus driver" to "Broadcom NetXtreme driver" to reflect this driver is applicable to NetXtreme family beyond Cumulus. Signed-off-by: Scott Branden Reviewed-by: Ajit Kumar Khaparde --- drivers/net/bnxt/bnxt_ethdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index 3e2ccfa90..58241ccac 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -29,7 +29,7 @@ #define DRV_MODULE_NAME"bnxt" static const char bnxt_version[] = - "Broadcom Cumulus driver " DRV_MODULE_NAME "\n"; + "Broadcom NetXtreme driver " DRV_MODULE_NAME "\n"; int bnxt_logtype_driver; #define PCI_VENDOR_ID_BROADCOM 0x14E4 -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 04/12] net/bnxt: return EINVAL instead of ENOSPC on invalid max ring
From: Jay Ding Return EINVAL instead of ENOSPC when invalid queue_idx passed in during rx and tx queue_setup_op routines. Signed-off-by: Jay Ding Signed-off-by: Scott Branden Reviewed-by: Ray Jui Reviewed-by: Ajit Kumar Khaparde --- drivers/net/bnxt/bnxt_rxq.c | 2 +- drivers/net/bnxt/bnxt_txq.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/bnxt/bnxt_rxq.c b/drivers/net/bnxt/bnxt_rxq.c index e939c9ac0..4e6fa4e30 100644 --- a/drivers/net/bnxt/bnxt_rxq.c +++ b/drivers/net/bnxt/bnxt_rxq.c @@ -290,7 +290,7 @@ int bnxt_rx_queue_setup_op(struct rte_eth_dev *eth_dev, PMD_DRV_LOG(ERR, "Cannot create Rx ring %d. Only %d rings available\n", queue_idx, bp->max_rx_rings); - return -ENOSPC; + return -EINVAL; } if (!nb_desc || nb_desc > MAX_RX_DESC_CNT) { diff --git a/drivers/net/bnxt/bnxt_txq.c b/drivers/net/bnxt/bnxt_txq.c index 07e25d77b..b50f37cf2 100644 --- a/drivers/net/bnxt/bnxt_txq.c +++ b/drivers/net/bnxt/bnxt_txq.c @@ -86,7 +86,7 @@ int bnxt_tx_queue_setup_op(struct rte_eth_dev *eth_dev, PMD_DRV_LOG(ERR, "Cannot create Tx ring %d. Only %d rings available\n", queue_idx, bp->max_tx_rings); - return -ENOSPC; + return -EINVAL; } if (!nb_desc || nb_desc > MAX_TX_DESC_CNT) { -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 01/12] net/bnxt: add support for lsc interrupt event
From: Qingmin Liu Add support to bnxt driver to register RTE_ETH_EVENT_INTR_LSC event and monitor physical link status. Signed-off-by: Qingmin Liu Signed-off-by: Scott Branden Signed-off-by: Ajit Kumar Khaparde Reviewed-by: Randy Schacher --- drivers/net/bnxt/bnxt_ethdev.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index 348129dad..229017ace 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -780,6 +780,11 @@ int bnxt_link_update_op(struct rte_eth_dev *eth_dev, int wait_to_complete) new.link_speed != eth_dev->data->dev_link.link_speed) { memcpy(ð_dev->data->dev_link, &new, sizeof(struct rte_eth_link)); + + _rte_eth_dev_callback_process(eth_dev, + RTE_ETH_EVENT_INTR_LSC, + NULL); + bnxt_print_link_info(eth_dev); } -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 02/12] net/bnxt: rename function checking MAC address
From: Scott Branden rename check_zero_bytes to bnxt_check_zero_bytes to match proper prefix. Signed-off-by: Scott Branden Signed-off-by: Ajit Kumar Khaparde --- drivers/net/bnxt/bnxt_ethdev.c | 2 +- drivers/net/bnxt/bnxt_filter.c | 8 +--- drivers/net/bnxt/bnxt_filter.h | 2 +- 3 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index 229017ace..3e2ccfa90 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -3286,7 +3286,7 @@ bnxt_dev_init(struct rte_eth_dev *eth_dev) goto error_free; } - if (check_zero_bytes(bp->dflt_mac_addr, ETHER_ADDR_LEN)) { + if (bnxt_check_zero_bytes(bp->dflt_mac_addr, ETHER_ADDR_LEN)) { PMD_DRV_LOG(ERR, "Invalid MAC addr %02X:%02X:%02X:%02X:%02X:%02X\n", bp->dflt_mac_addr[0], bp->dflt_mac_addr[1], diff --git a/drivers/net/bnxt/bnxt_filter.c b/drivers/net/bnxt/bnxt_filter.c index dadd1e32f..e36da9977 100644 --- a/drivers/net/bnxt/bnxt_filter.c +++ b/drivers/net/bnxt/bnxt_filter.c @@ -231,7 +231,7 @@ nxt_non_void_action(const struct rte_flow_action *cur) } } -int check_zero_bytes(const uint8_t *bytes, int len) +int bnxt_check_zero_bytes(const uint8_t *bytes, int len) { int i; for (i = 0; i < len; i++) @@ -512,13 +512,15 @@ bnxt_validate_and_parse_flow_type(struct bnxt *bp, ipv6_spec->hdr.src_addr, 16); rte_memcpy(filter->dst_ipaddr, ipv6_spec->hdr.dst_addr, 16); - if (!check_zero_bytes(ipv6_mask->hdr.src_addr, 16)) { + if (!bnxt_check_zero_bytes(ipv6_mask->hdr.src_addr, + 16)) { rte_memcpy(filter->src_ipaddr_mask, ipv6_mask->hdr.src_addr, 16); en |= !use_ntuple ? 0 : NTUPLE_FLTR_ALLOC_INPUT_EN_SRC_IPADDR_MASK; } - if (!check_zero_bytes(ipv6_mask->hdr.dst_addr, 16)) { + if (!bnxt_check_zero_bytes(ipv6_mask->hdr.dst_addr, + 16)) { rte_memcpy(filter->dst_ipaddr_mask, ipv6_mask->hdr.dst_addr, 16); en |= !use_ntuple ? 0 : diff --git a/drivers/net/bnxt/bnxt_filter.h b/drivers/net/bnxt/bnxt_filter.h index c70b127ac..d27be7032 100644 --- a/drivers/net/bnxt/bnxt_filter.h +++ b/drivers/net/bnxt/bnxt_filter.h @@ -69,7 +69,7 @@ struct bnxt_filter_info *bnxt_get_unused_filter(struct bnxt *bp); void bnxt_free_filter(struct bnxt *bp, struct bnxt_filter_info *filter); struct bnxt_filter_info *bnxt_get_l2_filter(struct bnxt *bp, struct bnxt_filter_info *nf, struct bnxt_vnic_info *vnic); -int check_zero_bytes(const uint8_t *bytes, int len); +int bnxt_check_zero_bytes(const uint8_t *bytes, int len); #define NTUPLE_FLTR_ALLOC_INPUT_EN_SRC_MACADDR \ HWRM_CFA_NTUPLE_FILTER_ALLOC_INPUT_ENABLES_SRC_MACADDR -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 06/12] net/bnxt: set MTU in dev config for jumbo packets
From: Qingmin Liu MTU setting does not take effect after rte_eth_dev_configure is called with jumbo enable unless it is configured using the set_mtu dev_op. Fixes: daef48efe5e5 ("net/bnxt: support set MTU") Cc: sta...@dpdk.org Signed-off-by: Qingmin Liu Signed-off-by: Scott Branden Reviewed-by: Jay Ding Reviewed-by: Randy Schacher Reviewed-by: Ajit Kumar Khaparde Signed-off-by: Ajit Khaparde --- drivers/net/bnxt/bnxt_ethdev.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index 58241ccac..e68608f61 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -151,6 +151,7 @@ static const struct rte_pci_id bnxt_pci_id_map[] = { static int bnxt_vlan_offload_set_op(struct rte_eth_dev *dev, int mask); static void bnxt_print_link_info(struct rte_eth_dev *eth_dev); +static int bnxt_mtu_set_op(struct rte_eth_dev *eth_dev, uint16_t new_mtu); /***/ @@ -548,10 +549,12 @@ static int bnxt_dev_configure_op(struct rte_eth_dev *eth_dev) bp->rx_cp_nr_rings = bp->rx_nr_rings; bp->tx_cp_nr_rings = bp->tx_nr_rings; - if (rx_offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) + if (rx_offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) { eth_dev->data->mtu = eth_dev->data->dev_conf.rxmode.max_rx_pkt_len - ETHER_HDR_LEN - ETHER_CRC_LEN - VLAN_TAG_SIZE; + bnxt_mtu_set_op(eth_dev, eth_dev->data->mtu); + } return 0; } -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 05/12] net/bnxt: Validate structs and pointers before use
From: Rahul Gupta Validate pointers aren't pointing to uninitialized areas including txq and rxq before using them to avoid bnxt driver from crashing. Signed-off-by: Rahul Gupta Signed-off-by: Jay Ding Signed-off-by: Scott Branden Reviewed-by: Ray Jui Reviewed-by: Ajit Kumar Khaparde Reviewed-by: Randy Schacher Tested-by: Randy Schacher --- drivers/net/bnxt/bnxt_ring.c | 3 +++ drivers/net/bnxt/bnxt_rxq.c | 6 ++ drivers/net/bnxt/bnxt_txq.c | 9 + 3 files changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/net/bnxt/bnxt_ring.c b/drivers/net/bnxt/bnxt_ring.c index 8e822e11f..aa9f3f4cc 100644 --- a/drivers/net/bnxt/bnxt_ring.c +++ b/drivers/net/bnxt/bnxt_ring.c @@ -24,6 +24,9 @@ void bnxt_free_ring(struct bnxt_ring *ring) { + if (!ring) + return; + if (ring->vmem_size && *ring->vmem) { memset((char *)*ring->vmem, 0, ring->vmem_size); *ring->vmem = NULL; diff --git a/drivers/net/bnxt/bnxt_rxq.c b/drivers/net/bnxt/bnxt_rxq.c index 4e6fa4e30..4b380d4f0 100644 --- a/drivers/net/bnxt/bnxt_rxq.c +++ b/drivers/net/bnxt/bnxt_rxq.c @@ -23,10 +23,8 @@ void bnxt_free_rxq_stats(struct bnxt_rx_queue *rxq) { - struct bnxt_cp_ring_info *cpr = rxq->cp_ring; - - if (cpr->hw_stats) - cpr->hw_stats = NULL; + if (rxq && rxq->cp_ring && rxq->cp_ring->hw_stats) + rxq->cp_ring->hw_stats = NULL; } int bnxt_mq_rx_configure(struct bnxt *bp) diff --git a/drivers/net/bnxt/bnxt_txq.c b/drivers/net/bnxt/bnxt_txq.c index b50f37cf2..b9b975e4c 100644 --- a/drivers/net/bnxt/bnxt_txq.c +++ b/drivers/net/bnxt/bnxt_txq.c @@ -19,10 +19,8 @@ void bnxt_free_txq_stats(struct bnxt_tx_queue *txq) { - struct bnxt_cp_ring_info *cpr = txq->cp_ring; - - if (cpr->hw_stats) - cpr->hw_stats = NULL; + if (txq && txq->cp_ring && txq->cp_ring->hw_stats) + txq->cp_ring->hw_stats = NULL; } static void bnxt_tx_queue_release_mbufs(struct bnxt_tx_queue *txq) @@ -30,6 +28,9 @@ static void bnxt_tx_queue_release_mbufs(struct bnxt_tx_queue *txq) struct bnxt_sw_tx_bd *sw_ring; uint16_t i; + if (!txq) + return; + sw_ring = txq->tx_ring->tx_buf_ring; if (sw_ring) { for (i = 0; i < txq->tx_ring->tx_ring_struct->ring_size; i++) { -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 07/12] net/bnxt: fix MTU calculation
We were not considering the case of nested VLANs while calculating MTU. This patch takes care of the same. Fixes: daef48efe5e5 ("net/bnxt: support set MTU") Cc: sta...@dpdk.org Signed-off-by: Qingmin Liu Signed-off-by: Scott Branden Reviewed-by: Jay Ding Reviewed-by: Ajit Kumar Khaparde Reviewed-by: Randy Schacher Signed-off-by: Ajit Khaparde --- drivers/net/bnxt/bnxt.h| 1 + drivers/net/bnxt/bnxt_ethdev.c | 3 ++- drivers/net/bnxt/bnxt_hwrm.c | 9 ++--- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h index 97b0e0853..110cdb992 100644 --- a/drivers/net/bnxt/bnxt.h +++ b/drivers/net/bnxt/bnxt.h @@ -23,6 +23,7 @@ #define BNXT_MAX_MTU 9500 #define VLAN_TAG_SIZE 4 #define BNXT_MAX_LED 4 +#define BNXT_NUM_VLANS 2 struct bnxt_led_info { uint8_t led_id; diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index e68608f61..20ed0a31f 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -552,7 +552,8 @@ static int bnxt_dev_configure_op(struct rte_eth_dev *eth_dev) if (rx_offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) { eth_dev->data->mtu = eth_dev->data->dev_conf.rxmode.max_rx_pkt_len - - ETHER_HDR_LEN - ETHER_CRC_LEN - VLAN_TAG_SIZE; + ETHER_HDR_LEN - ETHER_CRC_LEN - VLAN_TAG_SIZE * + BNXT_NUM_VLANS; bnxt_mtu_set_op(eth_dev, eth_dev->data->mtu); } return 0; diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c index bc8773509..c136edc06 100644 --- a/drivers/net/bnxt/bnxt_hwrm.c +++ b/drivers/net/bnxt/bnxt_hwrm.c @@ -2360,7 +2360,8 @@ static int bnxt_hwrm_pf_func_cfg(struct bnxt *bp, int tx_rings) req.flags = rte_cpu_to_le_32(bp->pf.func_cfg_flags); req.mtu = rte_cpu_to_le_16(BNXT_MAX_MTU); req.mru = rte_cpu_to_le_16(bp->eth_dev->data->mtu + ETHER_HDR_LEN + - ETHER_CRC_LEN + VLAN_TAG_SIZE); + ETHER_CRC_LEN + VLAN_TAG_SIZE * + BNXT_NUM_VLANS); req.num_rsscos_ctxs = rte_cpu_to_le_16(bp->max_rsscos_ctx); req.num_stat_ctxs = rte_cpu_to_le_16(bp->max_stat_ctx); req.num_cmpl_rings = rte_cpu_to_le_16(bp->max_cp_rings); @@ -2397,9 +2398,11 @@ static void populate_vf_func_cfg_req(struct bnxt *bp, HWRM_FUNC_CFG_INPUT_ENABLES_NUM_HW_RING_GRPS); req->mtu = rte_cpu_to_le_16(bp->eth_dev->data->mtu + ETHER_HDR_LEN + - ETHER_CRC_LEN + VLAN_TAG_SIZE); + ETHER_CRC_LEN + VLAN_TAG_SIZE * + BNXT_NUM_VLANS); req->mru = rte_cpu_to_le_16(bp->eth_dev->data->mtu + ETHER_HDR_LEN + - ETHER_CRC_LEN + VLAN_TAG_SIZE); + ETHER_CRC_LEN + VLAN_TAG_SIZE * + BNXT_NUM_VLANS); req->num_rsscos_ctxs = rte_cpu_to_le_16(bp->max_rsscos_ctx / (num_vfs + 1)); req->num_stat_ctxs = rte_cpu_to_le_16(bp->max_stat_ctx / (num_vfs + 1)); -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 08/12] net/bnxt: return error if init is not complete before accessing stats
From: Jay Ding return error if init is not complete before accessing stats. Fixes: ed2ced6fe927 ("net/bnxt: check initialization before accessing stats") Cc: sta...@dpdk.org Signed-off-by: Jay Ding Signed-off-by: Scott Branden Reviewed-by: Ajit Kumar Khaparde Reviewed-by: Randy Schacher Signed-off-by: Ajit Khaparde --- drivers/net/bnxt/bnxt_stats.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/bnxt/bnxt_stats.c b/drivers/net/bnxt/bnxt_stats.c index 1b586f333..c1a8fad09 100644 --- a/drivers/net/bnxt/bnxt_stats.c +++ b/drivers/net/bnxt/bnxt_stats.c @@ -210,7 +210,7 @@ int bnxt_stats_get_op(struct rte_eth_dev *eth_dev, memset(bnxt_stats, 0, sizeof(*bnxt_stats)); if (!(bp->flags & BNXT_FLAG_INIT_DONE)) { PMD_DRV_LOG(ERR, "Device Initialization not complete!\n"); - return 0; + return -1; } for (i = 0; i < bp->rx_cp_nr_rings; i++) { -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 00/12] bnxt patchset
Patchset against dpdk-next-net. Please apply. Ajit Khaparde (3): net/bnxt: fix MTU calculation net/bnxt: fix to reset status of initialization net/bnxt: fix usage of vnic id Jay Ding (2): net/bnxt: return EINVAL instead of ENOSPC on invalid max ring net/bnxt: return error if init is not complete before accessing stats Qingmin Liu (2): net/bnxt: add support for lsc interrupt event net/bnxt: set MTU in dev config for jumbo packets Rahul Gupta (1): net/bnxt: Validate structs and pointers before use Randy Schacher (1): net/bnxt: clear HWRM sniffer list for PFs Scott Branden (2): net/bnxt: rename function checking MAC address net/bnxt: rename driver version from Cumulus to NetXtreme Xiaoxin Peng (1): net/bnxt: fix rx mbuf and agg ring leak in dev stop drivers/net/bnxt/bnxt.h| 1 + drivers/net/bnxt/bnxt_ethdev.c | 23 --- drivers/net/bnxt/bnxt_filter.c | 8 +--- drivers/net/bnxt/bnxt_filter.h | 2 +- drivers/net/bnxt/bnxt_hwrm.c | 26 +++--- drivers/net/bnxt/bnxt_ring.c | 3 +++ drivers/net/bnxt/bnxt_rxq.c| 14 +++--- drivers/net/bnxt/bnxt_stats.c | 2 +- drivers/net/bnxt/bnxt_txq.c| 11 ++- 9 files changed, 59 insertions(+), 31 deletions(-) -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 10/12] net/bnxt: fix rx mbuf and agg ring leak in dev stop
From: Xiaoxin Peng In the start/stop_op operation, mbufs allocated for rings were not freed 1) add bnxt_free_tx_mbuf/bnxt_free_rx_mbuf in bnxt_dev_stop_op to free MBUF before freeing the rings. 2) MBUF allocation and free routines were not in sync. Allocation uses the ring->ring_size including any rounded up and multiple factors while the free routine uses the requested queue size. Fixes: c09f57b49c13 ("net/bnxt: add start/stop/link update operations") Cc: sta...@dpdk.org Signed-off-by: Jay Ding Signed-off-by: Scott Branden Reviewed-by: Ray Jui Reviewed-by: Randy Schacher Signed-off-by: Xiaoxin Peng Signed-off-by: Ajit Khaparde --- drivers/net/bnxt/bnxt_ethdev.c | 4 ++-- drivers/net/bnxt/bnxt_rxq.c| 6 -- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index 352fc30b4..dc445f9a5 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -655,6 +655,8 @@ static void bnxt_dev_stop_op(struct rte_eth_dev *eth_dev) } bnxt_set_hwrm_link_config(bp, false); bnxt_hwrm_port_clr_stats(bp); + bnxt_free_tx_mbufs(bp); + bnxt_free_rx_mbufs(bp); bnxt_shutdown_nic(bp); bp->dev_stopped = 1; } @@ -666,8 +668,6 @@ static void bnxt_dev_close_op(struct rte_eth_dev *eth_dev) if (bp->dev_stopped == 0) bnxt_dev_stop_op(eth_dev); - bnxt_free_tx_mbufs(bp); - bnxt_free_rx_mbufs(bp); bnxt_free_mem(bp); if (eth_dev->data->mac_addrs != NULL) { rte_free(eth_dev->data->mac_addrs); diff --git a/drivers/net/bnxt/bnxt_rxq.c b/drivers/net/bnxt/bnxt_rxq.c index 4b380d4f0..866fb56b1 100644 --- a/drivers/net/bnxt/bnxt_rxq.c +++ b/drivers/net/bnxt/bnxt_rxq.c @@ -207,7 +207,8 @@ static void bnxt_rx_queue_release_mbufs(struct bnxt_rx_queue *rxq) if (rxq) { sw_ring = rxq->rx_ring->rx_buf_ring; if (sw_ring) { - for (i = 0; i < rxq->nb_rx_desc; i++) { + for (i = 0; +i < rxq->rx_ring->rx_ring_struct->ring_size; i++) { if (sw_ring[i].mbuf) { rte_pktmbuf_free_seg(sw_ring[i].mbuf); sw_ring[i].mbuf = NULL; @@ -217,7 +218,8 @@ static void bnxt_rx_queue_release_mbufs(struct bnxt_rx_queue *rxq) /* Free up mbufs in Agg ring */ sw_ring = rxq->rx_ring->ag_buf_ring; if (sw_ring) { - for (i = 0; i < rxq->nb_rx_desc; i++) { + for (i = 0; +i < rxq->rx_ring->ag_ring_struct->ring_size; i++) { if (sw_ring[i].mbuf) { rte_pktmbuf_free_seg(sw_ring[i].mbuf); sw_ring[i].mbuf = NULL; -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 09/12] net/bnxt: fix to reset status of initialization
clear flag on stop at proper location to avoid race conditions. Fixes: ed2ced6fe927 ("net/bnxt: check initialization before accessing stats") Cc: sta...@dpdk.org Signed-off-by: Ajit Khaparde --- drivers/net/bnxt/bnxt_ethdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index 20ed0a31f..352fc30b4 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -648,13 +648,13 @@ static void bnxt_dev_stop_op(struct rte_eth_dev *eth_dev) { struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private; + bp->flags &= ~BNXT_FLAG_INIT_DONE; if (bp->eth_dev->data->dev_started) { /* TBD: STOP HW queues DMA */ eth_dev->data->dev_link.link_status = 0; } bnxt_set_hwrm_link_config(bp, false); bnxt_hwrm_port_clr_stats(bp); - bp->flags &= ~BNXT_FLAG_INIT_DONE; bnxt_shutdown_nic(bp); bp->dev_stopped = 1; } -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 12/12] net/bnxt: clear HWRM sniffer list for PFs
From: Randy Schacher Clear HWRM sniffer list for DPDK PFs so that VFs on DPDK PFs initialize successfully. DPDK PF driver does not handle HWRM commands from VFs. Signed-off-by: Randy Schacher Signed-off-by: Scott Branden Reviewed-by: Ajit Kumar Khaparde --- drivers/net/bnxt/bnxt_hwrm.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c index d3c50e490..5b9840d4f 100644 --- a/drivers/net/bnxt/bnxt_hwrm.c +++ b/drivers/net/bnxt/bnxt_hwrm.c @@ -611,6 +611,15 @@ int bnxt_hwrm_func_driver_register(struct bnxt *bp) memcpy(req.vf_req_fwd, bp->pf.vf_req_fwd, RTE_MIN(sizeof(req.vf_req_fwd), sizeof(bp->pf.vf_req_fwd))); + + /* +* PF can sniff HWRM API issued by VF. This can be set up by +* linux driver and inherited by the DPDK PF driver. Clear +* this HWRM sniffer list in FW because DPDK PF driver does +* not support this. +*/ + req.flags = + rte_cpu_to_le_32(HWRM_FUNC_DRV_RGTR_INPUT_FLAGS_FWD_NONE_MODE); } req.async_event_fwd[0] |= -- 2.15.1 (Apple Git-101)
[dpdk-dev] [PATCH 11/12] net/bnxt: fix usage of vnic id
VNIC ID returned by the FW is a 16-bit field. We are incorrectly using it as a 32-bit value in few places. This patch corrects that. Fixes: daef48efe5e5 ("net/bnxt: support set MTU") Cc: sta...@dpdk.org Signed-off-by: Ajit Khaparde Signed-off-by: Scott Branden Reviewed-by: Michael Wildt Reviewed-by: Randy Schacher Reviewed-by: Ray Jui --- drivers/net/bnxt/bnxt_hwrm.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c index c136edc06..d3c50e490 100644 --- a/drivers/net/bnxt/bnxt_hwrm.c +++ b/drivers/net/bnxt/bnxt_hwrm.c @@ -1212,7 +1212,7 @@ static int bnxt_hwrm_vnic_plcmodes_qcfg(struct bnxt *bp, HWRM_PREP(req, VNIC_PLCMODES_QCFG); - req.vnic_id = rte_cpu_to_le_32(vnic->fw_vnic_id); + req.vnic_id = rte_cpu_to_le_16(vnic->fw_vnic_id); rc = bnxt_hwrm_send_message(bp, &req, sizeof(req)); @@ -1240,7 +1240,7 @@ static int bnxt_hwrm_vnic_plcmodes_cfg(struct bnxt *bp, HWRM_PREP(req, VNIC_PLCMODES_CFG); - req.vnic_id = rte_cpu_to_le_32(vnic->fw_vnic_id); + req.vnic_id = rte_cpu_to_le_16(vnic->fw_vnic_id); req.flags = rte_cpu_to_le_32(pmode->flags); req.jumbo_thresh = rte_cpu_to_le_16(pmode->jumbo_thresh); req.hds_offset = rte_cpu_to_le_16(pmode->hds_offset); @@ -1484,7 +1484,7 @@ int bnxt_hwrm_vnic_plcmode_cfg(struct bnxt *bp, size -= RTE_PKTMBUF_HEADROOM; req.jumbo_thresh = rte_cpu_to_le_16(size); - req.vnic_id = rte_cpu_to_le_32(vnic->fw_vnic_id); + req.vnic_id = rte_cpu_to_le_16(vnic->fw_vnic_id); rc = bnxt_hwrm_send_message(bp, &req, sizeof(req)); @@ -1520,7 +1520,7 @@ int bnxt_hwrm_vnic_tpa_cfg(struct bnxt *bp, rte_cpu_to_le_16(HWRM_VNIC_TPA_CFG_INPUT_MAX_AGGS_MAX); req.min_agg_len = rte_cpu_to_le_32(512); } - req.vnic_id = rte_cpu_to_le_32(vnic->fw_vnic_id); + req.vnic_id = rte_cpu_to_le_16(vnic->fw_vnic_id); rc = bnxt_hwrm_send_message(bp, &req, sizeof(req)); -- 2.15.1 (Apple Git-101)
Re: [dpdk-dev] [PATCH 5/8 v4] raw/dpaa2_qdma: introduce the DPAA2 QDMA driver
> -Original Message- > From: Thomas Monjalon [mailto:tho...@monjalon.net] > Sent: Monday, April 30, 2018 6:05 PM > To: Nipun Gupta > Cc: dev@dpdk.org; Shreyansh Jain ; Hemant > Agrawal > Subject: Re: [dpdk-dev] [PATCH 5/8 v4] raw/dpaa2_qdma: introduce the > DPAA2 QDMA driver > > 24/04/2018 13:49, Nipun Gupta: > > drivers/raw/dpaa2_qdma/dpaa2_qdma.c| 294 > + > > drivers/raw/dpaa2_qdma/dpaa2_qdma.h| 66 + > > drivers/raw/dpaa2_qdma/dpaa2_qdma_logs.h | 46 > [...] > > +install_headers('rte_pmd_dpaa2_qdma.h') > > I think you need to rename the exported header file with rte_pmd_ prefix. Sorry, I did not get it. Filename is already with rte_pmd_ prefix. Thanks, Nipun >
Re: [dpdk-dev] [PATCH 5/8 v4] raw/dpaa2_qdma: introduce the DPAA2 QDMA driver
> -Original Message- > From: Nipun Gupta > Sent: Tuesday, May 1, 2018 11:44 AM > To: 'Thomas Monjalon' > Cc: dev@dpdk.org; Shreyansh Jain ; Hemant > Agrawal > Subject: RE: [dpdk-dev] [PATCH 5/8 v4] raw/dpaa2_qdma: introduce the > DPAA2 QDMA driver > > > > > -Original Message- > > From: Thomas Monjalon [mailto:tho...@monjalon.net] > > Sent: Monday, April 30, 2018 6:05 PM > > To: Nipun Gupta > > Cc: dev@dpdk.org; Shreyansh Jain ; Hemant > > Agrawal > > Subject: Re: [dpdk-dev] [PATCH 5/8 v4] raw/dpaa2_qdma: introduce the > > DPAA2 QDMA driver > > > > 24/04/2018 13:49, Nipun Gupta: > > > drivers/raw/dpaa2_qdma/dpaa2_qdma.c| 294 > > + > > > drivers/raw/dpaa2_qdma/dpaa2_qdma.h| 66 + > > > drivers/raw/dpaa2_qdma/dpaa2_qdma_logs.h | 46 > > [...] > > > +install_headers('rte_pmd_dpaa2_qdma.h') > > > > I think you need to rename the exported header file with rte_pmd_ prefix. Got it. This should be in the next patch where 'rte_pmd_dpaa2_qdma.h' file has been introduced. I will fix it and respin a version. > > Sorry, I did not get it. Filename is already with rte_pmd_ prefix. > > Thanks, > Nipun > > >