Re: [dpdk-dev] [PATCH v2 06/10] net/virtio: fix queue setup consistency
Hi Lei, It's on my todo list, I'll check this as soon as possible. Olivier On Thu, Feb 01, 2018 at 03:14:15AM +, Yao, Lei A wrote: > Hi, Olivier > > This is Lei from DPDK validation team in Intel. During our DPDK 18.02-rc1 > test, > I find the following patch will cause one serious issue with virtio vector > path: > the traffic can't resume after stop/start the virtio device. > > The step like following: > 1. Launch vhost-user port using testpmd at Host > 2. Launch VM with virtio device, mergeable is off > 3. Bind the virtio device to pmd driver, launch testpmd, let the tx/rx use > vector path > virtio_xmit_pkts_simple > virtio_recv_pkts_vec > 4. Send traffic to virtio device from vhost side, then stop the virtio device > 5. Start the virtio device again > After step 5, the traffic can't resume. > > Could you help check this and give a fix? This issue will impact the virtio > pmd user > experience heavily. By the way, this patch is already included into V17.11. > Looks like > we need give a patch to this LTS version. Thanks a lot! > > BRs > Lei > > -Original Message- > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Olivier Matz > > Sent: Thursday, September 7, 2017 8:14 PM > > To: dev@dpdk.org; y...@fridaylinux.org; maxime.coque...@redhat.com > > Cc: step...@networkplumber.org; sta...@dpdk.org > > Subject: [dpdk-dev] [PATCH v2 06/10] net/virtio: fix queue setup consistency > > > > In rx/tx queue setup functions, some code is executed only if > > use_simple_rxtx == 1. The value of this variable can change depending on > > the offload flags or sse support. If Rx queue setup is called before Tx > > queue setup, it can result in an invalid configuration: > > > > - dev_configure is called: use_simple_rxtx is initialized to 0 > > - rx queue setup is called: queues are initialized without simple path > > support > > - tx queue setup is called: use_simple_rxtx switch to 1, and simple > > Rx/Tx handlers are selected > > > > Fix this by postponing a part of Rx/Tx queue initialization in > > dev_start(), as it was the case in the initial implementation. > > > > Fixes: 48cec290a3d2 ("net/virtio: move queue configure code to proper > > place") > > Cc: sta...@dpdk.org > > > > Signed-off-by: Olivier Matz > > --- > > drivers/net/virtio/virtio_ethdev.c | 13 + > > drivers/net/virtio/virtio_ethdev.h | 6 ++ > > drivers/net/virtio/virtio_rxtx.c | 40 ++- > > --- > > 3 files changed, 51 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/net/virtio/virtio_ethdev.c > > b/drivers/net/virtio/virtio_ethdev.c > > index 8eee3ff80..c7888f103 100644 > > --- a/drivers/net/virtio/virtio_ethdev.c > > +++ b/drivers/net/virtio/virtio_ethdev.c > > @@ -1737,6 +1737,19 @@ virtio_dev_start(struct rte_eth_dev *dev) > > struct virtnet_rx *rxvq; > > struct virtnet_tx *txvq __rte_unused; > > struct virtio_hw *hw = dev->data->dev_private; > > + int ret; > > + > > + /* Finish the initialization of the queues */ > > + for (i = 0; i < dev->data->nb_rx_queues; i++) { > > + ret = virtio_dev_rx_queue_setup_finish(dev, i); > > + if (ret < 0) > > + return ret; > > + } > > + for (i = 0; i < dev->data->nb_tx_queues; i++) { > > + ret = virtio_dev_tx_queue_setup_finish(dev, i); > > + if (ret < 0) > > + return ret; > > + } > > > > /* check if lsc interrupt feature is enabled */ > > if (dev->data->dev_conf.intr_conf.lsc) { > > diff --git a/drivers/net/virtio/virtio_ethdev.h > > b/drivers/net/virtio/virtio_ethdev.h > > index c3413c6d9..2039bc547 100644 > > --- a/drivers/net/virtio/virtio_ethdev.h > > +++ b/drivers/net/virtio/virtio_ethdev.h > > @@ -92,10 +92,16 @@ int virtio_dev_rx_queue_setup(struct rte_eth_dev > > *dev, uint16_t rx_queue_id, > > const struct rte_eth_rxconf *rx_conf, > > struct rte_mempool *mb_pool); > > > > +int virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, > > + uint16_t rx_queue_id); > > + > > int virtio_dev_tx_queue_setup(struct rte_eth_dev *dev, uint16_t > > tx_queue_id, > > uint16_t nb_tx_desc, unsigned int socket_id, > > const struct rte_eth_txconf *tx_conf); > > > > +int virtio_dev_tx_queue_setup_finish(struct rte_eth_dev *dev, > > + uint16_t tx_queue_id); > > + > > uint16_t virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, > > uint16_t nb_pkts); > > > > diff --git a/drivers/net/virtio/virtio_rxtx.c > > b/drivers/net/virtio/virtio_rxtx.c > > index e30377c51..a32e3229f 100644 > > --- a/drivers/net/virtio/virtio_rxtx.c > > +++ b/drivers/net/virtio/virtio_rxtx.c > > @@ -421,9 +421,6 @@ virtio_dev_rx_queue_setup(struct rte_eth_dev *dev, > > struct virtio_hw *hw = dev->data->dev_private; > > struct virtqueue *vq = hw->vqs[vtpci_queue_idx]; > > struct virtnet_rx *
Re: [dpdk-dev] [PATCH v2] net/mlx4: fix drop flow resources not freed
Wednesday, January 31, 2018 5:33 PM, Adrien Mazarguil: > Resources allocated for drop flow rules are not freed properly. This causes a > memory leak and triggers an assertion failure on a reference counter when > compiled in debug mode. > > This issue can be reproduced with testpmd by entering the following > commands: > > flow create 0 ingress pattern eth / end actions drop / end port start all > port > stop all port start all port stop all quit > > The reason is additional references are taken when re-enabling existing flow > rules, a common occurrence when rehashing configuration. > > Fixes: d3a7e09234e4 ("net/mlx4: allocate drop flow resources on demand") > Cc: sta...@dpdk.org > > Reported-by: Moti Haimovsky > Signed-off-by: Adrien Mazarguil > --- Applied to next-net-mlx, thanks.
Re: [dpdk-dev] [RFC v2 04/17] mempool: add op to populate objects using provided memory
On 01/31/2018 07:45 PM, Olivier Matz wrote: On Tue, Jan 23, 2018 at 01:15:59PM +, Andrew Rybchenko wrote: The callback allows to customize how objects are stored in the memory chunk. Default implementation of the callback which simply puts objects one by one is available. Suggested-by: Olivier Matz Signed-off-by: Andrew Rybchenko ... +int +rte_mempool_populate_one_by_one(struct rte_mempool *mp, unsigned int max_objs, + void *vaddr, rte_iova_t iova, size_t len, + rte_mempool_populate_obj_cb_t *obj_cb) We shall find a better name for this function. Unfortunatly rte_mempool_populate_default() already exists... I have no better idea right now, but we'll try in the next version. May be rte_mempool_op_populate_default()? I'm also wondering if having a file rte_mempool_ops_default.c with all the default behaviors would make sense? I think it is a good idea. Will do. ... @@ -466,16 +487,13 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr, else off = RTE_PTR_ALIGN_CEIL(vaddr, RTE_CACHE_LINE_SIZE) - vaddr; - while (off + total_elt_sz <= len && mp->populated_size < mp->size) { - off += mp->header_size; - if (iova == RTE_BAD_IOVA) - mempool_add_elem(mp, (char *)vaddr + off, - RTE_BAD_IOVA); - else - mempool_add_elem(mp, (char *)vaddr + off, iova + off); - off += mp->elt_size + mp->trailer_size; - i++; - } + if (off > len) + return -EINVAL; + + i = rte_mempool_ops_populate(mp, mp->size - mp->populated_size, + (char *)vaddr + off, + (iova == RTE_BAD_IOVA) ? RTE_BAD_IOVA : (iova + off), + len - off, mempool_add_elem); My initial idea was to provide populate_iova(), populate_virt(), ... as mempool ops. I don't see any strong requirement for doing it now, but on the other hand it would break the API to do it later. What's your opinion? Suggested solution keeps only generic house-keeping inside rte_mempool_populate_iova() (driver-data alloc/init, generic check if the pool is already populated, maintenance of the memory chunks list and object cache-alignment requirements). I think that only the last item is questionable, but cache-line alignment is hard-wired in object size calculation as well which is not customizable yet. May be we should add callback for object size calculation with default fallback and move object cache-line alignment into populate() callback. As for populate_virt() etc right now all these functions finally come to populate_iova(). I have no customization usecases for these functions in my mind, so it is hard to guess required set of parameters. That's why I kept it as is for now. (In general I prefer to avoid overkill solutions since chances of success (100% guess of the prototype) are small) May be someone else on the list have usecases in mind? Also, I see that mempool_add_elem() is passed as a callback to rte_mempool_ops_populate(). Instead, would it make sense to export mempool_add_elem() and let the implementation of populate() ops to call it? I think callback gives a bit more freedom and allows to pass own function which does some actions (e.g. filtering) per object. In fact I think opaque parameter should be added to the callback prototype to make it really useful for customization (to provide specific context and make it possible to chain callbacks).
Re: [dpdk-dev] [RFC v2 11/17] mempool: ensure the mempool is initialized before populating
On 01/31/2018 07:45 PM, Olivier Matz wrote: On Tue, Jan 23, 2018 at 01:16:06PM +, Andrew Rybchenko wrote: From: "Artem V. Andreev" Callback to calculate required memory area size may require mempool driver data to be already allocated and initialized. Signed-off-by: Artem V. Andreev Signed-off-by: Andrew Rybchenko --- lib/librte_mempool/rte_mempool.c | 29 ++--- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index fc9c95a..cbb4dd5 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -370,6 +370,21 @@ rte_mempool_free_memchunks(struct rte_mempool *mp) } } +static int +mempool_maybe_initialize(struct rte_mempool *mp) +{ + int ret; + + /* create the internal ring if not already done */ + if ((mp->flags & MEMPOOL_F_POOL_CREATED) == 0) { + ret = rte_mempool_ops_alloc(mp); + if (ret != 0) + return ret; + mp->flags |= MEMPOOL_F_POOL_CREATED; + } + return 0; +} mempool_ops_alloc_once() ? Yes, I like it. Will fix.
Re: [dpdk-dev] [RFC v2 01/17] mempool: fix phys contig check if populate default skipped
On Thursday 01 February 2018 12:24 PM, Andrew Rybchenko wrote: > On 02/01/2018 08:05 AM, santosh wrote: >> On Wednesday 31 January 2018 10:15 PM, Olivier Matz wrote: >>> On Tue, Jan 23, 2018 at 01:15:56PM +, Andrew Rybchenko wrote: There is not specified dependency between rte_mempool_populate_default() and rte_mempool_populate_iova(). So, the second should not rely on the fact that the first adds capability flags to the mempool flags. Fixes: 65cf769f5e6a ("mempool: detect physical contiguous objects") Cc: sta...@dpdk.org Signed-off-by: Andrew Rybchenko >>> Looks good to me. I agree it's strange that the mp->flags are >>> updated with capabilities only in rte_mempool_populate_default(). >>> I see that this behavior is removed later in the patchset since the >>> get_capa() is removed! >>> >>> However maybe this single patch could go in 18.02. >>> +Santosh +Jerin since it's mostly about Octeon. >> rte_mempool_xmem_size should return correct size if >> MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS flag >> is set in 'mp->flags'. Thats why _ops_get_capabilities() called in >> _populate_default() but not >> at _populate_iova(). >> I think, this 'alone' patch may break octeontx mempool. > > The patch does not touch rte_mempool_populate_default(). > _ops_get_capabilities() is still called there before > rte_mempool_xmem_size(). The theoretical problem which > the patch tries to fix is the case when > rte_mempool_populate_default() is not called at all. I.e. application > calls _ops_get_capabilities() to get flags, then, together with > mp->flags, calls rte_mempool_xmem_size() directly, allocates > calculated amount of memory and calls _populate_iova(). > In that case, Application does like below: /* Get mempool capabilities */ mp_flags = 0; ret = rte_mempool_ops_get_capabilities(mp, &mp_flags); if ((ret < 0) && (ret != -ENOTSUP)) return ret; /* update mempool capabilities */ mp->flags |= mp_flags; /* calc xmem sz */ size = rte_mempool_xmem_size(n, total_elt_sz, pg_shift, mp->flags); /* rsrv memory */ mz = rte_memzone_reserve_aligned(mz_name, size,...); /* now populate iova */ ret = rte_mempool_populate_iova(mp,,..); won't it work? However I understand that clubbing `_get_ops_capa() + flag-updation` into _populate_iova() perhaps better from user PoV. > Since later patches of the series reconsider memory size > calculation etc, it is up to you if it makes sense to apply it > in 18.02 as a fix. >
Re: [dpdk-dev] [PATCH v2] doc: add a user guidance document for igb
> -Original Message- > From: Zhao1, Wei > Sent: Wednesday, January 31, 2018 8:47 AM > To: dev@dpdk.org > Cc: Mcnamara, John ; Lu, Wenzhuo > ; Zhao1, Wei > Subject: [PATCH v2] doc: add a user guidance document for igb > > This patch add a user guidance document specific for igb nic. > By now, a doc like ixgbe.rst is also needed by igb nic. So this patch add > igb.rst to record important information about igb, like feature supported > and known issues. Hi, Thanks for the doc. It is something we should have had a while ago. Some comments below. > +.. BSD LICENSE > +Copyright(c) 2018 Intel Corporation. All rights reserved. > +All rights reserved. > + You should probably use an SPDX header here. > +IGB Poll Mode Driver > + > + > +The IGB PMD (librte_pmd_e1000) provides poll mode driver support. Maybe use something a bit more descriptive here like: The IGB PMD (``librte_pmd_e1000``) provides poll mode driver support for Intel 1GbE nics. > + > +Features > + > + > +Features of the IGB PMD are: Could you fill in some of these as a bullet list like: Features of the IGB PMD are: * VLAN * VxLAN * IEEE 1588 * etc. The rest of the doc looks good. John
Re: [dpdk-dev] [PATCH 10/14] vhost: vring address setup for packed queues
Hi Jens, On 01/29/2018 03:11 PM, Jens Freimann wrote: From: Yuanhan Liu Add code to set up packed queues when enabled. Signed-off-by: Yuanhan Liu Signed-off-by: Jens Freimann --- lib/librte_vhost/vhost.c | 4 lib/librte_vhost/vhost.h | 1 + lib/librte_vhost/vhost_user.c | 17 - 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c index 1dd9adbc7..78913912c 100644 --- a/lib/librte_vhost/vhost.c +++ b/lib/librte_vhost/vhost.c @@ -536,6 +536,9 @@ rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable) { struct virtio_net *dev = get_device(vid); + if (dev->features & (1ULL << VIRTIO_F_PACKED)) + return 0; + This check should be done after dev is checked non-null. if (dev == NULL) return -1; @@ -545,6 +548,7 @@ rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable) return -1; } + Trailing line. dev->virtqueue[queue_id]->used->flags = VRING_USED_F_NO_NOTIFY; return 0; }
Re: [dpdk-dev] [RFC v2 01/17] mempool: fix phys contig check if populate default skipped
On 02/01/2018 12:09 PM, santosh wrote: On Thursday 01 February 2018 12:24 PM, Andrew Rybchenko wrote: On 02/01/2018 08:05 AM, santosh wrote: On Wednesday 31 January 2018 10:15 PM, Olivier Matz wrote: On Tue, Jan 23, 2018 at 01:15:56PM +, Andrew Rybchenko wrote: There is not specified dependency between rte_mempool_populate_default() and rte_mempool_populate_iova(). So, the second should not rely on the fact that the first adds capability flags to the mempool flags. Fixes: 65cf769f5e6a ("mempool: detect physical contiguous objects") Cc: sta...@dpdk.org Signed-off-by: Andrew Rybchenko Looks good to me. I agree it's strange that the mp->flags are updated with capabilities only in rte_mempool_populate_default(). I see that this behavior is removed later in the patchset since the get_capa() is removed! However maybe this single patch could go in 18.02. +Santosh +Jerin since it's mostly about Octeon. rte_mempool_xmem_size should return correct size if MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS flag is set in 'mp->flags'. Thats why _ops_get_capabilities() called in _populate_default() but not at _populate_iova(). I think, this 'alone' patch may break octeontx mempool. The patch does not touch rte_mempool_populate_default(). _ops_get_capabilities() is still called there before rte_mempool_xmem_size(). The theoretical problem which the patch tries to fix is the case when rte_mempool_populate_default() is not called at all. I.e. application calls _ops_get_capabilities() to get flags, then, together with mp->flags, calls rte_mempool_xmem_size() directly, allocates calculated amount of memory and calls _populate_iova(). In that case, Application does like below: /* Get mempool capabilities */ mp_flags = 0; ret = rte_mempool_ops_get_capabilities(mp, &mp_flags); if ((ret < 0) && (ret != -ENOTSUP)) return ret; /* update mempool capabilities */ mp->flags |= mp_flags; Above line is not mandatory. "mp->flags | mp_flags" could be simply passed to rte_mempool_xmem_size() below. /* calc xmem sz */ size = rte_mempool_xmem_size(n, total_elt_sz, pg_shift, mp->flags); /* rsrv memory */ mz = rte_memzone_reserve_aligned(mz_name, size,...); /* now populate iova */ ret = rte_mempool_populate_iova(mp,,..); won't it work? However I understand that clubbing `_get_ops_capa() + flag-updation` into _populate_iova() perhaps better from user PoV. Since later patches of the series reconsider memory size calculation etc, it is up to you if it makes sense to apply it in 18.02 as a fix.
Re: [dpdk-dev] [PATCH 11/14] vhost: add helpers for packed virtqueues
On 01/29/2018 03:11 PM, Jens Freimann wrote: Add some helper functions to set/check descriptor flags and toggle the used wrap counter. Signed-off-by: Jens Freimann --- lib/librte_vhost/virtio-1.1.h | 43 +++ 1 file changed, 43 insertions(+) diff --git a/lib/librte_vhost/virtio-1.1.h b/lib/librte_vhost/virtio-1.1.h index 5ca0bc33f..84039797e 100644 --- a/lib/librte_vhost/virtio-1.1.h +++ b/lib/librte_vhost/virtio-1.1.h @@ -17,4 +17,47 @@ struct vring_desc_1_1 { uint16_t flags; }; +static inline void +toggle_wrap_counter(struct vhost_virtqueue *vq) +{ + vq->used_wrap_counter ^= 1; +} + +static inline int +desc_is_avail(struct vhost_virtqueue *vq, struct vring_desc_1_1 *desc) +{ + if (!vq) + return -1; Maybe use unlikely() here? Maxime
Re: [dpdk-dev] [dpdk-stable] [PATCH] doc: fix documentation for testpmd ddp add del function
> -Original Message- > From: stable [mailto:stable-boun...@dpdk.org] On Behalf Of Kirill > Rybalchenko > Sent: Wednesday, January 31, 2018 11:15 AM > To: dev@dpdk.org > Cc: sta...@dpdk.org; Rybalchenko, Kirill ; > Chilikin, Andrey ; Xing, Beilei > ; Wu, Jingjing > Subject: [dpdk-stable] [PATCH] doc: fix documentation for testpmd ddp add > del function > > Documentation and help string more clear describe meaning of arguments for > ddp add del function. > > Fixes: 856ceb331b0a ("app/testpmd: enable DDP remove profile feature") Acked-by: John McNamara
Re: [dpdk-dev] [RFC v2 01/17] mempool: fix phys contig check if populate default skipped
On Thursday 01 February 2018 02:48 PM, Andrew Rybchenko wrote: > On 02/01/2018 12:09 PM, santosh wrote: >> On Thursday 01 February 2018 12:24 PM, Andrew Rybchenko wrote: >>> On 02/01/2018 08:05 AM, santosh wrote: On Wednesday 31 January 2018 10:15 PM, Olivier Matz wrote: > On Tue, Jan 23, 2018 at 01:15:56PM +, Andrew Rybchenko wrote: >> There is not specified dependency between rte_mempool_populate_default() >> and rte_mempool_populate_iova(). So, the second should not rely on the >> fact that the first adds capability flags to the mempool flags. >> >> Fixes: 65cf769f5e6a ("mempool: detect physical contiguous objects") >> Cc: sta...@dpdk.org >> >> Signed-off-by: Andrew Rybchenko > Looks good to me. I agree it's strange that the mp->flags are > updated with capabilities only in rte_mempool_populate_default(). > I see that this behavior is removed later in the patchset since the > get_capa() is removed! > > However maybe this single patch could go in 18.02. > +Santosh +Jerin since it's mostly about Octeon. rte_mempool_xmem_size should return correct size if MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS flag is set in 'mp->flags'. Thats why _ops_get_capabilities() called in _populate_default() but not at _populate_iova(). I think, this 'alone' patch may break octeontx mempool. >>> The patch does not touch rte_mempool_populate_default(). >>> _ops_get_capabilities() is still called there before >>> rte_mempool_xmem_size(). The theoretical problem which >>> the patch tries to fix is the case when >>> rte_mempool_populate_default() is not called at all. I.e. application >>> calls _ops_get_capabilities() to get flags, then, together with >>> mp->flags, calls rte_mempool_xmem_size() directly, allocates >>> calculated amount of memory and calls _populate_iova(). >>> >> In that case, Application does like below: >> >> /* Get mempool capabilities */ >> mp_flags = 0; >> ret = rte_mempool_ops_get_capabilities(mp, &mp_flags); >> if ((ret < 0) && (ret != -ENOTSUP)) >> return ret; >> >> /* update mempool capabilities */ >> mp->flags |= mp_flags; > > Above line is not mandatory. "mp->flags | mp_flags" could be simply > passed to rte_mempool_xmem_size() below. > That depends and again upto application requirement, if app further down wants to refer mp->flags for _align/_contig then better update to mp->flags. But that wasn't the point of discussion, I'm trying to understand that w/o this patch, whats could be the application level problem?
[dpdk-dev] [RFC v3, 3/3] security: add support to set session private data
The application may want to store private data along with the rte_security that is transparent to the rte_security layer. For e.g., If an eventdev based application is submitting a rte_security_session operation and wants to indicate event information required to construct a new event that will be enqueued to eventdev after completion of the rte_security operation. This patch provides a mechanism for the application to associate this information with the rte_security session. The application can set the private data using rte_security_session_set_private_data() and retrieve it using rte_security_session_get_private_data() Signed-off-by: Abhinandan Gujjar Signed-off-by: Nikhil Rao --- lib/librte_security/rte_security.h | 29 + 1 file changed, 29 insertions(+) diff --git a/lib/librte_security/rte_security.h b/lib/librte_security/rte_security.h index c75c121..baf168c 100644 --- a/lib/librte_security/rte_security.h +++ b/lib/librte_security/rte_security.h @@ -560,6 +560,35 @@ struct rte_security_capability_idx { rte_security_capability_get(struct rte_security_ctx *instance, struct rte_security_capability_idx *idx); +/** + * Set private data for a security session. + * + * @param sesssecurity session + * @param datapointer to the private data. + * @param sizesize of the private data. + * + * @return + * - On success, zero. + * - On failure, a negative value. + */ +int +rte_security_session_set_private_data(struct rte_security_session *sess, + void *data, + uint16_t size); + +/** + * Get private data of a security session. + * + * @param sesssecurity session + * + * @return + * - On success return pointer to private data. + * - On failure returns NULL. + */ +void * +rte_security_session_get_private_data( + const struct rte_security_session *session); + #ifdef __cplusplus } #endif -- 1.9.1
[dpdk-dev] [RFC v3, 1/3] cryptodev: set private data for session-less mode
The application may want to store private data along with the rte_crypto_op that is transparent to the rte_cryptodev layer. For e.g., If an eventdev based application is submitting a crypto session-less operation and wants to indicate event information required to construct a new event that will be enqueued to eventdev after completion of the crypto operation. This patch provides a mechanism for the application to associate this information with the rte_crypto_op in session-less mode. Signed-off-by: Abhinandan Gujjar Signed-off-by: Nikhil Rao --- Notes: V3: 1. Added separate patch for session-less private data 2. Added more information on offset V2: 1. Removed enum rte_crypto_op_private_data_type 2. Corrected formatting lib/librte_cryptodev/rte_crypto.h | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/lib/librte_cryptodev/rte_crypto.h b/lib/librte_cryptodev/rte_crypto.h index 95cf861..2540426 100644 --- a/lib/librte_cryptodev/rte_crypto.h +++ b/lib/librte_cryptodev/rte_crypto.h @@ -84,8 +84,14 @@ struct rte_crypto_op { */ uint8_t sess_type; /**< operation session type */ + uint16_t private_data_offset; + /**< Offset to indicate start of private data (if any). The offset +* is counted from the start of the rte_crypto_op including IV. +* The private data may be used by the application to store +* information which should remain untouched in the library/driver +*/ - uint8_t reserved[5]; + uint8_t reserved[3]; /**< Reserved bytes to fill 64 bits for future additions */ struct rte_mempool *mempool; /**< crypto operation mempool which operation is allocated from */ -- 1.9.1
[dpdk-dev] [RFC v3, 2/3] cryptodev: add support to set session private data
The application may want to store private data along with the rte_cryptodev that is transparent to the rte_cryptodev layer. For e.g., If an eventdev based application is submitting a rte_cryptodev_sym_session operation and wants to indicate event information required to construct a new event that will be enqueued to eventdev after completion of the rte_cryptodev_sym_session operation. This patch provides a mechanism for the application to associate this information with the rte_cryptodev_sym_session session. The application can set the private data using rte_cryptodev_sym_session_set_private_data() and retrieve it using rte_cryptodev_sym_session_get_private_data(). Signed-off-by: Abhinandan Gujjar Signed-off-by: Nikhil Rao --- lib/librte_cryptodev/rte_cryptodev.h | 32 1 file changed, 32 insertions(+) diff --git a/lib/librte_cryptodev/rte_cryptodev.h b/lib/librte_cryptodev/rte_cryptodev.h index c8fa689..2f4affe 100644 --- a/lib/librte_cryptodev/rte_cryptodev.h +++ b/lib/librte_cryptodev/rte_cryptodev.h @@ -1037,6 +1037,38 @@ struct rte_cryptodev_sym_session * */ const char *rte_cryptodev_driver_name_get(uint8_t driver_id); +/** + * Set private data for a session. + * + * @param sessSession pointer allocated by + * *rte_cryptodev_sym_session_create*. + * @param dataPointer to the private data. + * @param sizeSize of the private data. + * + * @return + * - On success, zero. + * - On failure, a negative value. + */ +int +rte_cryptodev_sym_session_set_private_data( + struct rte_cryptodev_sym_session *sess, + void *data, + uint16_t size); + +/** + * Get private data of a session. + * + * @param sessSession pointer allocated by + * *rte_cryptodev_sym_session_create*. + * + * @return + * - On success return pointer to private data. + * - On failure returns NULL. + */ +void * +rte_cryptodev_sym_session_get_private_data( + const struct rte_cryptodev_sym_session *sess); + #ifdef __cplusplus } #endif -- 1.9.1
Re: [dpdk-dev] [PATCH 12/14] vhost: dequeue for packed queues
On 01/29/2018 03:11 PM, Jens Freimann wrote: Implement code to dequeue and process descriptors from the vring if VIRTIO_F_PACKED is enabled. Check if descriptor was made available by driver by looking at VIRTIO_F_DESC_AVAIL flag in descriptor. If so dequeue and set the used flag VIRTIO_F_DESC_USED to the current value of the used wrap counter. Used ring wrap counter needs to be toggled when last descriptor is written out. This allows the host/guest to detect new descriptors even after the ring has wrapped. Signed-off-by: Jens Freimann --- lib/librte_vhost/vhost.c | 1 + lib/librte_vhost/vhost.h | 1 + lib/librte_vhost/virtio_net.c | 194 ++ 3 files changed, 196 insertions(+) diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c index 78913912c..e5f58d9c8 100644 --- a/lib/librte_vhost/vhost.c +++ b/lib/librte_vhost/vhost.c @@ -191,6 +191,7 @@ init_vring_queue(struct virtio_net *dev, uint32_t vring_idx) vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD; + vq->used_wrap_counter = 1; vhost_user_iotlb_init(dev, vring_idx); /* Backends are set to -1 indicating an inactive device. */ diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index 8554d51d8..a3d4214b6 100644 --- a/lib/librte_vhost/vhost.h +++ b/lib/librte_vhost/vhost.h @@ -106,6 +106,7 @@ struct vhost_virtqueue { struct batch_copy_elem *batch_copy_elems; uint16_tbatch_copy_nb_elems; + uint32_tused_wrap_counter; rte_rwlock_t iotlb_lock; rte_rwlock_tiotlb_pending_lock; diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index edfab3ba6..5d4cfe8cc 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -19,6 +19,7 @@ #include "iotlb.h" #include "vhost.h" +#include "virtio-1.1.h" #define MAX_PKT_BURST 32 @@ -,6 +1112,199 @@ restore_mbuf(struct rte_mbuf *m) } } +static inline uint16_t +dequeue_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, +struct rte_mempool *mbuf_pool, struct rte_mbuf *m, +struct vring_desc_1_1 *descs) +{ + struct vring_desc_1_1 *desc; + uint64_t desc_addr; + uint32_t desc_avail, desc_offset; + uint32_t mbuf_avail, mbuf_offset; + uint32_t cpy_len; + struct rte_mbuf *cur = m, *prev = m; + struct virtio_net_hdr *hdr = NULL; + uint16_t head_idx = vq->last_used_idx & (vq->size - 1); + int wrap_counter = vq->used_wrap_counter; + + desc = &descs[vq->last_used_idx & (vq->size - 1)]; + if (unlikely((desc->len < dev->vhost_hlen)) || + (desc->flags & VRING_DESC_F_INDIRECT)) + rte_panic("INDIRECT not supported yet"); Using rte_panic() may not be a good idea here, because a malicious guest could make the vswitch to crash easily. + + desc_addr = rte_vhost_gpa_to_vva(dev->mem, desc->addr); You should use vhost_iova_to_vva() here and everywhere else, otherwise you break IOMMU support. + if (unlikely(!desc_addr)) + return -1; + + if (virtio_net_with_host_offload(dev)) { + hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr); + rte_prefetch0(hdr); + } + + /* +* A virtio driver normally uses at least 2 desc buffers +* for Tx: the first for storing the header, and others +* for storing the data. +*/ + if (likely((desc->len == dev->vhost_hlen) && + (desc->flags & VRING_DESC_F_NEXT) != 0)) { + if ((++vq->last_used_idx & (vq->size - 1)) == 0) + toggle_wrap_counter(vq); + + desc = &descs[vq->last_used_idx & (vq->size - 1)]; + + if (unlikely(desc->flags & VRING_DESC_F_INDIRECT)) + rte_panic("INDIRECT not supported yet"); Ditto. + + desc_addr = rte_vhost_gpa_to_vva(dev->mem, desc->addr); + if (unlikely(!desc_addr)) + return -1; + + desc_offset = 0; + desc_avail = desc->len; + } else { + desc_avail = desc->len - dev->vhost_hlen; + desc_offset = dev->vhost_hlen; + } + + rte_prefetch0((void *)(uintptr_t)(desc_addr + desc_offset)); + + PRINT_PACKET(dev, (uintptr_t)(desc_addr + desc_offset), desc_avail, 0); + + mbuf_offset = 0; + mbuf_avail = m->buf_len - RTE_PKTMBUF_HEADROOM; + while (1) { + uint64_t hpa; + + cpy_len = RTE_MIN(desc_avail, mbuf_avail); + + /* +* A desc buf might across two host physical pages that are +* not continuous. In such case (gpa_to_hpa returns 0), data +* will be copied even though zero copy is enabled. +*/ + i
[dpdk-dev] [PATCH 2/2] app/testpmd: add command to resume a TM node
Traffic manager provides an API for resuming an arbitrary node in a hierarchy. This commit adds support for calling this API from testpmd. Signed-off-by: Tomasz Duszynski --- app/test-pmd/cmdline.c | 4 ++ app/test-pmd/cmdline_tm.c | 70 + app/test-pmd/cmdline_tm.h | 1 + doc/guides/testpmd_app_ug/testpmd_funcs.rst | 5 +++ 4 files changed, 80 insertions(+) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 6bbd606..f9827f6 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -800,6 +800,9 @@ static void cmd_help_long_parsed(void *parsed_result, "suspend port tm node (port_id) (node_id)" " Suspend tm node.\n\n" + "resume port tm node (port_id) (node_id)" + " Resume tm node.\n\n" + "port tm hierarchy commit (port_id) (clean_on_fail)\n" " Commit tm hierarchy.\n\n" @@ -16251,6 +16254,7 @@ cmdline_parse_ctx_t main_ctx[] = { (cmdline_parse_inst_t *)&cmd_del_port_tm_node, (cmdline_parse_inst_t *)&cmd_set_port_tm_node_parent, (cmdline_parse_inst_t *)&cmd_suspend_port_tm_node, + (cmdline_parse_inst_t *)&cmd_resume_port_tm_node, (cmdline_parse_inst_t *)&cmd_port_tm_hierarchy_commit, NULL, }; diff --git a/app/test-pmd/cmdline_tm.c b/app/test-pmd/cmdline_tm.c index c9a18dd..807e724 100644 --- a/app/test-pmd/cmdline_tm.c +++ b/app/test-pmd/cmdline_tm.c @@ -2036,6 +2036,76 @@ cmdline_parse_inst_t cmd_suspend_port_tm_node = { }, }; +/* *** Resume Port TM Node *** */ +struct cmd_resume_port_tm_node_result { + cmdline_fixed_string_t resume; + cmdline_fixed_string_t port; + cmdline_fixed_string_t tm; + cmdline_fixed_string_t node; + uint16_t port_id; + uint32_t node_id; +}; + +cmdline_parse_token_string_t cmd_resume_port_tm_node_resume = + TOKEN_STRING_INITIALIZER( + struct cmd_resume_port_tm_node_result, resume, "resume"); +cmdline_parse_token_string_t cmd_resume_port_tm_node_port = + TOKEN_STRING_INITIALIZER( + struct cmd_resume_port_tm_node_result, port, "port"); +cmdline_parse_token_string_t cmd_resume_port_tm_node_tm = + TOKEN_STRING_INITIALIZER( + struct cmd_resume_port_tm_node_result, tm, "tm"); +cmdline_parse_token_string_t cmd_resume_port_tm_node_node = + TOKEN_STRING_INITIALIZER( + struct cmd_resume_port_tm_node_result, node, "node"); +cmdline_parse_token_num_t cmd_resume_port_tm_node_port_id = + TOKEN_NUM_INITIALIZER( + struct cmd_resume_port_tm_node_result, port_id, UINT16); +cmdline_parse_token_num_t cmd_resume_port_tm_node_node_id = + TOKEN_NUM_INITIALIZER( + struct cmd_resume_port_tm_node_result, node_id, UINT32); + +static void cmd_resume_port_tm_node_parsed(void *parsed_result, + __attribute__((unused)) struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_resume_port_tm_node_result *res = parsed_result; + struct rte_tm_error error; + uint32_t node_id = res->node_id; + portid_t port_id = res->port_id; + int ret; + + if (port_id_is_invalid(port_id, ENABLED_WARN)) + return; + + /* Port status */ + if (!port_is_started(port_id)) { + printf(" Port %u not started (error)\n", port_id); + return; + } + + ret = rte_tm_node_resume(port_id, node_id, &error); + if (ret != 0) { + print_err_msg(&error); + return; + } +} + +cmdline_parse_inst_t cmd_resume_port_tm_node = { + .f = cmd_resume_port_tm_node_parsed, + .data = NULL, + .help_str = "Resume port tm node", + .tokens = { + (void *)&cmd_resume_port_tm_node_resume, + (void *)&cmd_resume_port_tm_node_port, + (void *)&cmd_resume_port_tm_node_tm, + (void *)&cmd_resume_port_tm_node_node, + (void *)&cmd_resume_port_tm_node_port_id, + (void *)&cmd_resume_port_tm_node_node_id, + NULL, + }, +}; + /* *** Port TM Hierarchy Commit *** */ struct cmd_port_tm_hierarchy_commit_result { cmdline_fixed_string_t port; diff --git a/app/test-pmd/cmdline_tm.h b/app/test-pmd/cmdline_tm.h index c4d5e8c..b3a14ad 100644 --- a/app/test-pmd/cmdline_tm.h +++ b/app/test-pmd/cmdline_tm.h @@ -23,6 +23,7 @@ extern cmdline_parse_inst_t cmd_add_port_tm_leaf_node; extern cmdline_parse_inst_t cmd_del_port_tm_node; extern cmdline_parse_inst_t cmd_set_port_tm_node_parent; extern cmdline_parse_inst_t cmd_suspend_port_tm_node; +extern cmdline_parse_inst_t cmd_resume_port_tm_node; extern cmdline_parse_inst_t cmd_port_tm_hierarchy_commit; #endif /* _CMDLINE_TM_H_ */ diff --git a/doc/guides/testpmd_app_ug/t
[dpdk-dev] [PATCH 0/2] add suspend/resume TM node commands to testpmd
Add new testpmd commands for invoking traffic manager suspend/resume API. Tomasz Duszynski (2): app/testpmd: add command to suspend a TM node app/testpmd: add command to resume a TM node app/test-pmd/cmdline.c | 8 ++ app/test-pmd/cmdline_tm.c | 140 app/test-pmd/cmdline_tm.h | 2 + doc/guides/testpmd_app_ug/testpmd_funcs.rst | 10 ++ 4 files changed, 160 insertions(+) -- 2.7.4
[dpdk-dev] [PATCH 1/2] app/testpmd: add command to suspend a TM node
Traffic manager provides an API for suspending an arbitrary node in a hierarchy. This commit adds support for calling this API from testpmd. Signed-off-by: Tomasz Duszynski --- app/test-pmd/cmdline.c | 4 ++ app/test-pmd/cmdline_tm.c | 70 + app/test-pmd/cmdline_tm.h | 1 + doc/guides/testpmd_app_ug/testpmd_funcs.rst | 5 +++ 4 files changed, 80 insertions(+) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 9f12c0f..6bbd606 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -797,6 +797,9 @@ static void cmd_help_long_parsed(void *parsed_result, " (priority) (weight)\n" " Set port tm node parent.\n\n" + "suspend port tm node (port_id) (node_id)" + " Suspend tm node.\n\n" + "port tm hierarchy commit (port_id) (clean_on_fail)\n" " Commit tm hierarchy.\n\n" @@ -16247,6 +16250,7 @@ cmdline_parse_ctx_t main_ctx[] = { (cmdline_parse_inst_t *)&cmd_add_port_tm_leaf_node, (cmdline_parse_inst_t *)&cmd_del_port_tm_node, (cmdline_parse_inst_t *)&cmd_set_port_tm_node_parent, + (cmdline_parse_inst_t *)&cmd_suspend_port_tm_node, (cmdline_parse_inst_t *)&cmd_port_tm_hierarchy_commit, NULL, }; diff --git a/app/test-pmd/cmdline_tm.c b/app/test-pmd/cmdline_tm.c index 9859c3d..c9a18dd 100644 --- a/app/test-pmd/cmdline_tm.c +++ b/app/test-pmd/cmdline_tm.c @@ -1966,6 +1966,76 @@ cmdline_parse_inst_t cmd_set_port_tm_node_parent = { }, }; +/* *** Suspend Port TM Node *** */ +struct cmd_suspend_port_tm_node_result { + cmdline_fixed_string_t suspend; + cmdline_fixed_string_t port; + cmdline_fixed_string_t tm; + cmdline_fixed_string_t node; + uint16_t port_id; + uint32_t node_id; +}; + +cmdline_parse_token_string_t cmd_suspend_port_tm_node_suspend = + TOKEN_STRING_INITIALIZER( + struct cmd_suspend_port_tm_node_result, suspend, "suspend"); +cmdline_parse_token_string_t cmd_suspend_port_tm_node_port = + TOKEN_STRING_INITIALIZER( + struct cmd_suspend_port_tm_node_result, port, "port"); +cmdline_parse_token_string_t cmd_suspend_port_tm_node_tm = + TOKEN_STRING_INITIALIZER( + struct cmd_suspend_port_tm_node_result, tm, "tm"); +cmdline_parse_token_string_t cmd_suspend_port_tm_node_node = + TOKEN_STRING_INITIALIZER( + struct cmd_suspend_port_tm_node_result, node, "node"); +cmdline_parse_token_num_t cmd_suspend_port_tm_node_port_id = + TOKEN_NUM_INITIALIZER( + struct cmd_suspend_port_tm_node_result, port_id, UINT16); +cmdline_parse_token_num_t cmd_suspend_port_tm_node_node_id = + TOKEN_NUM_INITIALIZER( + struct cmd_suspend_port_tm_node_result, node_id, UINT32); + +static void cmd_suspend_port_tm_node_parsed(void *parsed_result, + __attribute__((unused)) struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_suspend_port_tm_node_result *res = parsed_result; + struct rte_tm_error error; + uint32_t node_id = res->node_id; + portid_t port_id = res->port_id; + int ret; + + if (port_id_is_invalid(port_id, ENABLED_WARN)) + return; + + /* Port status */ + if (!port_is_started(port_id)) { + printf(" Port %u not started (error)\n", port_id); + return; + } + + ret = rte_tm_node_suspend(port_id, node_id, &error); + if (ret != 0) { + print_err_msg(&error); + return; + } +} + +cmdline_parse_inst_t cmd_suspend_port_tm_node = { + .f = cmd_suspend_port_tm_node_parsed, + .data = NULL, + .help_str = "Suspend port tm node", + .tokens = { + (void *)&cmd_suspend_port_tm_node_suspend, + (void *)&cmd_suspend_port_tm_node_port, + (void *)&cmd_suspend_port_tm_node_tm, + (void *)&cmd_suspend_port_tm_node_node, + (void *)&cmd_suspend_port_tm_node_port_id, + (void *)&cmd_suspend_port_tm_node_node_id, + NULL, + }, +}; + /* *** Port TM Hierarchy Commit *** */ struct cmd_port_tm_hierarchy_commit_result { cmdline_fixed_string_t port; diff --git a/app/test-pmd/cmdline_tm.h b/app/test-pmd/cmdline_tm.h index ba30360..c4d5e8c 100644 --- a/app/test-pmd/cmdline_tm.h +++ b/app/test-pmd/cmdline_tm.h @@ -22,6 +22,7 @@ extern cmdline_parse_inst_t cmd_add_port_tm_nonleaf_node; extern cmdline_parse_inst_t cmd_add_port_tm_leaf_node; extern cmdline_parse_inst_t cmd_del_port_tm_node; extern cmdline_parse_inst_t cmd_set_port_tm_node_parent; +extern cmdline_parse_inst_t cmd_suspend_port_tm_node; extern cmdline_parse_inst_t cmd_port_tm_hierarchy_commit; #endif /* _CMDLINE_TM_H_ */ diff
Re: [dpdk-dev] [PATCH] pci/uio: enable prefetchable resources mapping
On Thu, Feb 01, 2018 at 09:18:22AM +0800, Changpeng Liu wrote: > For PCI prefetchable resources, Linux will create a > write combined file as well, the library will try > to map resourceX_wc file first, if the file does > not exist, then it will map resourceX as usual. > > Signed-off-by: Changpeng Liu > --- > drivers/bus/pci/linux/pci_uio.c | 19 ++- > 1 file changed, 14 insertions(+), 5 deletions(-) > Hi, Given the lack of ordering guarantees with write-combined memory, I would have thought that this is very risky to do without a complete set of changes inside the PMDs to add in the necessary memory barriers to ensure ordering of operations to the BARs. Therefore, instead of mapping one file or another, I think the change should be made to map *both* in DPDK if available. Then each driver can chose whether to write a given device register using uncacheable memory type or write-combining memory type + any appropriate barriers. For example, with many NICs the initialization of the device involves many register writes in a pretty defined order, so wc operations are probably to suitable as performance is not a concern. However, for data path operations, a driver may chose to use wc memory for the occasional device writes there, for performance reasons. Regards, /Bruce
Re: [dpdk-dev] [RFC v2 01/17] mempool: fix phys contig check if populate default skipped
On 02/01/2018 12:30 PM, santosh wrote: On Thursday 01 February 2018 02:48 PM, Andrew Rybchenko wrote: On 02/01/2018 12:09 PM, santosh wrote: On Thursday 01 February 2018 12:24 PM, Andrew Rybchenko wrote: On 02/01/2018 08:05 AM, santosh wrote: On Wednesday 31 January 2018 10:15 PM, Olivier Matz wrote: On Tue, Jan 23, 2018 at 01:15:56PM +, Andrew Rybchenko wrote: There is not specified dependency between rte_mempool_populate_default() and rte_mempool_populate_iova(). So, the second should not rely on the fact that the first adds capability flags to the mempool flags. Fixes: 65cf769f5e6a ("mempool: detect physical contiguous objects") Cc: sta...@dpdk.org Signed-off-by: Andrew Rybchenko Looks good to me. I agree it's strange that the mp->flags are updated with capabilities only in rte_mempool_populate_default(). I see that this behavior is removed later in the patchset since the get_capa() is removed! However maybe this single patch could go in 18.02. +Santosh +Jerin since it's mostly about Octeon. rte_mempool_xmem_size should return correct size if MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS flag is set in 'mp->flags'. Thats why _ops_get_capabilities() called in _populate_default() but not at _populate_iova(). I think, this 'alone' patch may break octeontx mempool. The patch does not touch rte_mempool_populate_default(). _ops_get_capabilities() is still called there before rte_mempool_xmem_size(). The theoretical problem which the patch tries to fix is the case when rte_mempool_populate_default() is not called at all. I.e. application calls _ops_get_capabilities() to get flags, then, together with mp->flags, calls rte_mempool_xmem_size() directly, allocates calculated amount of memory and calls _populate_iova(). In that case, Application does like below: /* Get mempool capabilities */ mp_flags = 0; ret = rte_mempool_ops_get_capabilities(mp, &mp_flags); if ((ret < 0) && (ret != -ENOTSUP)) return ret; /* update mempool capabilities */ mp->flags |= mp_flags; Above line is not mandatory. "mp->flags | mp_flags" could be simply passed to rte_mempool_xmem_size() below. That depends and again upto application requirement, if app further down wants to refer mp->flags for _align/_contig then better update to mp->flags. But that wasn't the point of discussion, I'm trying to understand that w/o this patch, whats could be the application level problem? The problem that it is fragile. If application does not use rte_mempool_populate_default() it has to care about addition of mempool capability flags into mempool flags. If it is not done, rte_mempool_populate_iova/virt/iova_tab() functions will work incorrectly since F_CAPA_PHYS_CONTIG and F_CAPA_BLK_ALIGNED_OBJECTS are missing. The idea of the patch is to make it a bit more robust. I have no idea how it can break something. If capability flags are already there - no problem. If no, just make sure that we locally have them.
[dpdk-dev] [RFC v2 03/17] mempool/octeontx: add callback to calculate memory size
Hi Andrew, On Thursday 01 February 2018 11:48 AM, Jacob, Jerin wrote: > The driver requires one and only one physically contiguous > memory chunk for all objects. > > Signed-off-by: Andrew Rybchenko > --- > drivers/mempool/octeontx/rte_mempool_octeontx.c | 25 > + > 1 file changed, 25 insertions(+) > > diff --git a/drivers/mempool/octeontx/rte_mempool_octeontx.c > b/drivers/mempool/octeontx/rte_mempool_octeontx.c > index d143d05..4ec5efe 100644 > --- a/drivers/mempool/octeontx/rte_mempool_octeontx.c > +++ b/drivers/mempool/octeontx/rte_mempool_octeontx.c > @@ -136,6 +136,30 @@ octeontx_fpavf_get_capabilities(const struct rte_mempool > *mp, > return 0; > } > > +static ssize_t > +octeontx_fpavf_calc_mem_size(const struct rte_mempool *mp, > +uint32_t obj_num, uint32_t pg_shift, > +size_t *min_chunk_size, size_t *align) > +{ > + ssize_t mem_size; > + > + /* > +* Simply need space for one more object to be able to > +* fullfil alignment requirements. > +*/ > + mem_size = rte_mempool_calc_mem_size_def(mp, obj_num + 1, pg_shift, > + I think, you don't need that (obj_num + 1) as because rte_xmem_calc_int() will be checking flags for _ALIGNED + _CAPA_PHYS_CONFIG i.e.. mask = MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS | MEMPOOL_F_CAPA_PHYS_CONTIG; if ((flags & mask) == mask) /* alignment need one additional object */ elt_num += 1; > min_chunk_size, align); > + if (mem_size >= 0) { > + /* > +* The whole memory area containing the objects must be > +* physically contiguous. > +*/ > + *min_chunk_size = mem_size; > + } > + > + return mem_size; > +} > + > static int > octeontx_fpavf_register_memory_area(const struct rte_mempool *mp, > char *vaddr, rte_iova_t paddr, size_t > len) > @@ -159,6 +183,7 @@ static struct rte_mempool_ops octeontx_fpavf_ops = { > .get_count = octeontx_fpavf_get_count, > .get_capabilities = octeontx_fpavf_get_capabilities, > .register_memory_area = octeontx_fpavf_register_memory_area, > + .calc_mem_size = octeontx_fpavf_calc_mem_size, > }; > > MEMPOOL_REGISTER_OPS(octeontx_fpavf_ops); > -- > 2.7.4 >
[dpdk-dev] [PATCH v3 1/2] test/memzone: add test for memzone count in eal mem config
Ensure that memzone count in eal mem config is incremented and decremented whenever memzones are allocated and freed. Signed-off-by: Anatoly Burakov --- test/test/test_memzone.c | 20 1 file changed, 20 insertions(+) diff --git a/test/test/test_memzone.c b/test/test/test_memzone.c index f6c9b56..00d340f 100644 --- a/test/test/test_memzone.c +++ b/test/test/test_memzone.c @@ -841,6 +841,9 @@ test_memzone_basic(void) const struct rte_memzone *memzone3; const struct rte_memzone *memzone4; const struct rte_memzone *mz; + int memzone_cnt_after, memzone_cnt_expected; + int memzone_cnt_before = + rte_eal_get_configuration()->mem_config->memzone_cnt; memzone1 = rte_memzone_reserve("testzone1", 100, SOCKET_ID_ANY, 0); @@ -858,6 +861,18 @@ test_memzone_basic(void) if (memzone1 == NULL || memzone2 == NULL || memzone4 == NULL) return -1; + /* check how many memzones we are expecting */ + memzone_cnt_expected = memzone_cnt_before + + (memzone1 != NULL) + (memzone2 != NULL) + + (memzone3 != NULL) + (memzone4 != NULL); + + memzone_cnt_after = + rte_eal_get_configuration()->mem_config->memzone_cnt; + + if (memzone_cnt_after != memzone_cnt_expected) + return -1; + + rte_memzone_dump(stdout); /* check cache-line alignments */ @@ -930,6 +945,11 @@ test_memzone_basic(void) return -1; } + memzone_cnt_after = + rte_eal_get_configuration()->mem_config->memzone_cnt; + if (memzone_cnt_after != memzone_cnt_before) + return -1; + return 0; } -- 2.7.4
[dpdk-dev] [PATCH v3 2/2] test/memzone: handle previously allocated memzones
Currently, memzone autotest expects there to be no memzones present by the time the test is run. Some hardware drivers will allocate memzones for internal use during initialization, resulting in tests failing due to unexpected memzones being allocated before the test was run. Fix this by making sure all memzones allocated by this test have a common prefix, and making callback increment a counter on encountering memzones with this prefix. Also, separately increment another counter that will count total number of memzones left after test, and compares it to previously stored number of memzones, to ensure that we didn't accidentally allocated/freed any memzones we weren't supposed to. This also doubles as a test for correct operation of memzone_walk(). Fixes: 71330483a193 ("test/memzone: fix memory leak") Cc: radoslaw.bierna...@linaro.org Cc: sta...@dpdk.org Signed-off-by: Phil Yang Signed-off-by: Anatoly Burakov --- Notes: v3: suggested-by was supposed to be a signoff v2: incorporated Phil Yang's patch to better ensure no memzones were left behind by the test test/test/test_memzone.c | 225 +-- 1 file changed, 140 insertions(+), 85 deletions(-) diff --git a/test/test/test_memzone.c b/test/test/test_memzone.c index 00d340f..8ece1ac 100644 --- a/test/test/test_memzone.c +++ b/test/test/test_memzone.c @@ -4,6 +4,7 @@ #include #include +#include #include #include @@ -47,6 +48,8 @@ * - Check flags for specific huge page size reservation */ +#define TEST_MEMZONE_NAME(suffix) "MZ_TEST_" suffix + /* Test if memory overlaps: return 1 if true, or 0 if false. */ static int is_memory_overlap(rte_iova_t ptr1, size_t len1, rte_iova_t ptr2, size_t len2) @@ -63,14 +66,14 @@ test_memzone_invalid_alignment(void) { const struct rte_memzone * mz; - mz = rte_memzone_lookup("invalid_alignment"); + mz = rte_memzone_lookup(TEST_MEMZONE_NAME("invalid_alignment")); if (mz != NULL) { printf("Zone with invalid alignment has been reserved\n"); return -1; } - mz = rte_memzone_reserve_aligned("invalid_alignment", 100, - SOCKET_ID_ANY, 0, 100); + mz = rte_memzone_reserve_aligned(TEST_MEMZONE_NAME("invalid_alignment"), +100, SOCKET_ID_ANY, 0, 100); if (mz != NULL) { printf("Zone with invalid alignment has been reserved\n"); return -1; @@ -83,14 +86,16 @@ test_memzone_reserving_zone_size_bigger_than_the_maximum(void) { const struct rte_memzone * mz; - mz = rte_memzone_lookup("zone_size_bigger_than_the_maximum"); + mz = rte_memzone_lookup( + TEST_MEMZONE_NAME("zone_size_bigger_than_the_maximum")); if (mz != NULL) { printf("zone_size_bigger_than_the_maximum has been reserved\n"); return -1; } - mz = rte_memzone_reserve("zone_size_bigger_than_the_maximum", (size_t)-1, - SOCKET_ID_ANY, 0); + mz = rte_memzone_reserve( + TEST_MEMZONE_NAME("zone_size_bigger_than_the_maximum"), + (size_t)-1, SOCKET_ID_ANY, 0); if (mz != NULL) { printf("It is impossible to reserve such big a memzone\n"); return -1; @@ -137,8 +142,8 @@ test_memzone_reserve_flags(void) * available page size (i.e 1GB ) when 2MB pages are unavailable. */ if (hugepage_2MB_avail) { - mz = rte_memzone_reserve("flag_zone_2M", size, SOCKET_ID_ANY, - RTE_MEMZONE_2MB); + mz = rte_memzone_reserve(TEST_MEMZONE_NAME("flag_zone_2M"), + size, SOCKET_ID_ANY, RTE_MEMZONE_2MB); if (mz == NULL) { printf("MEMZONE FLAG 2MB\n"); return -1; @@ -152,7 +157,8 @@ test_memzone_reserve_flags(void) return -1; } - mz = rte_memzone_reserve("flag_zone_2M_HINT", size, SOCKET_ID_ANY, + mz = rte_memzone_reserve(TEST_MEMZONE_NAME("flag_zone_2M_HINT"), + size, SOCKET_ID_ANY, RTE_MEMZONE_2MB|RTE_MEMZONE_SIZE_HINT_ONLY); if (mz == NULL) { printf("MEMZONE FLAG 2MB\n"); @@ -171,7 +177,9 @@ test_memzone_reserve_flags(void) * HINT flag is indicated */ if (!hugepage_1GB_avail) { - mz = rte_memzone_reserve("flag_zone_1G_HINT", size, SOCKET_ID_ANY, + mz = rte_memzone_reserve( + TEST_MEMZONE_NAME("flag_zone_1G_HINT"), + size, SOCKET_ID_ANY, RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY);
Re: [dpdk-dev] [PATCH] net/tap: fix ICC compilation fails
01/02/2018 05:43, Zhiyong Yang: > The following error is reported when compiling 18.02-rc2 usng ICC, > "transfer of control bypasses initialization of". > The patch fixes the issue. > > Fixes: 1911c5edc6cd ("net/tap: fix eBPF RSS map key handling") > Cc: sta...@dpdk.org stable is not needed here > Cc: pascal.ma...@6wind.com > Cc: ferruh.yi...@intel.com > Cc: tho...@monjalon.net > Cc: ophi...@mellanox.com > > Signed-off-by: Zhiyong Yang Applied, thanks
[dpdk-dev] [PATCH] doc: fix release note for rawdev library
Fixes: a9bb0c44c775 ("doc: add rawdev library guide and doxygen page") Cc: shreyansh.j...@nxp.com '+' sign was missing from librawdev library which is added in this release. Signed-off-by: Shreyansh Jain --- doc/guides/rel_notes/release_18_02.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/guides/rel_notes/release_18_02.rst b/doc/guides/rel_notes/release_18_02.rst index 689080bed..03a82a409 100644 --- a/doc/guides/rel_notes/release_18_02.rst +++ b/doc/guides/rel_notes/release_18_02.rst @@ -320,7 +320,7 @@ The libraries prepended with a plus sign were incremented in this version. librte_pmd_vhost.so.2 librte_port.so.3 librte_power.so.1 - librte_rawdev.so.1 + + librte_rawdev.so.1 librte_reorder.so.1 librte_ring.so.1 librte_sched.so.1 -- 2.14.1
Re: [dpdk-dev] [PATCH v2 1/2] test/memzone: add test for memzone count in eal mem config
On 01-Feb-18 12:12 AM, Thomas Monjalon wrote: 31/01/2018 16:29, Anatoly Burakov: Ensure that memzone count in eal mem config is incremented and decremented whenever memzones are allocated and freed. Signed-off-by: Anatoly Burakov Please report acks from previous version. OK, will submit a v4. -- Thanks, Anatoly
Re: [dpdk-dev] [PATCH] pci/uio: enable prefetchable resources mapping
On 02/01/2018 12:59 PM, Bruce Richardson wrote: On Thu, Feb 01, 2018 at 09:18:22AM +0800, Changpeng Liu wrote: For PCI prefetchable resources, Linux will create a write combined file as well, the library will try to map resourceX_wc file first, if the file does not exist, then it will map resourceX as usual. Signed-off-by: Changpeng Liu --- drivers/bus/pci/linux/pci_uio.c | 19 ++- 1 file changed, 14 insertions(+), 5 deletions(-) Hi, Given the lack of ordering guarantees with write-combined memory, I would have thought that this is very risky to do without a complete set of changes inside the PMDs to add in the necessary memory barriers to ensure ordering of operations to the BARs. Therefore, instead of mapping one file or another, I think the change should be made to map *both* in DPDK if available. Then each driver can chose whether to write a given device register using uncacheable memory type or write-combining memory type + any appropriate barriers. For example, with many NICs the initialization of the device involves many register writes in a pretty defined order, so wc operations are probably to suitable as performance is not a concern. However, for data path operations, a driver may chose to use wc memory for the occasional device writes there, for performance reasons. +1 I think so too that it would be useful to have both mappings available and allow driver to choose which one to use.
Re: [dpdk-dev] IXGBE, IOMMU DMAR DRHD handling fault issue
On 31-Jan-18 9:51 PM, Ravi Kerur wrote: Hi Anatoly, Thanks. I am following wiki link below which uses vIOMMU with DPDK as a use-case and instantiate VM as specified with Q35 chipset in Qemu. https://wiki.qemu.org/Features/VT-d Qemu-version is 2.11 Host kernel 4.9 Guest kernel 4.4 I can only guess that guest kernel needs an upgrade in my setup to work correctly, if versions on my setup rings a bell on not having support kindly let me know. When 'modprobe vfio enable_unsafe_noiommu_node=Y' is executed on guest I get following error ... vfio: unknown parameter 'enable_unsafe_noiommu_node' ignored ... in guest. Thanks. AFAIK kernel 4.4 should have noiommu mode - it was introduced in 3.1x days. However, in order for that to work, kernel also has to be built with this mode enabled. My guess is, whoever is the supplier of your kernel, did not do that. You should double-check the kernel configuration of your distribution. However, if you have vIOMMU in QEMU, you shouldn't need noiommu mode - "regular" vfio should work fine. noiommu mode should only be needed if you know you don't have IOMMU enabled in your kernel, and even if you can't enable it, you can still use igb_uio. -- Thanks, Anatoly
Re: [dpdk-dev] [RFC v2 01/17] mempool: fix phys contig check if populate default skipped
On Thu, Feb 01, 2018 at 01:00:12PM +0300, Andrew Rybchenko wrote: > On 02/01/2018 12:30 PM, santosh wrote: > > On Thursday 01 February 2018 02:48 PM, Andrew Rybchenko wrote: > > > On 02/01/2018 12:09 PM, santosh wrote: > > > > On Thursday 01 February 2018 12:24 PM, Andrew Rybchenko wrote: > > > > > On 02/01/2018 08:05 AM, santosh wrote: > > > > > > On Wednesday 31 January 2018 10:15 PM, Olivier Matz wrote: > > > > > > > On Tue, Jan 23, 2018 at 01:15:56PM +, Andrew Rybchenko wrote: > > > > > > > > There is not specified dependency between > > > > > > > > rte_mempool_populate_default() > > > > > > > > and rte_mempool_populate_iova(). So, the second should not rely > > > > > > > > on the > > > > > > > > fact that the first adds capability flags to the mempool flags. > > > > > > > > > > > > > > > > Fixes: 65cf769f5e6a ("mempool: detect physical contiguous > > > > > > > > objects") > > > > > > > > Cc: sta...@dpdk.org > > > > > > > > > > > > > > > > Signed-off-by: Andrew Rybchenko > > > > > > > Looks good to me. I agree it's strange that the mp->flags are > > > > > > > updated with capabilities only in rte_mempool_populate_default(). > > > > > > > I see that this behavior is removed later in the patchset since > > > > > > > the > > > > > > > get_capa() is removed! > > > > > > > > > > > > > > However maybe this single patch could go in 18.02. > > > > > > > +Santosh +Jerin since it's mostly about Octeon. > > > > > > rte_mempool_xmem_size should return correct size if > > > > > > MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS flag > > > > > > is set in 'mp->flags'. Thats why _ops_get_capabilities() called in > > > > > > _populate_default() but not > > > > > > at _populate_iova(). > > > > > > I think, this 'alone' patch may break octeontx mempool. > > > > > The patch does not touch rte_mempool_populate_default(). > > > > > _ops_get_capabilities() is still called there before > > > > > rte_mempool_xmem_size(). The theoretical problem which > > > > > the patch tries to fix is the case when > > > > > rte_mempool_populate_default() is not called at all. I.e. application > > > > > calls _ops_get_capabilities() to get flags, then, together with > > > > > mp->flags, calls rte_mempool_xmem_size() directly, allocates > > > > > calculated amount of memory and calls _populate_iova(). > > > > > > > > > In that case, Application does like below: > > > > > > > > /* Get mempool capabilities */ > > > > mp_flags = 0; > > > > ret = rte_mempool_ops_get_capabilities(mp, &mp_flags); > > > > if ((ret < 0) && (ret != -ENOTSUP)) > > > > return ret; > > > > > > > > /* update mempool capabilities */ > > > > mp->flags |= mp_flags; > > > Above line is not mandatory. "mp->flags | mp_flags" could be simply > > > passed to rte_mempool_xmem_size() below. > > > > > That depends and again upto application requirement, if app further down > > wants to refer mp->flags for _align/_contig then better update to mp->flags. > > > > But that wasn't the point of discussion, I'm trying to understand that > > w/o this patch, whats could be the application level problem? > > The problem that it is fragile. If application does not use > rte_mempool_populate_default() it has to care about addition > of mempool capability flags into mempool flags. If it is not done, > rte_mempool_populate_iova/virt/iova_tab() functions will work > incorrectly since F_CAPA_PHYS_CONTIG and > F_CAPA_BLK_ALIGNED_OBJECTS are missing. > > The idea of the patch is to make it a bit more robust. I have no > idea how it can break something. If capability flags are already > there - no problem. If no, just make sure that we locally have them. The example given by Santosh will work, but it is *not* the role of the application to update the mempool flags. And nothing says that it is mandatory to call rte_mempool_ops_get_capabilities() before the populate functions. For instance, in testpmd it calls rte_mempool_populate_anon() when using anonymous memory. The capabilities will never be updated in mp->flags.
[dpdk-dev] [PATCH v4 1/2] test/memzone: add test for memzone count in eal mem config
Ensure that memzone count in eal mem config is incremented and decremented whenever memzones are allocated and freed. Reviewed-by: Radoslaw Biernacki Signed-off-by: Anatoly Burakov --- Notes: v4: added missing reviewed-by tag test/test/test_memzone.c | 20 1 file changed, 20 insertions(+) diff --git a/test/test/test_memzone.c b/test/test/test_memzone.c index f6c9b56..00d340f 100644 --- a/test/test/test_memzone.c +++ b/test/test/test_memzone.c @@ -841,6 +841,9 @@ test_memzone_basic(void) const struct rte_memzone *memzone3; const struct rte_memzone *memzone4; const struct rte_memzone *mz; + int memzone_cnt_after, memzone_cnt_expected; + int memzone_cnt_before = + rte_eal_get_configuration()->mem_config->memzone_cnt; memzone1 = rte_memzone_reserve("testzone1", 100, SOCKET_ID_ANY, 0); @@ -858,6 +861,18 @@ test_memzone_basic(void) if (memzone1 == NULL || memzone2 == NULL || memzone4 == NULL) return -1; + /* check how many memzones we are expecting */ + memzone_cnt_expected = memzone_cnt_before + + (memzone1 != NULL) + (memzone2 != NULL) + + (memzone3 != NULL) + (memzone4 != NULL); + + memzone_cnt_after = + rte_eal_get_configuration()->mem_config->memzone_cnt; + + if (memzone_cnt_after != memzone_cnt_expected) + return -1; + + rte_memzone_dump(stdout); /* check cache-line alignments */ @@ -930,6 +945,11 @@ test_memzone_basic(void) return -1; } + memzone_cnt_after = + rte_eal_get_configuration()->mem_config->memzone_cnt; + if (memzone_cnt_after != memzone_cnt_before) + return -1; + return 0; } -- 2.7.4
[dpdk-dev] [PATCH v4 2/2] test/memzone: handle previously allocated memzones
Currently, memzone autotest expects there to be no memzones present by the time the test is run. Some hardware drivers will allocate memzones for internal use during initialization, resulting in tests failing due to unexpected memzones being allocated before the test was run. Fix this by making sure all memzones allocated by this test have a common prefix, and making callback increment a counter on encountering memzones with this prefix. Also, separately increment another counter that will count total number of memzones left after test, and compares it to previously stored number of memzones, to ensure that we didn't accidentally allocated/freed any memzones we weren't supposed to. This also doubles as a test for correct operation of memzone_walk(). Fixes: 71330483a193 ("test/memzone: fix memory leak") Cc: radoslaw.bierna...@linaro.org Cc: sta...@dpdk.org Reviewed-by: Radoslaw Biernacki Signed-off-by: Phil Yang Signed-off-by: Anatoly Burakov --- Notes: v4: added missing reviewed-by tag v3: suggested-by was supposed to be a signoff v2: incorporated Phil Yang's patch to better ensure no memzones were left behind by the test test/test/test_memzone.c | 225 +-- 1 file changed, 140 insertions(+), 85 deletions(-) diff --git a/test/test/test_memzone.c b/test/test/test_memzone.c index 00d340f..8ece1ac 100644 --- a/test/test/test_memzone.c +++ b/test/test/test_memzone.c @@ -4,6 +4,7 @@ #include #include +#include #include #include @@ -47,6 +48,8 @@ * - Check flags for specific huge page size reservation */ +#define TEST_MEMZONE_NAME(suffix) "MZ_TEST_" suffix + /* Test if memory overlaps: return 1 if true, or 0 if false. */ static int is_memory_overlap(rte_iova_t ptr1, size_t len1, rte_iova_t ptr2, size_t len2) @@ -63,14 +66,14 @@ test_memzone_invalid_alignment(void) { const struct rte_memzone * mz; - mz = rte_memzone_lookup("invalid_alignment"); + mz = rte_memzone_lookup(TEST_MEMZONE_NAME("invalid_alignment")); if (mz != NULL) { printf("Zone with invalid alignment has been reserved\n"); return -1; } - mz = rte_memzone_reserve_aligned("invalid_alignment", 100, - SOCKET_ID_ANY, 0, 100); + mz = rte_memzone_reserve_aligned(TEST_MEMZONE_NAME("invalid_alignment"), +100, SOCKET_ID_ANY, 0, 100); if (mz != NULL) { printf("Zone with invalid alignment has been reserved\n"); return -1; @@ -83,14 +86,16 @@ test_memzone_reserving_zone_size_bigger_than_the_maximum(void) { const struct rte_memzone * mz; - mz = rte_memzone_lookup("zone_size_bigger_than_the_maximum"); + mz = rte_memzone_lookup( + TEST_MEMZONE_NAME("zone_size_bigger_than_the_maximum")); if (mz != NULL) { printf("zone_size_bigger_than_the_maximum has been reserved\n"); return -1; } - mz = rte_memzone_reserve("zone_size_bigger_than_the_maximum", (size_t)-1, - SOCKET_ID_ANY, 0); + mz = rte_memzone_reserve( + TEST_MEMZONE_NAME("zone_size_bigger_than_the_maximum"), + (size_t)-1, SOCKET_ID_ANY, 0); if (mz != NULL) { printf("It is impossible to reserve such big a memzone\n"); return -1; @@ -137,8 +142,8 @@ test_memzone_reserve_flags(void) * available page size (i.e 1GB ) when 2MB pages are unavailable. */ if (hugepage_2MB_avail) { - mz = rte_memzone_reserve("flag_zone_2M", size, SOCKET_ID_ANY, - RTE_MEMZONE_2MB); + mz = rte_memzone_reserve(TEST_MEMZONE_NAME("flag_zone_2M"), + size, SOCKET_ID_ANY, RTE_MEMZONE_2MB); if (mz == NULL) { printf("MEMZONE FLAG 2MB\n"); return -1; @@ -152,7 +157,8 @@ test_memzone_reserve_flags(void) return -1; } - mz = rte_memzone_reserve("flag_zone_2M_HINT", size, SOCKET_ID_ANY, + mz = rte_memzone_reserve(TEST_MEMZONE_NAME("flag_zone_2M_HINT"), + size, SOCKET_ID_ANY, RTE_MEMZONE_2MB|RTE_MEMZONE_SIZE_HINT_ONLY); if (mz == NULL) { printf("MEMZONE FLAG 2MB\n"); @@ -171,7 +177,9 @@ test_memzone_reserve_flags(void) * HINT flag is indicated */ if (!hugepage_1GB_avail) { - mz = rte_memzone_reserve("flag_zone_1G_HINT", size, SOCKET_ID_ANY, + mz = rte_memzone_reserve( + TEST_MEMZONE_NAME("flag_zone_1G_HINT"), + size, SOCKET_ID_ANY,
Re: [dpdk-dev] [RFC v2 01/17] mempool: fix phys contig check if populate default skipped
On Thursday 01 February 2018 03:30 PM, Andrew Rybchenko wrote: > On 02/01/2018 12:30 PM, santosh wrote: >> On Thursday 01 February 2018 02:48 PM, Andrew Rybchenko wrote: >>> On 02/01/2018 12:09 PM, santosh wrote: On Thursday 01 February 2018 12:24 PM, Andrew Rybchenko wrote: > On 02/01/2018 08:05 AM, santosh wrote: >> On Wednesday 31 January 2018 10:15 PM, Olivier Matz wrote: >>> On Tue, Jan 23, 2018 at 01:15:56PM +, Andrew Rybchenko wrote: There is not specified dependency between rte_mempool_populate_default() and rte_mempool_populate_iova(). So, the second should not rely on the fact that the first adds capability flags to the mempool flags. Fixes: 65cf769f5e6a ("mempool: detect physical contiguous objects") Cc: sta...@dpdk.org Signed-off-by: Andrew Rybchenko >>> Looks good to me. I agree it's strange that the mp->flags are >>> updated with capabilities only in rte_mempool_populate_default(). >>> I see that this behavior is removed later in the patchset since the >>> get_capa() is removed! >>> >>> However maybe this single patch could go in 18.02. >>> +Santosh +Jerin since it's mostly about Octeon. >> rte_mempool_xmem_size should return correct size if >> MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS flag >> is set in 'mp->flags'. Thats why _ops_get_capabilities() called in >> _populate_default() but not >> at _populate_iova(). >> I think, this 'alone' patch may break octeontx mempool. > The patch does not touch rte_mempool_populate_default(). > _ops_get_capabilities() is still called there before > rte_mempool_xmem_size(). The theoretical problem which > the patch tries to fix is the case when > rte_mempool_populate_default() is not called at all. I.e. application > calls _ops_get_capabilities() to get flags, then, together with > mp->flags, calls rte_mempool_xmem_size() directly, allocates > calculated amount of memory and calls _populate_iova(). > In that case, Application does like below: /* Get mempool capabilities */ mp_flags = 0; ret = rte_mempool_ops_get_capabilities(mp, &mp_flags); if ((ret < 0) && (ret != -ENOTSUP)) return ret; /* update mempool capabilities */ mp->flags |= mp_flags; >>> Above line is not mandatory. "mp->flags | mp_flags" could be simply >>> passed to rte_mempool_xmem_size() below. >>> >> That depends and again upto application requirement, if app further down >> wants to refer mp->flags for _align/_contig then better update to mp->flags. >> >> But that wasn't the point of discussion, I'm trying to understand that >> w/o this patch, whats could be the application level problem? > > The problem that it is fragile. If application does not use > rte_mempool_populate_default() it has to care about addition > of mempool capability flags into mempool flags. If it is not done, Capability flags should get updated to mempool flags. Or else _get_ops_capabilities() to update capa flags to mempool flags internally, I recall that I proposed same in the past. [...] > The idea of the patch is to make it a bit more robust. I have no > idea how it can break something. If capability flags are already > there - no problem. If no, just make sure that we locally have them. > I would prefer _get_ops_capabilities() updates capa flags to mp->flag for once, rather than doing (mp->flags | mp_flags) across mempool func.
[dpdk-dev] [PATCH v1] net/failsafe: fix strerror call in sub-eal
Ownership API returns a negative value, strerror expects a valid errno value, thus positive. CID 260401: Error handling issues (NEGATIVE_RETURNS) "ret" is passed to a parameter that cannot be negative. Fixes: dcd0c9c32b8d ("net/failsafe: use ownership mechanism for slaves") Signed-off-by: Gaetan Rivet --- drivers/net/failsafe/failsafe_eal.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c index 8946bf6fe..c3d673125 100644 --- a/drivers/net/failsafe/failsafe_eal.c +++ b/drivers/net/failsafe/failsafe_eal.c @@ -79,9 +79,9 @@ fs_bus_init(struct rte_eth_dev *dev) " %d named %s", i, da->name); } ret = rte_eth_dev_owner_set(pid, &PRIV(dev)->my_owner); - if (ret) { + if (ret < 0) { INFO("sub_device %d owner set failed (%s)," -" will try again later", i, strerror(ret)); +" will try again later", i, strerror(-ret)); continue; } else if (strncmp(rte_eth_devices[pid].device->name, da->name, strlen(da->name)) != 0) { -- 2.11.0
Re: [dpdk-dev] [PATCH 12/14] vhost: dequeue for packed queues
On Thu, Feb 01, 2018 at 10:35:18AM +0100, Maxime Coquelin wrote: On 01/29/2018 03:11 PM, Jens Freimann wrote: Implement code to dequeue and process descriptors from the vring if VIRTIO_F_PACKED is enabled. Check if descriptor was made available by driver by looking at VIRTIO_F_DESC_AVAIL flag in descriptor. If so dequeue and set the used flag VIRTIO_F_DESC_USED to the current value of the used wrap counter. Used ring wrap counter needs to be toggled when last descriptor is written out. This allows the host/guest to detect new descriptors even after the ring has wrapped. Signed-off-by: Jens Freimann --- lib/librte_vhost/vhost.c | 1 + lib/librte_vhost/vhost.h | 1 + lib/librte_vhost/virtio_net.c | 194 ++ 3 files changed, 196 insertions(+) diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c index 78913912c..e5f58d9c8 100644 --- a/lib/librte_vhost/vhost.c +++ b/lib/librte_vhost/vhost.c @@ -191,6 +191,7 @@ init_vring_queue(struct virtio_net *dev, uint32_t vring_idx) vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD; + vq->used_wrap_counter = 1; vhost_user_iotlb_init(dev, vring_idx); /* Backends are set to -1 indicating an inactive device. */ diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index 8554d51d8..a3d4214b6 100644 --- a/lib/librte_vhost/vhost.h +++ b/lib/librte_vhost/vhost.h @@ -106,6 +106,7 @@ struct vhost_virtqueue { struct batch_copy_elem *batch_copy_elems; uint16_tbatch_copy_nb_elems; + uint32_tused_wrap_counter; rte_rwlock_tiotlb_lock; rte_rwlock_tiotlb_pending_lock; diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index edfab3ba6..5d4cfe8cc 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -19,6 +19,7 @@ #include "iotlb.h" #include "vhost.h" +#include "virtio-1.1.h" #define MAX_PKT_BURST 32 @@ -,6 +1112,199 @@ restore_mbuf(struct rte_mbuf *m) } } +static inline uint16_t +dequeue_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, +struct rte_mempool *mbuf_pool, struct rte_mbuf *m, +struct vring_desc_1_1 *descs) +{ + struct vring_desc_1_1 *desc; + uint64_t desc_addr; + uint32_t desc_avail, desc_offset; + uint32_t mbuf_avail, mbuf_offset; + uint32_t cpy_len; + struct rte_mbuf *cur = m, *prev = m; + struct virtio_net_hdr *hdr = NULL; + uint16_t head_idx = vq->last_used_idx & (vq->size - 1); + int wrap_counter = vq->used_wrap_counter; + + desc = &descs[vq->last_used_idx & (vq->size - 1)]; + if (unlikely((desc->len < dev->vhost_hlen)) || + (desc->flags & VRING_DESC_F_INDIRECT)) + rte_panic("INDIRECT not supported yet"); Using rte_panic() may not be a good idea here, because a malicious guest could make the vswitch to crash easily. Good point. It was for debugging only, I will remove it. + + desc_addr = rte_vhost_gpa_to_vva(dev->mem, desc->addr); You should use vhost_iova_to_vva() here and everywhere else, otherwise you break IOMMU support. Yes, I'll change it. + if (unlikely(!desc_addr)) + return -1; + + if (virtio_net_with_host_offload(dev)) { + hdr = (struct virtio_net_hdr *)((uintptr_t)desc_addr); + rte_prefetch0(hdr); + } + + /* +* A virtio driver normally uses at least 2 desc buffers +* for Tx: the first for storing the header, and others +* for storing the data. +*/ + if (likely((desc->len == dev->vhost_hlen) && + (desc->flags & VRING_DESC_F_NEXT) != 0)) { + if ((++vq->last_used_idx & (vq->size - 1)) == 0) + toggle_wrap_counter(vq); + + desc = &descs[vq->last_used_idx & (vq->size - 1)]; + + if (unlikely(desc->flags & VRING_DESC_F_INDIRECT)) + rte_panic("INDIRECT not supported yet"); Ditto. + + desc_addr = rte_vhost_gpa_to_vva(dev->mem, desc->addr); + if (unlikely(!desc_addr)) + return -1; + + desc_offset = 0; + desc_avail = desc->len; + } else { + desc_avail = desc->len - dev->vhost_hlen; + desc_offset = dev->vhost_hlen; + } + + rte_prefetch0((void *)(uintptr_t)(desc_addr + desc_offset)); + + PRINT_PACKET(dev, (uintptr_t)(desc_addr + desc_offset), desc_avail, 0); + + mbuf_offset = 0; + mbuf_avail = m->buf_len - RTE_PKTMBUF_HEADROOM; + while (1) { + uint64_t hpa; + + cpy_len = RTE_MIN(desc_avail, mbuf_avail); + + /* +* A desc buf might across two host physical pages that are +* not continuous. In such case
Re: [dpdk-dev] [RFC v2 01/17] mempool: fix phys contig check if populate default skipped
On Thursday 01 February 2018 03:44 PM, Olivier Matz wrote: > On Thu, Feb 01, 2018 at 01:00:12PM +0300, Andrew Rybchenko wrote: >> On 02/01/2018 12:30 PM, santosh wrote: >>> On Thursday 01 February 2018 02:48 PM, Andrew Rybchenko wrote: On 02/01/2018 12:09 PM, santosh wrote: > On Thursday 01 February 2018 12:24 PM, Andrew Rybchenko wrote: >> On 02/01/2018 08:05 AM, santosh wrote: >>> On Wednesday 31 January 2018 10:15 PM, Olivier Matz wrote: On Tue, Jan 23, 2018 at 01:15:56PM +, Andrew Rybchenko wrote: > There is not specified dependency between > rte_mempool_populate_default() > and rte_mempool_populate_iova(). So, the second should not rely on the > fact that the first adds capability flags to the mempool flags. > > Fixes: 65cf769f5e6a ("mempool: detect physical contiguous objects") > Cc: sta...@dpdk.org > > Signed-off-by: Andrew Rybchenko Looks good to me. I agree it's strange that the mp->flags are updated with capabilities only in rte_mempool_populate_default(). I see that this behavior is removed later in the patchset since the get_capa() is removed! However maybe this single patch could go in 18.02. +Santosh +Jerin since it's mostly about Octeon. >>> rte_mempool_xmem_size should return correct size if >>> MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS flag >>> is set in 'mp->flags'. Thats why _ops_get_capabilities() called in >>> _populate_default() but not >>> at _populate_iova(). >>> I think, this 'alone' patch may break octeontx mempool. >> The patch does not touch rte_mempool_populate_default(). >> _ops_get_capabilities() is still called there before >> rte_mempool_xmem_size(). The theoretical problem which >> the patch tries to fix is the case when >> rte_mempool_populate_default() is not called at all. I.e. application >> calls _ops_get_capabilities() to get flags, then, together with >> mp->flags, calls rte_mempool_xmem_size() directly, allocates >> calculated amount of memory and calls _populate_iova(). >> > In that case, Application does like below: > > /* Get mempool capabilities */ > mp_flags = 0; > ret = rte_mempool_ops_get_capabilities(mp, &mp_flags); > if ((ret < 0) && (ret != -ENOTSUP)) > return ret; > > /* update mempool capabilities */ > mp->flags |= mp_flags; Above line is not mandatory. "mp->flags | mp_flags" could be simply passed to rte_mempool_xmem_size() below. >>> That depends and again upto application requirement, if app further down >>> wants to refer mp->flags for _align/_contig then better update to mp->flags. >>> >>> But that wasn't the point of discussion, I'm trying to understand that >>> w/o this patch, whats could be the application level problem? >> The problem that it is fragile. If application does not use >> rte_mempool_populate_default() it has to care about addition >> of mempool capability flags into mempool flags. If it is not done, >> rte_mempool_populate_iova/virt/iova_tab() functions will work >> incorrectly since F_CAPA_PHYS_CONTIG and >> F_CAPA_BLK_ALIGNED_OBJECTS are missing. >> >> The idea of the patch is to make it a bit more robust. I have no >> idea how it can break something. If capability flags are already >> there - no problem. If no, just make sure that we locally have them. > The example given by Santosh will work, but it is *not* the role of the > application to update the mempool flags. And nothing says that it is mandatory > to call rte_mempool_ops_get_capabilities() before the populate functions. > > For instance, in testpmd it calls rte_mempool_populate_anon() when using > anonymous memory. The capabilities will never be updated in mp->flags. Valid case and I agree with your example and explanation. With nits change: mp->flags |= mp_capa_flags; Acked-by: Santosh Shukla
[dpdk-dev] [PATCH 1/3] net/null: add set MAC address dev op
Needed if used with net/bonding Signed-off-by: Radu Nicolau --- v2: remove redundant memcpy drivers/net/null/rte_eth_null.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c index 9385ffd..d003b28 100644 --- a/drivers/net/null/rte_eth_null.c +++ b/drivers/net/null/rte_eth_null.c @@ -461,6 +461,12 @@ eth_rss_hash_conf_get(struct rte_eth_dev *dev, return 0; } +static void +eth_mac_address_set(__rte_unused struct rte_eth_dev *dev, + __rte_unused struct ether_addr *addr) +{ +} + static const struct eth_dev_ops ops = { .dev_start = eth_dev_start, .dev_stop = eth_dev_stop, @@ -472,6 +478,7 @@ static const struct eth_dev_ops ops = { .tx_queue_release = eth_queue_release, .mtu_set = eth_mtu_set, .link_update = eth_link_update, + .mac_addr_set = eth_mac_address_set, .stats_get = eth_stats_get, .stats_reset = eth_stats_reset, .reta_update = eth_rss_reta_update, -- 2.7.5
[dpdk-dev] [PATCH 1/3] net/null: add set MAC address dev op
Needed if used with net/bonding Signed-off-by: Radu Nicolau --- v2: remove redundant memcpy drivers/net/null/rte_eth_null.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c index 9385ffd..d003b28 100644 --- a/drivers/net/null/rte_eth_null.c +++ b/drivers/net/null/rte_eth_null.c @@ -461,6 +461,12 @@ eth_rss_hash_conf_get(struct rte_eth_dev *dev, return 0; } +static void +eth_mac_address_set(__rte_unused struct rte_eth_dev *dev, + __rte_unused struct ether_addr *addr) +{ +} + static const struct eth_dev_ops ops = { .dev_start = eth_dev_start, .dev_stop = eth_dev_stop, @@ -472,6 +478,7 @@ static const struct eth_dev_ops ops = { .tx_queue_release = eth_queue_release, .mtu_set = eth_mtu_set, .link_update = eth_link_update, + .mac_addr_set = eth_mac_address_set, .stats_get = eth_stats_get, .stats_reset = eth_stats_reset, .reta_update = eth_rss_reta_update, -- 2.7.5
[dpdk-dev] [PATCH v2] net/null: add set MAC address dev op
Needed if used with net/bonding Signed-off-by: Radu Nicolau --- v2: remove redundant memcpy drivers/net/null/rte_eth_null.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/null/rte_eth_null.c b/drivers/net/null/rte_eth_null.c index 9385ffd..d003b28 100644 --- a/drivers/net/null/rte_eth_null.c +++ b/drivers/net/null/rte_eth_null.c @@ -461,6 +461,12 @@ eth_rss_hash_conf_get(struct rte_eth_dev *dev, return 0; } +static void +eth_mac_address_set(__rte_unused struct rte_eth_dev *dev, + __rte_unused struct ether_addr *addr) +{ +} + static const struct eth_dev_ops ops = { .dev_start = eth_dev_start, .dev_stop = eth_dev_stop, @@ -472,6 +478,7 @@ static const struct eth_dev_ops ops = { .tx_queue_release = eth_queue_release, .mtu_set = eth_mtu_set, .link_update = eth_link_update, + .mac_addr_set = eth_mac_address_set, .stats_get = eth_stats_get, .stats_reset = eth_stats_reset, .reta_update = eth_rss_reta_update, -- 2.7.5
[dpdk-dev] [PATCH v2] test/virtual_pmd: add set MAC address dev op
Needed if used with net/bonding Signed-off-by: Radu Nicolau --- v2: remove redundant memcpy test/test/virtual_pmd.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/test/test/virtual_pmd.c b/test/test/virtual_pmd.c index 7a7adbb..2f5b31d 100644 --- a/test/test/virtual_pmd.c +++ b/test/test/virtual_pmd.c @@ -216,6 +216,11 @@ static void virtual_ethdev_promiscuous_mode_disable(struct rte_eth_dev *dev __rte_unused) {} +static void +virtual_ethdev_mac_address_set(__rte_unused struct rte_eth_dev *dev, + __rte_unused struct ether_addr *addr) +{ +} static const struct eth_dev_ops virtual_ethdev_default_dev_ops = { .dev_configure = virtual_ethdev_configure_success, @@ -228,13 +233,13 @@ static const struct eth_dev_ops virtual_ethdev_default_dev_ops = { .rx_queue_release = virtual_ethdev_rx_queue_release, .tx_queue_release = virtual_ethdev_tx_queue_release, .link_update = virtual_ethdev_link_update_success, + .mac_addr_set = virtual_ethdev_mac_address_set, .stats_get = virtual_ethdev_stats_get, .stats_reset = virtual_ethdev_stats_reset, .promiscuous_enable = virtual_ethdev_promiscuous_mode_enable, .promiscuous_disable = virtual_ethdev_promiscuous_mode_disable }; - void virtual_ethdev_start_fn_set_success(uint16_t port_id, uint8_t success) { -- 2.7.5
[dpdk-dev] [PATCH v2] test/bonding: assign non-zero MAC to null devices
Prevent failure in rte_eth_dev_default_mac_addr_set() that resunts in bonding add slave failure. Signed-off-by: Radu Nicolau --- v2: update commit message test/test/test_link_bonding_rssconf.c | 5 + 1 file changed, 5 insertions(+) diff --git a/test/test/test_link_bonding_rssconf.c b/test/test/test_link_bonding_rssconf.c index cf9c4b0..518c4c1 100644 --- a/test/test/test_link_bonding_rssconf.c +++ b/test/test/test_link_bonding_rssconf.c @@ -505,6 +505,7 @@ test_setup(void) int port_id; char name[256]; struct slave_conf *port; + struct ether_addr mac_addr = {0}; if (test_params.mbuf_pool == NULL) { @@ -536,6 +537,10 @@ test_setup(void) TEST_ASSERT_SUCCESS(retval, "Failed to configure virtual ethdev %s\n", name); + /* assign a non-zero MAC */ + mac_addr.addr_bytes[5] = 0x10 + port->port_id; + rte_eth_dev_default_mac_addr_set(port->port_id, &mac_addr); + rte_eth_dev_info_get(port->port_id, &port->dev_info); } -- 2.7.5
[dpdk-dev] [PATCH v2] net/failsafe: fix strerror call in sub-eal
Ownership API returns a negative value, strerror expects a valid errno value, thus positive. Coverity issue: 260401 Fixes: dcd0c9c32b8d ("net/failsafe: use ownership mechanism for slaves") Signed-off-by: Gaetan Rivet --- v2: Fix coverity reference syntax in commit log. drivers/net/failsafe/failsafe_eal.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c index 8946bf6fe..c3d673125 100644 --- a/drivers/net/failsafe/failsafe_eal.c +++ b/drivers/net/failsafe/failsafe_eal.c @@ -79,9 +79,9 @@ fs_bus_init(struct rte_eth_dev *dev) " %d named %s", i, da->name); } ret = rte_eth_dev_owner_set(pid, &PRIV(dev)->my_owner); - if (ret) { + if (ret < 0) { INFO("sub_device %d owner set failed (%s)," -" will try again later", i, strerror(ret)); +" will try again later", i, strerror(-ret)); continue; } else if (strncmp(rte_eth_devices[pid].device->name, da->name, strlen(da->name)) != 0) { -- 2.11.0
[dpdk-dev] [RFC v4 1/1] lib/compressdev: Adding hash support
Added hash support in lib compressdev. It's an incremental patch to compression lib RFC v3 https://dpdk.org/dev/patchwork/patch/32331/ Changes from RFC v3: - Added hash algo enumeration and associated capability stucture and params in xform and rte_comp_op - Rearranged rte_compresdev_capability structure to have separate rte_comp_algo_capability and moved algo specific capabilities: window_size, dictionary support, and hash as part of it - Added RTE_COMP_UNSPECIFIED=0 in enum rte_comp_algorithm - Redefined RTE_COMP_END_OF_CAPABILITIES_LIST to use RTE_COMP_UNSPECIFIED to resolve missing-field-initializer compiler warning - Updated compress/decompress xform to input hash algorithm during session init - Updated struct rte_comp_op to input hash buffer - Fixed checkpatch reported errors on RFCv3 Every compression algorithm can indicate its capability to perform alongside hash in its associate rte_comp_algo_capa structure. If none is supported, can terminate array with hash_algo = RTE_COMP_HASH_ALGO_UNSPECIFIED. if supported, application can initialize session with desired algorithm enumeration in xform structure and pass valid hash buffer pointer during enqueue_burst(). Signed-off-by: Shally Verma --- lib/librte_compressdev/rte_comp.h | 83 +- lib/librte_compressdev/rte_compressdev.c | 19 + lib/librte_compressdev/rte_compressdev.h | 59 --- lib/librte_compressdev/rte_compressdev_version.map | 1 + 4 files changed, 137 insertions(+), 25 deletions(-) diff --git a/lib/librte_compressdev/rte_comp.h b/lib/librte_compressdev/rte_comp.h index ca8cbb4..341f59f 100644 --- a/lib/librte_compressdev/rte_comp.h +++ b/lib/librte_compressdev/rte_comp.h @@ -75,10 +75,11 @@ enum rte_comp_op_status { */ }; - /** Compression Algorithms */ enum rte_comp_algorithm { - RTE_COMP_NULL = 0, + RTE_COMP_UNSPECIFIED = 0, + /** No Compression algo */ + RTE_COMP_NULL, /**< No compression. * Pass-through, data is copied unchanged from source buffer to * destination buffer. @@ -94,6 +95,18 @@ enum rte_comp_algorithm { RTE_COMP_ALGO_LIST_END }; + +/** Compression hash algorithms */ +enum rte_comp_hash_algorithm { + RTE_COMP_HASH_ALGO_UNSPECIFIED = 0, + /**< No hash */ + RTE_COMP_HASH_ALGO_SHA1, + /**< SHA1 hash algorithm */ + RTE_COMP_HASH_ALGO_SHA256, + /**< SHA256 hash algorithm */ + RTE_COMP_HASH_ALGO_LIST_END, +}; + /**< Compression Level. * The number is interpreted by each PMD differently. However, lower numbers * give fastest compression, at the expense of compression ratio while @@ -154,21 +167,24 @@ enum rte_comp_flush_flag { RTE_COMP_FLUSH_SYNC, /**< All data should be flushed to output buffer. Output data can be * decompressed. However state and history is not cleared, so future -* ops may use history from this op */ +* ops may use history from this op +*/ RTE_COMP_FLUSH_FULL, /**< All data should be flushed to output buffer. Output data can be * decompressed. State and history data is cleared, so future * ops will be independent of ops processed before this. */ RTE_COMP_FLUSH_FINAL - /**< Same as RTE_COMP_FLUSH_FULL but also bfinal bit is set in last block + /**< Same as RTE_COMP_FLUSH_FULL but also bfinal bit is set in +* last block */ /* TODO: * describe flag meanings for decompression. * describe behavous in OUT_OF_SPACE case. * At least the last flag is specific to deflate algo. Should this be * called rte_comp_deflate_flush_flag? And should there be - * comp_op_deflate_params in the op? */ + * comp_op_deflate_params in the op? + */ }; /** Compression transform types */ @@ -180,17 +196,17 @@ enum rte_comp_xform_type { }; enum rte_comp_op_type { -RTE_COMP_OP_STATELESS, -/**< All data to be processed is submitted in the op, no state or history - * from previous ops is used and none will be stored for future ops. - * flush must be set to either FLUSH_FULL or FLUSH_FINAL - */ -RTE_COMP_OP_STATEFUL -/**< There may be more data to be processed after this op, it's part of a - * stream of data. State and history from previous ops can be used - * and resulting state and history can be stored for future ops, - * depending on flush_flag. - */ + RTE_COMP_OP_STATELESS, + /**< All data to be processed is submitted in the op, no state or +* history from previous ops is used and none will be stored for +* future ops.flush must be set to either FLUSH_FULL or FLUSH_FINAL +*/ + RTE_COMP_OP_STATEFUL + /**< There may be more data to be processed after this op, it's +* part of a stream of data. State and history from previous ops +
[dpdk-dev] [PATCH v3] test/bonding: assign non-zero MAC to null devices
Prevent failure in rte_eth_dev_default_mac_addr_set() that resunts in bonding add slave failure. Fixes: aa7791ba8de0 ("net/bonding: fix setting slave MAC addresses") Signed-off-by: Radu Nicolau --- v3: update commit message test/test/test_link_bonding_rssconf.c | 5 + 1 file changed, 5 insertions(+) diff --git a/test/test/test_link_bonding_rssconf.c b/test/test/test_link_bonding_rssconf.c index cf9c4b0..518c4c1 100644 --- a/test/test/test_link_bonding_rssconf.c +++ b/test/test/test_link_bonding_rssconf.c @@ -505,6 +505,7 @@ test_setup(void) int port_id; char name[256]; struct slave_conf *port; + struct ether_addr mac_addr = {0}; if (test_params.mbuf_pool == NULL) { @@ -536,6 +537,10 @@ test_setup(void) TEST_ASSERT_SUCCESS(retval, "Failed to configure virtual ethdev %s\n", name); + /* assign a non-zero MAC */ + mac_addr.addr_bytes[5] = 0x10 + port->port_id; + rte_eth_dev_default_mac_addr_set(port->port_id, &mac_addr); + rte_eth_dev_info_get(port->port_id, &port->dev_info); } -- 2.7.5
[dpdk-dev] FW: [RFC v1 1/1] lib/cryptodev: add support of asymmetric crypto
Hi Pablo/Fiona Could you please provide your input on this RFC. Your feedback is awaited. Thanks Shally -Original Message- From: Verma, Shally Sent: 23 January 2018 15:24 To: declan.dohe...@intel.com Cc: dev@dpdk.org; Athreya, Narayana Prasad ; Murthy, Nidadavolu ; Sahu, Sunila ; Gupta, Ashish ; Verma, Shally Subject: [RFC v1 1/1] lib/cryptodev: add support of asymmetric crypto From: Shally Verma Add support for asymmetric crypto operations in DPDK lib cryptodev Key feature include: - Only session based asymmetric crypto operations - new get and set APIs for symmetric and asymmetric session private data and other informations - APIs to create, configure and attch queue pair to asymmetric sessions - new capabilities in struct device_info to indicate -- number of dedicated queue pairs available for symmetric and asymmetric operations, if any -- number of asymmetric sessions possible per qp Proposed asymmetric cryptographic operations are: - rsa - dsa - deffie-hellman key pair generation and shared key computation - ecdeffie-hellman - fundamental elliptic curve operations - elliptic curve DSA - modular exponentiation and inversion This patch primarily defines PMD operations and device capabilities to perform asymmetric crypto ops on queue pairs and intend to invite feedbacks on current proposal so as to ensure it encompass all kind of crypto devices with different capabilities and queue pair management. List of TBDs: - Currently, patch only updated for RSA xform and associated params. Other algoritms to be added in subsequent versions. - per-service stats update Signed-off-by: Shally Verma --- It is derivative of RFC v2 asymmetric crypto patch series initiated by Umesh Kartha(mailto:umesh.kar...@caviumnetworks.com): http://dpdk.org/dev/patchwork/patch/24245/ http://dpdk.org/dev/patchwork/patch/24246/ http://dpdk.org/dev/patchwork/patch/24247/ And inclusive of all review comments given on RFC v2. ( See complete discussion thread here: http://dev.dpdk.narkive.com/yqTFFLHw/dpdk-dev-rfc-specifications-for-asymmetric-crypto-algorithms#post12) Some of the RFCv2 Review comments pending for closure: > " [Fiona] The count fn isn't used at all for sym - probably no need to add > for asym better instead to remove the sym fn." It is still present in dpdk-next-crypto for sym, so what has been decision on it? >"[Fiona] if each qp can handle only a specific service, i.e. a subset off the >capabilities Indicated by the device capability list, there's a need for a new API to query the capability of a qp." Current proposal doesn’t distinguish between device capability and qp capability. It rather leave such differences handling internal to PMDs. Thus no capability or API added for qp in current version. It is subject to revisit based on review feedback on current proposal. - Sessionless Support. Current proposal only support Session-based because: 1. All one-time setup i.e. algos and associated params, such as, public-private keys or modulus length can be done in control path using session-init API 2. it’s an easier way to dedicate qp to do specific service (using queue_pair_attach()) which cannot be case in sessionless 3. Couldn’t find any significant advantage going sessionless way. Also existing most of PMDs are session-based. It could be added in subsequent versions, if requirement is identified, based on review comment on this RFC. Summary --- This section provides an overview of key feature enabled in current specification. It comprise of key design challenges as have been identified on RFCv2 and summary description of new interfaces and definitions added to handle same. Description --- This API set assumes that the max_nb_queue_pairs on a device can be allocated to any mix of sym or asym. Some devices may have a fixed max per service. Thus, rte_cryptodev_info is updated with max_sym_nb_queues and max_asym_nb_queues with rule: max_nb_queue_pair = max_nb_sym_qp + max_nb_asym_qp. If device has no restrictions on qp to be used per service, such PMDs can leave max_nb_sym_qp = max_nb_asym_qp = 0. In such case, application can setup any of the service upto limit defined by max_nb_queue_pair. Here, max_nb_sym_qp and max_nb_asym_qp, if non-zero, just define limit on qp which are available for each service and *are not* ids to be used during qp setup and enqueue i.e. if device supports both symmetric and asymmetric with n qp, then any of them can be configured for symmetric or asymmetric subject to limit defined by max_nb_sym_qp and max_nb_asym_qp. For example, if device has 6 queues and 5 for symmetric and 1 for asymmetric that imply application can setup only 1 out of any 6 qp for asymmetric and rest for symmetric. Additionally, application can dedicate qp to perform specific service via optional queue_pair_attach_sym/asym_session() API. Except the one
[dpdk-dev] [PATCH] ethdev: fix comments for offload capabilites
Indeed, rx_offload_capa or tx_offload_capa in struct rte_eth_dev_info includes not only per port offloading features but also per queue ones. This patch make its meaning much clearer. Fixes: ce17eddefc20 ("ethdev: introduce Rx queue offloads API") Fixes: cba7f53b717d ("ethdev: introduce Tx queue offloads API") Cc: sta...@dpdk.org Signed-off-by: Wei Dai --- lib/librte_ether/rte_ethdev.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 0361533..6ab6552 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -1006,9 +1006,11 @@ struct rte_eth_dev_info { uint16_t max_vfs; /**< Maximum number of VFs. */ uint16_t max_vmdq_pools; /**< Maximum number of VMDq pools. */ uint64_t rx_offload_capa; - /**< Device per port RX offload capabilities. */ + /**< Rx offload capabilities including all per port ones + and all per queue ones. */ uint64_t tx_offload_capa; - /**< Device per port TX offload capabilities. */ + /**< Tx offload capabilities including all per port ones + and all per queue ones. */ uint64_t rx_queue_offload_capa; /**< Device per queue RX offload capabilities. */ uint64_t tx_queue_offload_capa; -- 2.7.5
[dpdk-dev] [PATCH 1/2] doc: update mlx PMD release notes
Signed-off-by: Shahaf Shuler --- doc/guides/rel_notes/release_18_02.rst | 27 +++ 1 file changed, 27 insertions(+) diff --git a/doc/guides/rel_notes/release_18_02.rst b/doc/guides/rel_notes/release_18_02.rst index 689080bed..714a24388 100644 --- a/doc/guides/rel_notes/release_18_02.rst +++ b/doc/guides/rel_notes/release_18_02.rst @@ -50,6 +50,33 @@ New Features exiting. Not calling this function could result in leaking hugepages, leading to failure during initialization of secondary processes. +* **Updated mlx5 driver.** + + Updated the mlx5 driver including the following changes: + + * Enabled compilation as a plugin, thus removed the mandatory dependency with rdma-core. +With the special compilation, the rdma-core libraries will be loaded only in case +Mellanox device is being used. For binaries creation the PMD can be enabled, still not +requiring from every end user to install rdma-core. + * Improved multi-segment packet performance. + * Changed driver name to use the PCI address to be compatible with OVS-DPDK APIs. + * Extended statistics for physical port packet/byte counters. + * Supported IPv4 time-to-live filter. + * Converted to the new offloads API. + * Supported device removal check operation. + +* **Updated mlx4 driver.** + + Updated the mlx4 driver including the following changes: + + * Enabled compilation as a plugin, thus removed the mandatory dependency with rdma-core. +With the special compilation, the rdma-core libraries will be loaded only in case +Mellanox device is being used. For binaries creation the PMD can be enabled, still not +requiring from every end user to install rdma-core. + * Improved data path performance. + * Converted to the new offloads API. + * Supported device removal check operation. + * **Added the ixgbe ethernet driver to support RSS with flow API.** Rte_flow actually defined to include RSS, but till now, RSS is out of -- 2.12.0
[dpdk-dev] [PATCH 2/2] doc: update mlx5 required OFED version
Signed-off-by: Shahaf Shuler --- doc/guides/nics/mlx5.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index a9e4bf51a..b2376363b 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -131,7 +131,7 @@ Limitations - A multi segment packet must have less than 6 segments in case the Tx burst function is set to multi-packet send or Enhanced multi-packet send. Otherwise it must have less than 50 segments. -- Count action for RTE flow is only supported in Mellanox OFED 4.2. +- Count action for RTE flow is only supported in Mellanox OFED 4.2 and above. - Flows with a VXLAN Network Identifier equal (or ends to be equal) to 0 are not supported. - VXLAN TSO and checksum offloads are not supported on VM. @@ -392,7 +392,7 @@ RMDA Core with Linux Kernel Mellanox OFED ^ -- Mellanox OFED version: **4.2**. +- Mellanox OFED version: **4.2, 4.3**. - firmware version: - ConnectX-4: **12.21.1000** and above. -- 2.12.0
[dpdk-dev] [PATCH v3 0/3] net/i40e: fix multiple driver support issue
DPDK i40e PMD will modify some global registers during initialization and post initialization, there'll be impact during use of 700 series Ethernet Adapter with both Linux kernel and DPDK PMD. This patchset adds log for global configuration and adds device args to disable global configuration. v3 changes: - Reword commit log. v2 changes: - Add debug log when writing global registers - Add option to disable writing global registers Beilei Xing (3): net/i40e: add warnings when writing global registers net/i40e: add debug logs when writing global registers net/i40e: fix multiple driver support issue doc/guides/nics/i40e.rst | 12 ++ drivers/net/i40e/i40e_ethdev.c | 384 + drivers/net/i40e/i40e_ethdev.h | 55 ++ drivers/net/i40e/i40e_fdir.c | 40 +++-- drivers/net/i40e/i40e_flow.c | 9 + 5 files changed, 413 insertions(+), 87 deletions(-) -- 2.5.5
[dpdk-dev] [PATCH v3 1/3] net/i40e: add warnings when writing global registers
Add warnings when writing global registers. Signed-off-by: Beilei Xing --- doc/guides/nics/i40e.rst | 12 drivers/net/i40e/i40e_ethdev.c | 25 drivers/net/i40e/i40e_ethdev.h | 43 ++ drivers/net/i40e/i40e_fdir.c | 1 + drivers/net/i40e/i40e_flow.c | 1 + 5 files changed, 82 insertions(+) diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index 29601f1..166f447 100644 --- a/doc/guides/nics/i40e.rst +++ b/doc/guides/nics/i40e.rst @@ -566,6 +566,18 @@ DCB function DCB works only when RSS is enabled. +Global configuration warning + + +I40E PMD will set some global registers to enable some function or set some +configure. Then when using different ports of the same NIC with Linux kernel +and DPDK, the port with Linux kernel will be impacted by the port with DPDK. +For example, register I40E_GL_SWT_L2TAGCTRL is used to control L2 tag, i40e +PMD uses I40E_GL_SWT_L2TAGCTRL to set vlan TPID. If setting TPID in port A +with DPDK, then the configuration will also impact port B in the NIC with +kernel driver, which don't want to use the TPID. +So PMD reports warning to clarify what is changed by writing global register. + High Performance of Small Packets on 40G NIC diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index 277c1a8..b4a2857 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -680,6 +680,7 @@ static inline void i40e_GLQF_reg_init(struct i40e_hw *hw) */ I40E_WRITE_REG(hw, I40E_GLQF_ORT(40), 0x0029); I40E_WRITE_REG(hw, I40E_GLQF_PIT(9), 0x9420); + i40e_global_cfg_warning(I40E_WARNING_QINQ_PARSER); } #define I40E_FLOW_CONTROL_ETHERTYPE 0x8808 @@ -1133,6 +1134,7 @@ eth_i40e_dev_init(struct rte_eth_dev *dev) 0x0028, NULL); if (ret) PMD_INIT_LOG(ERR, "Failed to write L3 MAP register %d", ret); + i40e_global_cfg_warning(I40E_WARNING_QINQ_CLOUD_FILTER); /* Need the special FW version to support floating VEB */ config_floating_veb(dev); @@ -1413,6 +1415,7 @@ void i40e_flex_payload_reg_set_default(struct i40e_hw *hw) I40E_WRITE_REG(hw, I40E_GLQF_ORT(33), 0x); I40E_WRITE_REG(hw, I40E_GLQF_ORT(34), 0x); I40E_WRITE_REG(hw, I40E_GLQF_ORT(35), 0x); + i40e_global_cfg_warning(I40E_WARNING_DIS_FLX_PLD); } static int @@ -3260,6 +3263,7 @@ i40e_vlan_tpid_set(struct rte_eth_dev *dev, /* If NVM API < 1.7, keep the register setting */ ret = i40e_vlan_tpid_set_by_registers(dev, vlan_type, tpid, qinq); + i40e_global_cfg_warning(I40E_WARNING_TPID); return ret; } @@ -3502,6 +3506,7 @@ i40e_flow_ctrl_set(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf) I40E_WRITE_REG(hw, I40E_GLRPB_GLW, pf->fc_conf.low_water[I40E_MAX_TRAFFIC_CLASS] << I40E_KILOSHIFT); + i40e_global_cfg_warning(I40E_WARNING_FLOW_CTL); I40E_WRITE_FLUSH(hw); @@ -7284,6 +7289,8 @@ i40e_status_code i40e_replace_mpls_l1_filter(struct i40e_pf *pf) status = i40e_aq_replace_cloud_filters(hw, &filter_replace, &filter_replace_buf); + if (!status) + i40e_global_cfg_warning(I40E_WARNING_RPL_CLD_FILTER); return status; } @@ -7338,6 +7345,8 @@ i40e_status_code i40e_replace_mpls_cloud_filter(struct i40e_pf *pf) status = i40e_aq_replace_cloud_filters(hw, &filter_replace, &filter_replace_buf); + if (!status) + i40e_global_cfg_warning(I40E_WARNING_RPL_CLD_FILTER); return status; } @@ -7405,6 +7414,8 @@ i40e_replace_gtp_l1_filter(struct i40e_pf *pf) status = i40e_aq_replace_cloud_filters(hw, &filter_replace, &filter_replace_buf); + if (!status) + i40e_global_cfg_warning(I40E_WARNING_RPL_CLD_FILTER); return status; } @@ -7457,6 +7468,8 @@ i40e_status_code i40e_replace_gtp_cloud_filter(struct i40e_pf *pf) status = i40e_aq_replace_cloud_filters(hw, &filter_replace, &filter_replace_buf); + if (!status) + i40e_global_cfg_warning(I40E_WARNING_RPL_CLD_FILTER); return status; } @@ -8006,6 +8019,7 @@ i40e_dev_set_gre_key_len(struct i40e_hw *hw, uint8_t len) reg, NULL); if (ret != 0) return ret; + i40e_global_cfg_warning(I40E_WARNING_GRE_KEY_LEN); } else { ret = 0; } @@ -8265,6 +8279,7 @@ i40e_set_has
[dpdk-dev] [PATCH v3 3/3] net/i40e: fix multiple driver support issue
This patch provides the option to disable writing some global registers in PMD, in order to avoid affecting other drivers, when multiple drivers run on the same NIC and control different physical ports. Because there are few global resources shared among different physical ports. Fixes: ec246eeb5da1 ("i40e: use default filter input set on init") Fixes: 98f055707685 ("i40e: configure input fields for RSS or flow director") Fixes: f05ec7d77e41 ("i40e: initialize flow director flexible payload setting") Fixes: e536c2e32883 ("net/i40e: fix parsing QinQ packets type") Fixes: 19b16e2f6442 ("ethdev: add vlan type when setting ether type") Signed-off-by: Beilei Xing --- drivers/net/i40e/i40e_ethdev.c | 260 - drivers/net/i40e/i40e_ethdev.h | 1 + drivers/net/i40e/i40e_fdir.c | 39 --- drivers/net/i40e/i40e_flow.c | 8 ++ 4 files changed, 235 insertions(+), 73 deletions(-) diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index aad00aa..b73b742 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -1039,6 +1039,64 @@ i40e_init_queue_region_conf(struct rte_eth_dev *dev) memset(info, 0, sizeof(struct i40e_queue_regions)); } +#define ETH_I40E_DISABLE_GLOBAL_CFG"disable-global-cfg" + +static int +i40e_parse_global_cfg_handler(__rte_unused const char *key, + const char *value, + void *opaque) +{ + struct i40e_pf *pf; + unsigned long dis_global_cfg; + char *end; + + pf = (struct i40e_pf *)opaque; + + errno = 0; + dis_global_cfg = strtoul(value, &end, 10); + if (errno != 0 || end == value || *end != 0) { + PMD_DRV_LOG(WARNING, "Wrong global configuration"); + return -(EINVAL); + } + + if (dis_global_cfg == 1 || dis_global_cfg == 0) + pf->dis_global_cfg = (bool)dis_global_cfg; + else + PMD_DRV_LOG(WARNING, "%s must be 1 or 0,", + "enable global configuration by default." + ETH_I40E_DISABLE_GLOBAL_CFG); + return 0; +} + +static int +i40e_disable_global_cfg(struct rte_eth_dev *dev) +{ + struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); + static const char *const valid_keys[] = { + ETH_I40E_DISABLE_GLOBAL_CFG, NULL}; + struct rte_kvargs *kvlist; + + /* Enable global configuration by default */ + pf->dis_global_cfg = false; + + if (!dev->device->devargs) + return 0; + + kvlist = rte_kvargs_parse(dev->device->devargs->args, valid_keys); + if (!kvlist) + return -EINVAL; + + if (rte_kvargs_count(kvlist, ETH_I40E_DISABLE_GLOBAL_CFG) > 1) + PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only " + "the first invalid or last valid one is used !", + ETH_I40E_DISABLE_GLOBAL_CFG); + + rte_kvargs_process(kvlist, ETH_I40E_DISABLE_GLOBAL_CFG, + i40e_parse_global_cfg_handler, pf); + rte_kvargs_free(kvlist); + return 0; +} + static int eth_i40e_dev_init(struct rte_eth_dev *dev) { @@ -1092,6 +1150,9 @@ eth_i40e_dev_init(struct rte_eth_dev *dev) hw->bus.func = pci_dev->addr.function; hw->adapter_stopped = 0; + /* Check if need to disable global registers configuration */ + i40e_disable_global_cfg(dev); + /* Make sure all is clean before doing PF reset */ i40e_clear_hw(hw); @@ -1119,7 +1180,8 @@ eth_i40e_dev_init(struct rte_eth_dev *dev) * for packet type of QinQ by software. * It should be removed once issues are fixed in NVM. */ - i40e_GLQF_reg_init(hw); + if (!pf->dis_global_cfg) + i40e_GLQF_reg_init(hw); /* Initialize the input set for filters (hash and fd) to default value */ i40e_filter_input_set_init(pf); @@ -1139,13 +1201,17 @@ eth_i40e_dev_init(struct rte_eth_dev *dev) (hw->nvm.version & 0xf), hw->nvm.eetrack); /* initialise the L3_MAP register */ - ret = i40e_aq_debug_write_register(hw, I40E_GLQF_L3_MAP(40), - 0x0028, NULL); - if (ret) - PMD_INIT_LOG(ERR, "Failed to write L3 MAP register %d", ret); - PMD_INIT_LOG(DEBUG, "Global register 0x%08x is changed with value 0x28", -I40E_GLQF_L3_MAP(40)); - i40e_global_cfg_warning(I40E_WARNING_QINQ_CLOUD_FILTER); + if (!pf->dis_global_cfg) { + ret = i40e_aq_debug_write_register(hw, I40E_GLQF_L3_MAP(40), + 0x0028, NULL); + if (ret) + PMD_INIT_LOG(ERR, "Failed to write L3 MAP register %d", +ret); + PMD_INIT_LOG(
[dpdk-dev] [PATCH v3 2/3] net/i40e: add debug logs when writing global registers
Add debug logs when writing global registers. Signed-off-by: Beilei Xing --- drivers/net/i40e/i40e_ethdev.c | 153 ++--- drivers/net/i40e/i40e_ethdev.h | 11 +++ 2 files changed, 123 insertions(+), 41 deletions(-) diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index b4a2857..aad00aa 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -656,6 +656,15 @@ rte_i40e_dev_atomic_write_link_status(struct rte_eth_dev *dev, return 0; } +static inline void +i40e_write_global_rx_ctl(struct i40e_hw *hw, u32 reg_addr, u32 reg_val) +{ + i40e_write_rx_ctl(hw, reg_addr, reg_val); + PMD_DRV_LOG(DEBUG, "Global register 0x%08x is modified " + "with value 0x%08x", + reg_addr, reg_val); +} + RTE_PMD_REGISTER_PCI(net_i40e, rte_i40e_pmd); RTE_PMD_REGISTER_PCI_TABLE(net_i40e, pci_id_i40e_map); RTE_PMD_REGISTER_KMOD_DEP(net_i40e, "* igb_uio | uio_pci_generic | vfio-pci"); @@ -678,8 +687,8 @@ static inline void i40e_GLQF_reg_init(struct i40e_hw *hw) * configuration API is added to avoid configuration conflicts * between ports of the same device. */ - I40E_WRITE_REG(hw, I40E_GLQF_ORT(40), 0x0029); - I40E_WRITE_REG(hw, I40E_GLQF_PIT(9), 0x9420); + I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(40), 0x0029); + I40E_WRITE_GLB_REG(hw, I40E_GLQF_PIT(9), 0x9420); i40e_global_cfg_warning(I40E_WARNING_QINQ_PARSER); } @@ -1134,6 +1143,8 @@ eth_i40e_dev_init(struct rte_eth_dev *dev) 0x0028, NULL); if (ret) PMD_INIT_LOG(ERR, "Failed to write L3 MAP register %d", ret); + PMD_INIT_LOG(DEBUG, "Global register 0x%08x is changed with value 0x28", +I40E_GLQF_L3_MAP(40)); i40e_global_cfg_warning(I40E_WARNING_QINQ_CLOUD_FILTER); /* Need the special FW version to support floating VEB */ @@ -1412,9 +1423,9 @@ void i40e_flex_payload_reg_set_default(struct i40e_hw *hw) * Disable by default flexible payload * for corresponding L2/L3/L4 layers. */ - I40E_WRITE_REG(hw, I40E_GLQF_ORT(33), 0x); - I40E_WRITE_REG(hw, I40E_GLQF_ORT(34), 0x); - I40E_WRITE_REG(hw, I40E_GLQF_ORT(35), 0x); + I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(33), 0x); + I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(34), 0x); + I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(35), 0x); i40e_global_cfg_warning(I40E_WARNING_DIS_FLX_PLD); } @@ -3219,8 +3230,8 @@ i40e_vlan_tpid_set_by_registers(struct rte_eth_dev *dev, return -EIO; } PMD_DRV_LOG(DEBUG, - "Debug write 0x%08"PRIx64" to I40E_GL_SWT_L2TAGCTRL[%d]", - reg_w, reg_id); + "Global register 0x%08x is changed with value 0x%08x", + I40E_GL_SWT_L2TAGCTRL(reg_id), (uint32_t)reg_w); return 0; } @@ -3494,16 +3505,16 @@ i40e_flow_ctrl_set(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf) } /* config the water marker both based on the packets and bytes */ - I40E_WRITE_REG(hw, I40E_GLRPB_PHW, + I40E_WRITE_GLB_REG(hw, I40E_GLRPB_PHW, (pf->fc_conf.high_water[I40E_MAX_TRAFFIC_CLASS] << I40E_KILOSHIFT) / I40E_PACKET_AVERAGE_SIZE); - I40E_WRITE_REG(hw, I40E_GLRPB_PLW, + I40E_WRITE_GLB_REG(hw, I40E_GLRPB_PLW, (pf->fc_conf.low_water[I40E_MAX_TRAFFIC_CLASS] << I40E_KILOSHIFT) / I40E_PACKET_AVERAGE_SIZE); - I40E_WRITE_REG(hw, I40E_GLRPB_GHW, + I40E_WRITE_GLB_REG(hw, I40E_GLRPB_GHW, pf->fc_conf.high_water[I40E_MAX_TRAFFIC_CLASS] << I40E_KILOSHIFT); - I40E_WRITE_REG(hw, I40E_GLRPB_GLW, + I40E_WRITE_GLB_REG(hw, I40E_GLRPB_GLW, pf->fc_conf.low_water[I40E_MAX_TRAFFIC_CLASS] << I40E_KILOSHIFT); i40e_global_cfg_warning(I40E_WARNING_FLOW_CTL); @@ -7289,8 +7300,13 @@ i40e_status_code i40e_replace_mpls_l1_filter(struct i40e_pf *pf) status = i40e_aq_replace_cloud_filters(hw, &filter_replace, &filter_replace_buf); - if (!status) + if (!status) { i40e_global_cfg_warning(I40E_WARNING_RPL_CLD_FILTER); + PMD_DRV_LOG(DEBUG, "Global configuration modification: " + "cloud l1 type is changed from 0x%x to 0x%x", + filter_replace.old_filter_type, + filter_replace.new_filter_type); + } return status; } @@ -7323,6 +7339,10 @@ i40e_status_code i40e_replace_mpls_cloud_filter(struct i40e_pf *pf) &filter_replace_buf); if (status < 0)
Re: [dpdk-dev] [PATCH] net/i40e: fix missing deps for avx2 code in meson
On Wed, Jan 31, 2018 at 05:09:05PM +, Bruce Richardson wrote: > The AVX2 code path includes files from the ethdev, hash and kvargs libs. > These are not listed as dependencies in the case where AVX2 is not in > the default instruction set for the build e.g. machine=nehalem. This > leads to compiler errors as the header files needed cannot be found. > > Fixes: e940646b20fa ("drivers/net: build Intel NIC PMDs with meson") > > Signed-off-by: Bruce Richardson > --- Bug fix applied to dpdk-next-build /Bruce
Re: [dpdk-dev] [RFC v2] doc compression API for DPDK
> >[Fiona] I propose if BFINAL bit is detected before end of input > >the decompression should stop. In this case consumed will be < src.length. > >produced will be < dst buffer size. Do we need an extra STATUS response? > >STATUS_BFINAL_DETECTED ? > [Shally] @fiona, I assume you mean here decompressor stop after processing > Final block right? [Fiona] Yes. And if yes, > and if it can process that final block successfully/unsuccessfully, then > status could simply be > SUCCESS/FAILED. > I don't see need of specific return code for this use case. Just to share, in > past, we have practically run into > such cases with boost lib, and decompressor has simply worked this way. [Fiona] I'm ok with this. > >Only thing I don't like this is it can impact on performance, as normally > >we can just look for STATUS == SUCCESS. Anything else should be an exception. > >Now the application would have to check for SUCCESS || BFINAL_DETECTED every > >time. > >Do you have a suggestion on how we should handle this? > >
[dpdk-dev] [PATCH v2 2/3] net/i40e: add debug logs when writing global registers
Add debug logs when writing global registers. Signed-off-by: Beilei Xing Cc: sta...@dpdk.org --- drivers/net/i40e/i40e_ethdev.c | 131 ++--- drivers/net/i40e/i40e_ethdev.h | 9 +++ 2 files changed, 92 insertions(+), 48 deletions(-) diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index 44821f2..6d6d6d2 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -716,6 +716,15 @@ rte_i40e_dev_atomic_write_link_status(struct rte_eth_dev *dev, return 0; } +static inline void +i40e_write_global_rx_ctl(struct i40e_hw *hw, u32 reg_addr, u32 reg_val) +{ + i40e_write_rx_ctl(hw, reg_addr, reg_val); + PMD_DRV_LOG(DEBUG, "Global register 0x%08x is modified " + "with value 0x%08x", + reg_addr, reg_val); +} + RTE_PMD_REGISTER_PCI(net_i40e, rte_i40e_pmd.pci_drv); RTE_PMD_REGISTER_PCI_TABLE(net_i40e, pci_id_i40e_map); @@ -735,9 +744,9 @@ static inline void i40e_GLQF_reg_init(struct i40e_hw *hw) * configuration API is added to avoid configuration conflicts * between ports of the same device. */ - I40E_WRITE_REG(hw, I40E_GLQF_ORT(33), 0x00E0); - I40E_WRITE_REG(hw, I40E_GLQF_ORT(34), 0x00E3); - I40E_WRITE_REG(hw, I40E_GLQF_ORT(35), 0x00E6); + I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(33), 0x00E0); + I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(34), 0x00E3); + I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(35), 0x00E6); i40e_global_cfg_warning(I40E_WARNING_ENA_FLX_PLD); /* @@ -746,8 +755,8 @@ static inline void i40e_GLQF_reg_init(struct i40e_hw *hw) * configuration API is added to avoid configuration conflicts * between ports of the same device. */ - I40E_WRITE_REG(hw, I40E_GLQF_ORT(40), 0x0029); - I40E_WRITE_REG(hw, I40E_GLQF_PIT(9), 0x9420); + I40E_WRITE_GLB_REG(hw, I40E_GLQF_ORT(40), 0x0029); + I40E_WRITE_GLB_REG(hw, I40E_GLQF_PIT(9), 0x9420); i40e_global_cfg_warning(I40E_WARNING_QINQ_PARSER); } @@ -2799,8 +2808,9 @@ i40e_vlan_tpid_set(struct rte_eth_dev *dev, "I40E_GL_SWT_L2TAGCTRL[%d]", reg_id); return ret; } - PMD_DRV_LOG(DEBUG, "Debug write 0x%08"PRIx64" to " - "I40E_GL_SWT_L2TAGCTRL[%d]", reg_w, reg_id); + PMD_DRV_LOG(DEBUG, + "Global register 0x%08x is changed with value 0x%08x" + I40E_GL_SWT_L2TAGCTRL(reg_id), (uint32_t)reg_w); i40e_global_cfg_warning(I40E_WARNING_TPID); @@ -3030,16 +3040,16 @@ i40e_flow_ctrl_set(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf) } /* config the water marker both based on the packets and bytes */ - I40E_WRITE_REG(hw, I40E_GLRPB_PHW, + I40E_WRITE_GLB_REG(hw, I40E_GLRPB_PHW, (pf->fc_conf.high_water[I40E_MAX_TRAFFIC_CLASS] << I40E_KILOSHIFT) / I40E_PACKET_AVERAGE_SIZE); - I40E_WRITE_REG(hw, I40E_GLRPB_PLW, + I40E_WRITE_GLB_REG(hw, I40E_GLRPB_PLW, (pf->fc_conf.low_water[I40E_MAX_TRAFFIC_CLASS] << I40E_KILOSHIFT) / I40E_PACKET_AVERAGE_SIZE); - I40E_WRITE_REG(hw, I40E_GLRPB_GHW, + I40E_WRITE_GLB_REG(hw, I40E_GLRPB_GHW, pf->fc_conf.high_water[I40E_MAX_TRAFFIC_CLASS] << I40E_KILOSHIFT); - I40E_WRITE_REG(hw, I40E_GLRPB_GLW, + I40E_WRITE_GLB_REG(hw, I40E_GLRPB_GLW, pf->fc_conf.low_water[I40E_MAX_TRAFFIC_CLASS] << I40E_KILOSHIFT); i40e_global_cfg_warning(I40E_WARNING_FLOW_CTL); @@ -6880,6 +6890,9 @@ i40e_dev_set_gre_key_len(struct i40e_hw *hw, uint8_t len) reg, NULL); if (ret != 0) return ret; + PMD_DRV_LOG(DEBUG, "Global register 0x%08x is changed " + "with value 0x%08x", + I40E_GL_PRS_FVBM(2), reg); i40e_global_cfg_warning(I40E_WARNING_GRE_KEY_LEN); } else { ret = 0; @@ -7124,41 +7137,43 @@ i40e_set_hash_filter_global_config(struct i40e_hw *hw, I40E_GLQF_HSYM_SYMH_ENA_MASK : 0; if (hw->mac.type == I40E_MAC_X722) { if (pctype == I40E_FILTER_PCTYPE_NONF_IPV4_UDP) { - i40e_write_rx_ctl(hw, I40E_GLQF_HSYM( + i40e_write_global_rx_ctl(hw, I40E_GLQF_HSYM( I40E_FILTER_PCTYPE_NONF_IPV4_UDP), reg); - i40e_write_rx_ctl(hw, I40E_GLQF_HSYM( + i40e_write_global_rx_ctl(hw, I40E_GLQF_HSYM( I40E_FILTER_PCTYPE_NONF_UNICAST_IPV4_UDP),
[dpdk-dev] [PATCH v2 0/3] net/i40e: fix multiple driver support issue
DPDK i40e PMD will modify some global registers during initialization and post initialization, there'll be impact during use of 700 series Ethernet Adapter with both Linux kernel and DPDK PMD. This patchset adds logs for global configuration and adds device args to disable global configuration. This patchset is based on 16.11.4 LTS. Commit id: 516447a5056c093e4d020011a69216b453576782 v2 changes: - Add warning logs and debug logs. Beilei Xing (3): net/i40e: add warnings when writing global registers net/i40e: add debug logs when writing global registers net/i40e: fix multiple driver support issue doc/guides/nics/i40e.rst | 12 ++ drivers/net/i40e/i40e_ethdev.c | 319 +++-- drivers/net/i40e/i40e_ethdev.h | 54 +++ 3 files changed, 309 insertions(+), 76 deletions(-) -- 2.5.5
[dpdk-dev] [PATCH v2 1/3] net/i40e: add warnings when writing global registers
Add warnings when writing global registers. Signed-off-by: Beilei Xing Cc: sta...@dpdk.org --- doc/guides/nics/i40e.rst | 12 drivers/net/i40e/i40e_ethdev.c | 15 +++ drivers/net/i40e/i40e_ethdev.h | 43 ++ 3 files changed, 70 insertions(+) diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index 5780268..68a546b 100644 --- a/doc/guides/nics/i40e.rst +++ b/doc/guides/nics/i40e.rst @@ -459,3 +459,15 @@ Receive packets with Ethertype 0x88A8 Due to the FW limitation, PF can receive packets with Ethertype 0x88A8 only when floating VEB is disabled. + +Global configuration warning + + +I40E PMD will set some global registers to enable some function or set some +configure. Then when using different ports of the same NIC with Linux kernel +and DPDK, the port with Linux kernel will be impacted by the port with DPDK. +For example, register I40E_GL_SWT_L2TAGCTRL is used to control L2 tag, i40e +PMD uses I40E_GL_SWT_L2TAGCTRL to set vlan TPID. If setting TPID in port A +with DPDK, then the configuration will also impact port B in the NIC with +kernel driver, which don't want to use the TPID. +So PMD reports warning to clarify what is changed by writing global register. diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index 0835c2d..44821f2 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -738,6 +738,7 @@ static inline void i40e_GLQF_reg_init(struct i40e_hw *hw) I40E_WRITE_REG(hw, I40E_GLQF_ORT(33), 0x00E0); I40E_WRITE_REG(hw, I40E_GLQF_ORT(34), 0x00E3); I40E_WRITE_REG(hw, I40E_GLQF_ORT(35), 0x00E6); + i40e_global_cfg_warning(I40E_WARNING_ENA_FLX_PLD); /* * Initialize registers for parsing packet type of QinQ @@ -747,6 +748,7 @@ static inline void i40e_GLQF_reg_init(struct i40e_hw *hw) */ I40E_WRITE_REG(hw, I40E_GLQF_ORT(40), 0x0029); I40E_WRITE_REG(hw, I40E_GLQF_PIT(9), 0x9420); + i40e_global_cfg_warning(I40E_WARNING_QINQ_PARSER); } #define I40E_FLOW_CONTROL_ETHERTYPE 0x8808 @@ -2800,6 +2802,8 @@ i40e_vlan_tpid_set(struct rte_eth_dev *dev, PMD_DRV_LOG(DEBUG, "Debug write 0x%08"PRIx64" to " "I40E_GL_SWT_L2TAGCTRL[%d]", reg_w, reg_id); + i40e_global_cfg_warning(I40E_WARNING_TPID); + return ret; } @@ -3038,6 +3042,7 @@ i40e_flow_ctrl_set(struct rte_eth_dev *dev, struct rte_eth_fc_conf *fc_conf) I40E_WRITE_REG(hw, I40E_GLRPB_GLW, pf->fc_conf.low_water[I40E_MAX_TRAFFIC_CLASS] << I40E_KILOSHIFT); + i40e_global_cfg_warning(I40E_WARNING_FLOW_CTL); I40E_WRITE_FLUSH(hw); @@ -6875,6 +6880,7 @@ i40e_dev_set_gre_key_len(struct i40e_hw *hw, uint8_t len) reg, NULL); if (ret != 0) return ret; + i40e_global_cfg_warning(I40E_WARNING_GRE_KEY_LEN); } else { ret = 0; } @@ -7154,6 +7160,7 @@ i40e_set_hash_filter_global_config(struct i40e_hw *hw, } else { i40e_write_rx_ctl(hw, I40E_GLQF_HSYM(pctype), reg); } + i40e_global_cfg_warning(I40E_WARNING_HSYM); } reg = i40e_read_rx_ctl(hw, I40E_GLQF_CTL); @@ -7178,6 +7185,7 @@ i40e_set_hash_filter_global_config(struct i40e_hw *hw, goto out; i40e_write_rx_ctl(hw, I40E_GLQF_CTL, reg); + i40e_global_cfg_warning(I40E_WARNING_QF_CTL); out: I40E_WRITE_FLUSH(hw); @@ -7848,6 +7856,10 @@ i40e_filter_input_set_init(struct i40e_pf *pf) pf->hash_input_set[pctype] = input_set; pf->fdir.input_set[pctype] = input_set; } + + i40e_global_cfg_warning(I40E_WARNING_HASH_INSET); + i40e_global_cfg_warning(I40E_WARNING_FD_MSK); + i40e_global_cfg_warning(I40E_WARNING_HASH_MSK); } int @@ -7913,6 +7925,7 @@ i40e_hash_filter_inset_select(struct i40e_hw *hw, i40e_check_write_reg(hw, I40E_GLQF_HASH_INSET(1, pctype), (uint32_t)((inset_reg >> I40E_32_BIT_WIDTH) & UINT32_MAX)); + i40e_global_cfg_warning(I40E_WARNING_HASH_INSET); for (i = 0; i < num; i++) i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype), @@ -7921,6 +7934,7 @@ i40e_hash_filter_inset_select(struct i40e_hw *hw, for (i = num; i < I40E_INSET_MASK_NUM_REG; i++) i40e_check_write_reg(hw, I40E_GLQF_HASH_MSK(i, pctype), 0); + i40e_global_cfg_warning(I40E_WARNING_HASH_MSK); I40E_WRITE_FLUSH(hw); pf->hash_input_set[pctype] = input_set; @@ -7999,6 +8013,7 @@ i40e_fdir_filter_inset_select(struct i40e_pf *pf, for (i = num; i < I40E_INSET_MASK_NUM_REG; i++)
[dpdk-dev] [PATCH v2 3/3] net/i40e: fix multiple driver support issue
This patch provides the option to disable writing some global registers in PMD, in order to avoid affecting other drivers, when multiple drivers run on the same NIC and control different physical ports. Because there are few global resources shared among different physical ports. Fixes: ec246eeb5da1 ("i40e: use default filter input set on init") Fixes: 98f055707685 ("i40e: configure input fields for RSS or flow director") Fixes: f05ec7d77e41 ("i40e: initialize flow director flexible payload setting") Fixes: e536c2e32883 ("net/i40e: fix parsing QinQ packets type") Fixes: 19b16e2f6442 ("ethdev: add vlan type when setting ether type") Signed-off-by: Beilei Xing Cc: sta...@dpdk.org --- drivers/net/i40e/i40e_ethdev.c | 213 +++-- drivers/net/i40e/i40e_ethdev.h | 2 + 2 files changed, 167 insertions(+), 48 deletions(-) diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index 6d6d6d2..63d26de 100644 --- a/drivers/net/i40e/i40e_ethdev.c +++ b/drivers/net/i40e/i40e_ethdev.c @@ -944,6 +944,67 @@ config_floating_veb(struct rte_eth_dev *dev) #define I40E_L2_TAGS_S_TAG_SHIFT 1 #define I40E_L2_TAGS_S_TAG_MASK I40E_MASK(0x1, I40E_L2_TAGS_S_TAG_SHIFT) +#define ETH_I40E_DISABLE_GLOBAL_CFG"disable-global-cfg" +RTE_PMD_REGISTER_PARAM_STRING(net_i40e, + ETH_I40E_DISABLE_GLOBAL_CFG "=0|1"); + +static int +i40e_parse_global_cfg_handler(__rte_unused const char *key, + const char *value, + void *opaque) +{ + struct i40e_pf *pf; + unsigned long dis_global_cfg; + char *end; + + pf = (struct i40e_pf *)opaque; + + errno = 0; + dis_global_cfg = strtoul(value, &end, 10); + if (errno != 0 || end == value || *end != 0) { + PMD_DRV_LOG(WARNING, "Wrong global configuration"); + return -(EINVAL); + } + + if (dis_global_cfg == 1 || dis_global_cfg == 0) + pf->dis_global_cfg = (bool)dis_global_cfg; + else + PMD_DRV_LOG(WARNING, "%s must be 1 or 0,", + "enable global configuration by default." + ETH_I40E_DISABLE_GLOBAL_CFG); + return 0; +} + +static int +i40e_disable_global_cfg(struct rte_eth_dev *dev) +{ + struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); + struct rte_pci_device *pci_dev = dev->pci_dev; + static const char *valid_keys[] = { + ETH_I40E_DISABLE_GLOBAL_CFG, NULL}; + struct rte_kvargs *kvlist; + + /* Enable global configuration by default */ + pf->dis_global_cfg = false; + + if (!pci_dev->device.devargs) + return 0; + + kvlist = rte_kvargs_parse(pci_dev->device.devargs->args, valid_keys); + if (!kvlist) + return -EINVAL; + + if (rte_kvargs_count(kvlist, ETH_I40E_DISABLE_GLOBAL_CFG) > 1) + PMD_DRV_LOG(WARNING, "More than one argument \"%s\" and only " + "the first invalid or last valid one is used !", + ETH_I40E_DISABLE_GLOBAL_CFG); + + rte_kvargs_process(kvlist, ETH_I40E_DISABLE_GLOBAL_CFG, + i40e_parse_global_cfg_handler, pf); + rte_kvargs_free(kvlist); + return 0; +} + static int eth_i40e_dev_init(struct rte_eth_dev *dev) { @@ -993,6 +1054,9 @@ eth_i40e_dev_init(struct rte_eth_dev *dev) hw->bus.func = pci_dev->addr.function; hw->adapter_stopped = 0; + /* Check if need to disable global registers configuration */ + i40e_disable_global_cfg(dev); + /* Make sure all is clean before doing PF reset */ i40e_clear_hw(hw); @@ -1019,7 +1083,8 @@ eth_i40e_dev_init(struct rte_eth_dev *dev) * software. It should be removed once issues are fixed * in NVM. */ - i40e_GLQF_reg_init(hw); + if (!pf->dis_global_cfg) + i40e_GLQF_reg_init(hw); /* Initialize the input set for filters (hash and fd) to default value */ i40e_filter_input_set_init(pf); @@ -1115,11 +1180,14 @@ eth_i40e_dev_init(struct rte_eth_dev *dev) i40e_set_fc(hw, &aq_fail, TRUE); /* Set the global registers with default ether type value */ - ret = i40e_vlan_tpid_set(dev, ETH_VLAN_TYPE_OUTER, ETHER_TYPE_VLAN); - if (ret != I40E_SUCCESS) { - PMD_INIT_LOG(ERR, "Failed to set the default outer " -"VLAN ether type"); - goto err_setup_pf_switch; + if (!pf->dis_global_cfg) { + ret = i40e_vlan_tpid_set(dev, ETH_VLAN_TYPE_OUTER, +ETHER_TYPE_VLAN); + if (ret != I40E_SUCCESS) { + PMD_INIT_LOG(ERR, "Failed to set the default outer " +"VLAN ether type"); + goto err_setup_pf_switch; +
[dpdk-dev] [PATCH v2] relicense various bits of the dpdk
Received a note the other day from the Linux Foundation governance board for DPDK indicating that several files I have copyright on need to be relicensed to be compliant with the DPDK licensing guidelines. I have some concerns with some parts of the request, but am not opposed to other parts. So, for those pieces that we are in consensus on, I'm proposing that we change their license from BSD 2 clause to 3 clause. I'm also updating the files to use the SPDX licensing scheme Signed-off-by: Neil Horman CC: Hemant Agrawal CC: Thomas Monjalon --- Change notes V2) Cleaned up formatting (tmonjalon) --- devtools/validate-abi.sh | 32 lib/librte_compat/rte_compat.h | 31 +++ 2 files changed, 7 insertions(+), 56 deletions(-) diff --git a/devtools/validate-abi.sh b/devtools/validate-abi.sh index 8caf43e83..ee64b08fa 100755 --- a/devtools/validate-abi.sh +++ b/devtools/validate-abi.sh @@ -1,32 +1,8 @@ #!/usr/bin/env bash -# BSD LICENSE -# -# Copyright(c) 2015 Neil Horman. All rights reserved. -# Copyright(c) 2017 6WIND S.A. -# All rights reserved. -# -# Redistribution and use in source and binary forms, with or without -# modification, are permitted provided that the following conditions -# are met: -# -# * Redistributions of source code must retain the above copyright -# notice, this list of conditions and the following disclaimer. -# * Redistributions in binary form must reproduce the above copyright -# notice, this list of conditions and the following disclaimer in -# the documentation and/or other materials provided with the -# distribution. -# -# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS -# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT -# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR -# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT -# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, -# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT -# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, -# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2015 Neil Horman. All rights reserved. +# Copyright(c) 2017 6WIND S.A. +# All rights reserved set -e diff --git a/lib/librte_compat/rte_compat.h b/lib/librte_compat/rte_compat.h index d6e79f3fc..2cdc37214 100644 --- a/lib/librte_compat/rte_compat.h +++ b/lib/librte_compat/rte_compat.h @@ -1,31 +1,6 @@ -/*- - * BSD LICENSE - * - * Copyright(c) 2015 Neil Horman . - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * - * * Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * * Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in - * the documentation and/or other materials provided with the - * distribution. - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2015 Neil Horman . + * All rights reserved. */ #ifndef _RTE_COMPAT_H_ -- 2.14.3
[dpdk-dev] [PATCH] doc: add preferred burst size support
rte_eth_rx_burst(..,nb_pkts) function has semantic that if return value is smaller than requested, application can consider it end of packet stream. Some hardware can only support smaller burst sizes which need to be advertised. Similar is the case for Tx burst. This patch adds deprecation notice for rte_eth_dev_info structure as two new members, for preferred Rx and Tx burst size would be added - impacting the size of the structure. Signed-off-by: Shreyansh Jain --- * Refer: http://dpdk.org/dev/patchwork/patch/32112 for context doc/guides/rel_notes/deprecation.rst | 8 1 file changed, 8 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index d59ad5988..575c5e770 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -59,3 +59,11 @@ Deprecation Notices be added between the producer and consumer structures. The size of the structure and the offset of the fields will remain the same on platforms with 64B cache line, but will change on other platforms. + +* ethdev: Currently, if the rte_eth_rx_burst() function returns a value less + than *nb_pkts*, the application will assume that no more packets are present. + Some of the hw queue based hardware can only support smaller burst for RX + and TX and thus break the expectation of the rx_burst API. Similar is the + case for TX burst. ``rte_eth_dev_info`` will be added with two new + paramaters, ``uint16_t pref_rx_burst`` and ``uint16_t pref_tx_burst``, + for preferred RX and TX burst sizes, respectively. -- 2.14.1
[dpdk-dev] [PATCH] doc/ip_pipeline.rst: update f_post_init and correct f_track
Update f_post_init for pipeline frontend. Move f_track from pipeline backend to pipeline frontend. Signed-off-by: longtb5 --- doc/guides/sample_app_ug/ip_pipeline.rst | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/doc/guides/sample_app_ug/ip_pipeline.rst b/doc/guides/sample_app_ug/ip_pipeline.rst index e0aa148..dd0a23f 100644 --- a/doc/guides/sample_app_ug/ip_pipeline.rst +++ b/doc/guides/sample_app_ug/ip_pipeline.rst @@ -876,8 +876,6 @@ The front-end communicates with the back-end through message queues. || | of some requests which are mandatory for all pipelines (e.g. | || | ping, statistics). | ++--++ - | f_track| Function pointer | See section Tracking pipeline output port to physical link | - ++--++ .. _table_ip_pipelines_front_end: @@ -892,6 +890,10 @@ The front-end communicates with the back-end through message queues. | f_init | Function pointer | Function to initialize the front-end of the current pipeline | || | instance. | ++---+---+ + | f_post_init| Function pointer | Function to run once after f_init. | + ++---+---+ + | f_track| Function pointer | See section Tracking pipeline output port to physical link. | + ++---+---+ | f_free | Function pointer | Function to free the resources allocated by the front-end of | || | the current pipeline instance. | ++---+---+ -- 2.7.4
[dpdk-dev] [PATCH v2] doc: add preferred burst size support
rte_eth_rx_burst(..,nb_pkts) function has semantic that if return value is smaller than requested, application can consider it end of packet stream. Some hardware can only support smaller burst sizes which need to be advertised. Similar is the case for Tx burst. This patch adds deprecation notice for rte_eth_dev_info structure as two new members, for preferred Rx and Tx burst size would be added - impacting the size of the structure. Signed-off-by: Shreyansh Jain --- * Refer: http://dpdk.org/dev/patchwork/patch/32112 for context v2: - fix spelling error in deprecation notice doc/guides/rel_notes/deprecation.rst | 8 1 file changed, 8 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index d59ad5988..fdc7656fa 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -59,3 +59,11 @@ Deprecation Notices be added between the producer and consumer structures. The size of the structure and the offset of the fields will remain the same on platforms with 64B cache line, but will change on other platforms. + +* ethdev: Currently, if the rte_eth_rx_burst() function returns a value less + than *nb_pkts*, the application will assume that no more packets are present. + Some of the hw queue based hardware can only support smaller burst for RX + and TX and thus break the expectation of the rx_burst API. Similar is the + case for TX burst. ``rte_eth_dev_info`` will be added with two new + parameters, ``uint16_t pref_rx_burst`` and ``uint16_t pref_tx_burst``, + for preferred RX and TX burst sizes, respectively. -- 2.14.1
[dpdk-dev] [PATCH v3] net/i40e: fix multiple DDP packages should not be allowed
Should be not possible to load conflicting DDP profiles. Only DDP profiles of the same group (not 0) can be loaded together; If DDP profile group is 0, it is exclusive, i.e. it cannot be loaded with any other DDP profile; If DDP profile groups are different - these profiles cannot be loaded together; Fixes: b319712f53c8 ("net/i40e: extended list of operations for DDP processing") v3: prevent registration of read-only profiles with profile list Signed-off-by: Kirill Rybalchenko --- drivers/net/i40e/rte_pmd_i40e.c | 40 1 file changed, 36 insertions(+), 4 deletions(-) diff --git a/drivers/net/i40e/rte_pmd_i40e.c b/drivers/net/i40e/rte_pmd_i40e.c index 5436db4..dae59e6 100644 --- a/drivers/net/i40e/rte_pmd_i40e.c +++ b/drivers/net/i40e/rte_pmd_i40e.c @@ -1496,7 +1496,14 @@ i40e_check_profile_info(uint16_t port, uint8_t *profile_info_sec) struct rte_pmd_i40e_profile_info *pinfo, *p; uint32_t i; int ret; + static const uint32_t group_mask = 0x00ff; + pinfo = (struct rte_pmd_i40e_profile_info *)(profile_info_sec + +sizeof(struct i40e_profile_section_header)); + if (pinfo->track_id == 0) { + PMD_DRV_LOG(INFO, "Read-only profile."); + return 0; + } buff = rte_zmalloc("pinfo_list", (I40E_PROFILE_INFO_SIZE * I40E_MAX_PROFILE_NUM + 4), 0); @@ -1515,8 +1522,6 @@ i40e_check_profile_info(uint16_t port, uint8_t *profile_info_sec) return -1; } p_list = (struct rte_pmd_i40e_profile_list *)buff; - pinfo = (struct rte_pmd_i40e_profile_info *)(profile_info_sec + -sizeof(struct i40e_profile_section_header)); for (i = 0; i < p_list->p_count; i++) { p = &p_list->p_info[i]; if (pinfo->track_id == p->track_id) { @@ -1525,6 +1530,23 @@ i40e_check_profile_info(uint16_t port, uint8_t *profile_info_sec) return 1; } } + for (i = 0; i < p_list->p_count; i++) { + p = &p_list->p_info[i]; + if ((p->track_id & group_mask) == 0) { + PMD_DRV_LOG(INFO, "Profile of the group 0 exists."); + rte_free(buff); + return 2; + } + } + for (i = 0; i < p_list->p_count; i++) { + p = &p_list->p_info[i]; + if ((pinfo->track_id & group_mask) != + (p->track_id & group_mask)) { + PMD_DRV_LOG(INFO, "Profile of different group exists."); + rte_free(buff); + return 3; + } + } rte_free(buff); return 0; @@ -1544,6 +1566,7 @@ rte_pmd_i40e_process_ddp_package(uint16_t port, uint8_t *buff, uint8_t *profile_info_sec; int is_exist; enum i40e_status_code status = I40E_SUCCESS; + static const uint32_t type_mask = 0xff00; if (op != RTE_PMD_I40E_PKG_OP_WR_ADD && op != RTE_PMD_I40E_PKG_OP_WR_ONLY && @@ -1595,6 +1618,10 @@ rte_pmd_i40e_process_ddp_package(uint16_t port, uint8_t *buff, return -EINVAL; } + /* force read-only track_id for type 0 */ + if ((track_id & type_mask) == 0) + track_id = 0; + /* Find profile segment */ profile_seg_hdr = i40e_find_segment_in_package(SEGMENT_TYPE_I40E, pkg_hdr); @@ -1628,12 +1655,17 @@ rte_pmd_i40e_process_ddp_package(uint16_t port, uint8_t *buff, if (op == RTE_PMD_I40E_PKG_OP_WR_ADD) { if (is_exist) { - PMD_DRV_LOG(ERR, "Profile already exists."); + if (is_exist == 1) + PMD_DRV_LOG(ERR, "Profile already exists."); + else if (is_exist == 2) + PMD_DRV_LOG(ERR, "Profile of group 0 already exists."); + else if (is_exist == 3) + PMD_DRV_LOG(ERR, "Profile of different group already exists"); rte_free(profile_info_sec); return -EEXIST; } } else if (op == RTE_PMD_I40E_PKG_OP_WR_DEL) { - if (!is_exist) { + if (is_exist != 1) { PMD_DRV_LOG(ERR, "Profile does not exist."); rte_free(profile_info_sec); return -EACCES; -- 2.5.5
Re: [dpdk-dev] [PATCH v2] relicense various bits of the dpdk
On 2/1/2018 5:49 PM, Neil Horman wrote: Received a note the other day from the Linux Foundation governance board for DPDK indicating that several files I have copyright on need to be relicensed to be compliant with the DPDK licensing guidelines. I have some concerns with some parts of the request, but am not opposed to other parts. So, for those pieces that we are in consensus on, I'm proposing that we change their license from BSD 2 clause to 3 clause. I'm also updating the files to use the SPDX licensing scheme Signed-off-by: Neil Horman CC: Hemant Agrawal CC: Thomas Monjalon --- Change notes V2) Cleaned up formatting (tmonjalon) --- devtools/validate-abi.sh | 32 lib/librte_compat/rte_compat.h | 31 +++ 2 files changed, 7 insertions(+), 56 deletions(-) Acked-by: Hemant Agrawal
Re: [dpdk-dev] [PATCH 2/2] vhost: only drop vqs with built-in virtio_net.c driver
On 02/01/2018 11:24 AM, Stefan Hajnoczi wrote: On Wed, Jan 31, 2018 at 07:07:50PM +0100, Maxime Coquelin wrote: Hi Stefan, On 01/31/2018 06:46 PM, Stefan Hajnoczi wrote: Commit e29109323595beb3884da58126ebb3b878cb66f5 ("vhost: destroy unused virtqueues when multiqueue not negotiated") broke vhost-scsi by removing virtqueues when the virtio-net-specific VIRTIO_NET_F_MQ feature bit is missing. The vhost_user.c code shouldn't assume all devices are vhost net device backends. Use the new VIRTIO_DEV_BUILTIN_VIRTIO_NET flag to check whether virtio_net.c is being used. This fixes examples/vhost_scsi. Cc: Maxime Coquelin Signed-off-by: Stefan Hajnoczi --- lib/librte_vhost/vhost_user.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 1dd1a61b6..65ee33919 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -187,7 +187,8 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features) (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off", (dev->features & (1ULL << VIRTIO_F_VERSION_1)) ? "on" : "off"); - if (!(dev->features & (1ULL << VIRTIO_NET_F_MQ))) { + if ((dev->flags & VIRTIO_DEV_BUILTIN_VIRTIO_NET) && + !(dev->features & (1ULL << VIRTIO_NET_F_MQ))) { If we had an external net backend using the library, we would also need to remove extra queues if it advertised VIRTIO_NET_F_MQ, but the feature isn't negotiated. A net-specific fix inside librte_vhost is not enough since non-net drivers may only initialize a subset of the virtqueues. You are right, and even with net backend, the driver may decide to only initialize a subset of the virtqueues even if VIRTIO_NET_F_MQ is negotiated. This is not the case today with virtio-net Linux kernel driver and DPDK's Virtio PMD, but Windows virtio-net driver only initialize as much queue pairs as online vCPUs. I suggest starting over. Get rid of the net-specific fix and instead look at when new_device() needs to be called. I agree, and already started to work on it. But my understanding is that we will need a vhost-user protocol update. So I implemented this workaround to support the VIRTIO_NET_F_MQ not negotiated case, which happens with iPXE. Virtqueues will not be changed after the DRIVER_OK status bit has been set. The VIRTIO 1.0 specification says, "The device MUST NOT consume buffers before DRIVER_OK, and the driver MUST NOT notify the device before it sets DRIVER_OK" (3.1 Device Initialization). http://docs.oasis-open.org/virtio/virtio/v1.0/csprd01/virtio-v1.0-csprd01.html#x1-230001 However, it also says "legacy device implementations often used the device before setting the DRIVER_OK bit" (3.1.1 Legacy Interface: Device Initialization). VIRTIO 1.0 can be supported fine by the current librte_vhost API. Legacy cannot be supported without API changes - there is no magic way to detect when new_device() can be invoked if the driver might require some virtqueue processing before the device is fully initialized. I think this is the main discussion that needs to happen. This patch series and the original VIRTIO_NET_F_MQ fix are just workarounds for the real problem. Yes. In this case, the fix I suggested yesterday would work: if ((vhost_features & (1ULL << VIRTIO_NET_F_MQ)) && !(dev->features & (1ULL << VIRTIO_NET_F_MQ)) { ... } For any backend that does not advertise the feature, no queues will be destroyed. The feature bit space is shared by all device types. Another device can use bit 22 (VIRTIO_NET_F_MQ) for another purpose. This code would incorrectly assume it's a net device. Thanks for pointing this out, I missed that. No other device type in VIRTIO 1.0 uses bit 22 yet, but this solution is not future-proof. If you decide to use your fix, please include a comment in the code so whoever needs to debug this again in the future can spot the problem more quickly. No, I agree this is not future proof. I now think your patch is better. Thanks for the insights! Maxime Stefan
Re: [dpdk-dev] [PATCH] net/mlx5: fix port stop by verify flows are still present
Tuesday, January 30, 2018 3:37 PM, Nelio Laranjeiro: > priv_flow_stop() may be called several times, in such situation flows are > already removed from the NIC and thus all associated objects are no present > in the flow object (ibv_flow, indirections tables, ). s/indirections/ indirection > > Fixes: 71ee11c83bc4 ("net/mlx5: fix flow stop when flows are already > stopped") Such commit don't exists. Removed it. > Fixes: 8086cf08b2f0 ("net/mlx5: handle RSS hash configuration in RSS flow") > > Signed-off-by: Nelio Laranjeiro > --- With above fixes, applied to next-net-mlx. Let me know if you disagree. Thanks.
Re: [dpdk-dev] [PATCH v2] doc: add preferred burst size support
On 2/1/2018 6:18 PM, Shreyansh Jain wrote: rte_eth_rx_burst(..,nb_pkts) function has semantic that if return value is smaller than requested, application can consider it end of packet stream. Some hardware can only support smaller burst sizes which need to be advertised. Similar is the case for Tx burst. This patch adds deprecation notice for rte_eth_dev_info structure as two new members, for preferred Rx and Tx burst size would be added - impacting the size of the structure. Signed-off-by: Shreyansh Jain Acked-by: Hemant Agrawal ...
Re: [dpdk-dev] [PATCH] net/mlx5: fix flow priority on queue action
Thursday, February 1, 2018 3:42 AM, Yongseok Koh: > > On Jan 31, 2018, at 8:13 AM, Nelio Laranjeiro > wrote: > > > > A single queue should have the same verbs priority as an RSS one. > > > > Fixes: 8086cf08b2f0 ("net/mlx5: handle RSS hash configuration in RSS flow") > > Cc: sta...@dpdk.org > > > > Signed-off-by: Nelio Laranjeiro > > --- > Acked-by: Yongseok Koh Applied to next-net-mlx. Thanks. > > Thanks
Re: [dpdk-dev] [PATCH v2] doc: add preferred burst size support
On 02/01/2018 03:48 PM, Shreyansh Jain wrote: rte_eth_rx_burst(..,nb_pkts) function has semantic that if return value is smaller than requested, application can consider it end of packet stream. Some hardware can only support smaller burst sizes which need to be advertised. Similar is the case for Tx burst. This patch adds deprecation notice for rte_eth_dev_info structure as two new members, for preferred Rx and Tx burst size would be added - impacting the size of the structure. Signed-off-by: Shreyansh Jain --- * Refer: http://dpdk.org/dev/patchwork/patch/32112 for context v2: - fix spelling error in deprecation notice doc/guides/rel_notes/deprecation.rst | 8 1 file changed, 8 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index d59ad5988..fdc7656fa 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -59,3 +59,11 @@ Deprecation Notices be added between the producer and consumer structures. The size of the structure and the offset of the fields will remain the same on platforms with 64B cache line, but will change on other platforms. + +* ethdev: Currently, if the rte_eth_rx_burst() function returns a value less + than *nb_pkts*, the application will assume that no more packets are present. + Some of the hw queue based hardware can only support smaller burst for RX + and TX and thus break the expectation of the rx_burst API. Similar is the + case for TX burst. ``rte_eth_dev_info`` will be added with two new + parameters, ``uint16_t pref_rx_burst`` and ``uint16_t pref_tx_burst``, + for preferred RX and TX burst sizes, respectively. Acked-by: Andrew Rybchenko
Re: [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool
On 2/1/2018 12:10 PM, Jerin Jacob wrote: -Original Message- Date: Wed, 31 Jan 2018 17:46:51 +0100 From: Olivier Matz To: Andrew Rybchenko CC: dev@dpdk.org Subject: Re: [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool User-Agent: NeoMutt/20170113 (1.7.2) On Tue, Jan 23, 2018 at 01:23:04PM +, Andrew Rybchenko wrote: An API/ABI changes are planned for 18.05 [1]: * Allow to customize how mempool objects are stored in memory. * Deprecate mempool XMEM API. * Add mempool driver ops to get information from mempool driver and dequeue contiguous blocks of objects if driver supports it. [1] http://dpdk.org/ml/archives/dev/2018-January/088698.html Signed-off-by: Andrew Rybchenko Acked-by: Olivier Matz Acked-by: Jerin Jacob Acked-by: Hemant Agrawal
Re: [dpdk-dev] [PATCH 0/2] vhost: fix VIRTIO_NET_F_MQ vhost_scsi breakage
On 01/31/2018 06:46 PM, Stefan Hajnoczi wrote: These patches fix a recent regression in librte_vhost that breaks the vhost_scsi example application. vhost_user.c assumes all devices are vhost net backends when handling the VIRTIO_NET_F_MQ feature bit. The code is triggered by vhost scsi devices and causes virtqueues to be removed. See Patch 2 for details. Patch 1 puts the infrastructure in place to distinguish between the built-in virtio_net.c driver and generic vhost device backend usage. Patch 2 fixes the regression by handling VIRTIO_NET_F_MQ only when the built-in virtio_net.c driver is in use. Stefan Hajnoczi (2): vhost: add flag for built-in virtio_net.c driver vhost: only drop vqs with built-in virtio_net.c driver lib/librte_vhost/vhost.h | 3 +++ lib/librte_vhost/socket.c | 15 +++ lib/librte_vhost/vhost.c | 17 - lib/librte_vhost/vhost_user.c | 3 ++- lib/librte_vhost/virtio_net.c | 14 ++ 5 files changed, 50 insertions(+), 2 deletions(-) For the series: Reviewed-by: Maxime Coquelin Thanks, Maxime
Re: [dpdk-dev] [PATCH] test/test: clean up memory for func reentrancy test
On Wed, Jan 31, 2018 at 02:17:32PM +, Anatoly Burakov wrote: > Function reentrancy test limits maximum number of iterations based > on the number of memzones and cores, however it doesn't free the > memzones after the fact, so on a machine with big amount of cores > the tests will fail due to running out of memzones. > > Fix this by introducing cleanup functions for ring and mempool > reentrancy tests. > > Signed-off-by: Anatoly Burakov Acked-by: Olivier Matz Not specifically related to this patch, but it seems that the func_reent test cannot be launched twice, because the objects "fr_test_once" are not freed. I'll see if I can submit a patch in the coming days.
[dpdk-dev] [PATCH] net/ena: fix jumbo support in Rx offloads flags
ENA device supports Rx jumbo frames and such information needs to be provided in the offloads flags. Fixes: 7369f88f88c0 ("net/ena: convert to new Rx offloads API") Signed-off-by: Rafal Kozik --- drivers/net/ena/ena_ethdev.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c index 83e0ae2..3588384 100644 --- a/drivers/net/ena/ena_ethdev.c +++ b/drivers/net/ena/ena_ethdev.c @@ -1561,6 +1561,8 @@ static void ena_infos_get(struct rte_eth_dev *dev, DEV_RX_OFFLOAD_UDP_CKSUM | DEV_RX_OFFLOAD_TCP_CKSUM; + rx_feat |= DEV_RX_OFFLOAD_JUMBO_FRAME; + /* Inform framework about available features */ dev_info->rx_offload_capa = rx_feat; dev_info->rx_queue_offload_capa = rx_feat; -- 2.7.4
Re: [dpdk-dev] [PATCH v3] net/i40e: fix multiple DDP packages should not be allowed
> -Original Message- > From: Rybalchenko, Kirill > Sent: Thursday, February 1, 2018 12:43 PM > To: dev@dpdk.org > Cc: sta...@dpdk.org; Rybalchenko, Kirill ; > Chilikin, Andrey ; Xing, Beilei > ; Wu, Jingjing > Subject: [PATCH v3] net/i40e: fix multiple DDP packages should not be > allowed > > Should be not possible to load conflicting DDP profiles. > Only DDP profiles of the same group (not 0) can be loaded > together; > If DDP profile group is 0, it is exclusive, i.e. it cannot > be loaded with any other DDP profile; > If DDP profile groups are different - these profiles cannot > be loaded together; > > Fixes: b319712f53c8 ("net/i40e: extended list of operations for DDP > processing") > > v3: prevent registration of read-only profiles with profile list > > Signed-off-by: Kirill Rybalchenko Acked-by: Andrey Chilikin
Re: [dpdk-dev] [PATCH v2] doc: add preferred burst size support
On Thu, Feb 01, 2018 at 06:18:23PM +0530, Shreyansh Jain wrote: > rte_eth_rx_burst(..,nb_pkts) function has semantic that if return value > is smaller than requested, application can consider it end of packet > stream. Some hardware can only support smaller burst sizes which need > to be advertised. Similar is the case for Tx burst. > > This patch adds deprecation notice for rte_eth_dev_info structure as > two new members, for preferred Rx and Tx burst size would be added - > impacting the size of the structure. > > Signed-off-by: Shreyansh Jain > --- > * Refer: http://dpdk.org/dev/patchwork/patch/32112 for context > > v2: > - fix spelling error in deprecation notice > > doc/guides/rel_notes/deprecation.rst | 8 > 1 file changed, 8 insertions(+) > > diff --git a/doc/guides/rel_notes/deprecation.rst > b/doc/guides/rel_notes/deprecation.rst > index d59ad5988..fdc7656fa 100644 > --- a/doc/guides/rel_notes/deprecation.rst > +++ b/doc/guides/rel_notes/deprecation.rst > @@ -59,3 +59,11 @@ Deprecation Notices >be added between the producer and consumer structures. The size of the >structure and the offset of the fields will remain the same on >platforms with 64B cache line, but will change on other platforms. > + > +* ethdev: Currently, if the rte_eth_rx_burst() function returns a value > less > + than *nb_pkts*, the application will assume that no more packets are > present. > + Some of the hw queue based hardware can only support smaller burst for RX > + and TX and thus break the expectation of the rx_burst API. Similar is the > + case for TX burst. ``rte_eth_dev_info`` will be added with two new > + parameters, ``uint16_t pref_rx_burst`` and ``uint16_t pref_tx_burst``, > + for preferred RX and TX burst sizes, respectively. > -- > 2.14.1 > LTGM as far as it goes, but following discussion on this patch, http://dpdk.org/ml/archives/dev/2018-January/089585.html I think we might also want to add in parameters for "pref_tx_ring_sz" and "pref_rx_ring_sz" too. While it is the case that, once the structure is changed, we can make multiple additional changes, I think it might be worth mentioning as many as we can for completeness. Another point to consider, is whether we might want to add in a sub-structure for "preferred_settings" to hold all these, rather than just adding them as new fields. It might help with making names more readable (though also longer). struct { uint16_t rx_burst; uint16_t tx_burst; uint16_t rx_ring_sz; uint16_t tx_ring_sz; } preferred_settings; In any case, for this or subsequent versions: Acked-by: Bruce Richardson /Bruce
[dpdk-dev] Link discovery using DPDK-based NICs
Hi all, I wonder if there is any means in DPDK to perform link discovery. I know that DPDK is all about packet I/O, but maybe there is some library with useful functions that an application could use for link discovery (e.g., using LLDP). Any suggestions are much appreciated. Thanks, -- Georgios Katsikas Industrial Ph.D. Student Network Intelligence Group Decision, Networks, and Analytics (DNA) Lab RISE SICS E-Mail: georgios.katsi...@ri.se
Re: [dpdk-dev] [RFC v2 03/17] mempool/octeontx: add callback to calculate memory size
On Thursday 01 February 2018 03:31 PM, santosh wrote: > Hi Andrew, > > > On Thursday 01 February 2018 11:48 AM, Jacob, Jerin wrote: >> The driver requires one and only one physically contiguous >> memory chunk for all objects. >> >> Signed-off-by: Andrew Rybchenko >> --- >> drivers/mempool/octeontx/rte_mempool_octeontx.c | 25 >> + >> 1 file changed, 25 insertions(+) >> >> diff --git a/drivers/mempool/octeontx/rte_mempool_octeontx.c >> b/drivers/mempool/octeontx/rte_mempool_octeontx.c >> index d143d05..4ec5efe 100644 >> --- a/drivers/mempool/octeontx/rte_mempool_octeontx.c >> +++ b/drivers/mempool/octeontx/rte_mempool_octeontx.c >> @@ -136,6 +136,30 @@ octeontx_fpavf_get_capabilities(const struct >> rte_mempool *mp, >> return 0; >> } >> >> +static ssize_t >> +octeontx_fpavf_calc_mem_size(const struct rte_mempool *mp, >> +uint32_t obj_num, uint32_t pg_shift, >> +size_t *min_chunk_size, size_t *align) >> +{ >> + ssize_t mem_size; >> + >> + /* >> +* Simply need space for one more object to be able to >> +* fullfil alignment requirements. >> +*/ >> + mem_size = rte_mempool_calc_mem_size_def(mp, obj_num + 1, pg_shift, >> + > I think, you don't need that (obj_num + 1) as because > rte_xmem_calc_int() will be checking flags for > _ALIGNED + _CAPA_PHYS_CONFIG i.e.. > > mask = MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS | MEMPOOL_F_CAPA_PHYS_CONTIG; > if ((flags & mask) == mask) > /* alignment need one additional object */ > elt_num += 1; ok, You are removing above check in v2- 06/17, so ignore above comment. I suggest to move this patch and keep it after 06/17. Or perhaps keep common mempool changes first then followed by driver specifics changes in your v3 series. Thanks.
[dpdk-dev] [PATCH] buildtools: output build failure reason to stderr
If build fails because of failed experimental check and stdout is redirected to /dev/null, it is absolutely unclear why build fails. Fixes: a4bcd61de82d ("buildtools: add script to check experimental API exports") Signed-off-by: Andrew Rybchenko --- buildtools/check-experimental-syms.sh | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/buildtools/check-experimental-syms.sh b/buildtools/check-experimental-syms.sh index 7d21de3..7f5aa61 100755 --- a/buildtools/check-experimental-syms.sh +++ b/buildtools/check-experimental-syms.sh @@ -22,9 +22,11 @@ do IN_EXP=$? if [ $IN_TEXT -eq 0 -a $IN_EXP -ne 0 ] then - echo "$SYM is not flagged as experimental" - echo "but is listed in version map" - echo "Please add __rte_experimental to the definition of $SYM" + cat >&2 <
[dpdk-dev] [PATCH] mempool: fix phys contig check if populate default skipped
There is not specified dependency between rte_mempool_populate_default() and rte_mempool_populate_iova(). So, the second should not rely on the fact that the first adds capability flags to the mempool flags. Fixes: 65cf769f5e6a ("mempool: detect physical contiguous objects") Cc: sta...@dpdk.org Signed-off-by: Andrew Rybchenko Acked-by: Santosh Shukla --- lib/librte_mempool/rte_mempool.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 6fdb723..54f7f4b 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -333,6 +333,7 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr, void *opaque) { unsigned total_elt_sz; + unsigned int mp_capa_flags; unsigned i = 0; size_t off; struct rte_mempool_memhdr *memhdr; @@ -357,8 +358,17 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr, total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size; + /* Get mempool capabilities */ + mp_capa_flags = 0; + ret = rte_mempool_ops_get_capabilities(mp, &mp_capa_flags); + if ((ret < 0) && (ret != -ENOTSUP)) + return ret; + + /* update mempool capabilities */ + mp->flags |= mp_capa_flags; + /* Detect pool area has sufficient space for elements */ - if (mp->flags & MEMPOOL_F_CAPA_PHYS_CONTIG) { + if (mp_capa_flags & MEMPOOL_F_CAPA_PHYS_CONTIG) { if (len < total_elt_sz * mp->size) { RTE_LOG(ERR, MEMPOOL, "pool area %" PRIx64 " not enough\n", @@ -378,7 +388,7 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr, memhdr->free_cb = free_cb; memhdr->opaque = opaque; - if (mp->flags & MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS) + if (mp_capa_flags & MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS) /* align object start address to a multiple of total_elt_sz */ off = total_elt_sz - ((uintptr_t)vaddr % total_elt_sz); else if (mp->flags & MEMPOOL_F_NO_CACHE_ALIGN) -- 2.7.4
Re: [dpdk-dev] [RFC v2 01/17] mempool: fix phys contig check if populate default skipped
On 02/01/2018 01:33 PM, santosh wrote: On Thursday 01 February 2018 03:44 PM, Olivier Matz wrote: On Thu, Feb 01, 2018 at 01:00:12PM +0300, Andrew Rybchenko wrote: On 02/01/2018 12:30 PM, santosh wrote: On Thursday 01 February 2018 02:48 PM, Andrew Rybchenko wrote: On 02/01/2018 12:09 PM, santosh wrote: On Thursday 01 February 2018 12:24 PM, Andrew Rybchenko wrote: On 02/01/2018 08:05 AM, santosh wrote: On Wednesday 31 January 2018 10:15 PM, Olivier Matz wrote: On Tue, Jan 23, 2018 at 01:15:56PM +, Andrew Rybchenko wrote: There is not specified dependency between rte_mempool_populate_default() and rte_mempool_populate_iova(). So, the second should not rely on the fact that the first adds capability flags to the mempool flags. Fixes: 65cf769f5e6a ("mempool: detect physical contiguous objects") Cc: sta...@dpdk.org Signed-off-by: Andrew Rybchenko Looks good to me. I agree it's strange that the mp->flags are updated with capabilities only in rte_mempool_populate_default(). I see that this behavior is removed later in the patchset since the get_capa() is removed! However maybe this single patch could go in 18.02. +Santosh +Jerin since it's mostly about Octeon. rte_mempool_xmem_size should return correct size if MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS flag is set in 'mp->flags'. Thats why _ops_get_capabilities() called in _populate_default() but not at _populate_iova(). I think, this 'alone' patch may break octeontx mempool. The patch does not touch rte_mempool_populate_default(). _ops_get_capabilities() is still called there before rte_mempool_xmem_size(). The theoretical problem which the patch tries to fix is the case when rte_mempool_populate_default() is not called at all. I.e. application calls _ops_get_capabilities() to get flags, then, together with mp->flags, calls rte_mempool_xmem_size() directly, allocates calculated amount of memory and calls _populate_iova(). In that case, Application does like below: /* Get mempool capabilities */ mp_flags = 0; ret = rte_mempool_ops_get_capabilities(mp, &mp_flags); if ((ret < 0) && (ret != -ENOTSUP)) return ret; /* update mempool capabilities */ mp->flags |= mp_flags; Above line is not mandatory. "mp->flags | mp_flags" could be simply passed to rte_mempool_xmem_size() below. That depends and again upto application requirement, if app further down wants to refer mp->flags for _align/_contig then better update to mp->flags. But that wasn't the point of discussion, I'm trying to understand that w/o this patch, whats could be the application level problem? The problem that it is fragile. If application does not use rte_mempool_populate_default() it has to care about addition of mempool capability flags into mempool flags. If it is not done, rte_mempool_populate_iova/virt/iova_tab() functions will work incorrectly since F_CAPA_PHYS_CONTIG and F_CAPA_BLK_ALIGNED_OBJECTS are missing. The idea of the patch is to make it a bit more robust. I have no idea how it can break something. If capability flags are already there - no problem. If no, just make sure that we locally have them. The example given by Santosh will work, but it is *not* the role of the application to update the mempool flags. And nothing says that it is mandatory to call rte_mempool_ops_get_capabilities() before the populate functions. For instance, in testpmd it calls rte_mempool_populate_anon() when using anonymous memory. The capabilities will never be updated in mp->flags. Valid case and I agree with your example and explanation. With nits change: mp->flags |= mp_capa_flags; Acked-by: Santosh Shukla I'll submit the patch separately with this minor change. Thanks.
[dpdk-dev] [PATCH] ethdev: check consistency of per port offloads
A per port offloading feature should be enabled or disabled at same time in both rte_eth_dev_configure( ) and rte_eth_rx_queue_setup( )/rte_eth_tx_queue_setup( ). This patch check if a per port offloading flag has same configuration in rte_eth_dev_configure( ) and rte_eth_rx_queue_setup( )/rte_eth_tx_queue_setup( ). This patch can make such checking in a common way in rte_ethdev layer to avoid same checking in underlying PMD. Signed-off-by: Wei Dai --- lib/librte_ether/rte_ethdev.c | 70 +++ 1 file changed, 70 insertions(+) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index 78bed1a..7945890 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -1404,6 +1404,44 @@ rte_eth_dev_is_removed(uint16_t port_id) return ret; } +/** +* Check if the Rx/Tx queue offloading settings is valid +* @param queue_offloads +* offloads input to rte_eth_rx_queue_setup( ) or rte_eth_tx_queue_setup( ) +* @param port_offloads +* Rx or Tx offloads input to rte_eth_dev_configure( ) +* @param queue_offload_capa +* rx_queue_offload_capa or tx_queue_offload_capa in struct rte_eth_dev_ifnfo +* got from rte_eth_dev_info_get( ) +* @param all_offload_capa +* rx_offload_capa or tx_offload_capa in struct rte_eth_dev_info +* got from rte_eth_dev_info_get( ) +* +* @return +* Nonzero when per-queue offloading setting is valid +*/ +static int +rte_eth_check_queue_offloads(uint64_t queue_offloads, +uint64_t port_offloads, +uint64_t queue_offload_capa, +uint64_t all_offload_capa) +{ + uint64_t pure_port_capa = all_offload_capa ^ queue_offload_capa; + + return !((port_offloads ^ queue_offloads) & pure_port_capa); +} + +static int +rte_eth_check_rx_queue_offloads(uint64_t rx_queue_offloads, + const struct rte_eth_rxmode *rxmode, + const struct rte_eth_dev_info *dev_info) +{ + return rte_eth_check_queue_offloads(rx_queue_offloads, + rxmode->offloads, + dev_info->rx_queue_offload_capa, + dev_info->rx_offload_capa); +} + int rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id, uint16_t nb_rx_desc, unsigned int socket_id, @@ -1446,6 +1484,7 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id, (int) sizeof(struct rte_pktmbuf_pool_private)); return -ENOSPC; } + mbp_buf_size = rte_pktmbuf_data_room_size(mp); if ((mbp_buf_size - RTE_PKTMBUF_HEADROOM) < dev_info.min_rx_bufsize) { @@ -1495,6 +1534,16 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id, &local_conf.offloads); } + if (!rte_eth_check_rx_queue_offloads(local_conf.offloads, + &dev->data->dev_conf.rxmode, &dev_info)) { + RTE_PMD_DEBUG_TRACE("%p : Rx queue offloads ox%" PRIx64 + " don't match port offloads 0x%" PRIx64 + " or supported offloads 0x%" PRIx64, + (void *)dev, local_conf.offloads, + dev_info.rx_offload_capa); + return -ENOTSUP; + } + ret = (*dev->dev_ops->rx_queue_setup)(dev, rx_queue_id, nb_rx_desc, socket_id, &local_conf, mp); if (!ret) { @@ -1555,6 +1604,17 @@ rte_eth_convert_txq_offloads(const uint64_t tx_offloads, uint32_t *txq_flags) *txq_flags = flags; } +static int +rte_eth_check_tx_queue_offloads(uint64_t tx_queue_offloads, + const struct rte_eth_txmode *txmode, + const struct rte_eth_dev_info *dev_info) +{ + return rte_eth_check_queue_offloads(tx_queue_offloads, + txmode->offloads, + dev_info->tx_queue_offload_capa, + dev_info->tx_offload_capa); +} + int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id, uint16_t nb_tx_desc, unsigned int socket_id, @@ -1622,6 +1682,16 @@ rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id, &local_conf.offloads); } + if (!rte_eth_check_tx_queue_offloads(local_conf.offloads, + &dev->data->dev_conf.txmode, &dev_info)) { + RTE_PMD_DEBUG_TRACE("%p : Tx queue offloads ox%" PRIx64 + " don't match port offloads 0x%" PRIx64 + " or supported offloads 0x%" PRIx64, + (void *)dev, local_conf.offloads, + dev_info.tx_offload_capa); +
Re: [dpdk-dev] [PATCH v2] doc: add preferred burst size support
On Thursday 01 February 2018 06:57 PM, Bruce Richardson wrote: On Thu, Feb 01, 2018 at 06:18:23PM +0530, Shreyansh Jain wrote: rte_eth_rx_burst(..,nb_pkts) function has semantic that if return value is smaller than requested, application can consider it end of packet stream. Some hardware can only support smaller burst sizes which need to be advertised. Similar is the case for Tx burst. This patch adds deprecation notice for rte_eth_dev_info structure as two new members, for preferred Rx and Tx burst size would be added - impacting the size of the structure. Signed-off-by: Shreyansh Jain --- * Refer: http://dpdk.org/dev/patchwork/patch/32112 for context v2: - fix spelling error in deprecation notice doc/guides/rel_notes/deprecation.rst | 8 1 file changed, 8 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index d59ad5988..fdc7656fa 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -59,3 +59,11 @@ Deprecation Notices be added between the producer and consumer structures. The size of the structure and the offset of the fields will remain the same on platforms with 64B cache line, but will change on other platforms. + +* ethdev: Currently, if the rte_eth_rx_burst() function returns a value less + than *nb_pkts*, the application will assume that no more packets are present. + Some of the hw queue based hardware can only support smaller burst for RX + and TX and thus break the expectation of the rx_burst API. Similar is the + case for TX burst. ``rte_eth_dev_info`` will be added with two new + parameters, ``uint16_t pref_rx_burst`` and ``uint16_t pref_tx_burst``, + for preferred RX and TX burst sizes, respectively. -- 2.14.1 LTGM as far as it goes, but following discussion on this patch, http://dpdk.org/ml/archives/dev/2018-January/089585.html I think we might also want to add in parameters for "pref_tx_ring_sz" and "pref_rx_ring_sz" too. While it is the case that, once the structure is changed, we can make multiple additional changes, I think it might be worth mentioning as many as we can for completeness. Another point to consider, is whether we might want to add in a sub-structure for "preferred_settings" to hold all these, rather than just adding them as new fields. It might help with making names more readable (though also longer). struct { uint16_t rx_burst; uint16_t tx_burst; uint16_t rx_ring_sz; uint16_t tx_ring_sz; } preferred_settings; This, and the point above that we can make multiple additional changes, is definitely a good idea. Though, 'preferred_setting' is long and has chances of spell mistakes in first go - what about just 'pref' or, 'pref_size' if only 4 mentioned above are part of this. For now I saw need for burst size because I hit that case. Ring size looks logical to me. We can have a look if more such toggles are required. In any case, for this or subsequent versions: Acked-by: Bruce Richardson /Bruce Thanks.
Re: [dpdk-dev] [PATCH v2] doc: add preferred burst size support
On Thu, Feb 01, 2018 at 07:58:32PM +0530, Shreyansh Jain wrote: > On Thursday 01 February 2018 06:57 PM, Bruce Richardson wrote: > > On Thu, Feb 01, 2018 at 06:18:23PM +0530, Shreyansh Jain wrote: > > > rte_eth_rx_burst(..,nb_pkts) function has semantic that if return value > > > is smaller than requested, application can consider it end of packet > > > stream. Some hardware can only support smaller burst sizes which need > > > to be advertised. Similar is the case for Tx burst. > > > > > > This patch adds deprecation notice for rte_eth_dev_info structure as > > > two new members, for preferred Rx and Tx burst size would be added - > > > impacting the size of the structure. > > > > > > Signed-off-by: Shreyansh Jain > > > --- > > > * Refer: http://dpdk.org/dev/patchwork/patch/32112 for context > > > > > > v2: > > > - fix spelling error in deprecation notice > > > > > > doc/guides/rel_notes/deprecation.rst | 8 > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/doc/guides/rel_notes/deprecation.rst > > > b/doc/guides/rel_notes/deprecation.rst > > > index d59ad5988..fdc7656fa 100644 > > > --- a/doc/guides/rel_notes/deprecation.rst > > > +++ b/doc/guides/rel_notes/deprecation.rst > > > @@ -59,3 +59,11 @@ Deprecation Notices > > > be added between the producer and consumer structures. The size of the > > > structure and the offset of the fields will remain the same on > > > platforms with 64B cache line, but will change on other platforms. > > > + > > > +* ethdev: Currently, if the rte_eth_rx_burst() function returns a > > > value less > > > + than *nb_pkts*, the application will assume that no more packets are > > > present. > > > + Some of the hw queue based hardware can only support smaller burst for > > > RX > > > + and TX and thus break the expectation of the rx_burst API. Similar is > > > the > > > + case for TX burst. ``rte_eth_dev_info`` will be added with two new > > > + parameters, ``uint16_t pref_rx_burst`` and ``uint16_t pref_tx_burst``, > > > + for preferred RX and TX burst sizes, respectively. > > > -- > > > 2.14.1 > > > > > > > LTGM as far as it goes, but following discussion on this patch, > > http://dpdk.org/ml/archives/dev/2018-January/089585.html > > I think we might also want to add in parameters for "pref_tx_ring_sz" > > and "pref_rx_ring_sz" too. While it is the case that, once the structure > > is changed, we can make multiple additional changes, I think it might be > > worth mentioning as many as we can for completeness. > > > > Another point to consider, is whether we might want to add in a > > sub-structure for "preferred_settings" to hold all these, rather than > > just adding them as new fields. It might help with making names more > > readable (though also longer). > > > > struct { > > uint16_t rx_burst; > > uint16_t tx_burst; > > uint16_t rx_ring_sz; > > uint16_t tx_ring_sz; > > } preferred_settings; > > This, and the point above that we can make multiple additional changes, is > definitely a good idea. Though, 'preferred_setting' is long and has chances > of spell mistakes in first go - what about just 'pref' or, 'pref_size' if > only 4 mentioned above are part of this. > > For now I saw need for burst size because I hit that case. Ring size looks > logical to me. We can have a look if more such toggles are required. > I actually don't like the abbreviation "pref", as it looks too much like "perf" short for performance. As this is an initialization setting, I also don't think having a longer name is that big of deal. How about calling them "suggested" or "recommended" settings - both of which have less fiddly spellings. /Bruce
[dpdk-dev] [PATCH v2 0/5] Fix meson build on FreeBSD
There are a few issues with building DPDK for FreeBSD using the meson build system, specifically: * the kernel modules aren't compiling due to an incorrect VPATH * a number of unit tests depend on libraries not supported on BSD * applications and examples need to be linked with execinfo library. V2: merged patch 6 in with patch 2, since it's the same fix for main apps and for the examples. Bruce Richardson (5): eal/bsdapp: fix building kernel modules build: fix dependency on execinfo for BSD meson builds test/test: mark tests as skipped when required lib not available test/test: fix dependency on power lib for BSD meson build test/test: fix dependency on KNI lib for BSD meson build app/test-eventdev/meson.build | 1 + app/test-pmd/meson.build| 1 + examples/meson.build| 3 ++- lib/librte_eal/bsdapp/BSDmakefile.meson | 1 + lib/librte_eal/meson.build | 1 - test/test/meson.build | 8 +++- test/test/test_kni.c| 13 + test/test/test_power.c | 12 test/test/test_power_acpi_cpufreq.c | 11 +++ test/test/test_power_kvm_vm.c | 11 +++ 10 files changed, 59 insertions(+), 3 deletions(-) -- 2.14.1
[dpdk-dev] [PATCH v2 1/5] eal/bsdapp: fix building kernel modules
The kernel module source file directory passed via VPATH was wrong, which caused the source files to be not found via make. Rather than explicitly passing VPATH, make use of the fact that the full path to the source files is passed by meson, so split that into directory part - to be used as VPATH - and file part - to be used as the source filename. Fixes: 610beca42ea4 ("build: remove library special cases") Signed-off-by: Bruce Richardson --- lib/librte_eal/bsdapp/BSDmakefile.meson | 1 + lib/librte_eal/meson.build | 1 - 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_eal/bsdapp/BSDmakefile.meson b/lib/librte_eal/bsdapp/BSDmakefile.meson index 2f16ac05b..42f5b2b9d 100644 --- a/lib/librte_eal/bsdapp/BSDmakefile.meson +++ b/lib/librte_eal/bsdapp/BSDmakefile.meson @@ -36,6 +36,7 @@ # source file is passed via KMOD_SRC as full path, we only use final # component of it, as VPATH is used to find actual file, so as to # have the .o files placed in the build, not source directory +VPATH = ${KMOD_SRC:H} SRCS = ${KMOD_SRC:T} device_if.h bus_if.h pci_if.h CFLAGS += $(KMOD_CFLAGS) diff --git a/lib/librte_eal/meson.build b/lib/librte_eal/meson.build index 6fb2ef17f..d9ba38533 100644 --- a/lib/librte_eal/meson.build +++ b/lib/librte_eal/meson.build @@ -36,7 +36,6 @@ elif host_machine.system() == 'freebsd' command: ['make', '-f', '@INPUT0@', 'KMOD_SRC=@INPUT1@', 'KMOD=' + k, - 'VPATH=' + join_paths(meson.current_source_dir(), k), 'KMOD_CFLAGS=' + ' '.join(kmod_cflags)], build_by_default: get_option('enable_kmods')) endforeach -- 2.14.1
[dpdk-dev] [PATCH v2 2/5] build: fix dependency on execinfo for BSD meson builds
The binaries and apps in DPDK all need to be linked against the execinfo library on FreeBSD so add this as a dependency in cases where it is found. It's available by default on BSD, but not at all on Linux Fixes: 16ade738fd0d ("app/testpmd: build with meson") Fixes: 89f0711f9ddf ("examples: build some samples with meson") Fixes: b5dc795a8a55 ("test: build app with meson as dpdk-test") Fixes: 2ff67267b049 ("app/eventdev: build with meson") Signed-off-by: Bruce Richardson --- app/test-eventdev/meson.build | 1 + app/test-pmd/meson.build | 1 + examples/meson.build | 3 ++- test/test/meson.build | 1 + 4 files changed, 5 insertions(+), 1 deletion(-) diff --git a/app/test-eventdev/meson.build b/app/test-eventdev/meson.build index 7fb3a280a..7c373c87b 100644 --- a/app/test-eventdev/meson.build +++ b/app/test-eventdev/meson.build @@ -13,6 +13,7 @@ sources = files('evt_main.c', 'test_perf_queue.c') dep_objs = [get_variable(get_option('default_library') + '_rte_eventdev')] +dep_objs += cc.find_library('execinfo', required: false) # BSD only link_libs = [] if get_option('default_library') == 'static' diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build index 9964dae75..7ed74db2b 100644 --- a/app/test-pmd/meson.build +++ b/app/test-pmd/meson.build @@ -37,6 +37,7 @@ dep_objs = [] foreach d:deps dep_objs += get_variable(get_option('default_library') + '_rte_' + d) endforeach +dep_objs += cc.find_library('execinfo', required: false) # for BSD only link_libs = [] if get_option('default_library') == 'static' diff --git a/examples/meson.build b/examples/meson.build index 5658fbe04..2c6b3f889 100644 --- a/examples/meson.build +++ b/examples/meson.build @@ -6,12 +6,13 @@ if get_option('default_library') == 'static' driver_libs = dpdk_drivers endif +execinfo = cc.find_library('execinfo', required: false) foreach example: get_option('examples').split(',') name = example sources = [] allow_experimental_apis = false cflags = machine_args - ext_deps = [] + ext_deps = [execinfo] includes = [include_directories(example)] deps = ['eal', 'mempool', 'net', 'mbuf', 'ethdev', 'cmdline'] subdir(example) diff --git a/test/test/meson.build b/test/test/meson.build index d5b768b9d..2457a2adb 100644 --- a/test/test/meson.build +++ b/test/test/meson.build @@ -234,6 +234,7 @@ foreach d:test_deps def_lib = get_option('default_library') test_dep_objs += get_variable(def_lib + '_rte_' + d) endforeach +test_dep_objs += cc.find_library('execinfo', required: false) link_libs = [] if get_option('default_library') == 'static' -- 2.14.1
[dpdk-dev] [PATCH v2 3/5] test/test: mark tests as skipped when required lib not available
The power management and KNI libraries are not compiled on a FreeBSD platform, which means that the tests can't run. Add in stub code for these cases, allowing the tests to still be compiled, but to report as skipped in those cases. Signed-off-by: Bruce Richardson CC: Ferruh Yigit CC: David Hunt --- test/test/test_kni.c| 13 + test/test/test_power.c | 12 test/test/test_power_acpi_cpufreq.c | 11 +++ test/test/test_power_kvm_vm.c | 11 +++ 4 files changed, 47 insertions(+) diff --git a/test/test/test_kni.c b/test/test/test_kni.c index c6867f256..e4839cdb7 100644 --- a/test/test/test_kni.c +++ b/test/test/test_kni.c @@ -10,6 +10,17 @@ #include "test.h" +#ifndef RTE_LIBRTE_KNI + +static int +test_kni(void) +{ + printf("KNI not supported, skipping test\n"); + return TEST_SKIPPED; +} + +#else + #include #include #include @@ -609,4 +620,6 @@ test_kni(void) return ret; } +#endif + REGISTER_TEST_COMMAND(kni_autotest, test_kni); diff --git a/test/test/test_power.c b/test/test/test_power.c index d601a2730..a0ee21983 100644 --- a/test/test/test_power.c +++ b/test/test/test_power.c @@ -10,6 +10,17 @@ #include "test.h" +#ifndef RTE_LIBRTE_POWER + +static int +test_power(void) +{ + printf("Power management library not supported, skipping test\n"); + return TEST_SKIPPED; +} + +#else + #include static int @@ -74,5 +85,6 @@ test_power(void) rte_power_unset_env(); return -1; } +#endif REGISTER_TEST_COMMAND(power_autotest, test_power); diff --git a/test/test/test_power_acpi_cpufreq.c b/test/test/test_power_acpi_cpufreq.c index ad948fbe1..3bfd03351 100644 --- a/test/test/test_power_acpi_cpufreq.c +++ b/test/test/test_power_acpi_cpufreq.c @@ -10,6 +10,16 @@ #include "test.h" +#ifndef RTE_LIBRTE_POWER + +static int +test_power_acpi_cpufreq(void) +{ + printf("Power management library not supported, skipping test\n"); + return TEST_SKIPPED; +} + +#else #include #define TEST_POWER_LCORE_ID 2U @@ -507,5 +517,6 @@ test_power_acpi_cpufreq(void) rte_power_unset_env(); return -1; } +#endif REGISTER_TEST_COMMAND(power_acpi_cpufreq_autotest, test_power_acpi_cpufreq); diff --git a/test/test/test_power_kvm_vm.c b/test/test/test_power_kvm_vm.c index 97b8af9b5..91b31c442 100644 --- a/test/test/test_power_kvm_vm.c +++ b/test/test/test_power_kvm_vm.c @@ -10,6 +10,16 @@ #include "test.h" +#ifndef RTE_LIBRTE_POWER + +static int +test_power_kvm_vm(void) +{ + printf("Power management library not supported, skipping test\n"); + return TEST_SKIPPED; +} + +#else #include #define TEST_POWER_VM_LCORE_ID0U @@ -270,5 +280,6 @@ test_power_kvm_vm(void) rte_power_unset_env(); return -1; } +#endif REGISTER_TEST_COMMAND(power_kvm_vm_autotest, test_power_kvm_vm); -- 2.14.1
[dpdk-dev] [PATCH v2 4/5] test/test: fix dependency on power lib for BSD meson build
The power library is not built on FreeBSD, so it needs to be an optional rather than a mandatory dependency for building the autotest binary. Fixes: b5dc795a8a55 ("test: build app with meson as dpdk-test") Signed-off-by: Bruce Richardson --- test/test/meson.build | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/test/test/meson.build b/test/test/meson.build index 2457a2adb..e8ddb76e3 100644 --- a/test/test/meson.build +++ b/test/test/meson.build @@ -111,7 +111,6 @@ test_deps = ['acl', 'member', 'pipeline', 'port', - 'power', 'reorder', 'ring', 'timer' @@ -228,6 +227,9 @@ endif if dpdk_conf.has('RTE_LIBRTE_RING_PMD') test_deps += 'pmd_ring' endif +if dpdk_conf.has('RTE_LIBRTE_POWER') + test_deps += 'power' +endif test_dep_objs = [] foreach d:test_deps -- 2.14.1
[dpdk-dev] [PATCH v2 5/5] test/test: fix dependency on KNI lib for BSD meson build
The KNI library is not built on FreeBSD, so it needs to be an optional rather than a mandatory dependency for building the autotest binary. Fixes: b5dc795a8a55 ("test: build app with meson as dpdk-test") Signed-off-by: Bruce Richardson --- test/test/meson.build | 3 +++ 1 file changed, 3 insertions(+) diff --git a/test/test/meson.build b/test/test/meson.build index e8ddb76e3..eb3d87a4d 100644 --- a/test/test/meson.build +++ b/test/test/meson.build @@ -230,6 +230,9 @@ endif if dpdk_conf.has('RTE_LIBRTE_POWER') test_deps += 'power' endif +if dpdk_conf.has('RTE_LIBRTE_KNI') + test_deps += 'kni' +endif test_dep_objs = [] foreach d:test_deps -- 2.14.1
Re: [dpdk-dev] [PATCH v1] build: add more implementers' IDs and PNs for Arm platforms
On Wed, Jan 31, 2018 at 03:39:19PM +0800, Herbert Guan wrote: > 1) Add native PN option '-march=native' to allow automatic detection. >Set 'arm_force_native_march' to 'true' in config/arm/meson.build >to use native PN option. > 2) Add implementer_pn option for part num selection in cross compile > 3) Add known Arm cortex PN support > 4) Add known implementers' IDs (use generic flags/archs by default) > 5) Sync build options with config/common_armv8a_linuxapp > > Signed-off-by: Herbert Guan > --- Is it intended to get this into 18.02, or can it be delayed till 18.05? Pavan, can you please review, as author of the existing ARM-specific meson code? Thanks, /Bruce
Re: [dpdk-dev] [PATCH 1/2] vhost: add flag for built-in virtio_net.c driver
Hi Stefan, On Wed, Jan 31, 2018 at 05:46:50PM +, Stefan Hajnoczi wrote: > The librte_vhost API is used in two ways: > 1. As a vhost net device backend via rte_vhost_enqueue/dequeue_burst(). This is how DPDK vhost-user firstly implemented. > 2. As a library for implementing vhost device backends. This is how DPDK vhost-use extended later, and vhost-user scsi is the first one being added. > There is no distinction between the two at the API level or in the > librte_vhost implementation. For example, device state is kept in > "struct virtio_net" regardless of whether this is actually a net device > backend or whether the built-in virtio_net.c driver is in use. Indeed. virtio_net should be renamed to "vhost_dev" or something like this. It's part of something un-finished in the last vhost-user extension refactoring. > > The virtio_net.c driver should be a librte_vhost API client just like > the vhost-scsi code and have no special access to vhost.h internals. > Unfortunately, fixing this requires significant librte_vhost API > changes. The way I thought was to move the virtio_net.c completely to vhost pmd (drivers/net/vhost). And let vhost-user just be a generic lib without any device specific stuff. Unfortunately, it can not be done recently, as there are still a lot of applications using rte_vhost_enqueue/dequeue_burst directly, for example, OVS. > This patch takes a different approach: keep the librte_vhost API > unchanged but track whether the built-in virtio_net.c driver is in use. > See the next patch for a bug fix that requires knowledge of whether > virtio_net.c is in use. LGTM. Thanks. --yliu
Re: [dpdk-dev] [PATCH 0/2] vhost: fix VIRTIO_NET_F_MQ vhost_scsi breakage
On Wed, Jan 31, 2018 at 05:46:49PM +, Stefan Hajnoczi wrote: > These patches fix a recent regression in librte_vhost that breaks the > vhost_scsi example application. vhost_user.c assumes all devices are vhost > net > backends when handling the VIRTIO_NET_F_MQ feature bit. The code is triggered > by vhost scsi devices and causes virtqueues to be removed. See Patch 2 for > details. > > Patch 1 puts the infrastructure in place to distinguish between the built-in > virtio_net.c driver and generic vhost device backend usage. > > Patch 2 fixes the regression by handling VIRTIO_NET_F_MQ only when the > built-in > virtio_net.c driver is in use. > > Stefan Hajnoczi (2): > vhost: add flag for built-in virtio_net.c driver > vhost: only drop vqs with built-in virtio_net.c driver Series Acked-by: Yuanhan Liu Thanks. --yliu > > lib/librte_vhost/vhost.h | 3 +++ > lib/librte_vhost/socket.c | 15 +++ > lib/librte_vhost/vhost.c | 17 - > lib/librte_vhost/vhost_user.c | 3 ++- > lib/librte_vhost/virtio_net.c | 14 ++ > 5 files changed, 50 insertions(+), 2 deletions(-) > > -- > 2.14.3
Re: [dpdk-dev] [PATCH] examples/vhost_scsi: drop unimplemented EVENT_IDX feature bit
On Wed, Jan 31, 2018 at 05:48:28PM +, Stefan Hajnoczi wrote: > The vhost_scsi example application negotiates the > VIRTIO_RING_F_EVENT_IDX feature bit but does not honor it when accessing > vrings. > > In particular, commit e37ff954405addb8ea422426a2d162d00dcad196 ("vhost: > support virtqueue interrupt/notification suppression") broke vring call > because vq->last_used_idx is never updated by vhost_scsi. The > vq->last_used_idx field is not even available via the librte_vhost > public API, so VIRTIO_RING_F_EVENT_IDX is currently only usable by the > built-in virtio_net.c driver in librte_vhost. > > This patch drops VIRTIO_RING_F_EVENT_IDX from vhost_scsi so that vring > call works again. > > Cc: Changpeng Liu > Cc: Junjie Chen > Signed-off-by: Stefan Hajnoczi Acked-by: Yuanhan Liu Thanks. --yliu > --- > examples/vhost_scsi/vhost_scsi.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/examples/vhost_scsi/vhost_scsi.c > b/examples/vhost_scsi/vhost_scsi.c > index da01ad378..3cb4383e9 100644 > --- a/examples/vhost_scsi/vhost_scsi.c > +++ b/examples/vhost_scsi/vhost_scsi.c > @@ -21,7 +21,6 @@ > #include "scsi_spec.h" > > #define VIRTIO_SCSI_FEATURES ((1 << VIRTIO_F_NOTIFY_ON_EMPTY) |\ > - (1 << VIRTIO_RING_F_EVENT_IDX) |\ > (1 << VIRTIO_SCSI_F_INOUT) |\ > (1 << VIRTIO_SCSI_F_CHANGE)) > > -- > 2.14.3
Re: [dpdk-dev] [PATCH v1] doc: update definition of lcore id and lcore index
On Wed, Jan 31, 2018 at 04:46:46PM +, Marko Kovacevic wrote: > Added examples in lcore index for better > explanation on various examples, > Sited examples for lcore id. > > Signed-off-by: Marko Kovacevic > --- > lib/librte_eal/common/include/rte_lcore.h | 17 +++-- > 1 file changed, 15 insertions(+), 2 deletions(-) > > diff --git a/lib/librte_eal/common/include/rte_lcore.h > b/lib/librte_eal/common/include/rte_lcore.h > index d84bcff..349ac36 100644 > --- a/lib/librte_eal/common/include/rte_lcore.h > +++ b/lib/librte_eal/common/include/rte_lcore.h > @@ -57,7 +57,9 @@ RTE_DECLARE_PER_LCORE(unsigned, _lcore_id); /**< Per > thread "lcore id". */ > RTE_DECLARE_PER_LCORE(rte_cpuset_t, _cpuset); /**< Per thread "cpuset". */ > > /** > - * Return the ID of the execution unit we are running on. > + * Return the Application thread ID of the execution unit. > + * If option '-l' or '-c' is provided the lcore ID is the actual > + * CPU ID. Good idea to clarify this! I'd suggest the second sentence might do with being reworked a little though - the lcore ID will also be the processor id even if no args i.e. no -c or -l arguments are passed. How about: * Note: in most cases the lcore id returned here will also correspond * to the processor id of the CPU on which the thread is pinned, this * will not be the case if the user has explicitly changed the thread to * core affinities using --lcores EAL argument e.g. --lcores '(0-3)@10' * to run threads with lcore IDs 0, 1, 2 and 3 on physical core 10. It's longer, I know, but hopefully a bit clearer for the user. /Bruce
Re: [dpdk-dev] [PATCH] bnxt: Fix to set timestamp flag as well in the offload flags for the recieved pkt in case of PTP offload
On 2/1/2018 5:09 AM, Somnath Kotur wrote: Hi Somnath, Can you please keep patch title brief (around 50 characters) and put more content in commit log? > Signed-off-by: Somnath Kotur > --- > drivers/net/bnxt/bnxt_rxr.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c > index 82c93d6..430990d 100644 > --- a/drivers/net/bnxt/bnxt_rxr.c > +++ b/drivers/net/bnxt/bnxt_rxr.c > @@ -459,7 +459,7 @@ static int bnxt_rx_pkt(struct rte_mbuf **rx_pkt, > > if ((rxcmp->flags_type & rte_cpu_to_le_16(RX_PKT_CMPL_FLAGS_MASK)) == >RX_PKT_CMPL_FLAGS_ITYPE_PTP_W_TIMESTAMP) > - mbuf->ol_flags |= PKT_RX_IEEE1588_PTP; > + mbuf->ol_flags |= PKT_RX_IEEE1588_PTP | PKT_RX_IEEE1588_TMST; > > if (agg_buf) > bnxt_rx_pages(rxq, mbuf, &tmp_raw_cons, agg_buf); >
Re: [dpdk-dev] [PATCH v3 0/2] vhost: IOTLB fixes
On Mon, Jan 29, 2018 at 05:30:38PM +0100, Maxime Coquelin wrote: > First patch of the series fixes OOM handling from the IOTLB > mempool, the second one removes pending IOTLB entry when the > IOTLB miss request sending failed. > > Changes since v2: > - > - patch 2: Fix error message with correct IOVA > > Changes since v1: > - > - Make log levels consistent (Tiwei) > - Remove pending IOTLB entry of miss request seding failed (Tiwei) > > Maxime Coquelin (2): > vhost: fix iotlb pool out-of-memory handling > vhost: remove pending IOTLB entry if IOTLB MISS request sending failed Series Acked-by: Yuanhan Liu Thanks. --yliu > > lib/librte_vhost/iotlb.c | 20 ++-- > lib/librte_vhost/iotlb.h | 3 +++ > lib/librte_vhost/vhost.c | 13 ++--- > 3 files changed, 27 insertions(+), 9 deletions(-) > > -- > 2.14.3
[dpdk-dev] [PATCH v1] eal: add error check for core options
Error information on the current core usage list,mask and map were incomplete. Added states to differentiate core usage and to inform user. Signed-off-by: Marko Kovacevic --- doc/guides/testpmd_app_ug/run_app.rst | 4 lib/librte_eal/common/eal_common_options.c | 33 +++--- 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_app_ug/run_app.rst index 46da1df..26500bf 100644 --- a/doc/guides/testpmd_app_ug/run_app.rst +++ b/doc/guides/testpmd_app_ug/run_app.rst @@ -62,6 +62,10 @@ See the DPDK Getting Started Guides for more information on these options. The grouping ``()`` can be omitted for single element group. The ``@`` can be omitted if cpus and lcores have the same value. +.. Note:: + When ``--lcores`` is in use, the options ``-l`` and ``-c`` cannot be used. + + * ``--master-lcore ID`` Core ID that is used as master. diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c index b6d2762..6604c64 100644 --- a/lib/librte_eal/common/eal_common_options.c +++ b/lib/librte_eal/common/eal_common_options.c @@ -57,6 +57,9 @@ #include "eal_filesystem.h" #define BITS_PER_HEX 4 +#define LCORE_OPT_LST 1 +#define LCORE_OPT_MSK 2 +#define LCORE_OPT_MAP 3 const char eal_short_options[] = @@ -1028,7 +1031,15 @@ eal_parse_common_option(int opt, const char *optarg, RTE_LOG(ERR, EAL, "invalid coremask\n"); return -1; } - core_parsed = 1; + + if (core_parsed) { + RTE_LOG(ERR, EAL, "Core Mask Option is ignored, because core (%s) is set!\n", + (core_parsed == LCORE_OPT_LST)?"LIST" : + (core_parsed == LCORE_OPT_MAP)?"MAP" : "Unknown"); + return -1; + } + + core_parsed = LCORE_OPT_MSK; break; /* corelist */ case 'l': @@ -1036,7 +1047,15 @@ eal_parse_common_option(int opt, const char *optarg, RTE_LOG(ERR, EAL, "invalid core list\n"); return -1; } - core_parsed = 1; + + if (core_parsed) { + RTE_LOG(ERR, EAL, "Core List Option is ignored, because core (%s) is set!\n", + (core_parsed == LCORE_OPT_MSK)?"LIST" : + (core_parsed == LCORE_OPT_MAP)?"MAP" : "Unknown"); + return -1; + } + + core_parsed = LCORE_OPT_LST; break; /* service coremask */ case 's': @@ -1156,7 +1175,15 @@ eal_parse_common_option(int opt, const char *optarg, OPT_LCORES "\n"); return -1; } - core_parsed = 1; + + if (core_parsed) { + RTE_LOG(ERR, EAL, "Core Map Option is ignored, because core (%s) is set!\n", + (core_parsed == LCORE_OPT_LST)?"LIST" : + (core_parsed == LCORE_OPT_MSK)?"MASK" : "Unknown"); + return -1; + } + + core_parsed = LCORE_OPT_MAP; break; /* don't know what to do, leave this to caller */ -- 2.9.5
Re: [dpdk-dev] [PATCH] bnxt: Fix to set timestamp flag as well in the offload flags for the recieved pkt in case of PTP offload
Sure , will do a response Thanks Som On Feb 1, 2018 8:54 PM, "Ferruh Yigit" wrote: > On 2/1/2018 5:09 AM, Somnath Kotur wrote: > > Hi Somnath, > > Can you please keep patch title brief (around 50 characters) and put more > content in commit log? > > > Signed-off-by: Somnath Kotur > > --- > > drivers/net/bnxt/bnxt_rxr.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c > > index 82c93d6..430990d 100644 > > --- a/drivers/net/bnxt/bnxt_rxr.c > > +++ b/drivers/net/bnxt/bnxt_rxr.c > > @@ -459,7 +459,7 @@ static int bnxt_rx_pkt(struct rte_mbuf **rx_pkt, > > > > if ((rxcmp->flags_type & rte_cpu_to_le_16(RX_PKT_CMPL_FLAGS_MASK)) > == > >RX_PKT_CMPL_FLAGS_ITYPE_PTP_W_TIMESTAMP) > > - mbuf->ol_flags |= PKT_RX_IEEE1588_PTP; > > + mbuf->ol_flags |= PKT_RX_IEEE1588_PTP | > PKT_RX_IEEE1588_TMST; > > > > if (agg_buf) > > bnxt_rx_pages(rxq, mbuf, &tmp_raw_cons, agg_buf); > > > >
Re: [dpdk-dev] [PATCH 3/6] test/test: mark tests as skipped when required lib not available
Hi Bruce, On 31/1/2018 5:42 PM, Bruce Richardson wrote: The power management and KNI libraries are not compiled on a FreeBSD platform, which means that the tests can't run. Add in stub code for these cases, allowing the tests to still be compiled, but to report as skipped in those cases. Signed-off-by: Bruce Richardson CC: Ferruh Yigit CC: David Hunt --- test/test/test_kni.c| 13 + test/test/test_power.c | 12 test/test/test_power_acpi_cpufreq.c | 11 +++ test/test/test_power_kvm_vm.c | 11 +++ 4 files changed, 47 insertions(+) --snip-- Acked-by David Hunt
Re: [dpdk-dev] [PATCH] net/ena: fix jumbo support in Rx offloads flags
2018-02-01 14:06 GMT+01:00 Rafal Kozik : > > ENA device supports Rx jumbo frames and such information needs to > be provided in the offloads flags. > > Fixes: 7369f88f88c0 ("net/ena: convert to new Rx offloads API") > > Signed-off-by: Rafal Kozik Signed-off-by: Michal Krawczyk > > --- > drivers/net/ena/ena_ethdev.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c > index 83e0ae2..3588384 100644 > --- a/drivers/net/ena/ena_ethdev.c > +++ b/drivers/net/ena/ena_ethdev.c > @@ -1561,6 +1561,8 @@ static void ena_infos_get(struct rte_eth_dev *dev, > DEV_RX_OFFLOAD_UDP_CKSUM | > DEV_RX_OFFLOAD_TCP_CKSUM; > > + rx_feat |= DEV_RX_OFFLOAD_JUMBO_FRAME; > + > /* Inform framework about available features */ > dev_info->rx_offload_capa = rx_feat; > dev_info->rx_queue_offload_capa = rx_feat; > -- > 2.7.4 >
Re: [dpdk-dev] [PATCH v1] eal: add error check for core options
On 01-Feb-18 3:39 PM, Marko Kovacevic wrote: Error information on the current core usage list,mask and map were incomplete. Added states to differentiate core usage and to inform user. Nitpicking, but line width on commit message is a little on the short side... Signed-off-by: Marko Kovacevic --- doc/guides/testpmd_app_ug/run_app.rst | 4 lib/librte_eal/common/eal_common_options.c | 33 +++--- 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_app_ug/run_app.rst index 46da1df..26500bf 100644 --- a/doc/guides/testpmd_app_ug/run_app.rst +++ b/doc/guides/testpmd_app_ug/run_app.rst @@ -62,6 +62,10 @@ See the DPDK Getting Started Guides for more information on these options. The grouping ``()`` can be omitted for single element group. The ``@`` can be omitted if cpus and lcores have the same value. +.. Note:: + When ``--lcores`` is in use, the options ``-l`` and ``-c`` cannot be used. + + * ``--master-lcore ID`` Core ID that is used as master. diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c index b6d2762..6604c64 100644 --- a/lib/librte_eal/common/eal_common_options.c +++ b/lib/librte_eal/common/eal_common_options.c @@ -57,6 +57,9 @@ #include "eal_filesystem.h" #define BITS_PER_HEX 4 +#define LCORE_OPT_LST 1 +#define LCORE_OPT_MSK 2 +#define LCORE_OPT_MAP 3 const char eal_short_options[] = @@ -1028,7 +1031,15 @@ eal_parse_common_option(int opt, const char *optarg, RTE_LOG(ERR, EAL, "invalid coremask\n"); return -1; } - core_parsed = 1; + + if (core_parsed) { + RTE_LOG(ERR, EAL, "Core Mask Option is ignored, because core (%s) is set!\n", + (core_parsed == LCORE_OPT_LST)?"LIST" : + (core_parsed == LCORE_OPT_MAP)?"MAP" : "Unknown"); i think "LIST" and "MAP" are terribly undescriptive. It would be better to put the respective cmdline arguments ("-l" or "-c") there instead. Same applies to other cases. Otherwise, Reviewed-by: Anatoly Burakov + return -1; + } + + core_parsed = LCORE_OPT_MSK; break; /* corelist */ case 'l': @@ -1036,7 +1047,15 @@ eal_parse_common_option(int opt, const char *optarg, RTE_LOG(ERR, EAL, "invalid core list\n"); return -1; } - core_parsed = 1; + + if (core_parsed) { + RTE_LOG(ERR, EAL, "Core List Option is ignored, because core (%s) is set!\n", + (core_parsed == LCORE_OPT_MSK)?"LIST" : + (core_parsed == LCORE_OPT_MAP)?"MAP" : "Unknown"); + return -1; + } + + core_parsed = LCORE_OPT_LST; break; /* service coremask */ case 's': @@ -1156,7 +1175,15 @@ eal_parse_common_option(int opt, const char *optarg, OPT_LCORES "\n"); return -1; } - core_parsed = 1; + + if (core_parsed) { + RTE_LOG(ERR, EAL, "Core Map Option is ignored, because core (%s) is set!\n", + (core_parsed == LCORE_OPT_LST)?"LIST" : + (core_parsed == LCORE_OPT_MSK)?"MASK" : "Unknown"); + return -1; + } + + core_parsed = LCORE_OPT_MAP; break; /* don't know what to do, leave this to caller */ -- Thanks, Anatoly