[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching
The rte_pktmbuf_detach() function should decrease refcnt on a direct buffer. Signed-off-by: Hiroyuki Mikita --- lib/librte_mbuf/rte_mbuf.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index 529debb..3b117ca 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -1468,9 +1468,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m) */ static inline void rte_pktmbuf_detach(struct rte_mbuf *m) { + struct rte_mbuf *md = rte_mbuf_from_indirect(m); struct rte_mempool *mp = m->pool; uint32_t mbuf_size, buf_len, priv_size; + rte_mbuf_refcnt_update(md, -1); priv_size = rte_pktmbuf_priv_size(mp); mbuf_size = sizeof(struct rte_mbuf) + priv_size; buf_len = rte_pktmbuf_data_room_size(mp); @@ -1498,7 +1500,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m) if (RTE_MBUF_INDIRECT(m)) { struct rte_mbuf *md = rte_mbuf_from_indirect(m); rte_pktmbuf_detach(m); - if (rte_mbuf_refcnt_update(md, -1) == 0) + if (rte_mbuf_refcnt_read(md) == 0) __rte_mbuf_raw_free(md); } return m; -- 1.9.1
[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Hiroyuki Mikita > Sent: Sunday, May 15, 2016 4:51 PM > To: olivier.matz at 6wind.com > Cc: dev at dpdk.org > Subject: [dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching > > The rte_pktmbuf_detach() function should decrease refcnt on a direct > buffer. Hmm, why is that? What's wrong with the current approach? Konstantin > > Signed-off-by: Hiroyuki Mikita > --- > lib/librte_mbuf/rte_mbuf.h | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > index 529debb..3b117ca 100644 > --- a/lib/librte_mbuf/rte_mbuf.h > +++ b/lib/librte_mbuf/rte_mbuf.h > @@ -1468,9 +1468,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf > *mi, struct rte_mbuf *m) > */ > static inline void rte_pktmbuf_detach(struct rte_mbuf *m) > { > + struct rte_mbuf *md = rte_mbuf_from_indirect(m); > struct rte_mempool *mp = m->pool; > uint32_t mbuf_size, buf_len, priv_size; > > + rte_mbuf_refcnt_update(md, -1); > priv_size = rte_pktmbuf_priv_size(mp); > mbuf_size = sizeof(struct rte_mbuf) + priv_size; > buf_len = rte_pktmbuf_data_room_size(mp); > @@ -1498,7 +1500,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m) > if (RTE_MBUF_INDIRECT(m)) { > struct rte_mbuf *md = rte_mbuf_from_indirect(m); > rte_pktmbuf_detach(m); > - if (rte_mbuf_refcnt_update(md, -1) == 0) > + if (rte_mbuf_refcnt_read(md) == 0) > __rte_mbuf_raw_free(md); > } > return m; > -- > 1.9.1
[dpdk-dev] [PATCH v4 6/8] virtio-user: add new virtual pci driver for virtio
Hi Yuanhan, > -Original Message- > From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com] > Sent: Friday, May 13, 2016 12:45 PM > To: Tan, Jianfeng > Cc: dev at dpdk.org; Xie, Huawei; rich.lane at bigswitch.com; mst at > redhat.com; > nakajima.yoshihiro at lab.ntt.co.jp; p.fedin at samsung.com; > ann.zhuangyanying at huawei.com; mukawa at igel.co.jp; > nhorman at tuxdriver.com > Subject: Re: [PATCH v4 6/8] virtio-user: add new virtual pci driver for virtio > > On Fri, May 13, 2016 at 09:54:33AM +0800, Tan, Jianfeng wrote: > > > > So, I'd suggest something like following: > > > > if (is_vdev(..)) { > > > > > > The blocker issue of your suggestion is that we have no such condition. > > > > Previously, I use dev_type, but as David's comment said: > > That's not the only option. There should be others, for example, > checking the existence of virtio_user_device. Or even, you could > add a new flag inside virtio hw while initiating your vdev. > > > May I ask how many more such handling are needed, excluding the tx > queue > > header desc setup? And as stated, in generic, yes, we should try that. > > > > > > Those which need special handling: > > (1) vq->vq_ring_mem: it is set but never used, so it's out of question. > > (2) vq->virtio_net_hdr_mem and vring_hdr_desc_init > > vring_hdr_desc_init is common. > > > (3) vq->offset > > > > Just (2) and (3) so far. And the question is quite clear: where to put these > > two special handling. > > Apparently, you can't put it into the queue_setup(). And I still think > my proposal works great here. OK, since it's indeed inappropriate to put these special handlings inside queue_setup() from semantic perspective, I'll add them according to if hw->vdev_private == NULL in the driver. Thanks for suggestion. Thanks, Jianfeng > > --yliu
[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching
Hi Konstantin, Now, the attach operation increases refcnt, but the detach does not decrease it. I think both operations should affect refcnt or not. Which design is intended? In "6.7. Direct and Indirect Buffers" of Programmer's Guide, it is mentioned that "...whenever an indirect buffer is attached to the direct buffer, the reference counter on the direct buffer is incremented. Similarly, whenever the indirect buffer is detached, the reference counter on the direct buffer is decremented." Regards, Hiroyuki 2016-05-16 9:05 GMT+09:00 Ananyev, Konstantin : > > >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Hiroyuki Mikita >> Sent: Sunday, May 15, 2016 4:51 PM >> To: olivier.matz at 6wind.com >> Cc: dev at dpdk.org >> Subject: [dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching >> >> The rte_pktmbuf_detach() function should decrease refcnt on a direct >> buffer. > > Hmm, why is that? > What's wrong with the current approach? > Konstantin > >> >> Signed-off-by: Hiroyuki Mikita >> --- >> lib/librte_mbuf/rte_mbuf.h | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h >> index 529debb..3b117ca 100644 >> --- a/lib/librte_mbuf/rte_mbuf.h >> +++ b/lib/librte_mbuf/rte_mbuf.h >> @@ -1468,9 +1468,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf >> *mi, struct rte_mbuf *m) >> */ >> static inline void rte_pktmbuf_detach(struct rte_mbuf *m) >> { >> + struct rte_mbuf *md = rte_mbuf_from_indirect(m); >> struct rte_mempool *mp = m->pool; >> uint32_t mbuf_size, buf_len, priv_size; >> >> + rte_mbuf_refcnt_update(md, -1); >> priv_size = rte_pktmbuf_priv_size(mp); >> mbuf_size = sizeof(struct rte_mbuf) + priv_size; >> buf_len = rte_pktmbuf_data_room_size(mp); >> @@ -1498,7 +1500,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m) >> if (RTE_MBUF_INDIRECT(m)) { >> struct rte_mbuf *md = rte_mbuf_from_indirect(m); >> rte_pktmbuf_detach(m); >> - if (rte_mbuf_refcnt_update(md, -1) == 0) >> + if (rte_mbuf_refcnt_read(md) == 0) >> __rte_mbuf_raw_free(md); >> } >> return m; >> -- >> 1.9.1 >
[dpdk-dev] [PATCH v4 6/8] virtio-user: add new virtual pci driver for virtio
On Mon, May 16, 2016 at 01:48:01AM +, Tan, Jianfeng wrote: > > On Fri, May 13, 2016 at 09:54:33AM +0800, Tan, Jianfeng wrote: > > > > > > So, I'd suggest something like following: > > > > > > if (is_vdev(..)) { > > > > > > > > > The blocker issue of your suggestion is that we have no such condition. > > > > > > Previously, I use dev_type, but as David's comment said: > > > > That's not the only option. There should be others, for example, > > checking the existence of virtio_user_device. Or even, you could > > add a new flag inside virtio hw while initiating your vdev. > > > > > May I ask how many more such handling are needed, excluding the tx > > queue > > > header desc setup? And as stated, in generic, yes, we should try that. > > > > > > > > > Those which need special handling: > > > (1) vq->vq_ring_mem: it is set but never used, so it's out of question. > > > (2) vq->virtio_net_hdr_mem and vring_hdr_desc_init > > > > vring_hdr_desc_init is common. > > > > > (3) vq->offset > > > > > > Just (2) and (3) so far. And the question is quite clear: where to put > > > these > > > two special handling. > > > > Apparently, you can't put it into the queue_setup(). And I still think > > my proposal works great here. > > OK, since it's indeed inappropriate to put these special handlings inside > queue_setup() from semantic perspective, I'll add them according to if > hw->vdev_private == NULL in the driver. I'm thinking maybe we could rename "vdev_private" to "virtio_user_dev" (or something like that), to make it explicit that it's for virtio user device. I mean, there should be no other vdevs after all. OTOH, using "hw->vdev_private != NULL" to say it's a virtio-user device is a bit weird; it doesn't even make too much sense. --yliu
[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching
Hi Hiroyuki, > > Hi Konstantin, > > Now, the attach operation increases refcnt, but the detach does not decrease > it. > I think both operations should affect refcnt or not. > Which design is intended? > > In "6.7. Direct and Indirect Buffers" of Programmer's Guide, > it is mentioned that "...whenever an indirect buffer is attached to > the direct buffer, > the reference counter on the direct buffer is incremented. > Similarly, whenever the indirect buffer is detached, > the reference counter on the direct buffer is decremented." Well, right now the rte_pktmbuf_detach(struct rte_mbuf *m) just restores the fields of indirect mbufs to the default values, nothing more. Actual freeing of the mbuf it was attached to is done in the __rte_pktmbuf_prefree_seg(). I suppose the name rte_pktmbuf_detach() is a bit confusing here, might be, when created, it should be named rte_pktmbuf_restore() or so. About proposed changes - it would introduce an extra unnecessary read of refcnt (as it is a volatile field). If we'll decide to go that way, then I think rte_pktmbuf_detach() have to deal with freeing md. Something like that: static inline void rte_pktmbuf_detach(struct rte_mbuf *m) { struct rte_mbuf *md = rte_mbuf_from_indirect(m); /* former rte_pktmbuf_detach */ rte_pktmbuf_restore(m); if (rte_mbuf_refcnt_update(md, -1) == 0) __rte_mbuf_raw_free(md); } That might be a good thing in terms of API usability and clearness, but would cause a change in public API behaviour, so I am not sure it is worth it. Konstantin > > Regards, > Hiroyuki > > 2016-05-16 9:05 GMT+09:00 Ananyev, Konstantin intel.com>: > > > > > >> -Original Message- > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Hiroyuki Mikita > >> Sent: Sunday, May 15, 2016 4:51 PM > >> To: olivier.matz at 6wind.com > >> Cc: dev at dpdk.org > >> Subject: [dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching > >> > >> The rte_pktmbuf_detach() function should decrease refcnt on a direct > >> buffer. > > > > Hmm, why is that? > > What's wrong with the current approach? > > Konstantin > > > >> > >> Signed-off-by: Hiroyuki Mikita > >> --- > >> lib/librte_mbuf/rte_mbuf.h | 4 +++- > >> 1 file changed, 3 insertions(+), 1 deletion(-) > >> > >> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > >> index 529debb..3b117ca 100644 > >> --- a/lib/librte_mbuf/rte_mbuf.h > >> +++ b/lib/librte_mbuf/rte_mbuf.h > >> @@ -1468,9 +1468,11 @@ static inline void rte_pktmbuf_attach(struct > >> rte_mbuf *mi, struct rte_mbuf *m) > >> */ > >> static inline void rte_pktmbuf_detach(struct rte_mbuf *m) > >> { > >> + struct rte_mbuf *md = rte_mbuf_from_indirect(m); > >> struct rte_mempool *mp = m->pool; > >> uint32_t mbuf_size, buf_len, priv_size; > >> > >> + rte_mbuf_refcnt_update(md, -1); > >> priv_size = rte_pktmbuf_priv_size(mp); > >> mbuf_size = sizeof(struct rte_mbuf) + priv_size; > >> buf_len = rte_pktmbuf_data_room_size(mp); > >> @@ -1498,7 +1500,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m) > >> if (RTE_MBUF_INDIRECT(m)) { > >> struct rte_mbuf *md = rte_mbuf_from_indirect(m); > >> rte_pktmbuf_detach(m); > >> - if (rte_mbuf_refcnt_update(md, -1) == 0) > >> + if (rte_mbuf_refcnt_read(md) == 0) > >> __rte_mbuf_raw_free(md); > >> } > >> return m; > >> -- > >> 1.9.1 > >
[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching
Hi Hiroyuki, On 05/15/2016 05:50 PM, Hiroyuki Mikita wrote: > The rte_pktmbuf_detach() function should decrease refcnt on a direct > buffer. > > Signed-off-by: Hiroyuki Mikita > --- > lib/librte_mbuf/rte_mbuf.h | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > index 529debb..3b117ca 100644 > --- a/lib/librte_mbuf/rte_mbuf.h > +++ b/lib/librte_mbuf/rte_mbuf.h > @@ -1468,9 +1468,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf > *mi, struct rte_mbuf *m) > */ > static inline void rte_pktmbuf_detach(struct rte_mbuf *m) > { > + struct rte_mbuf *md = rte_mbuf_from_indirect(m); > struct rte_mempool *mp = m->pool; > uint32_t mbuf_size, buf_len, priv_size; > > + rte_mbuf_refcnt_update(md, -1); > priv_size = rte_pktmbuf_priv_size(mp); > mbuf_size = sizeof(struct rte_mbuf) + priv_size; > buf_len = rte_pktmbuf_data_room_size(mp); > @@ -1498,7 +1500,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m) > if (RTE_MBUF_INDIRECT(m)) { > struct rte_mbuf *md = rte_mbuf_from_indirect(m); > rte_pktmbuf_detach(m); > - if (rte_mbuf_refcnt_update(md, -1) == 0) > + if (rte_mbuf_refcnt_read(md) == 0) > __rte_mbuf_raw_free(md); > } > return m; > Thanks for submitting this patch. I agree that rte_pktmbuf_attach() and rte_pktmbuf_detach() are not symmetrical, but I think your patch could trigger some race conditions. Example: - init: m, c1 and c2 are direct mbuf - rte_pktmbuf_attach(c1, m); # c1 becomes a clone of m - rte_pktmbuf_attach(c2, m); # c2 becomes another clone of m - rte_pktmbuf_free(m); - after that, we have: - m is a direct mbuf with refcnt = 2 - c1 is an indirect mbuf pointing to data of m - c2 is an indirect mbuf pointing to data of m - if we call rte_pktmbuf_free(c1) and rte_pktmbuf_free(c2) on 2 different cores at the same time, m can be freed twice because (rte_mbuf_refcnt_read(md) == 0) can be true on both cores. I think the proper way of doing would be to have rte_pktmbuf_detach() returning the value of rte_mbuf_refcnt_update(md, -1), ensuring that only one core will call _rte_mbuf_raw_free(). In the unit tests, in test_attach_from_different_pool(), the mbuf m is never freed due to this behavior. That shows the current api is a bit misleading. I think it should also be fixed in the patch. Another issue is that it will break the API. To avoid issues in applications relying on the current behavior of rte_pktmbuf_detach(), I'd say we should keep the function as-is and mark it as deprecated. Then, introduce a new function rte_pktmbuf_detach2() (or any better name :) ) that changes the behavior to what you suggest. An entry in the release note would also be helpful. The other option is to let things as-is and just document the behavior of rte_pktmbuf_detach(), explicitly saying that it is not symmetrical with the attach. But I'd prefer the first option. Thoughts ? Regards, Olivier
[dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching
2016-05-16 11:46, Hiroyuki MIKITA: > Now, the attach operation increases refcnt, but the detach does not decrease > it. > I think both operations should affect refcnt or not. > Which design is intended? > > In "6.7. Direct and Indirect Buffers" of Programmer's Guide, > it is mentioned that "...whenever an indirect buffer is attached to > the direct buffer, > the reference counter on the direct buffer is incremented. > Similarly, whenever the indirect buffer is detached, > the reference counter on the direct buffer is decremented." The doc is the reference. The doxygen comment should explicit every details of the behaviour. And the unit tests must validate every parts of the behaviour. Probably there is a bug which is not (yet) tested. Please see the function testclone_testupdate_testdetach. Thanks
[dpdk-dev] [PATCH] rte mempool: division or modulo by zero
Hi Slawomir, On 05/12/2016 02:46 PM, Slawomir Mrozowicz wrote: > Fix issue reported by Coverity. > > Coverity ID 13243: Division or modulo by zero > In function call rte_mempool_xmem_size, division by expression total_size > which may be zero has undefined behavior. > > Fixes: 148f963fb532 ("xen: core library changes") > > Signed-off-by: Slawomir Mrozowicz > --- > lib/librte_mempool/rte_mempool.c | 18 +++--- > 1 file changed, 11 insertions(+), 7 deletions(-) > > diff --git a/lib/librte_mempool/rte_mempool.c > b/lib/librte_mempool/rte_mempool.c > index f8781e1..01668c1 100644 > --- a/lib/librte_mempool/rte_mempool.c > +++ b/lib/librte_mempool/rte_mempool.c > @@ -327,15 +327,19 @@ rte_mempool_calc_obj_size(uint32_t elt_size, uint32_t > flags, > size_t > rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz, uint32_t pg_shift) > { > - size_t n, pg_num, pg_sz, sz; > + size_t n, pg_num, pg_sz; > + size_t sz = 0; > > - pg_sz = (size_t)1 << pg_shift; > + if (elt_sz > 0) { > + pg_sz = (size_t)1 << pg_shift; > + n = pg_sz / elt_sz; > > - if ((n = pg_sz / elt_sz) > 0) { > - pg_num = (elt_num + n - 1) / n; > - sz = pg_num << pg_shift; > - } else { > - sz = RTE_ALIGN_CEIL(elt_sz, pg_sz) * elt_num; > + if (n > 0) { > + pg_num = (elt_num + n - 1) / n; > + sz = pg_num << pg_shift; > + } else { > + sz = RTE_ALIGN_CEIL(elt_sz, pg_sz) * elt_num; > + } > } > > return sz; > I think it would be clearer (either for the patch and the code) to avoid an additional indent, and do something like that: size_t rte_mempool_xmem_size(uint32_t elt_num, size_t elt_sz, uint32_t pg_shift) { if (elt_sz == 0) return 0; /* same code as before */ It will also facilitate the merge with http://patchwork.dpdk.org/dev/patchwork/patch/12057/ Could you please submit a v2 with this logic? Thanks, Olivier
[dpdk-dev] [RFC PATCH] tools:new tool for system info CPU, memory and huge pages
2016-05-13 15:48, Wiles, Keith: > I create this new tool to combine some information and use /sys/devices > instead. What I was hoping was some of you could try this script and see if > it works correctly. Also I was hope to find out if this script is useful and > what other features you would like in this type of tool. What is the intent of this script? Is it to be used for bug report? There already have some tools to display system informations. Why adding one more? Examples of useful tools: hwloc/lstopo, lspci, hugeadm.
[dpdk-dev] [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX queues
This allows releasing RX/TX queue memory. --- We're using DPDK 16.04 and have a test suite which performs a sequence of separate tests of the type allocate mempool rte_eth_dev_configure(port, n_rxq, n_txq, ...) setup rx/tx queues rte_eth_dev_start(port) stop rx/tx queues rte_eth_dev_stop(port) -> rte_eth_dev_configure(port, 0, 0, ...) check that there are no leaks from the mempool The crucial point is the marked line above. This is done so that the rx_queue_release/tx_queue_release callbacks in the PMD is called, so that mbufs allocated by the driver is released. Without this patch, this explicitly isn't allowed. Is there a particular reason why it shouldn't? It was introduced in d505ba80a165a9735f3d9d3c6ab68a7bd85f271b "ethdev: support unidirectional configuration" lib/librte_ether/rte_ethdev.c | 5 - 1 file changed, 5 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index a31018e..5481d45 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -944,11 +944,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q, */ (*dev->dev_ops->dev_infos_get)(dev, &dev_info); - if (nb_rx_q == 0 && nb_tx_q == 0) { - RTE_PMD_DEBUG_TRACE("ethdev port_id=%d both rx and tx queue cannot be 0\n", port_id); - return -EINVAL; - } - if (nb_rx_q > dev_info.max_rx_queues) { RTE_PMD_DEBUG_TRACE("ethdev port_id=%d nb_rx_queues=%d > %d\n", port_id, nb_rx_q, dev_info.max_rx_queues); -- 1.9.1
[dpdk-dev] [PATCH] cfgfile: fix integer overflow
2016-04-28 11:09, Dumitrescu, Cristian: > From: Kobylinski, MichalX > > Fix issue reported by Coverity. > > > > Coverity ID 13289: Integer overflowed argument: The argument will be too > > small or even negative, likely resulting in unexpected behavior (for > > example, under-allocation in a memory allocation function). > > In rte_cfgfile_load: An integer overflow occurs, with the overflowed > > value used as an argument to a function > > > > Fixes: eaafbad419bf ("cfgfile: library to interpret config files") > > > > Signed-off-by: Michal Kobylinski > > I don't understand the root issue here, can you please explain? > > It looks to me that "end" is always going to point to a location bigger or > equal to &buffer[1]. So the second parameter of _strip function is always > going to be a positive number (0 included). Michal, any answer please?
[dpdk-dev] [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX queues
> -Original Message- > From: Simon Kagstrom [mailto:simon.kagstrom at netinsight.net] > Sent: Monday, May 16, 2016 10:34 AM > To: dev at dpdk.org; thomas.monjalon at 6wind.com; Pattan, Reshma > > Subject: [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX > queues > > This allows releasing RX/TX queue memory. > --- > > Without this patch, this explicitly isn't allowed. Is there a particular > reason why it > shouldn't? It was introduced in > > d505ba80a165a9735f3d9d3c6ab68a7bd85f271b > > "ethdev: support unidirectional configuration" > > lib/librte_ether/rte_ethdev.c | 5 - > 1 file changed, 5 deletions(-) > > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c > index > a31018e..5481d45 100644 > --- a/lib/librte_ether/rte_ethdev.c > +++ b/lib/librte_ether/rte_ethdev.c > @@ -944,11 +944,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t > nb_rx_q, uint16_t nb_tx_q, >*/ > (*dev->dev_ops->dev_infos_get)(dev, &dev_info); > > - if (nb_rx_q == 0 && nb_tx_q == 0) { > - RTE_PMD_DEBUG_TRACE("ethdev port_id=%d both rx and tx > queue cannot be 0\n", port_id); > - return -EINVAL; > - } > - This was added to allow devices, at least with one direction (RX/TX) supported. As, devices with both directions disabled doesn't make sense right? Thanks, Reshma
[dpdk-dev] [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX queues
On 2016-05-16 12:24, Pattan, Reshma wrote: >> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c >> index >> a31018e..5481d45 100644 >> --- a/lib/librte_ether/rte_ethdev.c >> +++ b/lib/librte_ether/rte_ethdev.c >> @@ -944,11 +944,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t >> nb_rx_q, uint16_t nb_tx_q, >> */ >> (*dev->dev_ops->dev_infos_get)(dev, &dev_info); >> >> -if (nb_rx_q == 0 && nb_tx_q == 0) { >> -RTE_PMD_DEBUG_TRACE("ethdev port_id=%d both rx and tx >> queue cannot be 0\n", port_id); >> -return -EINVAL; >> -} > > This was added to allow devices, at least with one direction (RX/TX) > supported. As, devices with both directions disabled doesn't make sense > right? Well, not for running them, no. But this is a part of the shutdown procedure between tests (I should have been more clear I guess). As far as I can see in the code, rte_eth_dev_configure() is the only point which actually calls {rx,tx}_queue_release(), so without this call, we can't get the memory pool full again. So basically, our test suite looks like rte_eth_dev_configure(port, 32, 32); // For example rte_eth_dev_configure(port, 0, 0); Check that the mempool is full again rte_eth_dev_configure(port, 32, 32); rte_eth_dev_configure(port, 0, 0); Check that the mempool is full again ... And without this fix, the mempool check fails since a few of the buffers are tied up in the RX descriptor ring of the PMD. // Simon
[dpdk-dev] [RFC PATCH v2 0/3] Remove string operations from xstats
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Remy Horton > Sent: Friday, May 6, 2016 12:11 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [RFC PATCH v2 0/3] Remove string operations from > xstats > > The current extended ethernet statistics fetching involve doing several > string operations, which causes performance issues if there are lots of > statistics and/or network interfaces. This RFC patchset changes the API > for xstats to use integer identifiers instead of strings and implements this > new API for the ixgbe driver. Others drivers to follow. > > -- > > v2 changes: > * Fetching xstats count now seperate API function > * Added #define constants for some magic numbers > * Fixed bug with virtual function count fetching > * For non-xstats-supporting drivers, queue stats returned > * Some refactoring/cleanups > * Removed index assumption from example > > > Remy Horton (3): > rte: change xstats to use integer keys > drivers/net/ixgbe: change xstats to use integer keys > examples/ethtool: add xstats display command > > drivers/net/ixgbe/ixgbe_ethdev.c | 98 > - > examples/ethtool/ethtool-app/ethapp.c | 57 +++ > lib/librte_ether/rte_ethdev.c | 100 > +- > lib/librte_ether/rte_ethdev.h | 55 +++ > 4 files changed, 284 insertions(+), 26 deletions(-) > > -- > 2.5.5 Looks Great overall. Is there a need to update prog_guide/poll_mode_drv.rst with the new mods? BR Maryam
[dpdk-dev] [RFC PATCH v2 1/3] rte: change xstats to use integer keys
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Remy Horton > Sent: Friday, May 6, 2016 12:11 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [RFC PATCH v2 1/3] rte: change xstats to use integer > keys > > Signed-off-by: Remy Horton > --- Acked-by: Maryam Tahhan
[dpdk-dev] [PATCH] doc: move rel_notes instructions as comments
> -Original Message- > From: Olivier Matz [mailto:olivier.matz at 6wind.com] > Sent: Friday, May 13, 2016 2:28 PM > To: dev at dpdk.org > Cc: Mcnamara, John > Subject: [PATCH] doc: move rel_notes instructions as comments > > We don't want to have this instructions in the generated docs, so use > comments. It's also less confusing for people adding entries in the > documentation. > > Signed-off-by: Olivier Matz Good idea. Reviewed-by: John McNamara Acked-by: John McNamara
[dpdk-dev] possible kni bug and proposed fix
On 5/15/2016 5:48 AM, ALeX Wang wrote: > Hi, > > When using the kni module to test my application inside > debian (virtualbox) VM (kernel version 4.4), I get the > > "KNI: Out of memory" > > from syslog every time I `tcpreply` packets through > the kni interface. > > After checking source code, I saw that when I call > 'rte_kni_rx_burst()', no matter how many packets > are actually retrieved, we always call 'kni_allocate_mbufs()' > and try allocate 'MAX_MBUF_BURST_NUM' more > mbufs... I fix the issue via using this patch below, > > Could you confirm if this is an actual bug? > Hi Alex, I don't think this is a bug. kni_allocate_mbufs() will allocate MAX_MBUF_BURST_NUM mbufs as you mentioned. And will fill alloc_q with these mubfs _until it gets full_. And if the alloc_q is full and there are remaining mbufs, they will be freed. So for some cases this isn't the most optimized way, but there is no defect. Since you are getting "KNI: Out of memory", somewhere else can be missing freeing mbufs. mbufs freeing done in rte_kni_tx_burst(), I can guess two cases that can cause problem: a) not calling rte_kni_tx_burst() frequent, so that all free mbufs consumed. b) calling rte_kni_tx_burst() with number of mbufs bigger than MAX_MBUF_BURST_NUM, because this function frees at most MAX_MBUF_BURST_NUM of mbufs, but if you are calling calling rte_kni_tx_burst() with bigger numbers, this will cause mbufs to stuck in free_q Regards, ferruh
[dpdk-dev] [PATCH 3/4] ixgbe: automatic link recovery on VF
Hi Wenzhuo, On 05/04/2016 11:10 PM, Wenzhuo Lu wrote: > When the physical link is down and recover later, > the VF link cannot recover until the user stop and > start it manually. > This patch implements the automatic recovery of VF > port. > The automatic recovery bases on the link up/down > message received from PF. When VF receives the link > up/down message, it will replace the RX/TX and > operation functions with fake ones to stop RX/TX > and any future operation. Then reset the VF port. > After successfully resetting the port, recover the > RX/TX and operation functions. > > Signed-off-by: Wenzhuo Lu > > [...] > > +void > +ixgbevf_dev_link_up_down_handler(struct rte_eth_dev *dev) > +{ > + struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > + struct ixgbe_adapter *adapter = > + (struct ixgbe_adapter *)dev->data->dev_private; > + int diag; > + uint32_t vteiam; > + > + /* Only one working core need to performance VF reset */ > + if (rte_spinlock_trylock(&adapter->vf_reset_lock)) { > + /** > + * When fake rec/xmit is replaced, working thread may is running > + * into real RX/TX func, so wait long enough to assume all > + * working thread exit. The assumption is it will spend less > + * than 100us for each execution of RX and TX func. > + */ > + rte_delay_us(100); > + > + do { > + dev->data->dev_started = 0; > + ixgbevf_dev_stop(dev); > + rte_delay_us(100); If I understand well, ixgbevf_dev_link_up_down_handler() is called by ixgbevf_recv_pkts_fake() on a dataplane core. It means that the core that acquired the lock will loop during 100us + 1sec at least. If this core was also in charge of polling other queues of other ports, or timers, many packets will be dropped (even with a 100us loop). I don't think it is acceptable to actively wait inside a rx function. I think it would avoid many issues to delegate this work to the application, maybe by notifying it that the port is in a bad state and must be restarted. The application could then properly stop polling the queues, and stop and restart the port in a separate thread, without bothering the dataplane cores. Regards, Olivier
[dpdk-dev] [PATCH] power: fix argument cannot be negative
The title do not convey the real issue. We should be more concerned by an issue of "wrong error message" rather than an "argument" which "cannot be negative". 2016-04-20 16:39, Daniel Mrzyglod: > Fix issue reported by Coverity. > Coverity ID 13269 & 13266: It is better to put these references below and start with the explanation of the issue. > Function strerror(errno) has built strings only for non-negative errno values. > for negative values of errno it describe error as "Unknown error -errno" > to be more descriptive i put string "channel not found" taken from header. > > The negative argument will be interpreted as a very large unsigned value. OK. The next statement is probably a useless copy paste of the coverity report. > In send_msg: Negative value used as argument to a function expecting > a positive value (for example, size of buffer or allocation) Coverity issue: 13266 Coverity issue: 13269 > Fixes: 445c6528b55f ("power: common interface for guest and host") > > Signed-off-by: Daniel Mrzyglod [...] > RTE_LOG(ERR, GUEST_CHANNEL, "Error on channel '%s' > communications " > - "test: %s\n", fd_path, strerror(ret)); > + "test: %s\n", fd_path, ret > 0 ? strerror(ret) : > + "channel not connected"); The indent is messy. I sugest this: + RTE_LOG(ERR, GUEST_CHANNEL, + "Error on channel '%s' communications test: %s\n", + fd_path, ret > 0 ? strerror(ret) : + "channel not connected"); > - RTE_LOG(DEBUG, POWER, "Error sending message: %s\n", strerror(ret)); > + RTE_LOG(DEBUG, POWER, "Error sending message: %s\n", ret > 0 ? > strerror(ret) > + : "channel not connected"); + RTE_LOG(DEBUG, POWER, "Error sending message: %s\n", + ret > 0 ? strerror(ret) : "channel not connected"); Applied with above changes, thanks.
[dpdk-dev] [PATCH v3] i40e: configure MTU
Hi Beilei, On 05/13/2016 10:15 AM, Beilei Xing wrote: > This patch enables configuring MTU for i40e. > Since changing MTU needs to reconfigure queue, stop port first > before configuring MTU. > > Signed-off-by: Beilei Xing > --- > v3 changes: > Add frame size with extra I40E_VLAN_TAG_SIZE. > Delete i40e_dev_rx_init(pf) cause it will be called when port starts. > > v2 changes: > If mtu is not within the allowed range, return -EINVAL instead of -EBUSY. > Delete rxq reconfigure cause rxq reconfigure will be finished in > i40e_dev_rx_init. > > drivers/net/i40e/i40e_ethdev.c | 34 ++ > 1 file changed, 34 insertions(+) > > [...] > +static int > +i40e_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu) > +{ > + struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); > + struct rte_eth_dev_data *dev_data = pf->dev_data; > + uint32_t frame_size = mtu + ETHER_HDR_LEN > + + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE; > + int ret = 0; > + > + /* check if mtu is within the allowed range */ > + if ((mtu < ETHER_MIN_MTU) || (frame_size > I40E_FRAME_SIZE_MAX)) > + return -EINVAL; > + > + /* mtu setting is forbidden if port is start */ > + if (dev_data->dev_started) { > + PMD_DRV_LOG(ERR, > + "port %d must be stopped before configuration\n", > + dev_data->port_id); > + return -ENOTSUP; > + } I'm not convinced that ENOTSUP is the proper return value here. It is usually returned when a function is not implemented, which is not the case here: the function is implemented but is forbidden because the port is running. I saw that Julien commented on your v1 that the return value should be one of: - (0) if successful. - (-ENOTSUP) if operation is not supported. - (-ENODEV) if *port_id* invalid. - (-EINVAL) if *mtu* invalid. But I think your initial value (-EBUSY) was fine. Maybe it should be added in the API instead, with the following description: (-EBUSY) if the operation is not allowed when the port is running This would allow the application to take its dispositions to stop the port and restart it with the proper jumbo_frame argument. +CC Thomas which maintains ethdev API. Regards, Olivier
[dpdk-dev] [PATCH] examples/kni: unchecked return value
2016-05-09 11:38, Daniel Mrzyglod: > Fix issue reported by Coverity. > Coverity ID 30692 Better to put reference on top of Fixes: line. > If the function returns an error value, the error value may be mistaken for > a normal value. > > In kni_free_kni: Value returned from a function is not checked for errors > before being used One of the 2 sentences is enough. > Fixes: b475eb0bc400 ("examples/kni: new parameters") > > Signed-off-by: Daniel Mrzyglod > --- > examples/kni/main.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/examples/kni/main.c b/examples/kni/main.c > index a5297f2..dcecd09 100644 > --- a/examples/kni/main.c > +++ b/examples/kni/main.c > @@ -831,7 +831,8 @@ kni_free_kni(uint8_t port_id) > return -1; > > for (i = 0; i < p[port_id]->nb_kni; i++) { > - rte_kni_release(p[port_id]->kni[i]); > + if (rte_kni_release(p[port_id]->kni[i])) > + printf("fail to release kni\n"); Other error messages of this file start with an uppercase. Applied with above changes, thanks
[dpdk-dev] [PATCH] power: fix argument cannot be negative
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon > Sent: Monday, May 16, 2016 1:17 PM > To: Mrzyglod, DanielX T > Cc: dev at dpdk.org; Carew, Alan > Subject: Re: [dpdk-dev] [PATCH] power: fix argument cannot be negative > > The next statement is probably a useless copy paste of the coverity > report. > > > In send_msg: Negative value used as argument to a function expecting a > > positive value (for example, size of buffer or allocation) A question on this point. Is it just that the Coverity message is useless in this case or in general? For other error/warning fixes we include the message in the commit. John
[dpdk-dev] [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX queues
> -Original Message- > From: Simon K?gstr?m [mailto:simon.kagstrom at netinsight.net] > Sent: Monday, May 16, 2016 11:33 AM > To: Pattan, Reshma ; dev at dpdk.org; > thomas.monjalon at 6wind.com > Subject: Re: [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero > RX/TX queues > > On 2016-05-16 12:24, Pattan, Reshma wrote: > >> diff --git a/lib/librte_ether/rte_ethdev.c > >> b/lib/librte_ether/rte_ethdev.c index > >> a31018e..5481d45 100644 > >> --- a/lib/librte_ether/rte_ethdev.c > >> +++ b/lib/librte_ether/rte_ethdev.c > >> @@ -944,11 +944,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t > >> nb_rx_q, uint16_t nb_tx_q, > >> */ > >>(*dev->dev_ops->dev_infos_get)(dev, &dev_info); > >> > >> - if (nb_rx_q == 0 && nb_tx_q == 0) { > >> - RTE_PMD_DEBUG_TRACE("ethdev port_id=%d both rx and tx > >> queue cannot be 0\n", port_id); > >> - return -EINVAL; > >> - } > > > > This was added to allow devices, at least with one direction (RX/TX) > supported. As, devices with both directions disabled doesn't make sense > right? > > Well, not for running them, no. But this is a part of the shutdown procedure > between tests (I should have been more clear I guess). > Yes I understood this. But I am not sure if you can use rte_eth_dev_configure(port, 0, 0) to free the resources. Can you check if you can use rte_eth_dev_rx_queue_stop/ rte_eth_dev_tx_queue_stop to achieve the same, because they do take care of releasing mbufs, but doesn't free the queue's sw-ring and queue. Thanks, Reshma
[dpdk-dev] [PATCH v2] examples/vm_power_manager: buffer not null terminated
2016-05-10 17:49, Daniel Mrzyglod: > CID30691: > If the buffer is treated as a null terminated string in later operations, > a buffer overflow or over-read may occur. > > In add_vm: The string buffer may not have a null terminator if the source > string's length is equal to the buffer size > > Fixes: e8ae9b662506 ("examples/vm_power: channel manager and monitor in host") > > Signed-off-by: Daniel Mrzyglod Applied, thanks
[dpdk-dev] [PATCH] cfgfile: fix integer overflow
> -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Monday, May 16, 2016 12:06 PM > To: Kobylinski, MichalX > Cc: dev at dpdk.org; Dumitrescu, Cristian > Subject: Re: [dpdk-dev] [PATCH] cfgfile: fix integer overflow > Importance: High > > 2016-04-28 11:09, Dumitrescu, Cristian: > > From: Kobylinski, MichalX > > > Fix issue reported by Coverity. > > > > > > Coverity ID 13289: Integer overflowed argument: The argument will be > > > too small or even negative, likely resulting in unexpected behavior > > > (for example, under-allocation in a memory allocation function). > > > In rte_cfgfile_load: An integer overflow occurs, with the overflowed > > > value used as an argument to a function > > > > > > Fixes: eaafbad419bf ("cfgfile: library to interpret config files") > > > > > > Signed-off-by: Michal Kobylinski > > > > I don't understand the root issue here, can you please explain? > > > > It looks to me that "end" is always going to point to a location bigger or > equal to &buffer[1]. So the second parameter of _strip function is always > going to be a positive number (0 included). > > Michal, any answer please? Hi Thomas, Cristian Coverity show that there is overflowed value. But the second parameter will never be greater than 254 (its range is 0 - 254). I used cast this parameter to unsigned in order that resolved bug reported by static analysis.
[dpdk-dev] [PATCH] cfgfile: fix integer overflow
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Kobylinski, MichalX > Sent: Monday, May 16, 2016 1:51 PM > To: Thomas Monjalon > Cc: dev at dpdk.org; Dumitrescu, Cristian > Subject: Re: [dpdk-dev] [PATCH] cfgfile: fix integer overflow > ... > Coverity show that there is overflowed value. > But the second parameter will never be greater than 254 (its range is 0 - > 254). > I used cast this parameter to unsigned in order that resolved bug reported > by static analysis. Hi, If the error cannot happen in a real application then you can mark the defect as a false positive in Coverity. Add some of the information provided above as to why it is a false positive to the comments section of the defect or add a link to the patchwork discussion. John
[dpdk-dev] [PATCH] power: fix argument cannot be negative
2016-05-16 14:39 GMT+02:00 Mcnamara, John : > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon >> The next statement is probably a useless copy paste of the coverity >> report. >> >> > In send_msg: Negative value used as argument to a function expecting a >> > positive value (for example, size of buffer or allocation) > > A question on this point. Is it just that the Coverity message is useless > in this case or in general? For other error/warning fixes we include the > message in the commit. Sometimes, the coverity message is accurate, sometimes it s better to reword it. Anyway, having 2 sentences saying the same thing is disturbing.
[dpdk-dev] [PATCH] doc: known issue on EAL argv
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jingjing Wu > Sent: Friday, May 13, 2016 6:15 AM > To: david.marchand at 6wind.com > Cc: dev at dpdk.org; Wu, Jingjing ; Yang, Ziye > ; Richardson, Bruce > Subject: [dpdk-dev] [PATCH] doc: known issue on EAL argv > > This patch docs the issue on EAL argument that the last EAL argument is > replaced by program name in argv[]. > > Reported-by: Ziye Yang > Signed-off-by: Jingjing Wu Hi Ziye, The title would be better as "doc: add known issue with EAL argv" Also, below are some suggested changes to the docs in the patch: The last EAL argument is replaced by the program name in argv[] --- **Description**: The last EAL argument is replaced by the program name in ``argv[]`` after ``eal_parse_args()`` is called. This is the intended behavior but it causes the pointer to the last EAL argument to be lost. **Implication**: If the last EAL argument in ``argv[]`` is generated by a malloc function, changing it will cause memory issues when freeing the argument. **Resolution/Workaround**: An application should not consider the value in ``argv[]`` as unchanged. **Affected Environment/Platform**: All. **Driver/Module**: Environment Abstraction Layer (EAL).
[dpdk-dev] Ring PMD: why are stats counters atomic?
Hello Bruce, Although having this support does not harm anyone, I am not convinced that it is useful, mainly because there exists the single-thread limitation in other PMDs. Then, if an application has to use different kind of NICs (i.e, different PMDs) it has to implement the locking strategies. On the other hand, if an application only uses rte_rings, it could just use the rte_ring library. Thanks, Mauricio V On Tue, May 10, 2016 at 11:36 AM, Bruce Richardson < bruce.richardson at intel.com> wrote: > On Tue, May 10, 2016 at 11:13:08AM +0200, Mauricio V?squez wrote: > > Hello, > > > > Per-queue stats counters are defined as rte_atomic64_t, in the tx/rx > > functions, they are atomically increased if the rings have the multiple > > consumers/producer flag enabled. > > > > According to the design principles, the application should not invoke > those > > functions on the same queue on different cores, then I think that atomic > > increasing is not necessary. > > > > Is there something wrong with my reasoning?, If not, I am willing to > send a > > patch. > > > > Thank you very much, > > > Since the rte_rings, on which the ring pmd is obviously based, have > multi-producer > and multi-consumer support built-in, I thought it might be useful in the > ring > PMD itself to allow multiple threads to access the ring queues at the same > time, > if the underlying rings are marked as MP/MC safe. When doing enqueues and > dequeue > from the ring, the stats are either incremented atomically, or > non-atomically, > depending on the underlying queue type. > > const uint16_t nb_rx = (uint16_t)rte_ring_dequeue_burst(r->rng, > ptrs, nb_bufs); > if (r->rng->flags & RING_F_SC_DEQ) > r->rx_pkts.cnt += nb_rx; > else > rte_atomic64_add(&(r->rx_pkts), nb_rx); > > If people don't think this behaviour is worthwhile keeping, I'm ok with > removing > it, since all other PMDs have the restriction that the queues are > single-thread > only. > > Regards, > /Bruce >
[dpdk-dev] Ring PMD: why are stats counters atomic?
On Mon, May 16, 2016 at 03:12:10PM +0200, Mauricio V?squez wrote: > Hello Bruce, > > Although having this support does not harm anyone, I am not convinced that > it is useful, mainly because there exists the single-thread limitation in > other PMDs. Then, if an application has to use different kind of NICs (i.e, > different PMDs) it has to implement the locking strategies. On the other > hand, if an application only uses rte_rings, it could just use the > rte_ring library. > > Thanks, Mauricio V > I agree. If you want, please submit a patch to remove this behaviour and see if anyone objects to it. If there are no objections, I have no problem accepting the patch. However, since this is a behaviour change to existing functionality, we may need to implement function versionning for this for ABI compatibility. Please take that into account when drafting any patch. Regards, /Bruce > On Tue, May 10, 2016 at 11:36 AM, Bruce Richardson < > bruce.richardson at intel.com> wrote: > > > On Tue, May 10, 2016 at 11:13:08AM +0200, Mauricio V?squez wrote: > > > Hello, > > > > > > Per-queue stats counters are defined as rte_atomic64_t, in the tx/rx > > > functions, they are atomically increased if the rings have the multiple > > > consumers/producer flag enabled. > > > > > > According to the design principles, the application should not invoke > > those > > > functions on the same queue on different cores, then I think that atomic > > > increasing is not necessary. > > > > > > Is there something wrong with my reasoning?, If not, I am willing to > > send a > > > patch. > > > > > > Thank you very much, > > > > > Since the rte_rings, on which the ring pmd is obviously based, have > > multi-producer > > and multi-consumer support built-in, I thought it might be useful in the > > ring > > PMD itself to allow multiple threads to access the ring queues at the same > > time, > > if the underlying rings are marked as MP/MC safe. When doing enqueues and > > dequeue > > from the ring, the stats are either incremented atomically, or > > non-atomically, > > depending on the underlying queue type. > > > > const uint16_t nb_rx = (uint16_t)rte_ring_dequeue_burst(r->rng, > > ptrs, nb_bufs); > > if (r->rng->flags & RING_F_SC_DEQ) > > r->rx_pkts.cnt += nb_rx; > > else > > rte_atomic64_add(&(r->rx_pkts), nb_rx); > > > > If people don't think this behaviour is worthwhile keeping, I'm ok with > > removing > > it, since all other PMDs have the restriction that the queues are > > single-thread > > only. > > > > Regards, > > /Bruce > >
[dpdk-dev] [PATCH / RFC ] ethdev: Allow rte_eth_dev_configure with zero RX/TX queues
On 2016-05-16 14:43, Pattan, Reshma wrote: >>> This was added to allow devices, at least with one direction (RX/TX) >> supported. As, devices with both directions disabled doesn't make sense >> right? >> >> Well, not for running them, no. But this is a part of the shutdown procedure >> between tests (I should have been more clear I guess). > > Yes I understood this. But I am not sure if you can use > rte_eth_dev_configure(port, 0, 0) to free the resources. > Can you check if you can use rte_eth_dev_rx_queue_stop/ > rte_eth_dev_tx_queue_stop to achieve the same, because they do take care of > releasing mbufs, but doesn't free the queue's sw-ring and queue. But isn't that very strange behavior. Aren't the descriptor rings allocated in rx_queue_setup()? If so, the sequence rx_queue_stop(); // Release buffers rx_queue_start(); would leave the descriptor ring empty after start, i.e., not able to receive data. // Simon
[dpdk-dev] [PATCH 0/2] doc: announce ABI change of struct rte_port_source_params
The ABI changes are planned for rte_port_source_params and rte_port_sink_params, which will be supported from release 16.11. Here announces that ABI changes in detail. Fan Zhang (2): doc: announce ABI change of struct rte_port_source_params doc: announce ABI change of struct rte_port_sink_params doc/guides/rel_notes/deprecation.rst | 8 1 file changed, 8 insertions(+) -- 2.5.5
[dpdk-dev] [PATCH 1/2] doc: announce ABI change of struct rte_port_source_params
The ABI changes are planned for rte_port_source_params, which will be supported from release 16.11. Here announces that ABI changes in detail. Signed-off-by: Fan Zhang Acked-by: Cristian Dumitrescu --- doc/guides/rel_notes/deprecation.rst | 4 1 file changed, 4 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index fffe9c7..d228bae 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -74,3 +74,7 @@ Deprecation Notices a handle, like the way kernel exposes an fd to user for locating a specific file, and to keep all major structures internally, so that we are likely to be free from ABI violations in future. + +* ABI will change for rte_port_source_params struct. The member file_name + data type will be changed from char * to const char *. This change targets + release 16.11. -- 2.5.5
[dpdk-dev] [PATCH 2/2] doc: announce ABI change of struct rte_port_sink_params
The ABI changes are planned for rte_port_sink_params, which will be supported from release 16.11. Here announces that ABI changes in detail. Signed-off-by: Fan Zhang Acked-by: Cristian Dumitrescu --- doc/guides/rel_notes/deprecation.rst | 4 1 file changed, 4 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index d228bae..d2f7306 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -78,3 +78,7 @@ Deprecation Notices * ABI will change for rte_port_source_params struct. The member file_name data type will be changed from char * to const char *. This change targets release 16.11. + +* ABI will change for rte_port_sink_params struct. The member file_name + data type will be changed from char * to const char *. This change targets + release 16.11. -- 2.5.5
[dpdk-dev] [PATCH] power: fix argument cannot be negative
> -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Monday, May 16, 2016 2:00 PM > ... > > A question on this point. Is it just that the Coverity message is > > useless in this case or in general? For other error/warning fixes we > > include the message in the commit. > > Sometimes, the coverity message is accurate, sometimes it s better to > reword it. > Anyway, having 2 sentences saying the same thing is disturbing. Ok. Noted.
[dpdk-dev] [PATCH 2/2] doc: announce ABI change of struct rte_port_sink_params
On 05/16/2016 04:18 PM, Fan Zhang wrote: > The ABI changes are planned for rte_port_sink_params, which will be > supported from release 16.11. Here announces that ABI changes in detail. > > Signed-off-by: Fan Zhang > Acked-by: Cristian Dumitrescu > --- > doc/guides/rel_notes/deprecation.rst | 4 > 1 file changed, 4 insertions(+) > > diff --git a/doc/guides/rel_notes/deprecation.rst > b/doc/guides/rel_notes/deprecation.rst > index d228bae..d2f7306 100644 > --- a/doc/guides/rel_notes/deprecation.rst > +++ b/doc/guides/rel_notes/deprecation.rst > @@ -78,3 +78,7 @@ Deprecation Notices > * ABI will change for rte_port_source_params struct. The member file_name >data type will be changed from char * to const char *. This change targets >release 16.11. > + > +* ABI will change for rte_port_sink_params struct. The member file_name > + data type will be changed from char * to const char *. This change targets > + release 16.11. > Surely such a minor change doesn't require two separate announcements or patches for that matter. In fact I doubt it's an ABI change at all (although it is an API change certainly). However to me the bigger issue is that a change like this alone doesn't seem like worth breaking the ABI. The ABI was just broken in 16.04 to introduce these struct members (among other things) and to break the ABI again just to fixup a const-correctness issue seems a bit much. Could it maybe wait until there's some actually compelling reason to break the ABI? Note that I'm not against the change as such, const-correctness is a good thing. - Panu -
[dpdk-dev] [PATCH 1/1] lib/librte_cmdline: fix added checking return value
Unchecked return value: value returned from a function rdline_init is not checked, fix added checking return value and in the case of failure frees memory and return null pointer. Fixes: af75078fece3 ("first public release") Coverity ID 13204 Signed-off-by: Marcin Kerlin --- lib/librte_cmdline/cmdline.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/lib/librte_cmdline/cmdline.c b/lib/librte_cmdline/cmdline.c index c405878..a9c47be 100644 --- a/lib/librte_cmdline/cmdline.c +++ b/lib/librte_cmdline/cmdline.c @@ -130,6 +130,7 @@ struct cmdline * cmdline_new(cmdline_parse_ctx_t *ctx, const char *prompt, int s_in, int s_out) { struct cmdline *cl; + int ret; if (!ctx || !prompt) return NULL; @@ -142,8 +143,13 @@ cmdline_new(cmdline_parse_ctx_t *ctx, const char *prompt, int s_in, int s_out) cl->s_out = s_out; cl->ctx = ctx; - rdline_init(&cl->rdl, cmdline_write_char, - cmdline_valid_buffer, cmdline_complete_buffer); + ret = rdline_init(&cl->rdl, cmdline_write_char, cmdline_valid_buffer, + cmdline_complete_buffer); + if (ret != 0) { + free(cl); + return NULL; + } + cl->rdl.opaque = cl; cmdline_set_prompt(cl, prompt); rdline_newline(&cl->rdl, cl->prompt); -- 1.9.1
[dpdk-dev] [RFC PATCH] tools:new tool for system info CPU, memory and huge pages
>2016-05-13 15:48, Wiles, Keith: >> I create this new tool to combine some information and use /sys/devices >> instead. What I was hoping was some of you could try this script and see if >> it works correctly. Also I was hope to find out if this script is useful and >> what other features you would like in this type of tool. > >What is the intent of this script? Is it to be used for bug report? >There already have some tools to display system informations. Why adding >one more? >Examples of useful tools: hwloc/lstopo, lspci, hugeadm. I was looking to replace the cpu_layout.py tool which uses the /procfs instead of /sysfs, just figured we could then add some extra information into this script as well. Yes, we have other tools, but some people do not know or use or install these tools. I was hoping this one would be able to display a number of things to help the developer and us in helping them debug problems. Stephen Hemminger sent an email about the use of sysfs instead of procfs. http://dpdk.org/ml/archives/dev/2016-May/038560.html > Regards, Keith
[dpdk-dev] [PATCH] doc: move rel_notes instructions as comments
Hi John, > Reviewed-by: John McNamara > Acked-by: John McNamara The meaning of Reviewed-by and Acked-by can be slightly different but I don't think we need to put both. In this case, as the doc maintainer, your ack should be fine. By the way, I'm totally OK to discuss a better description of these tags in the guidelines.
[dpdk-dev] [RFC PATCH] tools:new tool for system info CPU, memory and huge pages
2016-05-16 14:11, Wiles, Keith: > >2016-05-13 15:48, Wiles, Keith: > >> I create this new tool to combine some information and use /sys/devices > >> instead. What I was hoping was some of you could try this script and see > >> if it works correctly. Also I was hope to find out if this script is > >> useful and what other features you would like in this type of tool. > > > >What is the intent of this script? Is it to be used for bug report? > >There already have some tools to display system informations. Why adding > >one more? > >Examples of useful tools: hwloc/lstopo, lspci, hugeadm. > > I was looking to replace the cpu_layout.py tool which uses the /procfs > instead of /sysfs, just figured we could then add some extra information into > this script as well. Yes, we have other tools, but some people do not know or > use or install these tools. I was hoping this one would be able to display a > number of things to help the developer and us in helping them debug problems. > > Stephen Hemminger sent an email about the use of sysfs instead of procfs. > http://dpdk.org/ml/archives/dev/2016-May/038560.html I agree that cpu_layout.py should be removed. Should we implement something else? Or just point to existing tools? Or call existing tools from a small script? Is it the DPDK focus to develop and maintain such tool?
[dpdk-dev] [RFC PATCH] tools:new tool for system info CPU, memory and huge pages
>2016-05-16 14:11, Wiles, Keith: >> >2016-05-13 15:48, Wiles, Keith: >> >> I create this new tool to combine some information and use /sys/devices >> >> instead. What I was hoping was some of you could try this script and see >> >> if it works correctly. Also I was hope to find out if this script is >> >> useful and what other features you would like in this type of tool. >> > >> >What is the intent of this script? Is it to be used for bug report? >> >There already have some tools to display system informations. Why adding >> >one more? >> >Examples of useful tools: hwloc/lstopo, lspci, hugeadm. >> >> I was looking to replace the cpu_layout.py tool which uses the /procfs >> instead of /sysfs, just figured we could then add some extra information >> into this script as well. Yes, we have other tools, but some people do not >> know or use or install these tools. I was hoping this one would be able to >> display a number of things to help the developer and us in helping them >> debug problems. >> >> Stephen Hemminger sent an email about the use of sysfs instead of procfs. >> http://dpdk.org/ml/archives/dev/2016-May/038560.html > >I agree that cpu_layout.py should be removed. >Should we implement something else? Or just point to existing tools? Well I did create something else :-) As for pointing to existing tools, we should do that anyway in the documentation. >Or call existing tools from a small script? Calling the existing tools could be a problem as they may not be installed, but they could install the tools too after the script points it out. >Is it the DPDK focus to develop and maintain such tool? I can not answer that question per say. I can at least be the maintainer for this script or someone else. I think we need something that pulls all of the information from the system for the developer to see at a quick glance to help debug his system. Also we can then say run this script and post to the list, which is nice and simple as a debug aid. > Regards, Keith
[dpdk-dev] [RFC PATCH] tools:new tool for system info CPU, memory and huge pages
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon > Sent: Monday, May 16, 2016 3:32 PM > To: Wiles, Keith > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [RFC PATCH] tools:new tool for system info CPU, > memory and huge pages > > 2016-05-16 14:11, Wiles, Keith: > > >2016-05-13 15:48, Wiles, Keith: > > >> I create this new tool to combine some information and use > /sys/devices instead. What I was hoping was some of you could try this > script and see if it works correctly. Also I was hope to find out if this > script is useful and what other features you would like in this type of > tool. > > > > > >What is the intent of this script? Is it to be used for bug report? > > >There already have some tools to display system informations. Why > > >adding one more? > > >Examples of useful tools: hwloc/lstopo, lspci, hugeadm. > > > > I was looking to replace the cpu_layout.py tool which uses the /procfs > instead of /sysfs, just figured we could then add some extra information > into this script as well. Yes, we have other tools, but some people do not > know or use or install these tools. I was hoping this one would be able to > display a number of things to help the developer and us in helping them > debug problems. > > > > Stephen Hemminger sent an email about the use of sysfs instead of > procfs. > > http://dpdk.org/ml/archives/dev/2016-May/038560.html > > I agree that cpu_layout.py should be removed. > Should we implement something else? Or just point to existing tools? > Or call existing tools from a small script? > Is it the DPDK focus to develop and maintain such tool? +1 for pointing to existing tools. /Bruce
[dpdk-dev] possible kni bug and proposed fix
Hi Ferruh, Thx for pointing out the 'fill alloc_q with these mubfs _until it gets full_.', I saw the size of all queues are 'KNI_FIFO_COUNT_MAX (1024)'... The corresponding memory required is more than what I specify as 'socket_mem' (since i'm using VM)... Also, in my use case, I only `tcpreplay` through the kni interface, and my application only do rx and then free the 'mbufs'. So there is no tx at all. So, in my case, I still think this is a bug/defect, or somewhere i still misunderstand, P.S. The description here seems to be inverted, http://dpdk.org/doc/api/rte__kni_8h.html#a0cdd727cdc227d005fef22c0189f3dfe 'rte_kni_rx_burst' does the 'It handles allocating the mbufs for KNI interface alloc queue.' Thanks, Alex Wang, On 16 May 2016 at 04:20, Ferruh Yigit wrote: > On 5/15/2016 5:48 AM, ALeX Wang wrote: > > Hi, > > > > When using the kni module to test my application inside > > debian (virtualbox) VM (kernel version 4.4), I get the > > > > "KNI: Out of memory" > > > > from syslog every time I `tcpreply` packets through > > the kni interface. > > > > After checking source code, I saw that when I call > > 'rte_kni_rx_burst()', no matter how many packets > > are actually retrieved, we always call 'kni_allocate_mbufs()' > > and try allocate 'MAX_MBUF_BURST_NUM' more > > mbufs... I fix the issue via using this patch below, > > > > Could you confirm if this is an actual bug? > > > > Hi Alex, > > I don't think this is a bug. > > kni_allocate_mbufs() will allocate MAX_MBUF_BURST_NUM mbufs as you > mentioned. And will fill alloc_q with these mubfs _until it gets full_. > And if the alloc_q is full and there are remaining mbufs, they will be > freed. So for some cases this isn't the most optimized way, but there is > no defect. > > Since you are getting "KNI: Out of memory", somewhere else can be > missing freeing mbufs. > > mbufs freeing done in rte_kni_tx_burst(), I can guess two cases that can > cause problem: > a) not calling rte_kni_tx_burst() frequent, so that all free mbufs > consumed. > b) calling rte_kni_tx_burst() with number of mbufs bigger than > MAX_MBUF_BURST_NUM, because this function frees at most > MAX_MBUF_BURST_NUM of mbufs, but if you are calling calling > rte_kni_tx_burst() with bigger numbers, this will cause mbufs to stuck > in free_q > > > Regards, > ferruh > > > -- Alex Wang, Open vSwitch developer
[dpdk-dev] [PATCH v2] examples/qos_sched: fix out-of-bounds read
> > Fix issue reported by Coverity. > > > > Coverity ID 30708: Out-of-bounds read > > overrun-local: Overrunning array tokens of 8 8-byte elements > > at element index 4294967294 (byte offset 34359738352) > > using index i (which evaluates to 4294967294). > > > > Fixes: de3cfa2c9823 ("sched: initial import") > > > > Signed-off-by: Slawomir Mrozowicz > > Acked-by: Cristian Dumitrescu Applied, thanks
[dpdk-dev] [PATCH v3] examples/qos_sched: fix negative loop bound
> > Fix issue reported by Coverity. > > > > Coverity ID 30704: Negative loop bound > > negative_returns: Using unsigned variable n_tokens in a loop exit condition. > > > > Fixes: de3cfa2c9823 ("sched: initial import") > > > > Signed-off-by: Slawomir Mrozowicz > > Acked-by: Cristian Dumitrescu There is no explanation but as it is acked and not so important, Applied
[dpdk-dev] [PATCH v3] examples/qos_sched: fix copy-paste error
copy-paste is the cause of this error. The headline should show which error is fixed (or briefly its impact area). Here: "fix error message" > > Fix issue reported by Coverity. > > > > Coverity ID 30699: Copy-paste error; > > rx_port in pconf->rx_port looks like a copy-paste error. > > > > Fixes: de3cfa2c9823 ("sched: initial import") > > > > Signed-off-by: Slawomir Mrozowicz > > Acked-by: Cristian Dumitrescu Applied, thanks
[dpdk-dev] [PATCH v4] examples/qos_sched: fix bad bit shift operation
> > Subject: [PATCH v4] examples/qos_sched: fix bad bit shift operation Slawomir, please use --in-reply-to when sending a new revision, to let us see the full history in our mailer and in the archives. [...] > > diff --git a/examples/qos_sched/args.c b/examples/qos_sched/args.c > > index 3e7fd08..354372d 100644 > > --- a/examples/qos_sched/args.c > > +++ b/examples/qos_sched/args.c > > @@ -123,7 +123,7 @@ app_eal_core_mask(void) > > uint64_t cm = 0; > > struct rte_config *cfg = rte_eal_get_configuration(); > > > > - for (i = 0; i < RTE_MAX_LCORE; i++) { > > + for (i = 0; i < APP_MAX_LCORE; i++) { > > if (cfg->lcore_role[i] == ROLE_RTE) > > cm |= (1ULL << i); > > } > > @@ -142,7 +142,7 @@ app_cpu_core_count(void) > > char path[PATH_MAX]; > > uint32_t ncores = 0; > > > > - for(i = 0; i < RTE_MAX_LCORE; i++) { > > + for (i = 0; i < APP_MAX_LCORE; i++) { > > len = snprintf(path, sizeof(path), SYS_CPU_DIR, i); > > if (len <= 0 || (unsigned)len >= sizeof(path)) > > continue; > > diff --git a/examples/qos_sched/main.h b/examples/qos_sched/main.h > > index 82aa0fa..c7490c6 100644 > > --- a/examples/qos_sched/main.h > > +++ b/examples/qos_sched/main.h > > @@ -68,7 +68,10 @@ extern "C" { > > > > #define BURST_TX_DRAIN_US 100 > > > > -#define MAX_DATA_STREAMS (RTE_MAX_LCORE/2) > > +#ifndef APP_MAX_LCORE > > +#define APP_MAX_LCORE 64 > > +#endif > > +#define MAX_DATA_STREAMS (APP_MAX_LCORE/2) > > #define MAX_SCHED_SUBPORTS 8 > > #define MAX_SCHED_PIPES4096 > > > > -- > > 1.9.1 > > Acked-by: Cristian Dumitrescu Cristian, please remove patch content when acking. My hand is tired of scrolling ;)
[dpdk-dev] [PATCH v4] examples/qos_sched: fix bad bit shift operation
> > Fix issue reported by Coverity. > > > > Coverity ID 30690: Bad bit shift operation > > large_shift: In expression 1ULL << i, left shifting by more than 63 bits > > has undefined behavior. The shift amount, i, is as much as 127. > > > > Fixes: de3cfa2c9823 ("sched: initial import") > > > > Signed-off-by: Slawomir Mrozowicz > > Acked-by: Cristian Dumitrescu Applied, thanks
[dpdk-dev] [PATCH] Add rte_mempool_free
There is no inverse of rte_mempool_create, so this patch adds one. The typical usage of rte_mempool_create is to create a pool at initialization time and only to free it upon program exit, so an rte_mempool_free function at first seems to be of little value. However, it is very useful as a sanity check for a clean shutdown when used in conjunction with tools like AddressSanitizer. Further, the call itself verifies that all elements have been returned to the pool or it fails. Signed-off-by: Ben Walker --- lib/librte_mempool/rte_dom0_mempool.c | 22 +++ lib/librte_mempool/rte_mempool.c | 70 +++ lib/librte_mempool/rte_mempool.h | 41 3 files changed, 133 insertions(+) diff --git a/lib/librte_mempool/rte_dom0_mempool.c b/lib/librte_mempool/rte_dom0_mempool.c index 0d6d750..edf2d58 100644 --- a/lib/librte_mempool/rte_dom0_mempool.c +++ b/lib/librte_mempool/rte_dom0_mempool.c @@ -131,3 +131,25 @@ rte_dom0_mempool_create(const char *name, unsigned elt_num, unsigned elt_size, return mp; } + +/* free the mempool supporting Dom0 */ +int +rte_dom0_mempool_free(struct rte_mempool *mp) +{ + const struct rte_memzone *mz; + char mz_name[RTE_MEMZONE_NAMESIZE]; + int rc; + + rc = rte_mempool_xmem_free(mp); + if (rc) { + return rc; + } + + snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME, mp->name); + mz = rte_memzone_lookup(mz_name); + if (mz) { + rte_memzone_free(mz); + } + + return 0; +} diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index 70812d9..82f645e 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -638,6 +638,76 @@ exit_unlock: return NULL; } +#ifndef RTE_LIBRTE_XEN_DOM0 +/* stub if DOM0 support not configured */ +int +rte_dom0_mempool_free(struct rte_mempool *mp __rte_unused) +{ + rte_errno = EINVAL; + return -1; +} +#endif + +int +rte_mempool_free(struct rte_mempool *mp) +{ + if (rte_xen_dom0_supported()) + return rte_dom0_mempool_free(mp); + else + return rte_mempool_xmem_free(mp); +} + + +/* Free the memory pool */ +int +rte_mempool_xmem_free(struct rte_mempool *mp) +{ + char mz_name[RTE_MEMZONE_NAMESIZE]; + struct rte_mempool_list *mempool_list; + struct rte_tailq_entry *te = NULL; + const struct rte_memzone *mz; + unsigned count; + + if (!mp) { + return 0; + } + + count = rte_mempool_free_count(mp); + if (count != 0) { + /* All elements must be returned to the pool before free */ + return count; + } + + rte_rwlock_write_lock(RTE_EAL_MEMPOOL_RWLOCK); + + /* Free the ring associated with this mempool */ + if (mp->ring) { + rte_ring_free(mp->ring); + } + + /* Remove the entry from the mempool list and free it. */ + rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); + mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list); + TAILQ_FOREACH(te, mempool_list, next) { + if ((struct rte_mempool *)te->data == mp) { + TAILQ_REMOVE(mempool_list, te, next); + rte_free(te); + break; + } + } + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + + snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, mp->name); + mz = rte_memzone_lookup(mz_name); + if (mz) { + rte_memzone_free(mz); + } + + rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK); + + return 0; +} + /* Return the number of entries in the mempool */ unsigned rte_mempool_count(const struct rte_mempool *mp) diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h index 9745bf0..26949c7 100644 --- a/lib/librte_mempool/rte_mempool.h +++ b/lib/librte_mempool/rte_mempool.h @@ -728,6 +728,47 @@ rte_dom0_mempool_create(const char *name, unsigned n, unsigned elt_size, rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg, int socket_id, unsigned flags); +/** + * Free the memory pool created by rte_mempool_create + * + * All elements must be placed back in the pool prior to calling this function. + * + * @param mp + * A pointer to the mempool structure. + * @return + * 0 on success. -1 on error with rte_errno set appropriately. + * Possible rte_errno values include: + *- EINVAL - Invalid input value. + */ +int rte_mempool_free(struct rte_mempool *mp); + +/** + * Free the memory pool created by rte_mempool_xmem_create. + * + * All elements must be placed back in the pool prior to calling this function. + * + * @param mp + * A pointer to the mempool structure. + * @return + * 0 on success. -1 on error with rte_errno set appropriately. + * Possible rte_errn
[dpdk-dev] [PATCH v4] examples/qos_meter: fix unchecked return value
"check flow configuration error" > > Fix issue reported by Coverity. > > > > Coverity ID 30693: Unchecked return value > > check_return: Calling rte_meter_srtcm_config without checking return > > value. > > > > Fixes: e6541fdec8b2 ("meter: initial import") > > > > Signed-off-by: Slawomir Mrozowicz > > Acked-by: Cristian Dumitrescu Applied, thanks
[dpdk-dev] [PATCH v2] examples/netmap_compat: fix infinite loop
2016-04-27 15:22, Michal Kobylinski: > Fix issue reported by Coverity. > > Coverity ID 30701: Infinite loop: The loop does not have a normal > termination condition, so will continue until an abnormal condition > arises. In rte_netmap_poll: Infinite loop with unsatisfiable exit > condition. > > Fixes: 06371afe394d ("examples/netmap_compat: import netmap > compatibility example") > > Signed-off-by: Michal Kobylinski Applied, thanks
[dpdk-dev] [PATCH] Add rte_mempool_free
>There is no inverse of rte_mempool_create, so this patch adds one. >The typical usage of rte_mempool_create is to create a pool at >initialization time and only to free it upon program exit, so an >rte_mempool_free function at first seems to be of little value. >However, it is very useful as a sanity check for a clean shutdown >when used in conjunction with tools like AddressSanitizer. Further, >the call itself verifies that all elements have been returned to >the pool or it fails. > >Signed-off-by: Ben Walker >--- > lib/librte_mempool/rte_dom0_mempool.c | 22 +++ > lib/librte_mempool/rte_mempool.c | 70 +++ > lib/librte_mempool/rte_mempool.h | 41 > 3 files changed, 133 insertions(+) > >diff --git a/lib/librte_mempool/rte_dom0_mempool.c >b/lib/librte_mempool/rte_dom0_mempool.c >index 0d6d750..edf2d58 100644 >--- a/lib/librte_mempool/rte_dom0_mempool.c >+++ b/lib/librte_mempool/rte_dom0_mempool.c >@@ -131,3 +131,25 @@ rte_dom0_mempool_create(const char *name, unsigned >elt_num, unsigned elt_size, > > return mp; > } >+ >+/* free the mempool supporting Dom0 */ >+int >+rte_dom0_mempool_free(struct rte_mempool *mp) >+{ >+ const struct rte_memzone *mz; >+ char mz_name[RTE_MEMZONE_NAMESIZE]; >+ int rc; >+ >+ rc = rte_mempool_xmem_free(mp); >+ if (rc) { >+ return rc; >+ } Remove {} on single line statements. >+ >+ snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME, mp->name); >+ mz = rte_memzone_lookup(mz_name); >+ if (mz) { >+ rte_memzone_free(mz); >+ } Here too. >+ >+ return 0; >+} >diff --git a/lib/librte_mempool/rte_mempool.c >b/lib/librte_mempool/rte_mempool.c >index 70812d9..82f645e 100644 >--- a/lib/librte_mempool/rte_mempool.c >+++ b/lib/librte_mempool/rte_mempool.c >@@ -638,6 +638,76 @@ exit_unlock: > return NULL; > } > >+#ifndef RTE_LIBRTE_XEN_DOM0 >+/* stub if DOM0 support not configured */ >+int >+rte_dom0_mempool_free(struct rte_mempool *mp __rte_unused) >+{ >+ rte_errno = EINVAL; >+ return -1; I was thinking this should just return OK or zero. The chances of being called is very low and maybe will not be called, right? If so then do we need the function? >+} >+#endif >+ >+int >+rte_mempool_free(struct rte_mempool *mp) >+{ >+ if (rte_xen_dom0_supported()) >+ return rte_dom0_mempool_free(mp); >+ else >+ return rte_mempool_xmem_free(mp); >+} >+ >+ >+/* Free the memory pool */ >+int >+rte_mempool_xmem_free(struct rte_mempool *mp) >+{ >+ char mz_name[RTE_MEMZONE_NAMESIZE]; >+ struct rte_mempool_list *mempool_list; >+ struct rte_tailq_entry *te = NULL; >+ const struct rte_memzone *mz; >+ unsigned count; >+ >+ if (!mp) { >+ return 0; >+ } Remove the extra {} >+ >+ count = rte_mempool_free_count(mp); >+ if (count != 0) { >+ /* All elements must be returned to the pool before free */ >+ return count; >+ } This one also does not really need the {} >+ >+ rte_rwlock_write_lock(RTE_EAL_MEMPOOL_RWLOCK); >+ >+ /* Free the ring associated with this mempool */ >+ if (mp->ring) { >+ rte_ring_free(mp->ring); >+ } This one too. >+ >+ /* Remove the entry from the mempool list and free it. */ >+ rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); >+ mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list); >+ TAILQ_FOREACH(te, mempool_list, next) { >+ if ((struct rte_mempool *)te->data == mp) { >+ TAILQ_REMOVE(mempool_list, te, next); >+ rte_free(te); >+ break; >+ } >+ } >+ rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); >+ >+ snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, mp->name); >+ mz = rte_memzone_lookup(mz_name); >+ if (mz) { >+ rte_memzone_free(mz); >+ } This one too. >+ >+ rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK); >+ >+ return 0; >+} The big question is how do you know the mempool is not being used someplace? >+ > /* Return the number of entries in the mempool */ > unsigned > rte_mempool_count(const struct rte_mempool *mp) >diff --git a/lib/librte_mempool/rte_mempool.h >b/lib/librte_mempool/rte_mempool.h >index 9745bf0..26949c7 100644 >--- a/lib/librte_mempool/rte_mempool.h >+++ b/lib/librte_mempool/rte_mempool.h >@@ -728,6 +728,47 @@ rte_dom0_mempool_create(const char *name, unsigned n, >unsigned elt_size, > rte_mempool_obj_ctor_t *obj_init, void *obj_init_arg, > int socket_id, unsigned flags); > >+/** >+ * Free the memory pool created by rte_mempool_create >+ * >+ * All elements must be placed back in the pool prior to calling this >function. >+ * >+ * @param mp >+ * A pointer to the mempool structure. >+ * @return >+ * 0 on success. -1 on error
[dpdk-dev] [PATCH 1/1] examples/distributor: fix unchecked return value
> > Fix issue reported by Coverity. > > > > Coverity ID 13207: > > Value returned from a function is not checked for errors before being used. > > > > Fixes: 07db4a975094 ("examples/distributor: new sample app") > > > > Signed-off-by: Marcin Kerlin > > Acked-by: Reshma Pattan Applied, thanks
[dpdk-dev] [PATCH] examples/ip_pipeline: fix out-of-bounds write
> > CID 124567: > > In the function app_init_eal(struct app params * app) number of > > entries into array exceeds the size of the array if the conditions > > are fulfilled. > > > > Fixes: 7f64b9c004aa ("examples/ip_pipeline: rework config file syntax") > > > > Signed-off-by: Marcin Kerlin > > Acked-by: Cristian Dumitrescu Applied, thanks
[dpdk-dev] [PATCH] examples/ip_pipline: fix memory initialization in firewall bulk functions
> > bulk functions expect that all memory is set with zeros > > > > Fixes: 67ebdbef0c31 ("examples/ip_pipeline: add bulk update of firewall > > rules") > > > > Signed-off-by: Daniel Mrzyglod > > Acked-by: Cristian Dumitrescu Applied, thanks
[dpdk-dev] [PATCH v2] examples: remove useless checking
> > The rte_eth_dev_count() function will never return a value greater > > than RTE_MAX_ETHPORTS, so that checking is useless. > > > > Signed-off-by: Mauricio Vasquez B > studenti.polito.it> > > Acked-by: Ferruh Yigit Applied, thanks
[dpdk-dev] [PATCH] Add rte_mempool_free
On Mon, 2016-05-16 at 16:57 +, Wiles, Keith wrote: > > > > There is no inverse of rte_mempool_create, so this patch adds one. > > The typical usage of rte_mempool_create is to create a pool at > > initialization time and only to free it upon program exit, so an > > rte_mempool_free function at first seems to be of little value. > > However, it is very useful as a sanity check for a clean shutdown > > when used in conjunction with tools like AddressSanitizer. Further, > > the call itself verifies that all elements have been returned to > > the pool or it fails. > > > > Signed-off-by: Ben Walker > > --- > > lib/librte_mempool/rte_dom0_mempool.c | 22 +++ > > lib/librte_mempool/rte_mempool.c??| 70 > > +++ > > lib/librte_mempool/rte_mempool.h??| 41 > > 3 files changed, 133 insertions(+) > > > > diff --git a/lib/librte_mempool/rte_dom0_mempool.c > > b/lib/librte_mempool/rte_dom0_mempool.c > > index 0d6d750..edf2d58 100644 > > --- a/lib/librte_mempool/rte_dom0_mempool.c > > +++ b/lib/librte_mempool/rte_dom0_mempool.c > > @@ -131,3 +131,25 @@ rte_dom0_mempool_create(const char *name, unsigned > > elt_num, unsigned > > elt_size, > > > > return mp; > > } > > + > > +/* free the mempool supporting Dom0 */ > > +int > > +rte_dom0_mempool_free(struct rte_mempool *mp) > > +{ > > + const struct rte_memzone *mz; > > + char mz_name[RTE_MEMZONE_NAMESIZE]; > > + int rc; > > + > > + rc = rte_mempool_xmem_free(mp); > > + if (rc) { > > + return rc; > > + } > Remove {} on single line statements. I'll remove braces from all of the single line conditionals and resend. > > > > + > > + snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_OBJ_NAME, mp->name); > > + mz = rte_memzone_lookup(mz_name); > > + if (mz) { > > + rte_memzone_free(mz); > > + } > Here too. > > > > + > > + return 0; > > +} > > diff --git a/lib/librte_mempool/rte_mempool.c > > b/lib/librte_mempool/rte_mempool.c > > index 70812d9..82f645e 100644 > > --- a/lib/librte_mempool/rte_mempool.c > > +++ b/lib/librte_mempool/rte_mempool.c > > @@ -638,6 +638,76 @@ exit_unlock: > > return NULL; > > } > > > > +#ifndef RTE_LIBRTE_XEN_DOM0 > > +/* stub if DOM0 support not configured */ > > +int > > +rte_dom0_mempool_free(struct rte_mempool *mp __rte_unused) > > +{ > > + rte_errno = EINVAL; > > + return -1; > I was thinking this should just return OK or zero. The chances of being > called is very low and > maybe will not be called, right? If so then do we need the function? The user does know about this function because it is declared in the header, so they could mistakenly call it. In that case, they'd have to pass in a mempool pointer, so I think the correct thing to do is fail the call with EINVAL like I have it because if XEN_DOM0 is not supported then the mempool they pass to this function is certainly not created with rte_dom0_mempool_create and is therefore an invalid parameter. > > > > +} > > +#endif > > + > > +int > > +rte_mempool_free(struct rte_mempool *mp) > > +{ > > + if (rte_xen_dom0_supported()) > > + return rte_dom0_mempool_free(mp); > > + else > > + return rte_mempool_xmem_free(mp); > > +} > > + > > + > > +/* Free the memory pool */ > > +int > > +rte_mempool_xmem_free(struct rte_mempool *mp) > > +{ > > + char mz_name[RTE_MEMZONE_NAMESIZE]; > > + struct rte_mempool_list *mempool_list; > > + struct rte_tailq_entry *te = NULL; > > + const struct rte_memzone *mz; > > + unsigned count; > > + > > + if (!mp) { > > + return 0; > > + } > Remove the extra {} > > > > + > > + count = rte_mempool_free_count(mp); > > + if (count != 0) { > > + /* All elements must be returned to the pool before free */ > > + return count; > > + } > This one also does not really need the {} > > > > + > > + rte_rwlock_write_lock(RTE_EAL_MEMPOOL_RWLOCK); > > + > > + /* Free the ring associated with this mempool */ > > + if (mp->ring) { > > + rte_ring_free(mp->ring); > > + } > This one too. > > > > + > > + /* Remove the entry from the mempool list and free it. */ > > + rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK); > > + mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list); > > + TAILQ_FOREACH(te, mempool_list, next) { > > + if ((struct rte_mempool *)te->data == mp) { > > + TAILQ_REMOVE(mempool_list, te, next); > > + rte_free(te); > > + break; > > + } > > + } > > + rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); > > + > > + snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, mp->name); > > + mz = rte_memzone_lookup(mz_name); > > + if (mz) { > > + rte_memzone_free(mz); > > + } > This one too. > > > > + > > + rte_rwlock_write_unlock(RTE_EAL_MEMPOOL_RWLOCK); > > + > > + return 0; > > +} > The big question is how do you know the mempool is no
[dpdk-dev] [PATCH 0/4] Implement pmd hardware support exports
Hey all- So heres attempt number 2 at a method for exporting PMD hardware support information. As we discussed previously, the consensus seems to be that pmd information should be: 1) Able to be interrogated on any ELF binary (application binary or individual DSO) 2) Equally functional on statically linked applications or on DSO's 3) Resilient to symbol stripping 4) Script friendly 5) Show kernel dependencies 6) List driver options 7) Show driver name 8) Offer human readable output 9) Show DPDK version 10) Show driver version 11) Allow for expansion 12) Not place additional build environment dependencies on an application I added item 12 myself, because I don't think its reasonable to require applications to use a specific linker script to get hardware support information (which precludes the use of a special .modinfo section like the kernel uses) However, we still can use some tricks from the kernel to make this work. In this approach, what I've done is the following: A) Modified the driver registration macro to also define two variables: this_pmd_name this_pmd_table The second is optional, and omitted for virtual devices. These symbols are static, and marked as used, so they will always be emitted by the compiler, and with their well known names, easy to search for in an object (.o) file B) Added a utility called pmdinfo. This utility is not meant for general use, but rather is used by the dpdk build environment itself when compiling pmds. for each object that uses the PMD_REGISTER_DRIVER macro, this utiity is run. It searches for the symbols in (A), and using those, extracts the hardware support info, and module name from the object, using that to produce a new c file containing a single variable in the following format: static const char [] __attribute__((used)) = "PMD_DRIVER_INFO="; The is arbitrary, as its static and not referenced. The relevant bit is the string value assigned to it. The is a json encoded string of the extracted hardware support information pmdinfo found for the corresponding object. This C file is suitable for compilation and relocatable linking back into the parent object file. The result of this operation is that the object string table now contains a string that will not e removed by stripping, whos leading text (PMD_DRIVER_INFO) can be easily searched for at any time weather the symbol referring to it is stripped or not. C) Added a utilty called pmd_hw_info.py. This python script, searches the string table of the .rodata section of any provided ELF file looking for the PMD_DRIVER_INFO prefix on a string. When found, it will interpret the remainder of the string as json, and output the hardware support for that ELF file (if any). This approach ticks most of the above major boxes: 1) Impervious to stripping 2) Works on static and dynamic binaries 3) Human and script friendly 4) Allows for expansion 5) Doesn't require additional build environment needs for applications Because of (4) the other items should be pretty easy to implement, as its just a matter of modifying the macros to export the info, pmdinfo to encode it to json, and pmd_hw_support.py to read it. I'm not adding them now, as theres alot of rote work to do to get some of them in place (e.g. DPDK has no current global VERSION macro, drivers don't have a consistent version scheme, command line strings have to be built for each driver, etc). But once this is accepted, those items can be done as time allows and should be fairly easy to implement. Signed-off-by: Neil Horman CC: Bruce Richardson CC: Thomas Monjalon CC: Stephen Hemminger CC: Panu Matilainen
[dpdk-dev] [PATCH 1/4] pmdinfo: Add buildtools and pmdinfo utility
pmdinfo is a tool used to parse object files and build json strings for use in later determining hardware support in a dso or application binary. pmdinfo looks for the non-exported symbol names this_pmd_name and this_pmd_tbl (where n is a integer counter). It records the name of each of these tuples, using the later to find the symbolic name of the pci_table for physical devices that the object supports. With this information, it outputs a C file with a single line of the form: static char *_driver_info[] __attribute__((used)) = " \ PMD_DRIVER_INFO="; Where is the arbitrary name of the pmd, and is the json encoded string that hold relevant pmd information, including the pmd name, type and optional array of pci device/vendor ids that the driver supports. This c file is suitable for compiling to object code, then relocatably linking into the parent file from which the C was generated. This creates an entry in the string table of the object that can inform a later tool about hardware support. Signed-off-by: Neil Horman CC: Bruce Richardson CC: Thomas Monjalon CC: Stephen Hemminger CC: Panu Matilainen --- GNUmakefile | 2 +- buildtools/Makefile | 36 buildtools/pmdinfo/Makefile | 48 + buildtools/pmdinfo/pmdinfo.c | 435 +++ buildtools/pmdinfo/pmdinfo.h | 210 + mk/rte.buildtools.mk | 148 +++ mk/rte.sdkbuild.mk | 3 +- 7 files changed, 880 insertions(+), 2 deletions(-) create mode 100644 buildtools/Makefile create mode 100644 buildtools/pmdinfo/Makefile create mode 100644 buildtools/pmdinfo/pmdinfo.c create mode 100644 buildtools/pmdinfo/pmdinfo.h create mode 100644 mk/rte.buildtools.mk diff --git a/GNUmakefile b/GNUmakefile index b59e4b6..00fe0db 100644 --- a/GNUmakefile +++ b/GNUmakefile @@ -40,6 +40,6 @@ export RTE_SDK # directory list # -ROOTDIRS-y := lib drivers app +ROOTDIRS-y := buildtools lib drivers app include $(RTE_SDK)/mk/rte.sdkroot.mk diff --git a/buildtools/Makefile b/buildtools/Makefile new file mode 100644 index 000..0f15d58 --- /dev/null +++ b/buildtools/Makefile @@ -0,0 +1,36 @@ +# BSD LICENSE +# +# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +include $(RTE_SDK)/mk/rte.vars.mk + +DIRS-y += pmdinfo + +include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/buildtools/pmdinfo/Makefile b/buildtools/pmdinfo/Makefile new file mode 100644 index 000..3dea68b --- /dev/null +++ b/buildtools/pmdinfo/Makefile @@ -0,0 +1,48 @@ +# BSD LICENSE +# +# Copyright(c) 2010-2015 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software withou
[dpdk-dev] [PATCH 2/4] drivers: Update driver registration macro usage
Modify the PMD_REGISTER_DRIVER macro, bifurcating it into two (PMD_REGISTER_DRIVER_PDEV and PMD_REGISTER_DRIVER_VDEV. Both of these do the same thing the origional macro did, but both add the definition of a string variable that informs interested parties of the name of the pmd, and the former also defines an second string that holds the symbol name of the pci table that is registered by this pmd. pmdinfo uses this information to extract hardware support from an object file and create a json string to make hardware support info discoverable later. Signed-off-by: Neil Horman CC: Bruce Richardson CC: Thomas Monjalon CC: Stephen Hemminger CC: Panu Matilainen --- drivers/Makefile | 2 ++ drivers/crypto/aesni_gcm/aesni_gcm_pmd.c | 2 +- drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 2 +- drivers/crypto/null/null_crypto_pmd.c | 2 +- drivers/crypto/qat/rte_qat_cryptodev.c | 2 +- drivers/crypto/snow3g/rte_snow3g_pmd.c | 2 +- drivers/net/af_packet/rte_eth_af_packet.c | 2 +- drivers/net/bnx2x/bnx2x_ethdev.c | 4 ++-- drivers/net/bonding/rte_eth_bond_pmd.c | 2 +- drivers/net/cxgbe/cxgbe_ethdev.c | 2 +- drivers/net/e1000/em_ethdev.c | 2 +- drivers/net/e1000/igb_ethdev.c | 4 ++-- drivers/net/ena/ena_ethdev.c | 2 +- drivers/net/enic/enic_ethdev.c | 2 +- drivers/net/fm10k/fm10k_ethdev.c | 2 +- drivers/net/i40e/i40e_ethdev.c | 2 +- drivers/net/i40e/i40e_ethdev_vf.c | 2 +- drivers/net/ixgbe/ixgbe_ethdev.c | 4 ++-- drivers/net/mlx4/mlx4.c| 2 +- drivers/net/mlx5/mlx5.c| 2 +- drivers/net/mpipe/mpipe_tilegx.c | 4 ++-- drivers/net/nfp/nfp_net.c | 2 +- drivers/net/null/rte_eth_null.c| 2 +- drivers/net/pcap/rte_eth_pcap.c| 2 +- drivers/net/ring/rte_eth_ring.c| 2 +- drivers/net/szedata2/rte_eth_szedata2.c| 2 +- drivers/net/vhost/rte_eth_vhost.c | 2 +- drivers/net/virtio/virtio_ethdev.c | 2 +- drivers/net/vmxnet3/vmxnet3_ethdev.c | 2 +- drivers/net/xenvirt/rte_eth_xenvirt.c | 2 +- lib/librte_eal/common/include/rte_dev.h| 21 ++--- 31 files changed, 53 insertions(+), 36 deletions(-) diff --git a/drivers/Makefile b/drivers/Makefile index 81c03a8..75a3168 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -34,4 +34,6 @@ include $(RTE_SDK)/mk/rte.vars.mk DIRS-y += net DIRS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += crypto +DEPDIRS-y += buildtools/pmdinfo + include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c index 2987ef6..1c1cbf7 100644 --- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c +++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c @@ -521,4 +521,4 @@ static struct rte_driver aesni_gcm_pmd_drv = { .uninit = aesni_gcm_uninit }; -PMD_REGISTER_DRIVER(aesni_gcm_pmd_drv); +PMD_REGISTER_DRIVER_VDEV(aesni_gcm_pmd_drv, aesni_gcm); diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c index 3415ac1..bbd44ff 100644 --- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c +++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c @@ -716,4 +716,4 @@ static struct rte_driver cryptodev_aesni_mb_pmd_drv = { .uninit = cryptodev_aesni_mb_uninit }; -PMD_REGISTER_DRIVER(cryptodev_aesni_mb_pmd_drv); +PMD_REGISTER_DRIVER_VDEV(cryptodev_aesni_mb_pmd_drvi, aesni_mb); diff --git a/drivers/crypto/null/null_crypto_pmd.c b/drivers/crypto/null/null_crypto_pmd.c index bdaf13c..03b79d9 100644 --- a/drivers/crypto/null/null_crypto_pmd.c +++ b/drivers/crypto/null/null_crypto_pmd.c @@ -275,4 +275,4 @@ static struct rte_driver cryptodev_null_pmd_drv = { .uninit = cryptodev_null_uninit }; -PMD_REGISTER_DRIVER(cryptodev_null_pmd_drv); +PMD_REGISTER_DRIVER_VDEV(cryptodev_null_pmd_drv, cryptodev_null_pmd); diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c b/drivers/crypto/qat/rte_qat_cryptodev.c index a7912f5..14794c2 100644 --- a/drivers/crypto/qat/rte_qat_cryptodev.c +++ b/drivers/crypto/qat/rte_qat_cryptodev.c @@ -137,4 +137,4 @@ static struct rte_driver pmd_qat_drv = { .init = rte_qat_pmd_init, }; -PMD_REGISTER_DRIVER(pmd_qat_drv); +PMD_REGISTER_DRIVER_PDEV(pmd_qat_drv, pci_id_qat_map, qat); diff --git a/drivers/crypto/snow3g/rte_snow3g_pmd.c b/drivers/crypto/snow3g/rte_snow3g_pmd.c index f3e0e66..0203670 100644 --- a/drivers/crypto/snow3g/rte_snow3g_pmd.c +++ b/drivers/crypto/snow3g/rte_snow3g_pmd.c @@ -548,4 +548,4 @@ static struct rte_driver cryptodev_snow3g_pmd_drv = { .uninit = cryptodev_snow3g_uninit }; -PMD_REGISTER_DRIVER(cryptodev_snow3g_pmd_drv); +PMD_REGISTER_DRIVER_VDEV(cryptodev_snow3g_pmd_drv, snow3g); diff --git a/drivers/net/af_packet/rte_eth_af_packet.c b/drivers/net/af_packet/rte_eth_af_packet.c index f17bd7e..1f4a87c
[dpdk-dev] [PATCH 3/4] Makefile: Do post processing on objects that register a driver
Modify the compilation makefile to identify C files that export PMD information, and use that to trigger execution of the pmdinfo binary. If the execution of pmdinfo is successful, compile the output C file to an object, and use the linker to do relocatable linking on the resultant object file into the parent object that it came from. This effectively just adds the json string into the string table of the object that defines the PMD to the outside world. Signed-off-by: Neil Horman CC: Bruce Richardson CC: Thomas Monjalon CC: Stephen Hemminger CC: Panu Matilainen --- mk/internal/rte.compile-pre.mk | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/mk/internal/rte.compile-pre.mk b/mk/internal/rte.compile-pre.mk index b9bff4a..2771887 100644 --- a/mk/internal/rte.compile-pre.mk +++ b/mk/internal/rte.compile-pre.mk @@ -80,7 +80,8 @@ C_TO_O_STR = $(subst ','\'',$(C_TO_O)) #'# fix syntax highlight C_TO_O_DISP = $(if $(V),"$(C_TO_O_STR)"," HOSTCC $(@)") else C_TO_O = $(CC) -Wp,-MD,$(call obj2dep,$(@)).tmp $(CFLAGS) \ - $(CFLAGS_$(@)) $(EXTRA_CFLAGS) -o $@ -c $< +$(CFLAGS_$(@)) $(EXTRA_CFLAGS) -o $@ -c $< + C_TO_O_STR = $(subst ','\'',$(C_TO_O)) #'# fix syntax highlight C_TO_O_DISP = $(if $(V),"$(C_TO_O_STR)"," CC $(@)") endif @@ -88,10 +89,26 @@ C_TO_O_CMD = 'cmd_$@ = $(C_TO_O_STR)' C_TO_O_DO = @set -e; \ echo $(C_TO_O_DISP); \ $(C_TO_O) && \ + sh -c "grep -q \"PMD_REGISTER_DRIVER_[PV]DEV(.*)\" $<; \ + if [ \$$? -eq 0 ]; \ + then \ + echo MODGEN $@; \ + OBJF=`readlink -f $@`; \ + ${RTE_OUTPUT}/buildtools/pmdinfo \$$OBJF \$$OBJF.mod.c; \ + if [ \$$? -eq 0 ]; \ + then \ + echo MODBUILD $@; \ + $(CC) -c -o \$$OBJF.mod.o \$$OBJF.mod.c; \ + $(CROSS)ld -r -o \$$OBJF.o \$$OBJF.mod.o \$$OBJF; \ + mv -f \$$OBJF.o \$$OBJF; \ + fi; \ + fi; \ + true" && \ echo $(C_TO_O_CMD) > $(call obj2cmd,$(@)) && \ sed 's,'$@':,dep_'$@' =,' $(call obj2dep,$(@)).tmp > $(call obj2dep,$(@)) && \ rm -f $(call obj2dep,$(@)).tmp + # return an empty string if string are equal compare = $(strip $(subst $(1),,$(2)) $(subst $(2),,$(1))) -- 2.5.5
[dpdk-dev] [PATCH 4/4] pmd_hw_support.py: Add tool to query binaries for hw support information
This tool searches for the primer sting PMD_DRIVER_INFO= in any ELF binary, and, if found parses the remainder of the string as a json encoded string, outputting the results in either a human readable or raw, script parseable format Signed-off-by: Neil Horman CC: Bruce Richardson CC: Thomas Monjalon CC: Stephen Hemminger CC: Panu Matilainen --- tools/pmd_hw_support.py | 174 1 file changed, 174 insertions(+) create mode 100755 tools/pmd_hw_support.py diff --git a/tools/pmd_hw_support.py b/tools/pmd_hw_support.py new file mode 100755 index 000..0669aca --- /dev/null +++ b/tools/pmd_hw_support.py @@ -0,0 +1,174 @@ +#!/usr/bin/python3 +#--- +# scripts/pmd_hw_support.py +# +# Utility to dump PMD_INFO_STRING support from an object file +# +#--- +import os, sys +from optparse import OptionParser +import string +import json + +# For running from development directory. It should take precedence over the +# installed pyelftools. +sys.path.insert(0, '.') + + +from elftools import __version__ +from elftools.common.exceptions import ELFError +from elftools.common.py3compat import ( +ifilter, byte2int, bytes2str, itervalues, str2bytes) +from elftools.elf.elffile import ELFFile +from elftools.elf.dynamic import DynamicSection, DynamicSegment +from elftools.elf.enums import ENUM_D_TAG +from elftools.elf.segments import InterpSegment +from elftools.elf.sections import SymbolTableSection +from elftools.elf.gnuversions import ( +GNUVerSymSection, GNUVerDefSection, +GNUVerNeedSection, +) +from elftools.elf.relocation import RelocationSection +from elftools.elf.descriptions import ( +describe_ei_class, describe_ei_data, describe_ei_version, +describe_ei_osabi, describe_e_type, describe_e_machine, +describe_e_version_numeric, describe_p_type, describe_p_flags, +describe_sh_type, describe_sh_flags, +describe_symbol_type, describe_symbol_bind, describe_symbol_visibility, +describe_symbol_shndx, describe_reloc_type, describe_dyn_tag, +describe_ver_flags, +) +from elftools.elf.constants import E_FLAGS +from elftools.dwarf.dwarfinfo import DWARFInfo +from elftools.dwarf.descriptions import ( +describe_reg_name, describe_attr_value, set_global_machine_arch, +describe_CFI_instructions, describe_CFI_register_rule, +describe_CFI_CFA_rule, +) +from elftools.dwarf.constants import ( +DW_LNS_copy, DW_LNS_set_file, DW_LNE_define_file) +from elftools.dwarf.callframe import CIE, FDE + +raw_output = False; + +class ReadElf(object): +""" display_* methods are used to emit output into the output stream +""" +def __init__(self, file, output): +""" file: +stream object with the ELF file to read + +output: +output stream to write to +""" +self.elffile = ELFFile(file) +self.output = output + +# Lazily initialized if a debug dump is requested +self._dwarfinfo = None + +self._versioninfo = None + +def _section_from_spec(self, spec): +""" Retrieve a section given a "spec" (either number or name). +Return None if no such section exists in the file. +""" +try: +num = int(spec) +if num < self.elffile.num_sections(): +return self.elffile.get_section(num) +else: +return None +except ValueError: +# Not a number. Must be a name then +return self.elffile.get_section_by_name(str2bytes(spec)) + +def parse_pmd_info_string(self, mystring): +global raw_output +i = mystring.index("="); +mystring = mystring[i+2:] +pmdinfo = json.loads(mystring) + +if raw_output: +print(pmdinfo) +return + +print("PMD NAME: " + pmdinfo["name"]) +print("PMD TYPE: " + pmdinfo["type"]) +if (pmdinfo["type"] == "PMD_PDEV"): +print("PMD HW SUPPORT:") +print("VENDOR\t DEVICE\t SUBVENDOR\t SUBDEVICE") +for i in pmdinfo["pci_ids"]: +print("0x%04x\t 0x%04x\t 0x%04x\t\t 0x%04x" % (i[0], i[1], i[2], i[3])) + +print("") + + +def display_pmd_info_strings(self, section_spec): +""" Display a strings dump of a section. section_spec is either a +section number or a name. +""" +section = self._section_from_spec(section_spec) +if section is None: +return + + +found = False +data = section.data() +dataptr = 0 + +while dataptr < len(data): +while ( dataptr < len(data) and +not (32 <= byte2int(data[dataptr]) <= 127)): +dataptr += 1 + +if dataptr >= len(data): +brea