[dpdk-dev] [PATCH] l3fwd: Fix compilation with HASH_MULTI_LOOKUP
--- examples/l3fwd/l3fwd_em_hlm_sse.h | 38 +++--- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/examples/l3fwd/l3fwd_em_hlm_sse.h b/examples/l3fwd/l3fwd_em_hlm_sse.h index d3388da..891ae2e 100644 --- a/examples/l3fwd/l3fwd_em_hlm_sse.h +++ b/examples/l3fwd/l3fwd_em_hlm_sse.h @@ -46,7 +46,7 @@ static inline void em_get_dst_port_ipv4x8(struct lcore_conf *qconf, struct rte_mbuf *m[8], - uint8_t portid, uint16_t dst_port[8]) + uint8_t portid, uint32_t dst_port[8]) { int32_t ret[8]; union ipv4_5tuple_host key[8]; @@ -77,14 +77,14 @@ em_get_dst_port_ipv4x8(struct lcore_conf *qconf, struct rte_mbuf *m[8], sizeof(struct ether_hdr) + offsetof(struct ipv4_hdr, time_to_live))); - key[0].xmm = _mm_and_si128(data[0], mask0); - key[1].xmm = _mm_and_si128(data[1], mask0); - key[2].xmm = _mm_and_si128(data[2], mask0); - key[3].xmm = _mm_and_si128(data[3], mask0); - key[4].xmm = _mm_and_si128(data[4], mask0); - key[5].xmm = _mm_and_si128(data[5], mask0); - key[6].xmm = _mm_and_si128(data[6], mask0); - key[7].xmm = _mm_and_si128(data[7], mask0); + key[0].xmm = _mm_and_si128(data[0], mask0.x); + key[1].xmm = _mm_and_si128(data[1], mask0.x); + key[2].xmm = _mm_and_si128(data[2], mask0.x); + key[3].xmm = _mm_and_si128(data[3], mask0.x); + key[4].xmm = _mm_and_si128(data[4], mask0.x); + key[5].xmm = _mm_and_si128(data[5], mask0.x); + key[6].xmm = _mm_and_si128(data[6], mask0.x); + key[7].xmm = _mm_and_si128(data[7], mask0.x); const void *key_array[8] = {&key[0], &key[1], &key[2], &key[3], &key[4], &key[5], &key[6], &key[7]}; @@ -170,19 +170,19 @@ get_ipv6_5tuple(struct rte_mbuf *m0, __m128i mask0, static inline void em_get_dst_port_ipv6x8(struct lcore_conf *qconf, struct rte_mbuf *m[8], - uint8_t portid, uint16_t dst_port[8]) + uint8_t portid, uint32_t dst_port[8]) { int32_t ret[8]; union ipv6_5tuple_host key[8]; - get_ipv6_5tuple(m[0], mask1, mask2, &key[0]); - get_ipv6_5tuple(m[1], mask1, mask2, &key[1]); - get_ipv6_5tuple(m[2], mask1, mask2, &key[2]); - get_ipv6_5tuple(m[3], mask1, mask2, &key[3]); - get_ipv6_5tuple(m[4], mask1, mask2, &key[4]); - get_ipv6_5tuple(m[5], mask1, mask2, &key[5]); - get_ipv6_5tuple(m[6], mask1, mask2, &key[6]); - get_ipv6_5tuple(m[7], mask1, mask2, &key[7]); + get_ipv6_5tuple(m[0], mask1.x, mask2.x, &key[0]); + get_ipv6_5tuple(m[1], mask1.x, mask2.x, &key[1]); + get_ipv6_5tuple(m[2], mask1.x, mask2.x, &key[2]); + get_ipv6_5tuple(m[3], mask1.x, mask2.x, &key[3]); + get_ipv6_5tuple(m[4], mask1.x, mask2.x, &key[4]); + get_ipv6_5tuple(m[5], mask1.x, mask2.x, &key[5]); + get_ipv6_5tuple(m[6], mask1.x, mask2.x, &key[6]); + get_ipv6_5tuple(m[7], mask1.x, mask2.x, &key[7]); const void *key_array[8] = {&key[0], &key[1], &key[2], &key[3], &key[4], &key[5], &key[6], &key[7]}; @@ -292,7 +292,7 @@ l3fwd_em_send_packets(int nb_rx, struct rte_mbuf **pkts_burst, uint8_t portid, struct lcore_conf *qconf) { int32_t j; - uint16_t dst_port[MAX_PKT_BURST]; + uint32_t dst_port[MAX_PKT_BURST]; /* * Send nb_rx - nb_rx%8 packets -- 1.9.1
[dpdk-dev] [PATCH] l3fwd: Fix compilation with HASH_MULTI_LOOKUP
Hi, 2016-03-16 00:23, Maciej Czekaj: > --- > examples/l3fwd/l3fwd_em_hlm_sse.h | 38 +++--- > 1 file changed, 19 insertions(+), 19 deletions(-) You've forgotten the explanation with the Signed-off. Thanks
[dpdk-dev] Reg: promiscuous mode on VF
Hi Bharath, > 2) Is the above supported for 82599 controller? If it is supported in the > NIC, > please provide the steps to enable. Talking about 82599, VF unicast promiscuous mode is not supported. Only broadcast and multicast can be supported. > > Thanks, > Bharath Paulraj
[dpdk-dev] [PATCH v4 0/3] Snow3G support for Intel Quick Assist Devices
Tested-by: Min Cao - Tested Commit: 1b9cb73ecef109593081ab9efbd9d1429607bb99 - OS: Fedora20 3.11.10-301.fc20.x86_64 - GCC: gcc (GCC) 4.8.3 - CPU: Intel(R) Xeon(R) CPU E5-2658 v3 @ 2.20GHz - NIC: Niantic - Default x86_64-native-linuxapp-gcc configuration - Prerequisites: - Total 42 cases, 42 passed, 0 failed - test case 1: QAT Unit test Total 31 cases, 31 passed, 0 failed - test case 2: AES_NI Unit test Total 10 cases, 10 passed, 0 failed - test case 3: l2fwd-crypto Total 1 cases, 1 passed, 0 failed -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Deepak Kumar JAIN Sent: Friday, March 11, 2016 1:13 AM To: dev at dpdk.org Subject: [dpdk-dev] [PATCH v4 0/3] Snow3G support for Intel Quick Assist Devices This patchset contains fixes and refactoring for Snow3G(UEA2 and UIA2) wireless algorithm for Intel Quick Assist devices. QAT PMD previously supported only cipher/hash alg-chaining for AES/SHA. The code has been refactored to also support cipher-only and hash only (for Snow3G only) functionality along with alg-chaining. Changes from V3: 1) Rebase based on below mentioned patchset. 2) Fixes test failure which happens only after applying patch 1 only. Changes from v2: 1) Rebasing based on below mentioned patchset. This patchset depends on cryptodev API changes http://dpdk.org/ml/archives/dev/2016-March/035451.html Deepak Kumar JAIN (3): crypto: add cipher/auth only support qat: add support for Snow3G app/test: add Snow3G tests app/test/test_cryptodev.c | 1037 +++- app/test/test_cryptodev.h |3 +- app/test/test_cryptodev_snow3g_hash_test_vectors.h | 415 app/test/test_cryptodev_snow3g_test_vectors.h | 379 +++ doc/guides/cryptodevs/qat.rst |8 +- doc/guides/rel_notes/release_16_04.rst |6 + drivers/crypto/qat/qat_adf/qat_algs.h | 19 +- drivers/crypto/qat/qat_adf/qat_algs_build_desc.c | 284 +- drivers/crypto/qat/qat_crypto.c| 149 ++- drivers/crypto/qat/qat_crypto.h| 10 + 10 files changed, 2236 insertions(+), 74 deletions(-) create mode 100644 app/test/test_cryptodev_snow3g_hash_test_vectors.h create mode 100644 app/test/test_cryptodev_snow3g_test_vectors.h -- 2.1.0
[dpdk-dev] [PATCH v3 0/3] Snow3G support for Intel Quick Assist Devices
Tested-by: Min Cao - Tested Commit: 1b9cb73ecef109593081ab9efbd9d1429607bb99 - OS: Fedora20 3.11.10-301.fc20.x86_64 - GCC: gcc (GCC) 4.8.3 - CPU: Intel(R) Xeon(R) CPU E5-2658 v3 @ 2.20GHz - NIC: Niantic - Default x86_64-native-linuxapp-gcc configuration - Prerequisites: - Total 42 cases, 42 passed, 0 failed - test case 1: QAT Unit test Total 31 cases, 31 passed, 0 failed - test case 2: AES_NI Unit test Total 10 cases, 10 passed, 0 failed - test case 3: l2fwd-crypto Total 1 cases, 1 passed, 0 failed -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Deepak Kumar JAIN Sent: Thursday, March 03, 2016 9:01 PM To: dev at dpdk.org Subject: [dpdk-dev] [PATCH v3 0/3] Snow3G support for Intel Quick Assist Devices This patchset contains fixes and refactoring for Snow3G(UEA2 and UIA2) wireless algorithm for Intel Quick Assist devices. QAT PMD previously supported only cipher/hash alg-chaining for AES/SHA. The code has been refactored to also support cipher-only and hash only (for Snow3G only) functionality along with alg-chaining. Changes from v2: 1) Rebasing based on below mentioned patchset. This patchset depends on cryptodev API changes http://dpdk.org/ml/archives/dev/2016-February/034212.html Deepak Kumar JAIN (3): crypto: add cipher/auth only support qat: add support for Snow3G app/test: add Snow3G tests app/test/test_cryptodev.c | 1037 +++- app/test/test_cryptodev.h |3 +- app/test/test_cryptodev_snow3g_hash_test_vectors.h | 415 app/test/test_cryptodev_snow3g_test_vectors.h | 379 +++ doc/guides/cryptodevs/qat.rst |8 +- doc/guides/rel_notes/release_16_04.rst |6 + drivers/crypto/qat/qat_adf/qat_algs.h | 19 +- drivers/crypto/qat/qat_adf/qat_algs_build_desc.c | 280 +- drivers/crypto/qat/qat_crypto.c| 149 ++- drivers/crypto/qat/qat_crypto.h| 10 + 10 files changed, 2231 insertions(+), 75 deletions(-) create mode 100644 app/test/test_cryptodev_snow3g_hash_test_vectors.h create mode 100644 app/test/test_cryptodev_snow3g_test_vectors.h -- 2.1.0
[dpdk-dev] [PATCH v2 0/3] Snow3G support for Intel Quick Assist Devices
Tested-by: Min Cao - Tested Commit: 1b9cb73ecef109593081ab9efbd9d1429607bb99 - OS: Fedora20 3.11.10-301.fc20.x86_64 - GCC: gcc (GCC) 4.8.3 - CPU: Intel(R) Xeon(R) CPU E5-2658 v3 @ 2.20GHz - NIC: Niantic - Default x86_64-native-linuxapp-gcc configuration - Prerequisites: - Total 42 cases, 42 passed, 0 failed - test case 1: QAT Unit test Total 31 cases, 31 passed, 0 failed - test case 2: AES_NI Unit test Total 10 cases, 10 passed, 0 failed - test case 3: l2fwd-crypto Total 1 cases, 1 passed, 0 failed -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Deepak Kumar JAIN Sent: Tuesday, February 23, 2016 10:03 PM To: dev at dpdk.org Subject: [dpdk-dev] [PATCH v2 0/3] Snow3G support for Intel Quick Assist Devices This patchset contains fixes and refactoring for Snow3G(UEA2 and UIA2) wireless algorithm for Intel Quick Assist devices. QAT PMD previously supported only cipher/hash alg-chaining for AES/SHA. The code has been refactored to also support cipher-only and hash only (for Snow3G only) functionality along with alg-chaining. Changes from v1: 1) Hash only fix and alg chainging fix 2) Added hash test vectors for snow3g UIA2 functionality. This patchset depends on Cryptodev API changes http://dpdk.org/ml/archives/dev/2016-February/033551.html Deepak Kumar JAIN (3): crypto: add cipher/auth only support qat: add support for Snow3G app/test: add Snow3G tests app/test/test_cryptodev.c | 1037 +++- app/test/test_cryptodev.h |3 +- app/test/test_cryptodev_snow3g_hash_test_vectors.h | 415 app/test/test_cryptodev_snow3g_test_vectors.h | 379 +++ doc/guides/cryptodevs/qat.rst |8 +- doc/guides/rel_notes/release_16_04.rst |4 + drivers/crypto/qat/qat_adf/qat_algs.h | 19 +- drivers/crypto/qat/qat_adf/qat_algs_build_desc.c | 280 +- drivers/crypto/qat/qat_crypto.c| 147 ++- drivers/crypto/qat/qat_crypto.h| 10 + 10 files changed, 2228 insertions(+), 74 deletions(-) create mode 100644 app/test/test_cryptodev_snow3g_hash_test_vectors.h create mode 100644 app/test/test_cryptodev_snow3g_test_vectors.h -- 2.1.0
[dpdk-dev] [PATCH] ixgbe: avoid unnessary break when checking at the tail of rx hwring
HI Jianbo, > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jianbo Liu > Sent: Monday, March 14, 2016 10:26 PM > To: Zhang, Helin; Ananyev, Konstantin; dev at dpdk.org > Cc: Jianbo Liu > Subject: [dpdk-dev] [PATCH] ixgbe: avoid unnessary break when checking at the > tail of rx hwring > > When checking rx ring queue, it's possible that loop will break at the tail > while > there are packets still in the queue header. Would you like to give more details about in what scenario this issue will be hit? Thanks.
[dpdk-dev] [PATCH v2] l3fwd: Fix compilation with HASH_MULTI_LOOKUP
l3fwd does not compile with HASH_MULTI_LOOKUP. 2 issues: * in 64d395 mask0 changed type from xmm_t to rte_xmm_t -> use x field from rte_xmm_t * in dc81eb dst_port parameter changed to uint32_t -> change uint16_t dst_port to uin32_t dsp_port Signed-off-by: Maciej Czekaj --- examples/l3fwd/l3fwd_em_hlm_sse.h | 38 +++--- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/examples/l3fwd/l3fwd_em_hlm_sse.h b/examples/l3fwd/l3fwd_em_hlm_sse.h index d3388da..891ae2e 100644 --- a/examples/l3fwd/l3fwd_em_hlm_sse.h +++ b/examples/l3fwd/l3fwd_em_hlm_sse.h @@ -46,7 +46,7 @@ static inline void em_get_dst_port_ipv4x8(struct lcore_conf *qconf, struct rte_mbuf *m[8], - uint8_t portid, uint16_t dst_port[8]) + uint8_t portid, uint32_t dst_port[8]) { int32_t ret[8]; union ipv4_5tuple_host key[8]; @@ -77,14 +77,14 @@ em_get_dst_port_ipv4x8(struct lcore_conf *qconf, struct rte_mbuf *m[8], sizeof(struct ether_hdr) + offsetof(struct ipv4_hdr, time_to_live))); - key[0].xmm = _mm_and_si128(data[0], mask0); - key[1].xmm = _mm_and_si128(data[1], mask0); - key[2].xmm = _mm_and_si128(data[2], mask0); - key[3].xmm = _mm_and_si128(data[3], mask0); - key[4].xmm = _mm_and_si128(data[4], mask0); - key[5].xmm = _mm_and_si128(data[5], mask0); - key[6].xmm = _mm_and_si128(data[6], mask0); - key[7].xmm = _mm_and_si128(data[7], mask0); + key[0].xmm = _mm_and_si128(data[0], mask0.x); + key[1].xmm = _mm_and_si128(data[1], mask0.x); + key[2].xmm = _mm_and_si128(data[2], mask0.x); + key[3].xmm = _mm_and_si128(data[3], mask0.x); + key[4].xmm = _mm_and_si128(data[4], mask0.x); + key[5].xmm = _mm_and_si128(data[5], mask0.x); + key[6].xmm = _mm_and_si128(data[6], mask0.x); + key[7].xmm = _mm_and_si128(data[7], mask0.x); const void *key_array[8] = {&key[0], &key[1], &key[2], &key[3], &key[4], &key[5], &key[6], &key[7]}; @@ -170,19 +170,19 @@ get_ipv6_5tuple(struct rte_mbuf *m0, __m128i mask0, static inline void em_get_dst_port_ipv6x8(struct lcore_conf *qconf, struct rte_mbuf *m[8], - uint8_t portid, uint16_t dst_port[8]) + uint8_t portid, uint32_t dst_port[8]) { int32_t ret[8]; union ipv6_5tuple_host key[8]; - get_ipv6_5tuple(m[0], mask1, mask2, &key[0]); - get_ipv6_5tuple(m[1], mask1, mask2, &key[1]); - get_ipv6_5tuple(m[2], mask1, mask2, &key[2]); - get_ipv6_5tuple(m[3], mask1, mask2, &key[3]); - get_ipv6_5tuple(m[4], mask1, mask2, &key[4]); - get_ipv6_5tuple(m[5], mask1, mask2, &key[5]); - get_ipv6_5tuple(m[6], mask1, mask2, &key[6]); - get_ipv6_5tuple(m[7], mask1, mask2, &key[7]); + get_ipv6_5tuple(m[0], mask1.x, mask2.x, &key[0]); + get_ipv6_5tuple(m[1], mask1.x, mask2.x, &key[1]); + get_ipv6_5tuple(m[2], mask1.x, mask2.x, &key[2]); + get_ipv6_5tuple(m[3], mask1.x, mask2.x, &key[3]); + get_ipv6_5tuple(m[4], mask1.x, mask2.x, &key[4]); + get_ipv6_5tuple(m[5], mask1.x, mask2.x, &key[5]); + get_ipv6_5tuple(m[6], mask1.x, mask2.x, &key[6]); + get_ipv6_5tuple(m[7], mask1.x, mask2.x, &key[7]); const void *key_array[8] = {&key[0], &key[1], &key[2], &key[3], &key[4], &key[5], &key[6], &key[7]}; @@ -292,7 +292,7 @@ l3fwd_em_send_packets(int nb_rx, struct rte_mbuf **pkts_burst, uint8_t portid, struct lcore_conf *qconf) { int32_t j; - uint16_t dst_port[MAX_PKT_BURST]; + uint32_t dst_port[MAX_PKT_BURST]; /* * Send nb_rx - nb_rx%8 packets -- 1.9.1
[dpdk-dev] [PATCH] vhost: remove unnecessary memset for virtio net hdr
We have to reset the virtio net hdr at virtio_enqueue_offload() before, due to all mbufs share a single virtio_hdr structure: struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, }, 0}; foreach (mbuf) { virtio_enqueue_offload(mbuf, &virtio_hdr.hdr); copy net hdr and mbuf to desc buf } However, after the vhost rxtx refactor, the code looks like: copy_mbuf_to_desc(mbuf) { struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, }, 0} virtio_enqueue_offload(mbuf, &virtio_hdr.hdr); copy net hdr and mbuf to desc buf } foreach (mbuf) { copy_mbuf_to_desc(mbuf); } Therefore, the memset at virtio_enqueue_offload() is not necessary any more; remove it. Signed-off-by: Yuanhan Liu --- lib/librte_vhost/vhost_rxtx.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c index a6330f8..b4da665 100644 --- a/lib/librte_vhost/vhost_rxtx.c +++ b/lib/librte_vhost/vhost_rxtx.c @@ -94,8 +94,6 @@ is_valid_virt_queue_idx(uint32_t idx, int is_tx, uint32_t qp_nb) static void virtio_enqueue_offload(struct rte_mbuf *m_buf, struct virtio_net_hdr *net_hdr) { - memset(net_hdr, 0, sizeof(struct virtio_net_hdr)); - if (m_buf->ol_flags & PKT_TX_L4_MASK) { net_hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM; net_hdr->csum_start = m_buf->l2_len + m_buf->l3_len; -- 1.9.0
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
On 03/14/2016 05:32 PM, Ferruh Yigit wrote: > On 3/9/2016 11:17 AM, Ferruh Yigit wrote: >> This patch sent to keep record of latest status of the work. >> >> >> This is slow data path communication implementation based on existing KNI. >> >> Difference is: librte_kni converted into a PMD, kdp kernel module is almost >> same except all control path functionality removed and some simplification >> done. >> >> Motivation is to simplify slow path data communication. >> Now any application can use this new PMD to send/get data to Linux kernel. >> >> PMD supports two communication methods: >> >> 1) KDP kernel module >> PMD initialization functions handles creating virtual interfaces (with help >> of >> kdp kernel module) and created FIFO. FIFO is used to share data between >> userspace and kernelspace. This is default method. >> >> 2) tun/tap module >> When KDP module is not inserted, PMD creates tap interface and transfers >> packets using tap interface. >> >> In long term this patch intends to replace the KNI and KNI will be >> depreciated. >> > > Self-NACK: Will work on another option that does not introduce new > kernel module. > Hmm, care to elaborate a bit? The second mode of this PMD already was free of external kernel modules. Do you mean you'll be just removing mode 1) from the PMD or looking at something completely different? Just thinking that tun/tap PMD sounds like a useful thing to have, I hope you're not abandoning that. - Panu -
[dpdk-dev] [PATCH] ixgbe: avoid unnessary break when checking at the tail of rx hwring
Hi Wenzhuo, On 16 March 2016 at 14:06, Lu, Wenzhuo wrote: > HI Jianbo, > > >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jianbo Liu >> Sent: Monday, March 14, 2016 10:26 PM >> To: Zhang, Helin; Ananyev, Konstantin; dev at dpdk.org >> Cc: Jianbo Liu >> Subject: [dpdk-dev] [PATCH] ixgbe: avoid unnessary break when checking at the >> tail of rx hwring >> >> When checking rx ring queue, it's possible that loop will break at the tail >> while >> there are packets still in the queue header. > Would you like to give more details about in what scenario this issue will be > hit? Thanks. > vPMD will place extra RTE_IXGBE_DESCS_PER_LOOP - 1 number of empty descriptiors at the end of hwring to avoid overflow when do checking on rx side. For the loop in _recv_raw_pkts_vec(), we check 4 descriptors each time. If all 4 DD are set, and all 4 packets are received.That's OK in the middle. But if come to the end of hwring, and less than 4 descriptors left, we still need to check 4 descriptors at the same time, so the extra empty descriptors are checked with them. This time, the number of received packets is apparently less than 4, and we break out of the loop because of the condition "var != RTE_IXGBE_DESCS_PER_LOOP". So the problem arises. It is possible that there could be more packets at the hwring beginning that still waiting for being received. I think this fix can avoid this situation, and at least reduce the latency for the packets in the header. Thanks! Jianbo
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
On 3/16/2016 7:26 AM, Panu Matilainen wrote: > On 03/14/2016 05:32 PM, Ferruh Yigit wrote: >> On 3/9/2016 11:17 AM, Ferruh Yigit wrote: >>> This patch sent to keep record of latest status of the work. >>> >>> >>> This is slow data path communication implementation based on existing KNI. >>> >>> Difference is: librte_kni converted into a PMD, kdp kernel module is almost >>> same except all control path functionality removed and some simplification >>> done. >>> >>> Motivation is to simplify slow path data communication. >>> Now any application can use this new PMD to send/get data to Linux kernel. >>> >>> PMD supports two communication methods: >>> >>> 1) KDP kernel module >>> PMD initialization functions handles creating virtual interfaces (with help >>> of >>> kdp kernel module) and created FIFO. FIFO is used to share data between >>> userspace and kernelspace. This is default method. >>> >>> 2) tun/tap module >>> When KDP module is not inserted, PMD creates tap interface and transfers >>> packets using tap interface. >>> >>> In long term this patch intends to replace the KNI and KNI will be >>> depreciated. >>> >> >> Self-NACK: Will work on another option that does not introduce new >> kernel module. >> > > Hmm, care to elaborate a bit? The second mode of this PMD already was > free of external kernel modules. Do you mean you'll be just removing > mode 1) from the PMD or looking at something completely different? > > Just thinking that tun/tap PMD sounds like a useful thing to have, I > hope you're not abandoning that. > It will be KNI PMD. Plan is to have something like KDP, but with existing KNI kernel module. There will be tun/tap support as fallback. Regards, ferruh
[dpdk-dev] [PATCH] vhost: remove lockless enqueue to the virtio ring
On 3/15/2016 7:14 AM, Thomas Monjalon wrote: > 2016-01-05 07:16, Xie, Huawei: >> On 1/5/2016 2:42 PM, Xie, Huawei wrote: >>> This patch removes the internal lockless enqueue implmentation. >>> DPDK doesn't support receiving/transmitting packets from/to the same >>> queue. Vhost PMD wraps vhost device as normal DPDK port. DPDK >>> applications normally have their own lock implmentation when enqueue >>> packets to the same queue of a port. >>> >>> The atomic cmpset is a costly operation. This patch should help >>> performance a bit. >>> >>> Signed-off-by: Huawei Xie >> This patch modifies the API's behavior, which is also a trivial ABI >> change. In my opinion, application shouldn't rely on previous behavior. >> Anyway, i am checking how to declare the ABI change. > I guess this patch is now obsolete? How about we delay this to next release after more considerations, whether we should keep this behavior, and what is the best way for concurrency in vhost.
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
On 03/16/2016 10:19 AM, Ferruh Yigit wrote: > On 3/16/2016 7:26 AM, Panu Matilainen wrote: >> On 03/14/2016 05:32 PM, Ferruh Yigit wrote: >>> On 3/9/2016 11:17 AM, Ferruh Yigit wrote: This patch sent to keep record of latest status of the work. This is slow data path communication implementation based on existing KNI. Difference is: librte_kni converted into a PMD, kdp kernel module is almost same except all control path functionality removed and some simplification done. Motivation is to simplify slow path data communication. Now any application can use this new PMD to send/get data to Linux kernel. PMD supports two communication methods: 1) KDP kernel module PMD initialization functions handles creating virtual interfaces (with help of kdp kernel module) and created FIFO. FIFO is used to share data between userspace and kernelspace. This is default method. 2) tun/tap module When KDP module is not inserted, PMD creates tap interface and transfers packets using tap interface. In long term this patch intends to replace the KNI and KNI will be depreciated. >>> >>> Self-NACK: Will work on another option that does not introduce new >>> kernel module. >>> >> >> Hmm, care to elaborate a bit? The second mode of this PMD already was >> free of external kernel modules. Do you mean you'll be just removing >> mode 1) from the PMD or looking at something completely different? >> >> Just thinking that tun/tap PMD sounds like a useful thing to have, I >> hope you're not abandoning that. >> > > It will be KNI PMD. > Plan is to have something like KDP, but with existing KNI kernel module. > There will be tun/tap support as fallback. Hum, now I'm confused. I was under the impression everybody hated KNI and wanted to get rid of it, and certainly not build future solutions on top of it? - Panu - > > Regards, > ferruh >
[dpdk-dev] [PATCH 0/3 v3] virtio: Tx performance improvements
On 3/14/2016 6:56 PM, Richardson, Bruce wrote: > On Fri, Mar 04, 2016 at 10:19:18AM -0800, Stephen Hemminger wrote: >> This patch series uses virtio negotiated features to allow for >> more packets to be queued to host even though the default QEMU/KVM >> virtio queue is very small 256 elements. >> >> Stephen Hemminger (3): >> virtio: use indirect ring elements >> virtio: use any layout on transmit >> virtio: optimize transmit enqueue >> > These patches require an ack to merge. Virtio maintainers, can you please > review and ack if ok. > > /Bruce > Acked in the previous release window. Acked-by: Huawei Xie
[dpdk-dev] [PATCH] vhost: remove lockless enqueue to the virtio ring
On Wed, Mar 16, 2016 at 08:20:37AM +, Xie, Huawei wrote: > On 3/15/2016 7:14 AM, Thomas Monjalon wrote: > > 2016-01-05 07:16, Xie, Huawei: > >> On 1/5/2016 2:42 PM, Xie, Huawei wrote: > >>> This patch removes the internal lockless enqueue implmentation. > >>> DPDK doesn't support receiving/transmitting packets from/to the same > >>> queue. Vhost PMD wraps vhost device as normal DPDK port. DPDK > >>> applications normally have their own lock implmentation when enqueue > >>> packets to the same queue of a port. > >>> > >>> The atomic cmpset is a costly operation. This patch should help > >>> performance a bit. > >>> > >>> Signed-off-by: Huawei Xie > >> This patch modifies the API's behavior, which is also a trivial ABI > >> change. In my opinion, application shouldn't rely on previous behavior. > >> Anyway, i am checking how to declare the ABI change. > > I guess this patch is now obsolete? > > How about we delay this to next release after more considerations, I'd suggest so. > whether we should keep this behavior, and what is the best way for > concurrency in vhost. I'm wondering should we do an announcement first, to notify user the behaviour change? --yliu
[dpdk-dev] [PATCH] i40e: fix build issue for RX set function
Issue: When define CONFIG_RTE_LIBTRE_I40E_RX_ALLOW_BULK_ALLOC as n in config file, there will be a build error: ?40e_recv_pkts_bulk_alloc' undeclared Now DPDK i40e PMD use the Macro variable to choose whether to define the related bulk recv functions, but for selection of the RX function,PMD only depends on a C variable, which will cause the inconsistency and lead to the build error which will tell us the bulk recv function is not defined. Fixes: 8e109464 (i40e: allow vector Rx and Tx usage) Signed-off-by: Zhe Tao --- drivers/net/i40e/i40e_rxtx.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index 8931b8e..1488f2f 100644 --- a/drivers/net/i40e/i40e_rxtx.c +++ b/drivers/net/i40e/i40e_rxtx.c @@ -1175,6 +1175,14 @@ i40e_recv_pkts_bulk_alloc(void *rx_queue, return nb_rx; } +#else +static uint16_t +i40e_recv_pkts_bulk_alloc(void __rte_unused *rx_queue, + struct rte_mbuf __rte_unused **rx_pkts, + uint16_t __rte_unused nb_pkts) +{ + return 0; +} #endif /* RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC */ uint16_t -- 2.1.4
[dpdk-dev] [PATCH v2] i40e: fix build issue for RX set function
Issue: When define CONFIG_RTE_LIBTRE_I40E_RX_ALLOW_BULK_ALLOC as n in config file, there will be a build error: 'i40e_recv_pkts_bulk_alloc' undeclared Now DPDK i40e PMD use the Macro variable to choose whether to define the related bulk recv functions, but for selection of the RX function,PMD only depends on a C variable, which will cause the inconsistency and lead to the build error which will tell us the bulk recv function is not defined. Fixes: 8e109464 (i40e: allow vector Rx and Tx usage) Signed-off-by: Zhe Tao --- V2: fix some characters issues in commit log drivers/net/i40e/i40e_rxtx.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index 8931b8e..1488f2f 100644 --- a/drivers/net/i40e/i40e_rxtx.c +++ b/drivers/net/i40e/i40e_rxtx.c @@ -1175,6 +1175,14 @@ i40e_recv_pkts_bulk_alloc(void *rx_queue, return nb_rx; } +#else +static uint16_t +i40e_recv_pkts_bulk_alloc(void __rte_unused *rx_queue, + struct rte_mbuf __rte_unused **rx_pkts, + uint16_t __rte_unused nb_pkts) +{ + return 0; +} #endif /* RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC */ uint16_t -- 2.1.4
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
On 3/16/2016 8:22 AM, Panu Matilainen wrote: > On 03/16/2016 10:19 AM, Ferruh Yigit wrote: >> On 3/16/2016 7:26 AM, Panu Matilainen wrote: >>> On 03/14/2016 05:32 PM, Ferruh Yigit wrote: On 3/9/2016 11:17 AM, Ferruh Yigit wrote: > This patch sent to keep record of latest status of the work. > > > This is slow data path communication implementation based on existing KNI. > > Difference is: librte_kni converted into a PMD, kdp kernel module is > almost > same except all control path functionality removed and some > simplification done. > > Motivation is to simplify slow path data communication. > Now any application can use this new PMD to send/get data to Linux kernel. > > PMD supports two communication methods: > > 1) KDP kernel module > PMD initialization functions handles creating virtual interfaces (with > help of > kdp kernel module) and created FIFO. FIFO is used to share data between > userspace and kernelspace. This is default method. > > 2) tun/tap module > When KDP module is not inserted, PMD creates tap interface and transfers > packets using tap interface. > > In long term this patch intends to replace the KNI and KNI will be > depreciated. > Self-NACK: Will work on another option that does not introduce new kernel module. >>> >>> Hmm, care to elaborate a bit? The second mode of this PMD already was >>> free of external kernel modules. Do you mean you'll be just removing >>> mode 1) from the PMD or looking at something completely different? >>> >>> Just thinking that tun/tap PMD sounds like a useful thing to have, I >>> hope you're not abandoning that. >>> >> >> It will be KNI PMD. >> Plan is to have something like KDP, but with existing KNI kernel module. >> There will be tun/tap support as fallback. > > Hum, now I'm confused. I was under the impression everybody hated KNI > and wanted to get rid of it, and certainly not build future solutions on > top of it? > We can't remove it. We can't replace/improve it -you were one of the major opposition to this. This doesn't leave more option other than using it. There won't be any update in KNI kernel module, library + sample app will be converted into PMD. Regards, ferruh
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
2016-03-16 10:26, Ferruh Yigit: > On 3/16/2016 8:22 AM, Panu Matilainen wrote: > > On 03/16/2016 10:19 AM, Ferruh Yigit wrote: > >> On 3/16/2016 7:26 AM, Panu Matilainen wrote: > >>> On 03/14/2016 05:32 PM, Ferruh Yigit wrote: > On 3/9/2016 11:17 AM, Ferruh Yigit wrote: > > This patch sent to keep record of latest status of the work. > > > > > > This is slow data path communication implementation based on existing > > KNI. > > > > Difference is: librte_kni converted into a PMD, kdp kernel module is > > almost > > same except all control path functionality removed and some > > simplification done. > > > > Motivation is to simplify slow path data communication. > > Now any application can use this new PMD to send/get data to Linux > > kernel. > > > > PMD supports two communication methods: > > > > 1) KDP kernel module > > PMD initialization functions handles creating virtual interfaces (with > > help of > > kdp kernel module) and created FIFO. FIFO is used to share data between > > userspace and kernelspace. This is default method. > > > > 2) tun/tap module > > When KDP module is not inserted, PMD creates tap interface and transfers > > packets using tap interface. > > > > In long term this patch intends to replace the KNI and KNI will be > > depreciated. > > > > Self-NACK: Will work on another option that does not introduce new > kernel module. > > >>> > >>> Hmm, care to elaborate a bit? The second mode of this PMD already was > >>> free of external kernel modules. Do you mean you'll be just removing > >>> mode 1) from the PMD or looking at something completely different? > >>> > >>> Just thinking that tun/tap PMD sounds like a useful thing to have, I > >>> hope you're not abandoning that. > >>> > >> > >> It will be KNI PMD. > >> Plan is to have something like KDP, but with existing KNI kernel module. > >> There will be tun/tap support as fallback. > > > > Hum, now I'm confused. I was under the impression everybody hated KNI > > and wanted to get rid of it, and certainly not build future solutions on > > top of it? > > We can't remove it. Why? > We can't replace/improve it -you were one of the major opposition to this. > This doesn't leave more option other than using it. Why cannot we replace it by something upstream? > There won't be any update in KNI kernel module, library + sample app > will be converted into PMD.
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
On Wed, Mar 16, 2016 at 10:22:05AM +0200, Panu Matilainen wrote: > On 03/16/2016 10:19 AM, Ferruh Yigit wrote: > >On 3/16/2016 7:26 AM, Panu Matilainen wrote: > >>On 03/14/2016 05:32 PM, Ferruh Yigit wrote: > >>>On 3/9/2016 11:17 AM, Ferruh Yigit wrote: > This patch sent to keep record of latest status of the work. > > > This is slow data path communication implementation based on existing KNI. > > Difference is: librte_kni converted into a PMD, kdp kernel module is > almost > same except all control path functionality removed and some > simplification done. > > Motivation is to simplify slow path data communication. > Now any application can use this new PMD to send/get data to Linux kernel. > > PMD supports two communication methods: > > 1) KDP kernel module > PMD initialization functions handles creating virtual interfaces (with > help of > kdp kernel module) and created FIFO. FIFO is used to share data between > userspace and kernelspace. This is default method. > > 2) tun/tap module > When KDP module is not inserted, PMD creates tap interface and transfers > packets using tap interface. > > In long term this patch intends to replace the KNI and KNI will be > depreciated. > > >>> > >>>Self-NACK: Will work on another option that does not introduce new > >>>kernel module. > >>> > >> > >>Hmm, care to elaborate a bit? The second mode of this PMD already was > >>free of external kernel modules. Do you mean you'll be just removing > >>mode 1) from the PMD or looking at something completely different? > >> > >>Just thinking that tun/tap PMD sounds like a useful thing to have, I > >>hope you're not abandoning that. > >> > > > >It will be KNI PMD. > >Plan is to have something like KDP, but with existing KNI kernel module. > >There will be tun/tap support as fallback. > > Hum, now I'm confused. I was under the impression everybody hated KNI and > wanted to get rid of it, and certainly not build future solutions on top of > it? > KNI has it's issues - mainly: a) not being upstream and b) having large amounts of code to do port management in it, that is best handled by other means - but the code for transferring packets between kernel space and userspace is more performant and scalable than TUN/TAP, so we need to keep that around unless/until we can get TUN/TAP to reach the same performance levels. Now, we are thinking of some ways in which that can be achieved, but any such solution is going to be a bit out, so making any driver for transferring packets from user->kernel and vice versa might as well take advantage of KNI as well as TUN/TAP so as to allow those who want the extra performance to have it. Regards, /Bruce
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon > Sent: Wednesday, March 16, 2016 10:46 AM > To: Yigit, Ferruh > Cc: dev at dpdk.org; Panu Matilainen ; David > Marchand ; Zhang, Helin > > Subject: Re: [dpdk-dev] [PATCH v3 0/2] slow data path communication > between DPDK port and Linux > > > > We can't remove it. > > Why? There are a lot of people using KNI. > > We can't replace/improve it -you were one of the major opposition to this. > > This doesn't leave more option other than using it. > > Why cannot we replace it by something upstream? In theory it could be upstreamed. Let's see how we get on with upstreaming the KCP component first. John
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
On 3/16/2016 10:45 AM, Thomas Monjalon wrote: > 2016-03-16 10:26, Ferruh Yigit: >> On 3/16/2016 8:22 AM, Panu Matilainen wrote: >>> On 03/16/2016 10:19 AM, Ferruh Yigit wrote: On 3/16/2016 7:26 AM, Panu Matilainen wrote: > On 03/14/2016 05:32 PM, Ferruh Yigit wrote: >> On 3/9/2016 11:17 AM, Ferruh Yigit wrote: >>> This patch sent to keep record of latest status of the work. >>> >>> >>> This is slow data path communication implementation based on existing >>> KNI. >>> >>> Difference is: librte_kni converted into a PMD, kdp kernel module is >>> almost >>> same except all control path functionality removed and some >>> simplification done. >>> >>> Motivation is to simplify slow path data communication. >>> Now any application can use this new PMD to send/get data to Linux >>> kernel. >>> >>> PMD supports two communication methods: >>> >>> 1) KDP kernel module >>> PMD initialization functions handles creating virtual interfaces (with >>> help of >>> kdp kernel module) and created FIFO. FIFO is used to share data between >>> userspace and kernelspace. This is default method. >>> >>> 2) tun/tap module >>> When KDP module is not inserted, PMD creates tap interface and transfers >>> packets using tap interface. >>> >>> In long term this patch intends to replace the KNI and KNI will be >>> depreciated. >>> >> >> Self-NACK: Will work on another option that does not introduce new >> kernel module. >> > > Hmm, care to elaborate a bit? The second mode of this PMD already was > free of external kernel modules. Do you mean you'll be just removing > mode 1) from the PMD or looking at something completely different? > > Just thinking that tun/tap PMD sounds like a useful thing to have, I > hope you're not abandoning that. > It will be KNI PMD. Plan is to have something like KDP, but with existing KNI kernel module. There will be tun/tap support as fallback. >>> >>> Hum, now I'm confused. I was under the impression everybody hated KNI >>> and wanted to get rid of it, and certainly not build future solutions on >>> top of it? >> >> We can't remove it. > > Why? > >> We can't replace/improve it -you were one of the major opposition to this. >> This doesn't leave more option other than using it. > > Why cannot we replace it by something upstream? > I doubt KDP is upstream-able to Linux community. If somebody can, that is great. Even for KCP, upstreaming task is still under discussion, and as a heads up, it is likely to be dropped. Regards, ferruh >> There won't be any update in KNI kernel module, library + sample app >> will be converted into PMD. > >
[dpdk-dev] [PATCH] ixgbe: avoid unnessary break when checking at the tail of rx hwring
On Wed, Mar 16, 2016 at 03:51:53PM +0800, Jianbo Liu wrote: > Hi Wenzhuo, > > On 16 March 2016 at 14:06, Lu, Wenzhuo wrote: > > HI Jianbo, > > > > > >> -Original Message- > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jianbo Liu > >> Sent: Monday, March 14, 2016 10:26 PM > >> To: Zhang, Helin; Ananyev, Konstantin; dev at dpdk.org > >> Cc: Jianbo Liu > >> Subject: [dpdk-dev] [PATCH] ixgbe: avoid unnessary break when checking at > >> the > >> tail of rx hwring > >> > >> When checking rx ring queue, it's possible that loop will break at the > >> tail while > >> there are packets still in the queue header. > > Would you like to give more details about in what scenario this issue will > > be hit? Thanks. > > > > vPMD will place extra RTE_IXGBE_DESCS_PER_LOOP - 1 number of empty > descriptiors at the end of hwring to avoid overflow when do checking > on rx side. > > For the loop in _recv_raw_pkts_vec(), we check 4 descriptors each > time. If all 4 DD are set, and all 4 packets are received.That's OK in > the middle. > But if come to the end of hwring, and less than 4 descriptors left, we > still need to check 4 descriptors at the same time, so the extra empty > descriptors are checked with them. > This time, the number of received packets is apparently less than 4, > and we break out of the loop because of the condition "var != > RTE_IXGBE_DESCS_PER_LOOP". > So the problem arises. It is possible that there could be more packets > at the hwring beginning that still waiting for being received. > I think this fix can avoid this situation, and at least reduce the > latency for the packets in the header. > Packets are always received in order from the NIC, so no packets ever get left behind or skipped on an RX burst call. /Bruce
[dpdk-dev] [PATCH] ring: assert on zero objects dequeue/enqueue
On Tue, Mar 15, 2016 at 06:58:45PM +0200, Lazaros Koromilas wrote: > Issuing a zero objects dequeue with a single consumer has no effect. > Doing so with multiple consumers, can get more than one thread to succeed > the compare-and-set operation and observe starvation or even deadlock in > the while loop that checks for preceding dequeues. The problematic piece > of code when n = 0: > > cons_next = cons_head + n; > success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next); > > The same is possible on the enqueue path. > > Signed-off-by: Lazaros Koromilas I'm not sure how serious a problem this really is, and I really suspect that just calling rte_panic is rather an overreaction here. At worst, this should be a check only when RTE_RING_DEBUG is on. However, probably my preferred solution to this issue would be to just add if (n == 0) return 0 to the mp and mc enqueue/dequeue functions. That way there is no performance penalty for the higher-performing sp/sc paths, and you avoid and unnecessary cmpset operations for the mp/mc cases. /Bruce > --- > lib/librte_ring/rte_ring.h | 26 ++ > 1 file changed, 26 insertions(+) > > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h > index 943c97c..2bf9ce3 100644 > --- a/lib/librte_ring/rte_ring.h > +++ b/lib/librte_ring/rte_ring.h > @@ -100,6 +100,7 @@ extern "C" { > #include > #include > #include > +#include > > #define RTE_TAILQ_RING_NAME "RTE_RING" > > @@ -211,6 +212,19 @@ struct rte_ring { > #endif > > /** > + * @internal Assert macro. > + * @param exp > + * The expression to evaluate. > + */ > +#define RTE_RING_ASSERT(exp) do { \ > + if (!(exp)) { \ > + rte_panic("line%d\t" \ > + "assert \"" #exp "\" failed\n", \ > + __LINE__); \ > + } \ > + } while (0) > + > +/** > * Calculate the memory size needed for a ring > * > * This function returns the number of bytes needed for a ring, given > @@ -406,6 +420,7 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r); > * A pointer to a table of void * pointers (objects). > * @param n > * The number of objects to add in the ring from the obj_table. > + * Must be greater than zero. > * @param behavior > * RTE_RING_QUEUE_FIXED:Enqueue a fixed number of items from a ring > * RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from ring > @@ -431,6 +446,8 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const > *obj_table, > uint32_t mask = r->prod.mask; > int ret; > > + RTE_RING_ASSERT(n > 0); > + > /* move prod.head atomically */ > do { > /* Reset n to the initial burst count */ > @@ -510,6 +527,7 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void * const > *obj_table, > * A pointer to a table of void * pointers (objects). > * @param n > * The number of objects to add in the ring from the obj_table. > + * Must be greater than zero. > * @param behavior > * RTE_RING_QUEUE_FIXED:Enqueue a fixed number of items from a ring > * RTE_RING_QUEUE_VARIABLE: Enqueue as many items a possible from ring > @@ -533,6 +551,8 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const > *obj_table, > uint32_t mask = r->prod.mask; > int ret; > > + RTE_RING_ASSERT(n > 0); > + > prod_head = r->prod.head; > cons_tail = r->cons.tail; > /* The subtraction is done between two unsigned 32bits value > @@ -594,6 +614,7 @@ __rte_ring_sp_do_enqueue(struct rte_ring *r, void * const > *obj_table, > * A pointer to a table of void * pointers (objects) that will be filled. > * @param n > * The number of objects to dequeue from the ring to the obj_table. > + * Must be greater than zero. > * @param behavior > * RTE_RING_QUEUE_FIXED:Dequeue a fixed number of items from a ring > * RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from ring > @@ -618,6 +639,8 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void > **obj_table, > unsigned i, rep = 0; > uint32_t mask = r->prod.mask; > > + RTE_RING_ASSERT(n > 0); > + > /* move cons.head atomically */ > do { > /* Restore n as it may change every loop */ > @@ -689,6 +712,7 @@ __rte_ring_mc_do_dequeue(struct rte_ring *r, void > **obj_table, > * A pointer to a table of void * pointers (objects) that will be filled. > * @param n > * The number of objects to dequeue from the ring to the obj_table. > + * Must be greater than zero. > * @param behavior > * RTE_RING_QUEUE_FIXED:Dequeue a fixed number of items from a ring > * RTE_RING_QUEUE_VARIABLE: Dequeue as many items a possible from ring > @@ -710,6 +7
[dpdk-dev] Document1
-- next part -- A non-text attachment was scrubbed... Name: Document1.zip Type: application/zip Size: 3050 bytes Desc: Document1.zip URL: <http://dpdk.org/ml/archives/dev/attachments/20160316/5959eb3e/attachment.zip>
[dpdk-dev] [PATCH 0/3] lpm allocation fixes - v2
Poking a bit on autotest revealed a few shortcomings in the lpm allocation path. Thanks to the feedback to the first revision of the patches here v2: *updates in v2* - lpm/lpm6 patches split - following dpdk coding guidelines regarding single line if's - adding singed-off and acked-bys gathered so far - combine all three related patches in one series [PATCH 1/3] lpm6: fix use after free of lpm in rte_lpm6_create [PATCH 2/3] lpm6: fix missing free of rules_tbl and lpm [PATCH 3/3] lpm: fix missing free of lpm rte_lpm.c |8 ++-- rte_lpm6.c | 11 +-- 2 files changed, 7 insertions(+), 12 deletions(-)
[dpdk-dev] [PATCH 1/3] lpm6: fix use after free of lpm in rte_lpm6_create
In certain autotests lpm->max_rules turned out to be non initialized. That was caused by a failing allocation for lpm->rules_tbl in rte_lpm6_create. It then left the function via goto exit with lpm freed, but still a pointer value being set. In case of an allocation failure it resets lpm to NULL now, to avoid the upper layers operate on that already freed memory. Along that is also makes the RTE_LOG message of the failed allocation unique. Acked-by: Stephen Hemminger Signed-off-by: Christian Ehrhardt --- lib/librte_lpm/rte_lpm6.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/librte_lpm/rte_lpm6.c b/lib/librte_lpm/rte_lpm6.c index 6c2b293..48931cc 100644 --- a/lib/librte_lpm/rte_lpm6.c +++ b/lib/librte_lpm/rte_lpm6.c @@ -206,8 +206,9 @@ rte_lpm6_create(const char *name, int socket_id, (size_t)rules_size, RTE_CACHE_LINE_SIZE, socket_id); if (lpm->rules_tbl == NULL) { - RTE_LOG(ERR, LPM, "LPM memory allocation failed\n"); + RTE_LOG(ERR, LPM, "LPM rules_tbl allocation failed\n"); rte_free(lpm); + lpm = NULL; rte_free(te); goto exit; } -- 2.7.0
[dpdk-dev] [PATCH 2/3] lpm6: fix missing free of rules_tbl and lpm
lpm6 autotests failed with the default alloc of 512M Memory. While >=2500M was a workaround it became clear while debugging that it had a leak. One could see a lot of output like: LPM Test tests6[i]: FAIL LPM: LPM memory allocation failed It turned out that in rte_lpm6_free - lpm might not be freed if it didn't find a te (early return) - lpm->rules_tbl was not freed ever Acked-by: Bruce Richardson Signed-off-by: Christian Ehrhardt --- lib/librte_lpm/rte_lpm6.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/lib/librte_lpm/rte_lpm6.c b/lib/librte_lpm/rte_lpm6.c index 48931cc..4c44cd7 100644 --- a/lib/librte_lpm/rte_lpm6.c +++ b/lib/librte_lpm/rte_lpm6.c @@ -278,15 +278,13 @@ rte_lpm6_free(struct rte_lpm6 *lpm) if (te->data == (void *) lpm) break; } - if (te == NULL) { - rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); - return; - } - TAILQ_REMOVE(lpm_list, te, next); + if (te != NULL) + TAILQ_REMOVE(lpm_list, te, next); rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + rte_free(lpm->rules_tbl); rte_free(lpm); rte_free(te); } -- 2.7.0
[dpdk-dev] [PATCH 3/3] lpm: fix missing free of lpm
Fixing lpm6 regarding a similar issue showed that that in rte_lpm_free lpm might not be freed if it didn't find a te (early return) Acked-by: Bruce Richardson Signed-off-by: Christian Ehrhardt --- lib/librte_lpm/rte_lpm.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c index ccaaa2a..d5fa1f8 100644 --- a/lib/librte_lpm/rte_lpm.c +++ b/lib/librte_lpm/rte_lpm.c @@ -360,12 +360,8 @@ rte_lpm_free_v20(struct rte_lpm_v20 *lpm) if (te->data == (void *) lpm) break; } - if (te == NULL) { - rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); - return; - } - - TAILQ_REMOVE(lpm_list, te, next); + if (te != NULL) + TAILQ_REMOVE(lpm_list, te, next); rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); -- 2.7.0
[dpdk-dev] Document1
-- next part -- A non-text attachment was scrubbed... Name: Document1.zip Type: application/zip Size: 4248 bytes Desc: Document1.zip URL: <http://dpdk.org/ml/archives/dev/attachments/20160316/16dfb3a8/attachment.zip>
[dpdk-dev] vhost: no protection against malformed queue descriptors in rte_vhost_dequeue_burst()
Hello, When taking a snapshot of a running VM instance, using OpenStack "nova image-create", I noticed that one OVS pmd-thread eventually failed in DPDK rte_vhost_dequeue_burst() with repeating log entries: compute-0-6 ovs-vswitchd[38172]: VHOST_DATA: Failed to allocate memory for mbuf. Debugging (data included further down) this issue lead to the observation that there is no protection against malformed vhost queue descriptors, thus tenant separation might be violated as a single faulty VM might bring down the connectivity of all VMs connected to the same virtual switch. To avoid this, validation would be needed at some points in the rte_vhost_dequeue_burst() code: 1) when the queue descriptor is picked up for processing, desc->flags and desc->len might both be 0 ... desc = &vq->desc[head[entry_success]]; ... /* Discard first buffer as it is the virtio header */ if (desc->flags & VRING_DESC_F_NEXT) { desc = &vq->desc[desc->next]; vb_offset = 0; vb_avail = desc->len; } else { vb_offset = vq->vhost_hlen; vb_avail = desc->len - vb_offset; } 2) at buffer address translation gpa_to_vva(), might fail returning NULL as indication vb_addr = gpa_to_vva(dev, desc->addr); ... while (cpy_len != 0) { rte_memcpy(rte_pktmbuf_mtod_offset(cur, void *, seg_offset), (void *)((uintptr_t)(vb_addr + vb_offset)), cpy_len); ... } ... Wondering if there are any plans of adding any kind of validation in DPDK, or if it would be useful to suggest specific implementation of such validations in the DPDK code? Or is there some mechanism that gives us the confidence to trust the vhost queue content absolutely? Debugging data: For my scenario the problem occurs in DPDK rte_vhost_dequeue_burst() due to use of a vhost queue descriptor that has all fields 0: (gdb) print *desc {addr = 0, len = 0, flags = 0, next = 0} Subsequent use of desc->len to compute vb_avail = desc->len - vb_offset, leads to the problem observed. What happens is that the packet needs to be segmented -- on my system it fails roughly at segment 122000 when memory available for mbufs run out. The relevant local variables for rte_vhost_dequeue_burst() when breaking on the condition desc->len == 0: vb_avail = 4294967284 (0xfff4) seg_avail = 2608 vb_offset = 12 cpy_len = 2608 seg_num = 1 desc = 0x2aadb6e5c000 vb_addr = 46928960159744 entry_success = 0 Note also that there is no crash despite to the desc->addr being zero, it is a valid address in the regions mapped to the device. Although, the 3 regions mapped does not seem to be correct either at this stage. The versions that I'm running are OVS 2.4.0, with corrections from the 2.4 branch, and DPDK 2.1.0. QEMU emulator version 2.2.0 and libvirt version 1.2.12. Regards, Patrik
[dpdk-dev] [PATCH] lpm: fix memory leak
On 03/15/2016 01:25 PM, Olivier Matz wrote: > Internal lpm structures are not properly freed. Seen with the > lpm6 autotest. > > Signed-off-by: Olivier Matz > --- > lib/librte_lpm/rte_lpm.c | 3 +++ > lib/librte_lpm/rte_lpm6.c | 1 + > 2 files changed, 4 insertions(+) > Self-nack, Christian already submitted a series about it: http://dpdk.org/dev/patchwork/patch/11543/ http://dpdk.org/dev/patchwork/patch/11544/ http://dpdk.org/dev/patchwork/patch/11545/
[dpdk-dev] [PATCH 3/3] lpm: fix missing free of lpm
Hi Christian, On 03/16/2016 01:33 PM, Christian Ehrhardt wrote: > Fixing lpm6 regarding a similar issue showed that that in rte_lpm_free lpm > might not be freed if it didn't find a te (early return) > > Acked-by: Bruce Richardson > Signed-off-by: Christian Ehrhardt > --- > lib/librte_lpm/rte_lpm.c | 8 ++-- > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c > index ccaaa2a..d5fa1f8 100644 > --- a/lib/librte_lpm/rte_lpm.c > +++ b/lib/librte_lpm/rte_lpm.c > @@ -360,12 +360,8 @@ rte_lpm_free_v20(struct rte_lpm_v20 *lpm) > if (te->data == (void *) lpm) > break; > } > - if (te == NULL) { > - rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); > - return; > - } > - > - TAILQ_REMOVE(lpm_list, te, next); > + if (te != NULL) > + TAILQ_REMOVE(lpm_list, te, next); > > rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); > > I've just seen you had already posted a series on this topic. It looks that some free() are missing in lpm.c: Could you please check my version of the patch (which was not as complete as your series)? http://dpdk.org/dev/patchwork/patch/11526/ Regards, Olivier
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
On 03/16/2016 12:26 PM, Ferruh Yigit wrote: > On 3/16/2016 8:22 AM, Panu Matilainen wrote: >> On 03/16/2016 10:19 AM, Ferruh Yigit wrote: >>> On 3/16/2016 7:26 AM, Panu Matilainen wrote: On 03/14/2016 05:32 PM, Ferruh Yigit wrote: > On 3/9/2016 11:17 AM, Ferruh Yigit wrote: >> This patch sent to keep record of latest status of the work. >> >> >> This is slow data path communication implementation based on existing >> KNI. >> >> Difference is: librte_kni converted into a PMD, kdp kernel module is >> almost >> same except all control path functionality removed and some >> simplification done. >> >> Motivation is to simplify slow path data communication. >> Now any application can use this new PMD to send/get data to Linux >> kernel. >> >> PMD supports two communication methods: >> >> 1) KDP kernel module >> PMD initialization functions handles creating virtual interfaces (with >> help of >> kdp kernel module) and created FIFO. FIFO is used to share data between >> userspace and kernelspace. This is default method. >> >> 2) tun/tap module >> When KDP module is not inserted, PMD creates tap interface and transfers >> packets using tap interface. >> >> In long term this patch intends to replace the KNI and KNI will be >> depreciated. >> > > Self-NACK: Will work on another option that does not introduce new > kernel module. > Hmm, care to elaborate a bit? The second mode of this PMD already was free of external kernel modules. Do you mean you'll be just removing mode 1) from the PMD or looking at something completely different? Just thinking that tun/tap PMD sounds like a useful thing to have, I hope you're not abandoning that. >>> >>> It will be KNI PMD. >>> Plan is to have something like KDP, but with existing KNI kernel module. >>> There will be tun/tap support as fallback. >> >> Hum, now I'm confused. I was under the impression everybody hated KNI >> and wanted to get rid of it, and certainly not build future solutions on >> top of it? >> > > We can't remove it. > We can't replace/improve it -you were one of the major opposition to this. No no no. There's a misunderstanding somewhere in there. I understand the functionality provided by KNI is important. I'd LOVE to see the it replaced. With something that does not require out-of-tree kernel modules. As long as out-of-tree kernel modules are in the picture, the feature might as well not exist at all for the audience I'm dealing with. To that audience, replacing KNI with out-of-tree KCP/KDP or whatever is just irrelevant, there's no progress being made. I also understand there are lot of users to whom out-of-tree kernel modules are not a problem at all, and I'm in no position to tell them that's somehow wrong. If KCP/KDP is better than KNI for that audience then more power to them. But I dont see why such modules would *have* to be within the dpdk source - as suggested several times around this thread/topic such work could live in a separate repository or such. What I really would like to see is a clear policy regarding kernel modules in DPDK. I certainly am in no position to dictate one, and that's why I've been asking questions and throwing around crazy (or not) ideas around the topic. - Panu -
[dpdk-dev] [PATCH] lpm: fix memory leak
Thanks Oliver, the bad thing was that I forgot to CC dpdk-dev last friday. I just resubmitted correcting that mistake. I think it should now just be down to the re-review and apply of Bruce. Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Wed, Mar 16, 2016 at 2:11 PM, Olivier MATZ wrote: > > > On 03/15/2016 01:25 PM, Olivier Matz wrote: > >> Internal lpm structures are not properly freed. Seen with the >> lpm6 autotest. >> >> Signed-off-by: Olivier Matz >> --- >> lib/librte_lpm/rte_lpm.c | 3 +++ >> lib/librte_lpm/rte_lpm6.c | 1 + >> 2 files changed, 4 insertions(+) >> >> > Self-nack, Christian already submitted a series about it: > > http://dpdk.org/dev/patchwork/patch/11543/ > http://dpdk.org/dev/patchwork/patch/11544/ > http://dpdk.org/dev/patchwork/patch/11545/ > >
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
On 03/16/2016 01:13 PM, Ferruh Yigit wrote: > On 3/16/2016 10:45 AM, Thomas Monjalon wrote: >> 2016-03-16 10:26, Ferruh Yigit: >>> On 3/16/2016 8:22 AM, Panu Matilainen wrote: On 03/16/2016 10:19 AM, Ferruh Yigit wrote: > On 3/16/2016 7:26 AM, Panu Matilainen wrote: >> On 03/14/2016 05:32 PM, Ferruh Yigit wrote: >>> On 3/9/2016 11:17 AM, Ferruh Yigit wrote: This patch sent to keep record of latest status of the work. This is slow data path communication implementation based on existing KNI. Difference is: librte_kni converted into a PMD, kdp kernel module is almost same except all control path functionality removed and some simplification done. Motivation is to simplify slow path data communication. Now any application can use this new PMD to send/get data to Linux kernel. PMD supports two communication methods: 1) KDP kernel module PMD initialization functions handles creating virtual interfaces (with help of kdp kernel module) and created FIFO. FIFO is used to share data between userspace and kernelspace. This is default method. 2) tun/tap module When KDP module is not inserted, PMD creates tap interface and transfers packets using tap interface. In long term this patch intends to replace the KNI and KNI will be depreciated. >>> >>> Self-NACK: Will work on another option that does not introduce new >>> kernel module. >>> >> >> Hmm, care to elaborate a bit? The second mode of this PMD already was >> free of external kernel modules. Do you mean you'll be just removing >> mode 1) from the PMD or looking at something completely different? >> >> Just thinking that tun/tap PMD sounds like a useful thing to have, I >> hope you're not abandoning that. >> > > It will be KNI PMD. > Plan is to have something like KDP, but with existing KNI kernel module. > There will be tun/tap support as fallback. Hum, now I'm confused. I was under the impression everybody hated KNI and wanted to get rid of it, and certainly not build future solutions on top of it? >>> >>> We can't remove it. >> >> Why? >> >>> We can't replace/improve it -you were one of the major opposition to this. >>> This doesn't leave more option other than using it. >> >> Why cannot we replace it by something upstream? >> > I doubt KDP is upstream-able to Linux community. If somebody can, that > is great. > > Even for KCP, upstreaming task is still under discussion, and as a heads > up, it is likely to be dropped. If KCP/KDP are not upstreamable then the solution is to find another way that is. Easier said than done, no doubt. - Panu -
[dpdk-dev] [PATCH 3/3] lpm: fix missing free of lpm
Hi, looking at it I think we have intersections but also parts of yours that I missed. More than that while applying your changes I found other potential use-after free cases. I'll wrap that all up together in a v3 of my series. Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Wed, Mar 16, 2016 at 2:14 PM, Olivier MATZ wrote: > Hi Christian, > > On 03/16/2016 01:33 PM, Christian Ehrhardt wrote: > >> Fixing lpm6 regarding a similar issue showed that that in rte_lpm_free lpm >> might not be freed if it didn't find a te (early return) >> >> Acked-by: Bruce Richardson >> Signed-off-by: Christian Ehrhardt >> --- >> lib/librte_lpm/rte_lpm.c | 8 ++-- >> 1 file changed, 2 insertions(+), 6 deletions(-) >> >> diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c >> index ccaaa2a..d5fa1f8 100644 >> --- a/lib/librte_lpm/rte_lpm.c >> +++ b/lib/librte_lpm/rte_lpm.c >> @@ -360,12 +360,8 @@ rte_lpm_free_v20(struct rte_lpm_v20 *lpm) >> if (te->data == (void *) lpm) >> break; >> } >> - if (te == NULL) { >> - rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); >> - return; >> - } >> - >> - TAILQ_REMOVE(lpm_list, te, next); >> + if (te != NULL) >> + TAILQ_REMOVE(lpm_list, te, next); >> >> rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); >> >> >> > I've just seen you had already posted a series on this topic. > It looks that some free() are missing in lpm.c: > > Could you please check my version of the patch (which was not as > complete as your series)? > http://dpdk.org/dev/patchwork/patch/11526/ > > Regards, > Olivier >
[dpdk-dev] [PATCH v2 5/5] mlx5: add VLAN insertion offload
On Fri, Mar 11, 2016 at 03:24:43PM +, Bruce Richardson wrote: > On Thu, Mar 03, 2016 at 03:27:59PM +0100, Adrien Mazarguil wrote: > > From: Yaacov Hazan > > > > VLAN insertion is done in software by the PMD by default unless > > CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION is enabled and Verbs provides > > support for hardware insertion. > > > > When enabled, this option improves performance when VLAN insertion is > > requested, however ConnectX-4 Lx boards cannot take advantage of > > multi-packet send optimizations anymore. > > > > Signed-off-by: Yaacov Hazan > > Signed-off-by: Adrien Mazarguil > > --- > > config/common_linuxapp | 1 + > > doc/guides/nics/mlx5.rst | 9 +++ > > doc/guides/rel_notes/release_16_04.rst | 6 ++ > > drivers/net/mlx5/Makefile | 9 +++ > > drivers/net/mlx5/mlx5_defs.h | 9 +++ > > drivers/net/mlx5/mlx5_ethdev.c | 12 ++-- > > drivers/net/mlx5/mlx5_rxtx.c | 109 > > +++-- > > drivers/net/mlx5/mlx5_rxtx.h | 13 > > drivers/net/mlx5/mlx5_txq.c| 15 - > > 9 files changed, 158 insertions(+), 25 deletions(-) > > > > diff --git a/config/common_linuxapp b/config/common_linuxapp > > index 7b5e49f..793d262 100644 > > --- a/config/common_linuxapp > > +++ b/config/common_linuxapp > > @@ -220,6 +220,7 @@ CONFIG_RTE_LIBRTE_MLX5_DEBUG=n > > CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N=4 > > CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE=0 > > CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8 > > +CONFIG_RTE_LIBRTE_MLX5_VERBS_VLAN_INSERTION=n > > > New build time configuration options are no longer allowed in DPDK, as they > can't > be used in binary distributions and make testing harder. This should be made > a run-time option instead. OK, it was done as a performance improvement for a specific case, I will submit an updated patchset without this option. -- Adrien Mazarguil 6WIND
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
2016-03-16 15:15, Panu Matilainen: > What I really would like to see is a clear policy regarding kernel > modules in DPDK. I certainly am in no position to dictate one, and > that's why I've been asking questions and throwing around crazy (or not) > ideas around the topic. I think the consensus is to avoid new kernel module, but allow them in a staging directory while being discussed upstream. About the existing out-of-tree kernel modules, we must continue trying to obsolete them with upstream work. If you feel the consensus must be clearly stated and acked, please send a patch for doc/guides/contributing/design.rst.
[dpdk-dev] [PATCH 1/5] lpm6: fix use after free of lpm in rte_lpm6_create
In certain autotests lpm->max_rules turned out to be non initialized. That was caused by a failing allocation for lpm->rules_tbl in rte_lpm6_create. It then left the function via goto exit with lpm freed, but still a pointer value being set. In case of an allocation failure it resets lpm to NULL now, to avoid the upper layers operate on that already freed memory. Along that is also makes the RTE_LOG message of the failed allocation unique. Acked-by: Stephen Hemminger Signed-off-by: Christian Ehrhardt --- lib/librte_lpm/rte_lpm6.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/librte_lpm/rte_lpm6.c b/lib/librte_lpm/rte_lpm6.c index 6c2b293..48931cc 100644 --- a/lib/librte_lpm/rte_lpm6.c +++ b/lib/librte_lpm/rte_lpm6.c @@ -206,8 +206,9 @@ rte_lpm6_create(const char *name, int socket_id, (size_t)rules_size, RTE_CACHE_LINE_SIZE, socket_id); if (lpm->rules_tbl == NULL) { - RTE_LOG(ERR, LPM, "LPM memory allocation failed\n"); + RTE_LOG(ERR, LPM, "LPM rules_tbl allocation failed\n"); rte_free(lpm); + lpm = NULL; rte_free(te); goto exit; } -- 2.7.0
[dpdk-dev] [PATCH 0/5] lpm allocation fixes - v3
Poking a bit on autotest revealed a few shortcomings in the lpm allocation path. Thanks to the feedback to the first revision of the patches here v2. Also Oliver Matz spotted similar issues and made me aware - thanks! Integrating them revealed even more use after free / leak issues. *updates in v3* - lpm create/free path for v20 and v1604 got the same fixes that were already identified for lpm6 before *updates in v2* - lpm/lpm6 patches split - following dpdk coding guidelines regarding single line if's - adding singed-off and acked-bys gathered so far - combine all three related patches in one series [PATCH 1/5] lpm6: fix use after free of lpm in rte_lpm6_create [PATCH 2/5] lpm6: fix missing free of rules_tbl and lpm [PATCH 3/5] lpm: fix missing free of lpm [PATCH 4/5] lpm: fix use after free of lpm in rte_lpm_create* [PATCH 5/5] lpm: fix missing free of rules_tbl and lpm in diffstat: rte_lpm.c | 23 ++- rte_lpm6.c | 12 ++-- 2 files changed, 16 insertions(+), 19 deletions(-)
[dpdk-dev] [PATCH 2/5] lpm6: fix missing free of rules_tbl and lpm
lpm6 autotests failed with the default alloc of 512M Memory. While >=2500M was a workaround it became clear while debugging that it had a leak. One could see a lot of output like: LPM Test tests6[i]: FAIL LPM: LPM memory allocation failed It turned out that in rte_lpm6_free - lpm might not be freed if it didn't find a te (early return) - lpm->rules_tbl was not freed ever Acked-by: Bruce Richardson Signed-off-by: Christian Ehrhardt --- lib/librte_lpm/rte_lpm6.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/lib/librte_lpm/rte_lpm6.c b/lib/librte_lpm/rte_lpm6.c index 48931cc..5abfc78 100644 --- a/lib/librte_lpm/rte_lpm6.c +++ b/lib/librte_lpm/rte_lpm6.c @@ -278,15 +278,14 @@ rte_lpm6_free(struct rte_lpm6 *lpm) if (te->data == (void *) lpm) break; } - if (te == NULL) { - rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); - return; - } - TAILQ_REMOVE(lpm_list, te, next); + if (te != NULL) { + TAILQ_REMOVE(lpm_list, te, next); + } rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + rte_free(lpm->rules_tbl); rte_free(lpm); rte_free(te); } -- 2.7.0
[dpdk-dev] [PATCH 3/5] lpm: fix missing free of lpm
Fixing lpm6 regarding a similar issue showed that that in rte_lpm_free lpm might not be freed if it didn't find a te (early return) Acked-by: Bruce Richardson Signed-off-by: Christian Ehrhardt --- lib/librte_lpm/rte_lpm.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c index ccaaa2a..2cc87b6 100644 --- a/lib/librte_lpm/rte_lpm.c +++ b/lib/librte_lpm/rte_lpm.c @@ -360,13 +360,10 @@ rte_lpm_free_v20(struct rte_lpm_v20 *lpm) if (te->data == (void *) lpm) break; } - if (te == NULL) { - rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); - return; + if (te != NULL) { + TAILQ_REMOVE(lpm_list, te, next); } - TAILQ_REMOVE(lpm_list, te, next); - rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); rte_free(lpm); -- 2.7.0
[dpdk-dev] [PATCH 4/5] lpm: fix use after free of lpm in rte_lpm_create*
There were further chances for a use after free by returning an already freed pointer in rte_lpm_create for v20 and v1604. Along that is also makes the RTE_LOG messages of the failed allocations unique. Signed-off-by: Christian Ehrhardt --- lib/librte_lpm/rte_lpm.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c index 2cc87b6..d21c783 100644 --- a/lib/librte_lpm/rte_lpm.c +++ b/lib/librte_lpm/rte_lpm.c @@ -303,8 +303,9 @@ rte_lpm_create_v1604(const char *name, int socket_id, (size_t)rules_size, RTE_CACHE_LINE_SIZE, socket_id); if (lpm->rules_tbl == NULL) { - RTE_LOG(ERR, LPM, "LPM memory allocation failed\n"); + RTE_LOG(ERR, LPM, "LPM rules_tbl memory allocation failed\n"); rte_free(lpm); + lpm = NULL; rte_free(te); goto exit; } @@ -313,8 +314,9 @@ rte_lpm_create_v1604(const char *name, int socket_id, (size_t)tbl8s_size, RTE_CACHE_LINE_SIZE, socket_id); if (lpm->tbl8 == NULL) { - RTE_LOG(ERR, LPM, "LPM memory allocation failed\n"); + RTE_LOG(ERR, LPM, "LPM tbl8 memory allocation failed\n"); rte_free(lpm); + lpm = NULL; rte_free(te); goto exit; } -- 2.7.0
[dpdk-dev] [PATCH 5/5] lpm: fix missing free of rules_tbl and lpm in rte_lpm_free*
As found in rte_lpm6_free the two lpm interfaces rte_lpm_free_v20 and rte_lpm_free_v1604 had a leak. rte_lpm_free_v20 might have missed to free rules_tbl rte_lpm_free_v1604 due to an early exit might have missed to free rules_tbl and lpm itself. Signed-off-by: Christian Ehrhardt --- lib/librte_lpm/rte_lpm.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c index d21c783..e8fe33e 100644 --- a/lib/librte_lpm/rte_lpm.c +++ b/lib/librte_lpm/rte_lpm.c @@ -368,6 +368,7 @@ rte_lpm_free_v20(struct rte_lpm_v20 *lpm) rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + rte_free(lpm->rules_tbl); rte_free(lpm); rte_free(te); } @@ -392,15 +393,12 @@ rte_lpm_free_v1604(struct rte_lpm *lpm) if (te->data == (void *) lpm) break; } - if (te == NULL) { - rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); - return; - } - - TAILQ_REMOVE(lpm_list, te, next); + if (te != NULL) + TAILQ_REMOVE(lpm_list, te, next); rte_rwlock_write_unlock(RTE_EAL_TAILQ_RWLOCK); + rte_free(lpm->rules_tbl); rte_free(lpm); rte_free(te); } -- 2.7.0
[dpdk-dev] [PATCH] lpm: fix memory leak
Hi, I'm done comparing our two patches and just submitted a v3 of my series based on that. I found even more use after free and leaks than we had before. Patch series has grown to 5 patches now. At least my gmail groups subsequent git send-email posts weirdly, let me know if you are in any trouble reviewing applying them. Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd On Wed, Mar 16, 2016 at 2:21 PM, Christian Ehrhardt < christian.ehrhardt at canonical.com> wrote: > Thanks Oliver, the bad thing was that I forgot to CC dpdk-dev last friday. > I just resubmitted correcting that mistake. > > I think it should now just be down to the re-review and apply of Bruce. > > Christian Ehrhardt > Software Engineer, Ubuntu Server > Canonical Ltd > > On Wed, Mar 16, 2016 at 2:11 PM, Olivier MATZ > wrote: > >> >> >> On 03/15/2016 01:25 PM, Olivier Matz wrote: >> >>> Internal lpm structures are not properly freed. Seen with the >>> lpm6 autotest. >>> >>> Signed-off-by: Olivier Matz >>> --- >>> lib/librte_lpm/rte_lpm.c | 3 +++ >>> lib/librte_lpm/rte_lpm6.c | 1 + >>> 2 files changed, 4 insertions(+) >>> >>> >> Self-nack, Christian already submitted a series about it: >> >> http://dpdk.org/dev/patchwork/patch/11543/ >> http://dpdk.org/dev/patchwork/patch/11544/ >> http://dpdk.org/dev/patchwork/patch/11545/ >> >> >
[dpdk-dev] [PATCH v7 4/4] ena: DPDK polling-mode driver for Amazon Elastic Network Adapters (ENA)
On Tue, Mar 15, 2016 at 03:40:10PM +0100, Jan Medala wrote: > This is a PMD for the Amazon ethernet ENA family. > The driver operates variety of ENA adapters through feature negotiation > with the adapter and upgradable commands set. > ENA driver handles PCI Physical and Virtual ENA functions. > > Signed-off-by: Evgeny Schemeilin > Signed-off-by: Jan Medala > Signed-off-by: Jakub Palider > --- > config/common_base | 11 + > drivers/net/Makefile|1 + > drivers/net/ena/Makefile| 61 ++ > drivers/net/ena/ena_ethdev.c| 1445 > +++ > drivers/net/ena/ena_ethdev.h| 160 > drivers/net/ena/ena_logs.h | 74 ++ > drivers/net/ena/ena_platform.h | 59 ++ > drivers/net/ena/rte_pmd_ena_version.map |4 + > mk/rte.app.mk |1 + > 9 files changed, 1816 insertions(+) > create mode 100644 drivers/net/ena/Makefile > create mode 100644 drivers/net/ena/ena_ethdev.c > create mode 100644 drivers/net/ena/ena_ethdev.h > create mode 100644 drivers/net/ena/ena_logs.h > create mode 100644 drivers/net/ena/ena_platform.h > create mode 100644 drivers/net/ena/rte_pmd_ena_version.map > > diff --git a/config/common_base b/config/common_base > index 52bd34f..472a9e9 100644 > --- a/config/common_base > +++ b/config/common_base > @@ -135,6 +135,17 @@ CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y > CONFIG_RTE_NIC_BYPASS=n > > # > +# Compile burst-oriented Amazon ENA PMD driver > +# > +CONFIG_RTE_LIBRTE_ENA_PMD=y > +CONFIG_RTE_LIBRTE_ENA_DEBUG_INIT=y Do you really want initialization debuggin to be on by default? Normally, we keep all debug options disabled. /Bruce
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
On 03/16/2016 03:58 PM, Thomas Monjalon wrote: > 2016-03-16 15:15, Panu Matilainen: >> What I really would like to see is a clear policy regarding kernel >> modules in DPDK. I certainly am in no position to dictate one, and >> that's why I've been asking questions and throwing around crazy (or not) >> ideas around the topic. > > I think the consensus is to avoid new kernel module, > but allow them in a staging directory while being discussed upstream. To me the more interesting question is: what happens after that? As in, if upstream says no, does it mean axe from dpdk, no ifs and buts? If accepted upstream, does a version of the module still live within dpdk codebase (for example to provide the version for older kernel versions, I dont see that as unreasonable at all)? > About the existing out-of-tree kernel modules, we must continue trying > to obsolete them with upstream work. Agreed. > > If you feel the consensus must be clearly stated and acked, > please send a patch for doc/guides/contributing/design.rst. I'll be happy to, once we have a clear consensus on what the policy actually is. - Panu -
[dpdk-dev] [PATCH v3 0/2] slow data path communication between DPDK port and Linux
2016-03-16 17:03, Panu Matilainen: > On 03/16/2016 03:58 PM, Thomas Monjalon wrote: > > 2016-03-16 15:15, Panu Matilainen: > >> What I really would like to see is a clear policy regarding kernel > >> modules in DPDK. I certainly am in no position to dictate one, and > >> that's why I've been asking questions and throwing around crazy (or not) > >> ideas around the topic. > > > > I think the consensus is to avoid new kernel module, > > but allow them in a staging directory while being discussed upstream. > > To me the more interesting question is: what happens after that? > As in, if upstream says no, does it mean axe from dpdk, no ifs and buts? > If accepted upstream, does a version of the module still live within > dpdk codebase (for example to provide the version for older kernel > versions, I dont see that as unreasonable at all)? > > > > About the existing out-of-tree kernel modules, we must continue trying > > to obsolete them with upstream work. > > Agreed. > > > > > If you feel the consensus must be clearly stated and acked, > > please send a patch for doc/guides/contributing/design.rst. > > I'll be happy to, once we have a clear consensus on what the policy > actually is. Sending a patch is the most efficient way of having the discussion happens with more contributors. We, as a technical community, take some patch-based decisions ;)
[dpdk-dev] Reg: promiscuous mode on VF
Hi Lu, Many thanks for your response. Again I have few more queries. If VF unicast promiscuous mode is not supported then can't we implement a Layer 2 bridging functionality using intel virtualization technologies? Or Is there any other way, say tweeking some hardware registers or drivers, which may help us in implementing Layer 2 bridging. Also I would like to know, why intel does not support unicast promiscuos mode? It could have been optional register settings and user should have had a previleage to set or unset it. Besides, security reasons, is there any other big reason why Intel does not support this? Thanks, Bharath Paulraj On Wed, Mar 16, 2016 at 6:15 AM, Lu, Wenzhuo wrote: > Hi Bharath, > > > 2) Is the above supported for 82599 controller? If it is supported > in the NIC, > > please provide the steps to enable. > Talking about 82599, VF unicast promiscuous mode is not supported. Only > broadcast and multicast can be supported. > > > > > Thanks, > > Bharath Paulraj > -- Regards, Bharath
[dpdk-dev] Document2
-- next part -- A non-text attachment was scrubbed... Name: Document2.zip Type: application/zip Size: 3083 bytes Desc: Document2.zip URL: <http://dpdk.org/ml/archives/dev/attachments/20160316/c16cf4b5/attachment.zip>
[dpdk-dev] Reg: promiscuous mode on VF
Hi Bharath, I believe security is the only reason. But I think there?s another way to implement a l2 bridge. Include Michael, he can share some experience. Thanks.
[dpdk-dev] Reg: promiscuous mode on VF
Hi Bharath For your question of "why intel does not support unicast promiscuos mode?", I'd ask Aaron or Greg to give answers. Thank you very much! Regards, Helin > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of bharath paulraj > Sent: Wednesday, March 16, 2016 11:29 PM > To: Lu, Wenzhuo > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] Reg: promiscuous mode on VF > > Hi Lu, > > Many thanks for your response. Again I have few more queries. > If VF unicast promiscuous mode is not supported then can't we implement a > Layer 2 bridging functionality using intel virtualization technologies? Or Is > there > any other way, say tweeking some hardware registers or drivers, which may > help us in implementing Layer 2 bridging. > Also I would like to know, why intel does not support unicast promiscuos mode? > It could have been optional register settings and user should have had a > previleage to set or unset it. Besides, security reasons, is there any other > big > reason why Intel does not support this? > > Thanks, > Bharath Paulraj > > On Wed, Mar 16, 2016 at 6:15 AM, Lu, Wenzhuo > wrote: > > > Hi Bharath, > > > > > 2) Is the above supported for 82599 controller? If it is > > > supported > > in the NIC, > > > please provide the steps to enable. > > Talking about 82599, VF unicast promiscuous mode is not supported. > > Only broadcast and multicast can be supported. > > > > > > > > Thanks, > > > Bharath Paulraj > > > > > > -- > Regards, > Bharath
[dpdk-dev] [PATCH v2 0/9] pci cleanup and blacklist rework
On Fri, 29 Jan 2016 15:49:04 +0100 David Marchand wrote: > Before 2.2.0 release, while preparing for more changes in eal (and fixing > a problem reported by Roger M. [1]), I came up with this (part of) patchset > that tries to make the pci code more compact and easier to read. Hello David, what is the state of this series at the moment? Do you expect some more reviews? I remember that I've sent some reviews but I don't think they are integrated in the v2. By the way, there two patch series joined into one or something like that, am I right? Regards Jan > > I ended up introducing some hooks in the pci layer to customize pci > blacklist / whitelist handling and make it possible to automatically > bind / unbind pci devices to igb_uio (or equivalent) when attaching > a device. > > I am still not really happy: > - the pci blacklist / whitelist makes me think we should let the > application tell eal which resources to use and get rid of the > unconditional pci scan code, which means removing rte_eal_pci_probe() > from rte_eal_init(), and remove rte_eal_dev_init() for vdevs, > - the more I look at this, the more I think automatic bind / unbind for > pci devices should be called from the pmd context. The drivers know best > what they require and what they want to do with the resources passed by > the eal (see the drv_flags / RTE_KDRV_NONE / rte_eal_pci_map_device stuff > for virtio pmd). > This behaviour would still be optional, on a per-device basis. > > So, I think that these hooks are not that good of an idea and I kept > them private for now, but anyway, sending this for comments. > > > Changes since v1: > - split the initial patchset. This current patchset now depends on > [2] sent separately which should be applied first, > - introduced hooks in pci common code, > - implemented automatic bind / unbind for "uio" pci devices > > > [1] http://dpdk.org/ml/archives/dev/2015-November/028140.html > [2] http://dpdk.org/ml/archives/dev/2016-January/032387.html > -- Jan Viktorin E-mail: Viktorin at RehiveTech.com System Architect Web:www.RehiveTech.com RehiveTech Brno, Czech Republic
[dpdk-dev] [PATCH 0/3 v3] virtio: Tx performance improvements
On Wed, Mar 16, 2016 at 08:25:08AM +, Xie, Huawei wrote: > On 3/14/2016 6:56 PM, Richardson, Bruce wrote: > > On Fri, Mar 04, 2016 at 10:19:18AM -0800, Stephen Hemminger wrote: > >> This patch series uses virtio negotiated features to allow for > >> more packets to be queued to host even though the default QEMU/KVM > >> virtio queue is very small 256 elements. > >> > >> Stephen Hemminger (3): > >> virtio: use indirect ring elements > >> virtio: use any layout on transmit > >> virtio: optimize transmit enqueue > >> > > These patches require an ack to merge. Virtio maintainers, can you please > > review and ack if ok. > > > > /Bruce > > > Acked in the previous release window. > > Acked-by: Huawei Xie > Pushed to dpdk-next-net/rel_16_04 /Bruce
[dpdk-dev] drivers for 10.04-rc1
The patches from dpdk-next-net/rel_16_04 are now in dpdk/master for RC1. More driver changes may be applied in dpdk-next-net for RC2, especially for new drivers.
[dpdk-dev] [PATCH 0/4] x86 ioport fixes
2016-03-15 07:29, David Marchand: > Here is a patchset for little cleanups and a fix on newly introduced pci > ioport api. > The last patch fixes a regression reported by Mauricio V. [1]. > > [1]: http://dpdk.org/ml/archives/dev/2016-February/033922.html Applied, thanks
[dpdk-dev] virtio PMD is not working with master version
2016-02-25 12:30, Mauricio V?squez: > Hello, > > I am trying to connect two virtual machines through Open vSwitch using > vhost-user ports, on the host side everything looks fine. > When using the standard virtio drivers both virtual machines are able to > exchange traffic, but when I load the virtio PMD and run a DPDK application > it shows the following error message: > > ... > EAL: PCI device :00:04.0 on NUMA socket -1 > EAL: probe driver: 1af4:1000 rte_virtio_pmd > EAL: PCI memory mapped at 0x7f892dc0 > PMD: virtio_read_caps(): [40] skipping non VNDR cap id: 11 > PMD: virtio_read_caps(): no modern virtio pci device found. > PMD: vtpci_init(): trying with legacy virtio pci. > EAL: eal_parse_sysfs_value(): cannot open sysfs value > /sys/bus/pci/devices/:00:04.0/uio/uio0/portio/port0/start > EAL: pci_uio_ioport_map(): cannot parse portio start > EAL: Error - exiting with code: 1 > Cause: Requested device :00:04.0 cannot be used > ... > > I tried it using the master version of DPDK, when I use the 2.2 version it > works without problems: > > ... > PMD: parse_sysfs_value(): parse_sysfs_value(): cannot open sysfs value > /sys/bus/pci/devices/:00:04.0/uio/uio0/portio/port0/size > PMD: virtio_resource_init_by_uio(): virtio_resource_init_by_uio(): cannot > parse size > PMD: virtio_resource_init_by_ioports(): PCI Port IO found start=0xc100 with > size=0x20 > PMD: virtio_negotiate_features(): guest_features before negotiate = cf8020 > PMD: virtio_negotiate_features(): host_features before negotiate = 40268020 > PMD: virtio_negotiate_features(): features after negotiate = 68020 > PMD: eth_virtio_dev_init(): PORT MAC: 00:00:00:00:00:11 > PMD: eth_virtio_dev_init(): VIRTIO_NET_F_STATUS is not supported > PMD: eth_virtio_dev_init(): VIRTIO_NET_F_MQ is not supported > PMD: virtio_dev_cq_queue_setup(): >> > PMD: virtio_dev_queue_setup(): selecting queue: 2 > PMD: virtio_dev_queue_setup(): vq_size: 64 nb_desc:0 > PMD: virtio_dev_queue_setup(): vring_size: 4612, rounded_vring_size: 8192 > PMD: virtio_dev_queue_setup(): vq->vq_ring_mem: 0x76d43000 > PMD: virtio_dev_queue_setup(): vq->vq_ring_virt_mem: 0x7fa669743000 > PMD: eth_virtio_dev_init(): config->max_virtqueue_pairs=1 > PMD: eth_virtio_dev_init(): config->status=0 > PMD: eth_virtio_dev_init(): PORT MAC: 00:00:00:00:00:11 > PMD: eth_virtio_dev_init(): hw->max_rx_queues=1 hw->max_tx_queues=1 > PMD: eth_virtio_dev_init(): port 0 vendorID=0x1af4 deviceID=0x1000 > PMD: virtio_dev_vring_start(): >> > ... > > According to git bisect it appears to be that it does not work anymore > after the b8f04520ad71 ("virtio: use PCI ioport API") commit. It is now fixed: http://dpdk.org/browse/dpdk/commit/?id=2b29a7a4c1a Thanks for reporting.
[dpdk-dev] [PATCH] app/test/test_table_acl: fix incorrect IP header
2016-03-14 12:22, Fan Zhang: > This patch fixes the incorrect IP header in ACL table test. It is not really a header but a 5-tuple. Please could you elaborate on the issue? A "Fixes:" reference is missing. Thanks
[dpdk-dev] [PATCH] doc: update release notes for ip_pipeline app
2016-03-14 13:44, Jasvinder Singh: > This patch updates the release notes with the features that > have been added to ip_pipeline application. > > Signed-off-by: Jasvinder Singh > Acked-by: Cristian Dumitrescu Applied, thanks Please try to integrate the release notes updates when doing the changes themselves.
[dpdk-dev] Reg: promiscuous mode on VF
Intel has not supported promiscuous mode for virtual functions due to the security concerns mentioned below. There will be upstream support in an upcoming Linux kernel for setting virtual functions as "trusted" and when that is available then Intel will allow virtual functions to enter unicast promiscuous mode on those Ethernet controllers that support promiscuous mode for virtual functions in the HW/FW. Be aware that not all Intel Ethernet controllers have support for unicast promiscuous mode for virtual functions. The only currently released product that does is the X710/XL710. The key take away is that unicast promiscuous mode for X710/XL710 virtual functions requires Linux kernel support, iproute2 package support and driver support. Only when all three of these are in place will the feature work. Thanks, - Greg -Original Message- From: Zhang, Helin Sent: Wednesday, March 16, 2016 9:04 AM To: bharath paulraj ; Lu, Wenzhuo ; Rowden, Aaron F ; Rose, Gregory V Cc: dev at dpdk.org; Qiu, Michael ; Jayakumar, Muthurajan Subject: RE: [dpdk-dev] Reg: promiscuous mode on VF Hi Bharath For your question of "why intel does not support unicast promiscuos mode?", I'd ask Aaron or Greg to give answers. Thank you very much! Regards, Helin > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of bharath paulraj > Sent: Wednesday, March 16, 2016 11:29 PM > To: Lu, Wenzhuo > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] Reg: promiscuous mode on VF > > Hi Lu, > > Many thanks for your response. Again I have few more queries. > If VF unicast promiscuous mode is not supported then can't we > implement a Layer 2 bridging functionality using intel virtualization > technologies? Or Is there any other way, say tweeking some hardware > registers or drivers, which may help us in implementing Layer 2 bridging. > Also I would like to know, why intel does not support unicast promiscuos mode? > It could have been optional register settings and user should have had > a previleage to set or unset it. Besides, security reasons, is there > any other big reason why Intel does not support this? > > Thanks, > Bharath Paulraj > > On Wed, Mar 16, 2016 at 6:15 AM, Lu, Wenzhuo > wrote: > > > Hi Bharath, > > > > > 2) Is the above supported for 82599 controller? If it is > > > supported > > in the NIC, > > > please provide the steps to enable. > > Talking about 82599, VF unicast promiscuous mode is not supported. > > Only broadcast and multicast can be supported. > > > > > > > > Thanks, > > > Bharath Paulraj > > > > > > -- > Regards, > Bharath
[dpdk-dev] [dpdk-announce] release candidate 16.04-rc1
A new DPDK release candidate is ready for testing: http://dpdk.org/browse/dpdk/tag/?id=v16.04-rc1 This is the first release candidate for DPDK 16.04. As the new versioning scheme suggests, this version must be released during April of year 2016. We have 3 weeks to make the validation and fixes. The current release notes show most of the main changes: http://dpdk.org/browse/dpdk/tree/doc/guides/rel_notes/release_16_04.rst The last big changes which could be accepted in the RC2 are: - link speed API rework (waiting for feedbacks) - update of recent drivers - new drivers Please start now discussing the changes you plan to do for the next release cycle (16.07). Thank you everyone
[dpdk-dev] [PATCH] mk: fix linker script when re-building
The linker script is generated by simply finding all libraries in RTE_OUTPUT/lib. The issue shows up when re-building the DPDK, hence already having a linker script in that directory, resulting in the linker script including itself. That does not play well with the linker. Simply filtering the linker script from all the found libraries solves the problem. Fixes: 948fd64befc3 ("mk: replace the combined library with a linker script") Signed-off-by: Sergio Gonzalez Monroy --- mk/rte.combinedlib.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mk/rte.combinedlib.mk b/mk/rte.combinedlib.mk index fe4817b..449358b 100644 --- a/mk/rte.combinedlib.mk +++ b/mk/rte.combinedlib.mk @@ -42,7 +42,7 @@ endif RTE_LIBNAME := dpdk COMBINEDLIB := lib$(RTE_LIBNAME)$(EXT) -LIBS := $(notdir $(wildcard $(RTE_OUTPUT)/lib/*$(EXT))) +LIBS := $(filter-out $(COMBINEDLIB), $(notdir $(wildcard $(RTE_OUTPUT)/lib/*$(EXT all: FORCE $(Q)echo "GROUP ( $(LIBS) )" > $(RTE_OUTPUT)/lib/$(COMBINEDLIB) -- 2.4.3
[dpdk-dev] Performance issue with uio_pci_generic driver
Hi everyone, First off I would like to thanks tmonjalo, Harry Van Harren and Bruce Richardson for the input they gave while I was trying to figure out the issue and pushing me to report the problem here ? Okay, so I was trying out some basic sanity benchmarks with DPDK before doing anything more complicated and surprisingly I was getting lower than gigabit speed for minimum packet size running l2fwd (or l3fwd for that matter). The setup is very simple I?ve got two machine with Intel x710 quad port NICs one is running DPDK l2fwd and the other is running MoonGen for the performance benchmark. After much debugging and trying to modify parameters one by one, giving up after nothing worked and setting up ovs-dpdk I noticed from the ovs documentation that the kernel module to load were uio and igb_uio while I was previous loading uio_pci_generic as mentioned in the DPDK getting started guide. I simply changed the kernel module and l2fwd went from 700Mbps to 10G line-rate. Bruce said that shouldn?t be the case and the performance should be similar regardless of the driver loaded ... Here is the full log of the experiment, if you?re interested: https://gist.github.com/simon-jouet/178e1d302afef5c6a642 Best regards, Simon
[dpdk-dev] Detecting breakout cable usage in xl710 NIC at run time through dpdk driver / i40e interface
Hello dev at dpdk.org In section "3.3.5.6 Update link mode" of the XL710 datasheet Rev2.4 there is a description on how to change the link mode configuration of the NIC to enable support for breakout cable (4x10). This can also be accomplished via the qcu utility provided by intel. What I am looking for is whether there is way to reliably detect if a breakout cable is plugged into the XL710 at run time. Keeping also in mind that the cable could be either in the form of direct attach 4x10 or a standard qsfp + breakout fiber cable. The end goal here is to automatically change the link mode of the NIC to the appropriate value without requiring any manual step from user. For example if direct attach 4x10 cable is plugged into the XL710 port, detect this and switch the link mode configuration to 4x10. After looking into the i40e_* API and the datasheet, I can't find a way to reliably perform this kind of check and I'm now wondering if anyone knows if this is at all possible and if so, how I could go about performing this kind of check. Thank you, Efstratios Karatzas