[dpdk-dev] A (possible) problem with `--no-huge` option

2017-05-14 Thread Ilya Matveychikov
Hi guys,

I have a problem while running DPDK with `--no-huge` option. It seems that the 
problem occurs since commit cdc242f260e766bd95a658b5e0686a62ec04f5b0 and that 
is the change that affects me:

+   if ((page & 0x7fULL) == 0)
+   return RTE_BAD_PHYS_ADDR;
+

What I did is to try to create memory pool using rte_pktmbuf_pool_create(). I 
dig into the issue and found that in my case “page" value is 0x0080 
which means that the page is not present and “soft-dirty” (according to 
kernel’s documentation):

   * Bits 0-54  page frame number (PFN) if present
   * Bits 0-4   swap type if swapped
   * Bits 5-54  swap offset if swapped
   * Bit  55pte is soft-dirty (see Documentation/vm/soft-dirty.txt)
   * Bit  56page exclusively mapped (since 4.2)
   * Bits 57-60 zero
   * Bit  61page is file-page or shared-anon (since 3.5)
   * Bit  62page swapped
   * Bit  63page present

So, before the change mentioned all “works” fine and such pages were not 
handled. But now the check causes rte_mempool_populate_default to fail with 
-EINVAL...
Can anyone familiar with the memory pool allocation helps with the issue?

Thanks in advice,
Ilya Matveychikov.



Re: [dpdk-dev] [PATCH] driver/net: remove unnecessary macro for unused variables

2017-05-14 Thread Lu, Wenzhuo
Hi,

> -Original Message-
> From: Yigit, Ferruh
> Sent: Friday, May 12, 2017 6:33 PM
> To: John W. Linville; Legacy, Allain (Wind River); Peters, Matt (Wind River);
> Harish Patil; Rasesh Mody; Stephen Hurd; Ajit Khaparde; Doherty, Declan; Lu,
> Wenzhuo; Marcin Wojtas; Michal Krawczyk; Guy Tzalik; Evgeny Schemeilin;
> John Daley; Nelson Escobar; Chen, Jing D; Zhang, Helin; Wu, Jingjing; Ananyev,
> Konstantin; Andrew Rybchenko; Pascal Mazon; Yuanhan Liu; Maxime
> Coquelin; Shrikrishna Khare
> Cc: dev@dpdk.org; Yigit, Ferruh
> Subject: [PATCH] driver/net: remove unnecessary macro for unused variables
> 
> remove __rte_unused instances that are not required.
> 
> Signed-off-by: Ferruh Yigit 
Acked-by: Wenzhuo Lu 


Re: [dpdk-dev] [PATCH v5 2/4] eal: move gcc version definition to common header

2017-05-14 Thread Jianbo Liu
On 12 May 2017 at 18:15, Ashwin Sekhar T K
 wrote:
> Moved the definition of GCC_VERSION from lib/librte_table/rte_lru.h
> to lib/librte_eal/common/include/rte_common.h.
>
> Tested compilation on:
>  * arm64 with gcc
>  * x86 with gcc and clang
>
> Signed-off-by: Ashwin Sekhar T K 
> Reviewed-by: Jan Viktorin 
> ---
>  lib/librte_eal/common/include/rte_common.h |  6 ++
>  lib/librte_table/rte_lru.h | 10 ++
>  2 files changed, 8 insertions(+), 8 deletions(-)
>

Acked-by: Jianbo Liu 


Re: [dpdk-dev] [PATCH v5 3/4] net: add arm64 neon version of CRC compute APIs

2017-05-14 Thread Jianbo Liu
On 12 May 2017 at 18:15, Ashwin Sekhar T K
 wrote:
> Added CRC compute APIs for arm64 utilizing the pmull
> capability.
>
> Added new file net_crc_neon.h to hold the arm64 pmull
> CRC implementation.
>
> Added wrappers in rte_vect.h for those neon intrinsics
> which are not supported in GCC version < 7.
>
> Verified the changes with crc_autotest unit test case
>
> Signed-off-by: Ashwin Sekhar T K 
> ---
>  MAINTAINERS   |   1 +
>  lib/librte_eal/common/include/arch/arm/rte_vect.h |  88 +++
>  lib/librte_net/net_crc_neon.h | 297 
> ++
>  lib/librte_net/rte_net_crc.c  |  34 ++-
>  lib/librte_net/rte_net_crc.h  |   2 +
>  5 files changed, 416 insertions(+), 6 deletions(-)
>  create mode 100644 lib/librte_net/net_crc_neon.h
>

Acked-by: Jianbo Liu 


Re: [dpdk-dev] [PATCH] driver/net: remove unnecessary macro for unused variables

2017-05-14 Thread Yuanhan Liu
On Fri, May 12, 2017 at 11:33:03AM +0100, Ferruh Yigit wrote:
> remove __rte_unused instances that are not required.

I'm wondering this is done by some scripts?

--yliu


[dpdk-dev] [RFC 2/2] doc/guides/prog_guide: add new flow attribute

2017-05-14 Thread Qi Zhang
Update the programming guide for the new attribute of rte_flow

Signed-off-by: Qi Zhang 
---
 doc/guides/prog_guide/rte_flow.rst | 12 
 1 file changed, 12 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index b587ba9..5207eec 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -181,6 +181,18 @@ directions. At least one direction must be specified.
 Specifying both directions at once for a given rule is not recommended but
 may be valid in a few cases (e.g. shared counters).
 
+Attribute: Match hint
+^
+
+This is a attribute to hint different pattern match accuracy.
+
+Perfect match:
+- Actions will be taken if input packet's pattern matches flow's pattern.
+
+Signature match:
+- Actions will be taken if the signature of input packet's pattern matches
+  the signature of flow's pattern.
+
 Pattern item
 
 
-- 
2.7.4



[dpdk-dev] [RFC 1/2] rte_flow: add attribute for signature match

2017-05-14 Thread Qi Zhang
Add new attribute "sig_match" to rte_flow_attr.
This attribute indicate if current flow take "perfect match"
or "signature match".
With perfect match (by default), if a packet does not match pattern,
actions will not be taken. (this is identical with current behavior )
With signature match, if a packet does not match pattern, it still
has the possibility to trigger the actions, this happens when device
think the signature of the pattern is matched.
Signature match is expected to have better performance than perfect
match, but the cost is accuracy.
When a flow rule with this attribute set, identical behavior can ONLY
be guaranteed if packet matches the pattern, since different device
may have different implementation of signature calculation algorithm.

Signed-off-by: Qi Zhang 
---
 app/test-pmd/cmdline_flow.c | 11 +++
 lib/librte_ether/rte_flow.h |  3 ++-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 0fd69f9..512f817 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -95,6 +95,7 @@ enum index {
PRIORITY,
INGRESS,
EGRESS,
+   SIG_MATCH,
 
/* Validate/create pattern. */
PATTERN,
@@ -397,6 +398,7 @@ static const enum index next_vc_attr[] = {
PRIORITY,
INGRESS,
EGRESS,
+   SIG_MATCH,
PATTERN,
ZERO,
 };
@@ -896,6 +898,12 @@ static const struct token token_list[] = {
.next = NEXT(next_vc_attr),
.call = parse_vc,
},
+   [SIG_MATCH] = {
+   .name = "sig_match",
+   .help = "affect rule to match",
+   .next = NEXT(next_vc_attr),
+   .call = parse_vc,
+   },
/* Validate/create pattern. */
[PATTERN] = {
.name = "pattern",
@@ -1728,6 +1736,9 @@ parse_vc(struct context *ctx, const struct token *token,
case EGRESS:
out->args.vc.attr.egress = 1;
return len;
+   case SIG_MATCH:
+   out->args.vc.attr.sig_match = 1;
+   return len;
case PATTERN:
out->args.vc.pattern =
(void *)RTE_ALIGN_CEIL((uintptr_t)(out + 1),
diff --git a/lib/librte_ether/rte_flow.h b/lib/librte_ether/rte_flow.h
index c47edbc..8ba3c36 100644
--- a/lib/librte_ether/rte_flow.h
+++ b/lib/librte_ether/rte_flow.h
@@ -95,7 +95,8 @@ struct rte_flow_attr {
uint32_t priority; /**< Priority level within group. */
uint32_t ingress:1; /**< Rule applies to ingress traffic. */
uint32_t egress:1; /**< Rule applies to egress traffic. */
-   uint32_t reserved:30; /**< Reserved, must be zero. */
+   uint32_t sig_match:1; /**< only use hash signagure to match. */
+   uint32_t reserved:29; /**< Reserved, must be zero. */
 };
 
 /**
-- 
2.7.4



[dpdk-dev] [RFC 0/2] ethdev: add new attribute for signature match

2017-05-14 Thread Qi Zhang
We try to enable ixgbe's signature match with rte_flow, but didn't
find a way with current APIs, so the RFC propose to add a new flow
attribute "sig_match" to indicate if current flow is "perfect match"
or "signature match"
With perfect match (by default), if a packet does not match pattern,
actions will not be taken. (this is identical with current behavior)
With signature match, if a packet does not match pattern, it still
has the possibility to trigger the actions, this happens when device
think the signature of the pattern is matched.
Signature match is expected to have better performance than perfect
match with the cost of accuracy.
When a flow rule with this attribute set, identical behavior can ONLY
be guaranteed if packet matches the pattern, since different device
may have different implementation of signature calculation algorithm.
Driver of device that does not support signature match is not required to
return error, but just simply igore this attribute, because the default
 "perfect match" still can be regarded as a speical case of 
"signature match".

Qi Zhang (2):
  rte_flow: add attribute for signature match
  doc/guides/prog_guide: add new rte_flow attribute

 app/test-pmd/cmdline_flow.c| 11 +++
 doc/guides/prog_guide/rte_flow.rst | 12 
 lib/librte_ether/rte_flow.h|  3 ++-
 3 files changed, 25 insertions(+), 1 deletion(-)

-- 
2.7.4



[dpdk-dev] [PATCH v4 0/8] accelerate examples/l3fwd with NEON on ARM64 platform

2017-05-14 Thread Jianbo Liu
v4:
  - add vcopyq_laneq_u32 for older version of gcc

v3:
  - remove unnecessary perfetch for rte_mbuf
  - fix typo in git log
  - Ashwin's suggestions for performance on ThunderX

v2:
  - change name of l3fwd_em_sse.h to l3fwd_em_sequential.h
  - add the times of hash multi-lookup for different Archs
  - performance tuning on ThunderX: prefetching, set NO_HASH_LOOKUP_MULTI ...

Jianbo Liu (8):
  examples/l3fwd: extract arch independent code from multi hash lookup
  examples/l3fwd: rename l3fwd_em_sse.h to l3fwd_em_sequential.h
  examples/l3fwd: extract common code from multi packet send
  examples/l3fwd: rearrange the code for lpm_l3fwd
  arch/arm: add vcopyq_laneq_u32 for old version of gcc
  examples/l3fwd: add neon support for l3fwd
  examples/l3fwd: add the times of hash multi-lookup for different Archs
  examples/l3fwd: change the guard macro name for header file

 examples/l3fwd/l3fwd_common.h  | 293 +
 examples/l3fwd/l3fwd_em.c  |   8 +-
 examples/l3fwd/l3fwd_em_hlm.h  | 218 +++
 examples/l3fwd/l3fwd_em_hlm_neon.h |  74 ++
 examples/l3fwd/l3fwd_em_hlm_sse.h  | 280 +---
 .../{l3fwd_em_sse.h => l3fwd_em_sequential.h}  |  24 +-
 examples/l3fwd/l3fwd_lpm.c |  87 +-
 examples/l3fwd/l3fwd_lpm.h |  26 +-
 examples/l3fwd/l3fwd_lpm_neon.h| 193 ++
 examples/l3fwd/l3fwd_lpm_sse.h |  66 -
 examples/l3fwd/l3fwd_neon.h| 259 ++
 examples/l3fwd/l3fwd_sse.h | 255 +-
 lib/librte_eal/common/include/arch/arm/rte_vect.h  |   9 +
 13 files changed, 1166 insertions(+), 626 deletions(-)
 create mode 100644 examples/l3fwd/l3fwd_common.h
 create mode 100644 examples/l3fwd/l3fwd_em_hlm.h
 create mode 100644 examples/l3fwd/l3fwd_em_hlm_neon.h
 rename examples/l3fwd/{l3fwd_em_sse.h => l3fwd_em_sequential.h} (88%)
 create mode 100644 examples/l3fwd/l3fwd_lpm_neon.h
 create mode 100644 examples/l3fwd/l3fwd_neon.h

-- 
1.8.3.1



[dpdk-dev] [PATCH v4 1/8] examples/l3fwd: extract arch independent code from multi hash lookup

2017-05-14 Thread Jianbo Liu
Extract common code from l3fwd_em_hlm_sse.h, and add to the new file
l3fwd_em_hlm.h.

Signed-off-by: Jianbo Liu 
---
 examples/l3fwd/l3fwd_em.c |   2 +-
 examples/l3fwd/l3fwd_em_hlm.h | 302 ++
 examples/l3fwd/l3fwd_em_hlm_sse.h | 280 +--
 3 files changed, 309 insertions(+), 275 deletions(-)
 create mode 100644 examples/l3fwd/l3fwd_em_hlm.h

diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index 9cc4460..939a16d 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -332,7 +332,7 @@ struct ipv6_l3fwd_em_route {
 #if defined(NO_HASH_MULTI_LOOKUP)
 #include "l3fwd_em_sse.h"
 #else
-#include "l3fwd_em_hlm_sse.h"
+#include "l3fwd_em_hlm.h"
 #endif
 #else
 #include "l3fwd_em.h"
diff --git a/examples/l3fwd/l3fwd_em_hlm.h b/examples/l3fwd/l3fwd_em_hlm.h
new file mode 100644
index 000..636dea4
--- /dev/null
+++ b/examples/l3fwd/l3fwd_em_hlm.h
@@ -0,0 +1,302 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2017, Linaro Limited
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef __L3FWD_EM_HLM_H__
+#define __L3FWD_EM_HLM_H__
+
+#include "l3fwd_sse.h"
+#include "l3fwd_em_hlm_sse.h"
+
+static inline __attribute__((always_inline)) void
+em_get_dst_port_ipv4x8(struct lcore_conf *qconf, struct rte_mbuf *m[8],
+   uint8_t portid, uint16_t dst_port[8])
+{
+   int32_t ret[8];
+   union ipv4_5tuple_host key[8];
+
+   get_ipv4_5tuple(m[0], mask0.x, &key[0]);
+   get_ipv4_5tuple(m[1], mask0.x, &key[1]);
+   get_ipv4_5tuple(m[2], mask0.x, &key[2]);
+   get_ipv4_5tuple(m[3], mask0.x, &key[3]);
+   get_ipv4_5tuple(m[4], mask0.x, &key[4]);
+   get_ipv4_5tuple(m[5], mask0.x, &key[5]);
+   get_ipv4_5tuple(m[6], mask0.x, &key[6]);
+   get_ipv4_5tuple(m[7], mask0.x, &key[7]);
+
+   const void *key_array[8] = {&key[0], &key[1], &key[2], &key[3],
+   &key[4], &key[5], &key[6], &key[7]};
+
+   rte_hash_lookup_bulk(qconf->ipv4_lookup_struct, &key_array[0], 8, ret);
+
+   dst_port[0] = (uint8_t) ((ret[0] < 0) ?
+   portid : ipv4_l3fwd_out_if[ret[0]]);
+   dst_port[1] = (uint8_t) ((ret[1] < 0) ?
+   portid : ipv4_l3fwd_out_if[ret[1]]);
+   dst_port[2] = (uint8_t) ((ret[2] < 0) ?
+   portid : ipv4_l3fwd_out_if[ret[2]]);
+   dst_port[3] = (uint8_t) ((ret[3] < 0) ?
+   portid : ipv4_l3fwd_out_if[ret[3]]);
+   dst_port[4] = (uint8_t) ((ret[4] < 0) ?
+   portid : ipv4_l3fwd_out_if[ret[4]]);
+   dst_port[5] = (uint8_t) ((ret[5] < 0) ?
+   portid : ipv4_l3fwd_out_if[ret[5]]);
+   dst_port[6] = (uint8_t) ((ret[6] < 0) ?
+   portid : ipv4_l3fwd_out_if[ret[6]]);
+   dst_port[7] = (uint8_t) ((ret[7] < 0) ?
+   portid : ipv4_l3fwd_out_if[ret[7]]);
+
+   if (dst_port[0] >= RTE_MAX_ETHPORTS ||
+   (enabled_port_mask & 1 << dst_port[0]) == 0)
+   dst_port[0] = portid;
+
+   if (dst_port[1] >= RTE_MAX_ETHPORTS ||
+   (enabled_port_mask & 1 << dst_port[1]) == 0)
+   dst_port[1] = portid;
+
+   if (dst_port[2] >= RTE_MAX_ETHPORTS ||
+   (enabled_port_m

[dpdk-dev] [PATCH v4 2/8] examples/l3fwd: rename l3fwd_em_sse.h to l3fwd_em_sequential.h

2017-05-14 Thread Jianbo Liu
The l3fwd_em_sse.h is enabled by NO_HASH_LOOKUP_MULTI.
Renaming it because it's only for sequential hash lookup,
and doesn't include any x86 SSE instructions.

Signed-off-by: Jianbo Liu 
---
 examples/l3fwd/l3fwd_em.c| 2 +-
 examples/l3fwd/{l3fwd_em_sse.h => l3fwd_em_sequential.h} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename examples/l3fwd/{l3fwd_em_sse.h => l3fwd_em_sequential.h} (100%)

diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index 939a16d..ba844b2 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -330,7 +330,7 @@ struct ipv6_l3fwd_em_route {
 
 #if defined(__SSE4_1__)
 #if defined(NO_HASH_MULTI_LOOKUP)
-#include "l3fwd_em_sse.h"
+#include "l3fwd_em_sequential.h"
 #else
 #include "l3fwd_em_hlm.h"
 #endif
diff --git a/examples/l3fwd/l3fwd_em_sse.h 
b/examples/l3fwd/l3fwd_em_sequential.h
similarity index 100%
rename from examples/l3fwd/l3fwd_em_sse.h
rename to examples/l3fwd/l3fwd_em_sequential.h
-- 
1.8.3.1



[dpdk-dev] [PATCH v4 3/8] examples/l3fwd: extract common code from multi packet send

2017-05-14 Thread Jianbo Liu
Keep x86 related code in l3fwd_sse.h, and move common code to
l3fwd_common.h, which will be used by other Archs.

Signed-off-by: Jianbo Liu 
---
 examples/l3fwd/l3fwd_common.h | 293 ++
 examples/l3fwd/l3fwd_sse.h| 255 +---
 2 files changed, 297 insertions(+), 251 deletions(-)
 create mode 100644 examples/l3fwd/l3fwd_common.h

diff --git a/examples/l3fwd/l3fwd_common.h b/examples/l3fwd/l3fwd_common.h
new file mode 100644
index 000..d7a1fdf
--- /dev/null
+++ b/examples/l3fwd/l3fwd_common.h
@@ -0,0 +1,293 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2017, Linaro Limited
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+
+#ifndef _L3FWD_COMMON_H_
+#define _L3FWD_COMMON_H_
+
+#ifdef DO_RFC_1812_CHECKS
+
+#defineIPV4_MIN_VER_IHL0x45
+#defineIPV4_MAX_VER_IHL0x4f
+#defineIPV4_MAX_VER_IHL_DIFF   (IPV4_MAX_VER_IHL - IPV4_MIN_VER_IHL)
+
+/* Minimum value of IPV4 total length (20B) in network byte order. */
+#defineIPV4_MIN_LEN_BE (sizeof(struct ipv4_hdr) << 8)
+
+/*
+ * From http://www.rfc-editor.org/rfc/rfc1812.txt section 5.2.2:
+ * - The IP version number must be 4.
+ * - The IP header length field must be large enough to hold the
+ *minimum length legal IP datagram (20 bytes = 5 words).
+ * - The IP total length field must be large enough to hold the IP
+ *   datagram header, whose length is specified in the IP header length
+ *   field.
+ * If we encounter invalid IPV4 packet, then set destination port for it
+ * to BAD_PORT value.
+ */
+static inline __attribute__((always_inline)) void
+rfc1812_process(struct ipv4_hdr *ipv4_hdr, uint16_t *dp, uint32_t ptype)
+{
+   uint8_t ihl;
+
+   if (RTE_ETH_IS_IPV4_HDR(ptype)) {
+   ihl = ipv4_hdr->version_ihl - IPV4_MIN_VER_IHL;
+
+   ipv4_hdr->time_to_live--;
+   ipv4_hdr->hdr_checksum++;
+
+   if (ihl > IPV4_MAX_VER_IHL_DIFF ||
+   ((uint8_t)ipv4_hdr->total_length == 0 &&
+   ipv4_hdr->total_length < IPV4_MIN_LEN_BE))
+   dp[0] = BAD_PORT;
+
+   }
+}
+
+#else
+#definerfc1812_process(mb, dp, ptype)  do { } while (0)
+#endif /* DO_RFC_1812_CHECKS */
+
+/*
+ * We group consecutive packets with the same destionation port into one burst.
+ * To avoid extra latency this is done together with some other packet
+ * processing, but after we made a final decision about packet's destination.
+ * To do this we maintain:
+ * pnum - array of number of consecutive packets with the same dest port for
+ * each packet in the input burst.
+ * lp - pointer to the last updated element in the pnum.
+ * dlp - dest port value lp corresponds to.
+ */
+
+#defineGRPSZ   (1 << FWDSTEP)
+#defineGRPMSK  (GRPSZ - 1)
+
+#define GROUP_PORT_STEP(dlp, dcp, lp, pn, idx) do { \
+   if (likely((dlp) == (dcp)[(idx)])) { \
+   (lp)[0]++;   \
+   } else { \
+   (dlp) = (dcp)[idx];  \
+   (lp) = (pn) + (idx); \
+   (lp)[0] = 1; \
+   }\
+} 

[dpdk-dev] [PATCH v4 5/8] arch/arm: add vcopyq_laneq_u32 for old version of gcc

2017-05-14 Thread Jianbo Liu
Implement vcopyq_laneq_u32 if gcc version is lower than 7.

Signed-off-by: Jianbo Liu 
---
 lib/librte_eal/common/include/arch/arm/rte_vect.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
b/lib/librte_eal/common/include/arch/arm/rte_vect.h
index 4107c99..d9fb4d0 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
@@ -78,6 +78,15 @@
 }
 #endif
 
+#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION < 7)
+static inline uint32x4_t
+vcopyq_laneq_u32(uint32x4_t a, const int lane_a,
+uint32x4_t b, const int lane_b)
+{
+   return vsetq_lane_u32(vgetq_lane_u32(b, lane_b), a, lane_a);
+}
+#endif
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.8.3.1



[dpdk-dev] [PATCH v4 6/8] examples/l3fwd: add neon support for l3fwd

2017-05-14 Thread Jianbo Liu
Use ARM NEON intrinsics to accelerate l3 fowarding.

Signed-off-by: Jianbo Liu 
---
 examples/l3fwd/l3fwd_em.c|   4 +-
 examples/l3fwd/l3fwd_em_hlm.h|  17 ++-
 examples/l3fwd/l3fwd_em_hlm_neon.h   |  74 ++
 examples/l3fwd/l3fwd_em_sequential.h |  18 ++-
 examples/l3fwd/l3fwd_lpm.c   |   4 +-
 examples/l3fwd/l3fwd_lpm_neon.h  | 193 ++
 examples/l3fwd/l3fwd_neon.h  | 259 +++
 7 files changed, 563 insertions(+), 6 deletions(-)
 create mode 100644 examples/l3fwd/l3fwd_em_hlm_neon.h
 create mode 100644 examples/l3fwd/l3fwd_lpm_neon.h
 create mode 100644 examples/l3fwd/l3fwd_neon.h

diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index ba844b2..da96cfd 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -328,7 +328,7 @@ struct ipv6_l3fwd_em_route {
return (uint8_t)((ret < 0) ? portid : ipv6_l3fwd_out_if[ret]);
 }
 
-#if defined(__SSE4_1__)
+#if defined(__SSE4_1__) || defined(RTE_MACHINE_CPUFLAG_NEON)
 #if defined(NO_HASH_MULTI_LOOKUP)
 #include "l3fwd_em_sequential.h"
 #else
@@ -709,7 +709,7 @@ struct ipv6_l3fwd_em_route {
if (nb_rx == 0)
continue;
 
-#if defined(__SSE4_1__)
+#if defined(__SSE4_1__) || defined(RTE_MACHINE_CPUFLAG_NEON)
l3fwd_em_send_packets(nb_rx, pkts_burst,
portid, qconf);
 #else
diff --git a/examples/l3fwd/l3fwd_em_hlm.h b/examples/l3fwd/l3fwd_em_hlm.h
index 636dea4..b9163e3 100644
--- a/examples/l3fwd/l3fwd_em_hlm.h
+++ b/examples/l3fwd/l3fwd_em_hlm.h
@@ -35,8 +35,13 @@
 #ifndef __L3FWD_EM_HLM_H__
 #define __L3FWD_EM_HLM_H__
 
+#if defined(__SSE4_1__)
 #include "l3fwd_sse.h"
 #include "l3fwd_em_hlm_sse.h"
+#elif defined(RTE_MACHINE_CPUFLAG_NEON)
+#include "l3fwd_neon.h"
+#include "l3fwd_em_hlm_neon.h"
+#endif
 
 static inline __attribute__((always_inline)) void
 em_get_dst_port_ipv4x8(struct lcore_conf *qconf, struct rte_mbuf *m[8],
@@ -238,7 +243,7 @@ static inline __attribute__((always_inline)) uint16_t
 l3fwd_em_send_packets(int nb_rx, struct rte_mbuf **pkts_burst,
uint8_t portid, struct lcore_conf *qconf)
 {
-   int32_t j;
+   int32_t i, j, pos;
uint16_t dst_port[MAX_PKT_BURST];
 
/*
@@ -247,6 +252,11 @@ static inline __attribute__((always_inline)) uint16_t
 */
int32_t n = RTE_ALIGN_FLOOR(nb_rx, 8);
 
+   for (j = 0; j < 8 && j < nb_rx; j++) {
+   rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j],
+  struct ether_hdr *) + 1);
+   }
+
for (j = 0; j < n; j += 8) {
 
uint32_t pkt_type =
@@ -263,6 +273,11 @@ static inline __attribute__((always_inline)) uint16_t
uint32_t tcp_or_udp = pkt_type &
(RTE_PTYPE_L4_TCP | RTE_PTYPE_L4_UDP);
 
+   for (i = 0, pos = j + 8; i < 8 && pos < nb_rx; i++, pos++) {
+   rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[pos],
+  struct ether_hdr *) + 1);
+   }
+
if (tcp_or_udp && (l3_type == RTE_PTYPE_L3_IPV4)) {
 
em_get_dst_port_ipv4x8(qconf, &pkts_burst[j], portid,
diff --git a/examples/l3fwd/l3fwd_em_hlm_neon.h 
b/examples/l3fwd/l3fwd_em_hlm_neon.h
new file mode 100644
index 000..dae1acf
--- /dev/null
+++ b/examples/l3fwd/l3fwd_em_hlm_neon.h
@@ -0,0 +1,74 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 Intel Corporation. All rights reserved.
+ *   Copyright(c) 2017, Linaro Limited
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICE

[dpdk-dev] [PATCH v4 4/8] examples/l3fwd: rearrange the code for lpm_l3fwd

2017-05-14 Thread Jianbo Liu
Signed-off-by: Jianbo Liu 

Some common code can be used by other ARCHs, move to l3fwd_lpm.c
---
 examples/l3fwd/l3fwd_lpm.c | 83 ++
 examples/l3fwd/l3fwd_lpm.h | 26 +
 examples/l3fwd/l3fwd_lpm_sse.h | 66 -
 3 files changed, 84 insertions(+), 91 deletions(-)

diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c
index f621269..fc554fc 100644
--- a/examples/l3fwd/l3fwd_lpm.c
+++ b/examples/l3fwd/l3fwd_lpm.c
@@ -104,6 +104,89 @@ struct ipv6_l3fwd_lpm_route {
 struct rte_lpm *ipv4_l3fwd_lpm_lookup_struct[NB_SOCKETS];
 struct rte_lpm6 *ipv6_l3fwd_lpm_lookup_struct[NB_SOCKETS];
 
+static inline uint16_t
+lpm_get_ipv4_dst_port(void *ipv4_hdr,  uint8_t portid, void *lookup_struct)
+{
+   uint32_t next_hop;
+   struct rte_lpm *ipv4_l3fwd_lookup_struct =
+   (struct rte_lpm *)lookup_struct;
+
+   return (uint16_t) ((rte_lpm_lookup(ipv4_l3fwd_lookup_struct,
+   rte_be_to_cpu_32(((struct ipv4_hdr *)ipv4_hdr)->dst_addr),
+   &next_hop) == 0) ? next_hop : portid);
+}
+
+static inline uint16_t
+lpm_get_ipv6_dst_port(void *ipv6_hdr,  uint8_t portid, void *lookup_struct)
+{
+   uint32_t next_hop;
+   struct rte_lpm6 *ipv6_l3fwd_lookup_struct =
+   (struct rte_lpm6 *)lookup_struct;
+
+   return (uint16_t) ((rte_lpm6_lookup(ipv6_l3fwd_lookup_struct,
+   ((struct ipv6_hdr *)ipv6_hdr)->dst_addr,
+   &next_hop) == 0) ?  next_hop : portid);
+}
+
+static inline __attribute__((always_inline)) uint16_t
+lpm_get_dst_port(const struct lcore_conf *qconf, struct rte_mbuf *pkt,
+   uint8_t portid)
+{
+   struct ipv6_hdr *ipv6_hdr;
+   struct ipv4_hdr *ipv4_hdr;
+   struct ether_hdr *eth_hdr;
+
+   if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
+
+   eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
+   ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
+
+   return lpm_get_ipv4_dst_port(ipv4_hdr, portid,
+qconf->ipv4_lookup_struct);
+   } else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
+
+   eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
+   ipv6_hdr = (struct ipv6_hdr *)(eth_hdr + 1);
+
+   return lpm_get_ipv6_dst_port(ipv6_hdr, portid,
+qconf->ipv6_lookup_struct);
+   }
+
+   return portid;
+}
+
+/*
+ * lpm_get_dst_port optimized routine for packets where dst_ipv4 is already
+ * precalculated. If packet is ipv6 dst_addr is taken directly from packet
+ * header and dst_ipv4 value is not used.
+ */
+static inline __attribute__((always_inline)) uint16_t
+lpm_get_dst_port_with_ipv4(const struct lcore_conf *qconf, struct rte_mbuf 
*pkt,
+   uint32_t dst_ipv4, uint8_t portid)
+{
+   uint32_t next_hop;
+   struct ipv6_hdr *ipv6_hdr;
+   struct ether_hdr *eth_hdr;
+
+   if (RTE_ETH_IS_IPV4_HDR(pkt->packet_type)) {
+   return (uint16_t) ((rte_lpm_lookup(qconf->ipv4_lookup_struct,
+  dst_ipv4, &next_hop) == 0)
+  ? next_hop : portid);
+
+   } else if (RTE_ETH_IS_IPV6_HDR(pkt->packet_type)) {
+
+   eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
+   ipv6_hdr = (struct ipv6_hdr *)(eth_hdr + 1);
+
+   return (uint16_t) ((rte_lpm6_lookup(qconf->ipv6_lookup_struct,
+   ipv6_hdr->dst_addr, &next_hop) == 0)
+   ? next_hop : portid);
+
+   }
+
+   return portid;
+}
+
 #if defined(__SSE4_1__)
 #include "l3fwd_lpm_sse.h"
 #else
diff --git a/examples/l3fwd/l3fwd_lpm.h b/examples/l3fwd/l3fwd_lpm.h
index 258a82f..4865d90 100644
--- a/examples/l3fwd/l3fwd_lpm.h
+++ b/examples/l3fwd/l3fwd_lpm.h
@@ -34,37 +34,13 @@
 #ifndef __L3FWD_LPM_H__
 #define __L3FWD_LPM_H__
 
-static inline uint8_t
-lpm_get_ipv4_dst_port(void *ipv4_hdr,  uint8_t portid, void *lookup_struct)
-{
-   uint32_t next_hop;
-   struct rte_lpm *ipv4_l3fwd_lookup_struct =
-   (struct rte_lpm *)lookup_struct;
-
-   return (uint8_t) ((rte_lpm_lookup(ipv4_l3fwd_lookup_struct,
-   rte_be_to_cpu_32(((struct ipv4_hdr *)ipv4_hdr)->dst_addr),
-   &next_hop) == 0) ? next_hop : portid);
-}
-
-static inline uint8_t
-lpm_get_ipv6_dst_port(void *ipv6_hdr,  uint8_t portid, void *lookup_struct)
-{
-   uint32_t next_hop;
-   struct rte_lpm6 *ipv6_l3fwd_lookup_struct =
-   (struct rte_lpm6 *)lookup_struct;
-
-   return (uint8_t) ((rte_lpm6_lookup(ipv6_l3fwd_lookup_struct,
-   ((struct ipv6_hdr *)ipv6_hdr)->dst_addr,
-   &next_hop) == 0) ?  next_hop : portid);
-}
-
 static inline __attribute__((always_inline)) void
 l3fwd_lpm_simple_forward(struct rte_mbuf *m, uint8_t port

[dpdk-dev] [PATCH v4 7/8] examples/l3fwd: add the times of hash multi-lookup for different Archs

2017-05-14 Thread Jianbo Liu
New macro to define how many times of hash lookup in one time, and this
makes the code more concise.

Signed-off-by: Jianbo Liu 
---
 examples/l3fwd/l3fwd_em_hlm.h | 241 +-
 1 file changed, 71 insertions(+), 170 deletions(-)

diff --git a/examples/l3fwd/l3fwd_em_hlm.h b/examples/l3fwd/l3fwd_em_hlm.h
index b9163e3..098b396 100644
--- a/examples/l3fwd/l3fwd_em_hlm.h
+++ b/examples/l3fwd/l3fwd_em_hlm.h
@@ -43,148 +43,65 @@
 #include "l3fwd_em_hlm_neon.h"
 #endif
 
+#ifdef RTE_ARCH_ARM64
+#define EM_HASH_LOOKUP_COUNT 16
+#else
+#define EM_HASH_LOOKUP_COUNT 8
+#endif
+
+
 static inline __attribute__((always_inline)) void
-em_get_dst_port_ipv4x8(struct lcore_conf *qconf, struct rte_mbuf *m[8],
-   uint8_t portid, uint16_t dst_port[8])
+em_get_dst_port_ipv4xN(struct lcore_conf *qconf, struct rte_mbuf *m[],
+   uint8_t portid, uint16_t dst_port[])
 {
-   int32_t ret[8];
-   union ipv4_5tuple_host key[8];
-
-   get_ipv4_5tuple(m[0], mask0.x, &key[0]);
-   get_ipv4_5tuple(m[1], mask0.x, &key[1]);
-   get_ipv4_5tuple(m[2], mask0.x, &key[2]);
-   get_ipv4_5tuple(m[3], mask0.x, &key[3]);
-   get_ipv4_5tuple(m[4], mask0.x, &key[4]);
-   get_ipv4_5tuple(m[5], mask0.x, &key[5]);
-   get_ipv4_5tuple(m[6], mask0.x, &key[6]);
-   get_ipv4_5tuple(m[7], mask0.x, &key[7]);
-
-   const void *key_array[8] = {&key[0], &key[1], &key[2], &key[3],
-   &key[4], &key[5], &key[6], &key[7]};
-
-   rte_hash_lookup_bulk(qconf->ipv4_lookup_struct, &key_array[0], 8, ret);
-
-   dst_port[0] = (uint8_t) ((ret[0] < 0) ?
-   portid : ipv4_l3fwd_out_if[ret[0]]);
-   dst_port[1] = (uint8_t) ((ret[1] < 0) ?
-   portid : ipv4_l3fwd_out_if[ret[1]]);
-   dst_port[2] = (uint8_t) ((ret[2] < 0) ?
-   portid : ipv4_l3fwd_out_if[ret[2]]);
-   dst_port[3] = (uint8_t) ((ret[3] < 0) ?
-   portid : ipv4_l3fwd_out_if[ret[3]]);
-   dst_port[4] = (uint8_t) ((ret[4] < 0) ?
-   portid : ipv4_l3fwd_out_if[ret[4]]);
-   dst_port[5] = (uint8_t) ((ret[5] < 0) ?
-   portid : ipv4_l3fwd_out_if[ret[5]]);
-   dst_port[6] = (uint8_t) ((ret[6] < 0) ?
-   portid : ipv4_l3fwd_out_if[ret[6]]);
-   dst_port[7] = (uint8_t) ((ret[7] < 0) ?
-   portid : ipv4_l3fwd_out_if[ret[7]]);
-
-   if (dst_port[0] >= RTE_MAX_ETHPORTS ||
-   (enabled_port_mask & 1 << dst_port[0]) == 0)
-   dst_port[0] = portid;
-
-   if (dst_port[1] >= RTE_MAX_ETHPORTS ||
-   (enabled_port_mask & 1 << dst_port[1]) == 0)
-   dst_port[1] = portid;
-
-   if (dst_port[2] >= RTE_MAX_ETHPORTS ||
-   (enabled_port_mask & 1 << dst_port[2]) == 0)
-   dst_port[2] = portid;
-
-   if (dst_port[3] >= RTE_MAX_ETHPORTS ||
-   (enabled_port_mask & 1 << dst_port[3]) == 0)
-   dst_port[3] = portid;
-
-   if (dst_port[4] >= RTE_MAX_ETHPORTS ||
-   (enabled_port_mask & 1 << dst_port[4]) == 0)
-   dst_port[4] = portid;
-
-   if (dst_port[5] >= RTE_MAX_ETHPORTS ||
-   (enabled_port_mask & 1 << dst_port[5]) == 0)
-   dst_port[5] = portid;
-
-   if (dst_port[6] >= RTE_MAX_ETHPORTS ||
-   (enabled_port_mask & 1 << dst_port[6]) == 0)
-   dst_port[6] = portid;
-
-   if (dst_port[7] >= RTE_MAX_ETHPORTS ||
-   (enabled_port_mask & 1 << dst_port[7]) == 0)
-   dst_port[7] = portid;
+   int i;
+   int32_t ret[EM_HASH_LOOKUP_COUNT];
+   union ipv4_5tuple_host key[EM_HASH_LOOKUP_COUNT];
+   const void *key_array[EM_HASH_LOOKUP_COUNT];
+
+   for (i = 0; i < EM_HASH_LOOKUP_COUNT; i++) {
+   get_ipv4_5tuple(m[i], mask0.x, &key[i]);
+   key_array[i] = &key[i];
+   }
+
+   rte_hash_lookup_bulk(qconf->ipv4_lookup_struct, &key_array[0],
+EM_HASH_LOOKUP_COUNT, ret);
+
+   for (i = 0; i < EM_HASH_LOOKUP_COUNT; i++) {
+   dst_port[i] = (uint8_t) ((ret[i] < 0) ?
+   portid : ipv4_l3fwd_out_if[ret[i]]);
 
+   if (dst_port[i] >= RTE_MAX_ETHPORTS ||
+   (enabled_port_mask & 1 << dst_port[i]) == 0)
+   dst_port[i] = portid;
+   }
 }
 
 static inline __attribute__((always_inline)) void
-em_get_dst_port_ipv6x8(struct lcore_conf *qconf, struct rte_mbuf *m[8],
-   uint8_t portid, uint16_t dst_port[8])
+em_get_dst_port_ipv6xN(struct lcore_conf *qconf, struct rte_mbuf *m[],
+   uint8_t portid, uint16_t dst_port[])
 {
-   int32_t ret[8];
-   union ipv6_5tuple_host key[8];
-
-   get_ipv6_5tuple(m[0], mask1.x, mask2.x, &key[0]);
-   get

[dpdk-dev] [PATCH v4 8/8] examples/l3fwd: change the guard macro name for header file

2017-05-14 Thread Jianbo Liu
As l3fwd_em_sse.h is renamed to l3fwd_em_sequential.h, change the macro
to __L3FWD_EM_SEQUENTIAL_H__ to maintain consistency.

Signed-off-by: Jianbo Liu 
---
 examples/l3fwd/l3fwd_em_sequential.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/examples/l3fwd/l3fwd_em_sequential.h 
b/examples/l3fwd/l3fwd_em_sequential.h
index 2b3ec16..c7d477d 100644
--- a/examples/l3fwd/l3fwd_em_sequential.h
+++ b/examples/l3fwd/l3fwd_em_sequential.h
@@ -31,8 +31,8 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
-#ifndef __L3FWD_EM_SSE_H__
-#define __L3FWD_EM_SSE_H__
+#ifndef __L3FWD_EM_SEQUENTIAL_H__
+#define __L3FWD_EM_SEQUENTIAL_H__
 
 /**
  * @file
@@ -123,4 +123,4 @@ static inline __attribute__((always_inline)) uint16_t
 
send_packets_multi(qconf, pkts_burst, dst_port, nb_rx);
 }
-#endif /* __L3FWD_EM_SSE_H__ */
+#endif /* __L3FWD_EM_SEQUENTIAL_H__ */
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH] event/sw: add queue-to-port stats

2017-05-14 Thread Jerin Jacob
-Original Message-
> Date: Thu, 11 May 2017 10:56:26 +0100
> From: Harry van Haaren 
> To: dev@dpdk.org
> CC: jerin.ja...@caviumnetworks.com, Harry van Haaren
>  
> Subject: [PATCH] event/sw: add queue-to-port stats
> X-Mailer: git-send-email 2.7.4
> 
> This patch targets the next-eventdev tree.
> 
> This commit adds a new statistic to the SW eventdev PMD.
> The statistic shows how many packets were sent from a
> queue to a port. This provides information on how traffic
> from a specific queue is being load-balanced to worker cores.
> 
> Note that these numbers should be compared across all queue
> stages - the load-balancing does not try to perfectly share
> each queue's traffic, rather it balances the overall traffic
> from all queues to the ports.
> 
> The statistic is printed from the rte_eventdev_dump() function,
> as well as being made available via the xstats API.
> 
> Unit tests have been updated to expect more per-queue statistics,
> and the correctness of counts and counts after reset is verified.
> 
> Signed-off-by: Harry van Haaren 

Applied to dpdk-next-eventdev/master after removing "This patch targets
the next-eventdev tree." in the git commit log.

Thanks.



Re: [dpdk-dev] [PATCH] eventdev: clarify atomic and ordered queue config

2017-05-14 Thread Jerin Jacob
-Original Message-
> Date: Fri, 12 May 2017 14:25:37 -0500
> From: Gage Eads 
> To: dev@dpdk.org
> CC: jerin.ja...@caviumnetworks.org
> Subject: [dpdk-dev] [PATCH] eventdev: clarify atomic and ordered queue
>  config
> X-Mailer: git-send-email 2.7.4
> 
> The nb_atomic_flows and nb_atomic_order_sequences fields are only inspected
> if the queue is configured for atomic or ordered scheduling, respectively.
> This commit updates the documentation to reflect that.
> 
> Signed-off-by: Gage Eads 
> ---
>  lib/librte_eventdev/rte_eventdev.h | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_eventdev/rte_eventdev.h 
> b/lib/librte_eventdev/rte_eventdev.h
> index 20e7293..32ffcd1 100644
> --- a/lib/librte_eventdev/rte_eventdev.h
> +++ b/lib/librte_eventdev/rte_eventdev.h
> @@ -521,9 +521,11 @@ rte_event_dev_configure(uint8_t dev_id,
>  struct rte_event_queue_conf {
>   uint32_t nb_atomic_flows;
>   /**< The maximum number of active flows this queue can track at any
> -  * given time. The value must be in the range of
> -  * [1 - nb_event_queue_flows)] which previously provided in
> -  * rte_event_dev_info_get().
> +  * given time. If the queue is configured for atomic scheduling (by
> +  * applying the RTE_EVENT_QUEUE_CFG_ALL_TYPES or
> +  * RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY flags to event_queue_cfg), then the
> +  * value must be in the range of [1 - nb_event_queue_flows)], which was
> +  * previously provided in rte_event_dev_configure().
>*/
>   uint32_t nb_atomic_order_sequences;
>   /**< The maximum number of outstanding events waiting to be
> @@ -533,8 +535,11 @@ struct rte_event_queue_conf {
>* scheduler cannot schedule the events from this queue and invalid
>* event will be returned from dequeue until one or more entries are
>* freed up/released.
> -  * The value must be in the range of [1 - nb_event_queue_flows)]
> -  * which previously supplied to rte_event_dev_configure().
> +  * If the queue is configured for ordered scheduling (by applying the
> +  * RTE_EVENT_QUEUE_CFG_ALL_TYPES or RTE_EVENT_QUEUE_CFG_ORDERED_ONLY
> +  * flags to event_queue_cfg), then the value must be in the range of [1
> +  * - nb_event_queue_flows)], which was previously supplied to

At this line, HTML document rendering is not showing up correctly.
Please check the generated HTML output with "make doc-api-html"

Other than that, content looks OK.

> +  * rte_event_dev_configure().
>*/
>   uint32_t event_queue_cfg; /**< Queue cfg flags(EVENT_QUEUE_CFG_) */
>   uint8_t priority;
> -- 
> 2.7.4
> 


Re: [dpdk-dev] [PATCH v4 5/8] arch/arm: add vcopyq_laneq_u32 for old version of gcc

2017-05-14 Thread Jerin Jacob
-Original Message-
> Date: Mon, 15 May 2017 11:34:53 +0800
> From: Jianbo Liu 
> To: dev@dpdk.org, tomasz.kante...@intel.com,
>  jerin.ja...@caviumnetworks.com, ashwin.sek...@caviumnetworks.com
> CC: Jianbo Liu 
> Subject: [PATCH v4 5/8] arch/arm: add vcopyq_laneq_u32 for old version of
>  gcc
> X-Mailer: git-send-email 1.8.3.1
> 
> Implement vcopyq_laneq_u32 if gcc version is lower than 7.
> 
> Signed-off-by: Jianbo Liu 

Acked-by: Jerin Jacob 

> ---
>  lib/librte_eal/common/include/arch/arm/rte_vect.h | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h 
> b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> index 4107c99..d9fb4d0 100644
> --- a/lib/librte_eal/common/include/arch/arm/rte_vect.h
> +++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
> @@ -78,6 +78,15 @@
>  }
>  #endif
>  
> +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION < 7)
> +static inline uint32x4_t
> +vcopyq_laneq_u32(uint32x4_t a, const int lane_a,
> +  uint32x4_t b, const int lane_b)
> +{
> + return vsetq_lane_u32(vgetq_lane_u32(b, lane_b), a, lane_a);
> +}
> +#endif
> +
>  #ifdef __cplusplus
>  }
>  #endif
> -- 
> 1.8.3.1
> 


Re: [dpdk-dev] [PATCH v4 6/8] examples/l3fwd: add neon support for l3fwd

2017-05-14 Thread Sekhar, Ashwin
On Mon, 2017-05-15 at 11:34 +0800, Jianbo Liu wrote:
> Use ARM NEON intrinsics to accelerate l3 fowarding.
> 
> Signed-off-by: Jianbo Liu 
Acked-by: Ashwin Sekhar T K 
> ---
>  examples/l3fwd/l3fwd_em.c|   4 +-
>  examples/l3fwd/l3fwd_em_hlm.h|  17 ++-
>  examples/l3fwd/l3fwd_em_hlm_neon.h   |  74 ++
>  examples/l3fwd/l3fwd_em_sequential.h |  18 ++-
>  examples/l3fwd/l3fwd_lpm.c   |   4 +-
>  examples/l3fwd/l3fwd_lpm_neon.h  | 193
> ++
>  examples/l3fwd/l3fwd_neon.h  | 259
> +++
>  7 files changed, 563 insertions(+), 6 deletions(-)
>  create mode 100644 examples/l3fwd/l3fwd_em_hlm_neon.h
>  create mode 100644 examples/l3fwd/l3fwd_lpm_neon.h
>  create mode 100644 examples/l3fwd/l3fwd_neon.h
> 
> diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
> index ba844b2..da96cfd 100644
> --- a/examples/l3fwd/l3fwd_em.c
> +++ b/examples/l3fwd/l3fwd_em.c
> @@ -328,7 +328,7 @@ struct ipv6_l3fwd_em_route {
>   return (uint8_t)((ret < 0) ? portid :
> ipv6_l3fwd_out_if[ret]);
>  }
>  
> -#if defined(__SSE4_1__)
> +#if defined(__SSE4_1__) || defined(RTE_MACHINE_CPUFLAG_NEON)
>  #if defined(NO_HASH_MULTI_LOOKUP)
>  #include "l3fwd_em_sequential.h"
>  #else
> @@ -709,7 +709,7 @@ struct ipv6_l3fwd_em_route {
>   if (nb_rx == 0)
>   continue;
>  
> -#if defined(__SSE4_1__)
> +#if defined(__SSE4_1__) || defined(RTE_MACHINE_CPUFLAG_NEON)
>   l3fwd_em_send_packets(nb_rx, pkts_burst,
>   portid,
> qconf);
>  #else
> diff --git a/examples/l3fwd/l3fwd_em_hlm.h
> b/examples/l3fwd/l3fwd_em_hlm.h
> index 636dea4..b9163e3 100644
> --- a/examples/l3fwd/l3fwd_em_hlm.h
> +++ b/examples/l3fwd/l3fwd_em_hlm.h
> @@ -35,8 +35,13 @@
>  #ifndef __L3FWD_EM_HLM_H__
>  #define __L3FWD_EM_HLM_H__
>  
> +#if defined(__SSE4_1__)
>  #include "l3fwd_sse.h"
>  #include "l3fwd_em_hlm_sse.h"
> +#elif defined(RTE_MACHINE_CPUFLAG_NEON)
> +#include "l3fwd_neon.h"
> +#include "l3fwd_em_hlm_neon.h"
> +#endif
>  
>  static inline __attribute__((always_inline)) void
>  em_get_dst_port_ipv4x8(struct lcore_conf *qconf, struct rte_mbuf
> *m[8],
> @@ -238,7 +243,7 @@ static inline __attribute__((always_inline))
> uint16_t
>  l3fwd_em_send_packets(int nb_rx, struct rte_mbuf **pkts_burst,
>   uint8_t portid, struct lcore_conf *qconf)
>  {
> - int32_t j;
> + int32_t i, j, pos;
>   uint16_t dst_port[MAX_PKT_BURST];
>  
>   /*
> @@ -247,6 +252,11 @@ static inline __attribute__((always_inline))
> uint16_t
>    */
>   int32_t n = RTE_ALIGN_FLOOR(nb_rx, 8);
>  
> + for (j = 0; j < 8 && j < nb_rx; j++) {
> + rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j],
> +    struct ether_hdr *) +
> 1);
> + }
> +
>   for (j = 0; j < n; j += 8) {
>  
>   uint32_t pkt_type =
> @@ -263,6 +273,11 @@ static inline __attribute__((always_inline))
> uint16_t
>   uint32_t tcp_or_udp = pkt_type &
>   (RTE_PTYPE_L4_TCP | RTE_PTYPE_L4_UDP);
>  
> + for (i = 0, pos = j + 8; i < 8 && pos < nb_rx; i++,
> pos++) {
> + rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[po
> s],
> +    struct
> ether_hdr *) + 1);
> + }
> +
>   if (tcp_or_udp && (l3_type == RTE_PTYPE_L3_IPV4)) {
>  
>   em_get_dst_port_ipv4x8(qconf,
> &pkts_burst[j], portid,
> diff --git a/examples/l3fwd/l3fwd_em_hlm_neon.h
> b/examples/l3fwd/l3fwd_em_hlm_neon.h
> new file mode 100644
> index 000..dae1acf
> --- /dev/null
> +++ b/examples/l3fwd/l3fwd_em_hlm_neon.h
> @@ -0,0 +1,74 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2016 Intel Corporation. All rights reserved.
> + *   Copyright(c) 2017, Linaro Limited
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or
> without
> + *   modification, are permitted provided that the following
> conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above
> copyright
> + *   notice, this list of conditions and the following
> disclaimer.
> + * * Redistributions in binary form must reproduce the above
> copyright
> + *   notice, this list of conditions and the following
> disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + *   contributors may be used to endorse or promote products
> derived
> + *   from this software without specific prior written
> permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT
> NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES