On 5 January 2017 at 01:33, wrote:
> From: Zbigniew Bodek
>
> This patch introduces crypto poll mode driver
> using ARMv8 cryptographic extensions.
> CPU compatibility with this driver is detected in
> run-time and virtual crypto device will not be
> created if CPU doesn't provide:
> AES, SHA1,
On 12 January 2017 at 21:12, Zbigniew Bodek
wrote:
> Hello Jianbo Liu,
>
> Thanks for the review. Please check my answers in-line.
>
> Kind regards
> Zbigniew
>
>
> On 06.01.2017 03:45, Jianbo Liu wrote:
>>
>> On 5 January 2017 at 01:33, wrote:
>>&
On 13 January 2017 at 16:16, Hemant Agrawal wrote:
> On 1/4/2017 11:03 PM, zbigniew.bo...@caviumnetworks.com wrote:
>>
>> From: Zbigniew Bodek
>>
>> Add type and name for ARMv8 crypto PMD
>>
>> Signed-off-by: Zbigniew Bodek
>> ---
>> lib/librte_cryptodev/rte_cryptodev.h | 3 +++
>> 1 file chang
On 2 February 2017 at 00:19, Ananyev, Konstantin
wrote:
> Hi,
>
>> -Original Message-----
>> From: Jianbo Liu [mailto:jianbo@linaro.org]
>> Sent: Monday, December 19, 2016 6:09 AM
>> To: dev@dpdk.org; Zhang, Helin ; Ananyev, Konstantin
>> ;
>> je
On 3 February 2017 at 19:38, Ananyev, Konstantin
wrote:
>
>
>> -Original Message-----
>> From: Jianbo Liu [mailto:jianbo@linaro.org]
>> Sent: Friday, February 3, 2017 6:22 AM
>> To: Ananyev, Konstantin
>> Cc: dev@dpdk.org; Zhang, Helin ;
>> jer
.
Signed-off-by: Jianbo Liu
---
drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 16
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
index f96cc85..0b1338d 100644
--- a/drivers/net/ixgbe
, and stops when meeting the first packet with DD bit unset.
Signed-off-by: Jianbo Liu
---
drivers/net/ixgbe/ixgbe_rxtx.c | 16 +---
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 36f1c02..613890e 100644
, and stops when meeting the first packet with DD bit unset.
Signed-off-by: Jianbo Liu
---
drivers/net/ixgbe/ixgbe_rxtx.c | 16 +---
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 36f1c02..613890e 100644
.
Signed-off-by: Jianbo Liu
---
drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 30 +++---
1 file changed, 19 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
index f96cc85..2a61322 100644
--- a/drivers/net
On 5 February 2017 at 00:37, Jianbo Liu wrote:
> To get better performance, Rx bulk alloc recv function will scan 8 descs
> in one time, but the statuses are not consistent on ARM platform because
> the memory allocated for Rx descriptors is cacheable hugepages.
> This patch is to c
On 9 February 2017 at 03:53, Ananyev, Konstantin
wrote:
>
>
>> -Original Message-
>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev, Konstantin
>> Sent: Wednesday, February 8, 2017 6:54 PM
>> To: Yigit, Ferruh ; Jianbo Liu
>> ; dev@
, and stops when meeting the first packet with DD bit unset.
Signed-off-by: Jianbo Liu
---
drivers/net/ixgbe/ixgbe_rxtx.c | 16 +---
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 36f1c02..613890e 100644
.
Signed-off-by: Jianbo Liu
---
drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 29 +
1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
index f96cc85..e2715cb 100644
--- a/drivers/net
Hi Shreyansh,
On 7 December 2016 at 21:10, Shreyansh Jain wrote:
> On Wednesday 07 December 2016 05:47 PM, David Marchand wrote:
>>
>> Hello Shreyansh,
>>
>> On Wed, Dec 7, 2016 at 10:55 AM, Shreyansh Jain
>> wrote:
>>>
>>> On Wednesday 07 December 2016 02:22 AM, David Marchand wrote:
>
On 14 December 2016 at 09:55, Jerin Jacob
wrote:
> dmb instruction based barrier is used for smp version of memory barrier.
>
> Signed-off-by: Jerin Jacob
> ---
> lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a
On 14 December 2016 at 09:55, Jerin Jacob
wrote:
> From: Santosh Shukla
>
> Replace the raw I/O device memory read/write access with eal
> abstraction for I/O device memory read/write access to fix
> portability issues across different architectures.
>
> Signed-off-by: Santosh Shukla
> Signed-of
On 14 December 2016 at 09:55, Jerin Jacob
wrote:
> Override the generic I/O device memory read/write access and implement it
> using armv8 instructions for arm64.
>
> Signed-off-by: Jerin Jacob
> ---
> lib/librte_eal/common/include/arch/arm/rte_io.h| 4 +
> lib/librte_eal/common/include/ar
On 15 December 2016 at 18:04, Jerin Jacob
wrote:
> On Thu, Dec 15, 2016 at 05:53:05PM +0800, Jianbo Liu wrote:
>> On 14 December 2016 at 09:55, Jerin Jacob
>> wrote:
>> > Override the generic I/O device memory read/write access and implement it
>> > u
On 15 December 2016 at 19:08, Jerin Jacob
wrote:
> On Thu, Dec 15, 2016 at 06:17:32PM +0800, Jianbo Liu wrote:
>> On 15 December 2016 at 18:04, Jerin Jacob
>> wrote:
>> > On Thu, Dec 15, 2016 at 05:53:05PM +0800, Jianbo Liu wrote:
>> >> On 14 December 201
unset.
Signed-off-by: Jianbo Liu
---
drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 16
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
index f96cc85..0b1338d 100644
--- a/drivers/net/ixgbe
sequentially, and stops when meeting the first packet with DD bit unset.
Signed-off-by: Jianbo Liu
---
drivers/net/ixgbe/ixgbe_rxtx.c | 12
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index b2d9f45..2866bdb
Hi Jerin,
On 21 December 2016 at 18:08, Jerin Jacob
wrote:
> On Mon, Dec 19, 2016 at 11:39:18AM +0530, Jianbo Liu wrote:
>
> Hi Jianbo,
>
>> vPMD will check 4 descriptors in one time, but the statuses are not
>> consistent
>> because the memory allocated f
On 21 December 2016 at 19:03, Bruce Richardson
wrote:
> On Wed, Dec 21, 2016 at 03:38:51PM +0530, Jerin Jacob wrote:
>> On Mon, Dec 19, 2016 at 11:39:18AM +0530, Jianbo Liu wrote:
>>
>> Hi Jianbo,
>>
>> > vPMD will check 4 descriptors in one time, but t
Hi Santosh,
On 22 December 2016 at 20:36, Santosh Shukla
wrote:
> Hi Jiangbo,
>
> On Thu, Dec 15, 2016 at 08:40:19PM -0800, Santosh Shukla wrote:
>> On Thu, Dec 15, 2016 at 04:37:12PM +0800, Jianbo Liu wrote:
>> > On 14 December 2016 at 09:55, Jerin Jacob
>> > w
On 27 December 2016 at 17:49, Jerin Jacob
wrote:
> dsb instruction based barrier is used for non smp
> version of memory barrier.
>
> Fixes: d708f01b7102 ("eal/arm: add atomic operations for ARMv8")
>
> CC: Jianbo Liu
> CC: sta...@dpdk.org
> Signed-off-by: Je
On 27 December 2016 at 17:49, Jerin Jacob
wrote:
> CC: Jianbo Liu
> Signed-off-by: Jerin Jacob
> ---
> lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 6 ++
> 1 file changed, 6 insertions(+)
>
> diff --git a/lib/librte_eal/common/include/arch/arm/rte
On 27 December 2016 at 17:49, Jerin Jacob
wrote:
> Change rte_?wb definitions to macros in order to
use rte_*mb?
> keep consistent with other barrier definitions in
> the file.
>
> Suggested-by: Jianbo Liu
> Signed-off-by: Jerin Jacob
> ---
> .../common/include/
On 4 January 2017 at 18:01, Jerin Jacob wrote:
> On Tue, Jan 03, 2017 at 03:48:32PM +0800, Jianbo Liu wrote:
>> On 27 December 2016 at 17:49, Jerin Jacob
>> wrote:
>> > CC: Jianbo Liu
>> > Signed-off-by: Jerin Jacob
>> > ---
>> > lib/lib
On 5 January 2017 at 14:24, Jerin Jacob wrote:
> On Thu, Jan 05, 2017 at 01:31:44PM +0800, Jianbo Liu wrote:
>> On 4 January 2017 at 18:01, Jerin Jacob
>> wrote:
>> > On Tue, Jan 03, 2017 at 03:48:32PM +0800, Jianbo Liu wrote:
>> >> On 27 December 201
in git log
- Ashwin's suggestions for performance on ThunderX
v2:
- change name of l3fwd_em_sse.h to l3fwd_em_sequential.h
- add the times of hash multi-lookup for different Archs
- performance tuning on ThunderX: prefetching, set NO_HASH_LOOKUP_MULTI ...
Jianbo Liu (8):
examples/l3fw
Extract common code from l3fwd_em_hlm_sse.h, and add to the new file
l3fwd_em_hlm.h.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c | 2 +-
examples/l3fwd/l3fwd_em_hlm.h | 302 ++
examples/l3fwd/l3fwd_em_hlm_sse.h | 276
Some common code can be used by other ARCHs, move to l3fwd_lpm.c
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_lpm.c | 83 ++
examples/l3fwd/l3fwd_lpm.h | 26 +
examples/l3fwd/l3fwd_lpm_sse.h | 66
Implement vcopyq_laneq_u32 if gcc version is lower than 7.
Signed-off-by: Jianbo Liu
---
lib/librte_eal/common/include/arch/arm/rte_vect.h | 9 +
1 file changed, 9 insertions(+)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h
b/lib/librte_eal/common/include/arch/arm
The l3fwd_em_sse.h is enabled by NO_HASH_LOOKUP_MULTI.
Renaming it because it's only for sequential hash lookup,
and doesn't include any x86 SSE instructions.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 2 +-
examples/l3fwd/{l3fw
Keep x86 related code in l3fwd_sse.h, and move common code to
l3fwd_common.h, which will be used by other Archs.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_common.h | 293 ++
examples/l3fwd/l3fwd_sse.h| 261
Use ARM NEON intrinsics to accelerate l3 fowarding.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 4 +-
examples/l3fwd/l3fwd_em_hlm.h| 17 ++-
examples/l3fwd/l3fwd_em_hlm_neon.h | 74 ++
examples/l3fwd/l3fwd_em_sequential.h | 18 ++-
examples/l3fwd
As l3fwd_em_sse.h is renamed to l3fwd_em_sequential.h, change the macro
to __L3FWD_EM_SEQUENTIAL_H__ to maintain consistency.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_sequential.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/examples/l3fwd
New macro to define how many times of hash lookup in one time, and this
makes the code more concise.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_hlm.h | 241 +-
1 file changed, 71 insertions(+), 170 deletions(-)
diff --git a/examples/l3fwd
On 4 July 2017 at 21:55, De Lara Guarch, Pablo
wrote:
>
>
>> -Original Message-
>> From: Thomas Monjalon [mailto:tho...@monjalon.net]
>> Sent: Tuesday, July 4, 2017 12:26 AM
>> To: Dumitrescu, Cristian ; De Lara Guarch,
>> Pablo
>>
uint64x1_t o = vget_low_u64(n) + vget_high_u64(n);
> +
> + return vget_lane_u32((uint32x2_t)o, 0);
> +}
> +
> #endif
>
> #if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION < 7)
> --
> 2.13.2
>
Acked-by: Jianbo Liu
On 9 July 2017 at 01:08, Thomas Monjalon wrote:
> 07/07/2017 18:26, Jerin Jacob:
>> vaddvq_u16() is not available for armv7.
>> Emulate the vaddvq_u16() using armv7 NEON intrinsics.
>
> After implementing this function, another missing function appears:
>
> lib/librte_sched/rte_sched.c:174
-
> - for (i = 0; i <
> TEST_ADAPTIVE_TRANSMIT_LOAD_BALANCING_RX_BURST_SLAVE_COUNT; i++) {
> - for (j = 0; j < MAX_PKT_BURST; j++) {
> - if (pkt_burst[i][j] != NULL) {
> - rte_pktmbuf_free(pkt_burst[i][j]);
> - pkt_burst[i][j] = NULL;
> - }
> - }
> - }
> -
> -
> /* Clean up and remove slaves from bonded device */
> return remove_slaves_and_stop_bonded_device();
> }
> --
> 1.8.3.1
>
Acked-by: Jianbo Liu
+#endif
> +
> /* NEON intrinsic vreinterpretq_u64_p128() is supported since GCC version 7
> */
> static inline uint64x2_t
> vreinterpretq_u64_p128(poly128_t x)
> --
> 1.8.3.1
>
Acked-by: Jianbo Liu
@@ -2789,7 +2789,7 @@ struct rte_fdir_conf fdir_conf = {
> static int
> test_balance_l23_tx_burst_ipv4_toggle_ip_addr(void)
> {
> - return balance_l23_tx_burst(0, 1, 1, 0);
> + return balance_l23_tx_burst(0, 1, 0, 1);
> }
>
> static int
> --
> 1.8.3.1
>
Acked-by: Jianbo Liu
Hi Hemant,
The 03/17/2017 18:17, Hemant Agrawal wrote:
> DPAA2 Hardware Mempool handlers allow enqueue/dequeue from NXP's
> QBMAN hardware block.
> CONFIG_RTE_MBUF_DEFAULT_MEMPOOL_OPS is set to 'dpaa2', if the pool
> is enabled.
>
> This memory pool currently supports packet mbuf type blocks only.
i40e_rxd_pkt_type_mapping(ptype);
> + rx_pkts[i]->packet_type = i40e_rxd_pkt_type_mapping(ptype);
> }
>
> }
Acked-by: Jianbo Liu
---
> 4 files changed, 149 insertions(+), 85 deletions(-)
>
Reviewed-by: Jianbo Liu
On 9 September 2016 at 16:43, Shreyansh Jain wrote:
> Introduction:
> =
>
> This patch set is direct derivative of Jan's original series [1],[2].
>
> - As this deviates substantially from original series, if need be I can
>post it as a separate patch rather than v2. Please suggest
On 18 September 2016 at 15:22, Jan Viktorin wrote:
> On Sun, 18 Sep 2016 13:58:50 +0800
> Jianbo Liu wrote:
>
>> On 9 September 2016 at 16:43, Shreyansh Jain
>> wrote:
>> > Introduction:
>> > =
>> >
>> > This patch
Hi Maxime,
On 22 August 2016 at 16:11, Maxime Coquelin
wrote:
> Hi Zhihong,
>
> On 08/19/2016 07:43 AM, Zhihong Wang wrote:
>>
>> This patch set optimizes the vhost enqueue function.
>>
...
>
> My setup consists of one host running a guest.
> The guest generates as much 64bytes packets as possi
On 21 September 2016 at 17:27, Wang, Zhihong wrote:
>
>
>> -Original Message-----
>> From: Jianbo Liu [mailto:jianbo.liu at linaro.org]
>> Sent: Wednesday, September 21, 2016 4:50 PM
>> To: Maxime Coquelin
>> Cc: Wang, Zhihong ; dev at dpdk.org;
>>
On 22 September 2016 at 10:29, Yuanhan Liu
wrote:
> On Wed, Sep 21, 2016 at 08:54:11PM +0800, Jianbo Liu wrote:
>> >> > My setup consists of one host running a guest.
>> >> > The guest generates as much 64bytes packets as possible using
>> >>
>
On 22 September 2016 at 14:58, Wang, Zhihong wrote:
>
>
>> -Original Message-----
>> From: Jianbo Liu [mailto:jianbo.liu at linaro.org]
>> Sent: Thursday, September 22, 2016 1:48 PM
>> To: Yuanhan Liu
>> Cc: Wang, Zhihong ; Maxime Coquelin
>> ;
On 20 September 2016 at 10:00, Zhihong Wang wrote:
> This patch implements the vhost logic from scratch into a single function
> to improve maintainability. This is the baseline version of the new code,
> more optimization will be added in the following patches in this patch set.
>
> In the existi
On 22 September 2016 at 18:04, Wang, Zhihong wrote:
>
>
>> -Original Message-----
>> From: Jianbo Liu [mailto:jianbo.liu at linaro.org]
>> Sent: Thursday, September 22, 2016 5:02 PM
>> To: Wang, Zhihong
>> Cc: Yuanhan Liu ; Maxime Coquelin
>> ;
On 23 September 2016 at 10:56, Wang, Zhihong wrote:
.
> This is expected because the 2nd patch is just a baseline and all optimization
> patches are organized in the rest of this patch set.
>
> I think you can do bottleneck analysis on ARM to see what's slowing down the
> perf, there might be
Hi Thomas,
On 23 September 2016 at 21:41, Thomas Monjalon
wrote:
> 2016-09-23 18:41, Jianbo Liu:
>> On 23 September 2016 at 10:56, Wang, Zhihong
>> wrote:
>> .
>> > This is expected because the 2nd patch is just a baseline and all
>> > optimization
On 25 September 2016 at 13:41, Wang, Zhihong wrote:
>
>
>> -Original Message-
>> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
>> Sent: Friday, September 23, 2016 9:41 PM
>> To: Jianbo Liu
>> Cc: dev at dpdk.org; Wang, Zhihon
On 26 September 2016 at 13:25, Wang, Zhihong wrote:
>
>
>> -Original Message-----
>> From: Jianbo Liu [mailto:jianbo.liu at linaro.org]
>> Sent: Monday, September 26, 2016 1:13 PM
>> To: Wang, Zhihong
>> Cc: Thomas Monjalon ; dev at dpdk.org; Yuanhan
>
On 26 September 2016 at 13:37, Luke Gorrie wrote:
> On 22 September 2016 at 11:01, Jianbo Liu wrote:
>>
>> Tested with testpmd, host: txonly, guest: rxonly
>> size (bytes) improvement (%)
>> 644.12
>> 128 6
>
On 5 June 2017 at 16:58, Jerin Jacob wrote:
> CC: Jianbo Liu
> Signed-off-by: Jerin Jacob
> ---
> v2:
> - Removed YEILD instruction comment, as it is an implementation
> specific(Jianbo)
> ---
> lib/librte_eal/common/include/arch/arm/rte_pause.h | 4 ++
>
NEON and Altivec code
> - it does not compile on ARM
> - there is no Ack from NEON or Altivec maintainers (they were not
> Cc'ed)
> I really doubt it has been tested.
> That's why it won't be in RC1.
>
> If NEON and Altivec maintainers agree, we can give it a chance for RC2.
>
Other than the above error on ARM:
Acked-by: Jianbo Liu
> PS: please use --in-reply-to to let us check the discussion history.
On 3 April 2017 at 22:39, Bruce Richardson wrote:
> this set is based upon Olivier's mbuf rework patchset, and makes some
> improvement to the i40e driver taking account of the rework. It also
> removes a build-time option that seems unnecessary.
>
> Bruce Richardson (2):
> net/i40e: eliminate m
xtx_vec_neon.c | 11
> drivers/net/i40e/i40e_rxtx_vec_sse.c| 50
> -
> 5 files changed, 24 insertions(+), 51 deletions(-)
Acked-by: Jianbo Liu
And I'll send a patch to do the same change for i40e neon implementation.
Porting two changes from x86 SSE implematation.
net/i40e: fix checksum flag in x86 vector Rx
net/i40e: eliminate mbuf write on rearm
Signed-off-by: Jianbo Liu
---
drivers/net/i40e/i40e_rxtx_vec_neon.c | 68 +--
1 file changed, 42 insertions(+), 26
app-gcc
> b/config/defconfig_arm64-xgene1-linuxapp-gcc
> index f096166b7..d8e544728 100644
> --- a/config/defconfig_arm64-xgene1-linuxapp-gcc
> +++ b/config/defconfig_arm64-xgene1-linuxapp-gcc
> @@ -32,3 +32,4 @@
> #include "defconfig_arm64-armv8a-linuxapp-gcc"
>
> CONFIG_RTE_MACHINE="xgene1"
> +CONFIG_RTE_CACHE_LINE_SIZE=64
> --
Acked-by: Jianbo Liu
= vld1q_u8((uint8_t const *)orig->src_addr);
> + vst1q_u8((uint8_t *)targ->v6.src_addr, vrev32q_u8(ipv6));
> + ipv6 = vld1q_u8((uint8_t const *)orig->dst_addr);
> + vst1q_u8((uint8_t *)targ->v6.dst_addr, vrev32q_u8(ipv6));
> #else
> int i;
> for (i = 0; i < 4; i++) {
> --
> 2.7.4
>
Acked-by: Jianbo Liu
On 27 April 2017 at 21:00, Ashwin Sekhar T K
wrote:
> * Enabled CONFIG_RTE_SCHED_VECTOR for arm64
> * Verified the changes with sched_autotest unit test case
>
> Signed-off-by: Ashwin Sekhar T K
> ---
> config/defconfig_arm64-armv8a-linuxapp-gcc | 2 +-
> lib/librte_sched/rte_sched.c
On 28 April 2017 at 13:27, Sekhar, Ashwin wrote:
> On Friday 28 April 2017 09:20 AM, Jianbo Liu wrote:
>> On 27 April 2017 at 21:00, Ashwin Sekhar T K
>> wrote:
>>> * Enabled CONFIG_RTE_SCHED_VECTOR for arm64
>>> * Verified the changes with sched_autotest uni
gt; + return 1;
> +
> + pipes = vld1q_u32(pos + 4);
> + if (!vminvq_u32(veorq_u32(pipes, index)))
> + return 1;
> +
> + return 0;
> +}
> +
> #else
>
> static inline int
> --
> 2.7.4
>
Acked-by: Jianbo Liu
On 27 April 2017 at 20:44, Ashwin Sekhar T K
wrote:
> * Added file lib/librte_efd/rte_efd_arm64.h to hold arm64
> specific definitions
> * Verified the changes with efd_autotest unit test case
>
> Signed-off-by: Ashwin Sekhar T K
> ---
> v2:
> * Slightly modified the content of the commit messa
On 28 April 2017 at 18:38, Sekhar, Ashwin wrote:
> On Friday 28 April 2017 03:36 PM, Jianbo Liu wrote:
>> On 27 April 2017 at 20:44, Ashwin Sekhar T K
>> wrote:
>>> * Added file lib/librte_efd/rte_efd_arm64.h to hold arm64
>>> specific definitions
>>>
Extract common code from l3fwd_em_hlm_sse.h, and add to the new file
l3fwd_em_hlm.h.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c | 2 +-
examples/l3fwd/l3fwd_em_hlm.h | 302 ++
examples/l3fwd/l3fwd_em_hlm_sse.h | 280
The l3fwd_em_sse.h is enabled by NO_HASH_LOOKUP_MULTI.
Renaming it because it's only for single hash lookup,
and doesn't include any x86 SSE instructions.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 2 +-
examples/l3fwd/{l3fw
Signed-off-by: Jianbo Liu
Some common code can be used by other ARCHs, move to l3fwd_lpm.c
---
examples/l3fwd/l3fwd_lpm.c | 83 ++
examples/l3fwd/l3fwd_lpm.h | 26 +
examples/l3fwd/l3fwd_lpm_sse.h | 66
Use ARM NEON intrinsics to accelerate l3 fowarding.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd.h | 4 -
examples/l3fwd/l3fwd_em.c | 4 +-
examples/l3fwd/l3fwd_em_hlm.h | 5 +
examples/l3fwd/l3fwd_em_hlm_neon.h | 74 +++
examples/l3fwd
Keep x86 related code in l3fwd_sse.h, and move common code to
l3fwd_common.h, which will be used by other Archs.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_common.h | 293 ++
examples/l3fwd/l3fwd_sse.h| 255
On 2 May 2017 at 14:41, Jerin Jacob wrote:
> -Original Message-
>> Date: Mon, 1 May 2017 22:59:53 -0700
>> From: Ashwin Sekhar T K
>> To: byron.mar...@intel.com, pablo.de.lara.gua...@intel.com,
>> jerin.ja...@caviumnetworks.com, jianbo@linaro.org
>> Cc: dev@dpdk.org, Ashwin Sekhar T
Hi Ashwin,
On 2 May 2017 at 19:47, Sekhar, Ashwin wrote:
> Hi Jianbo,
>
> I tested your neon changes on thunderx. I am seeing a performance
> regression of ~10% for LPM case and ~20% for EM case with your changes.
> Did you see improvement on any arm64 platform with these changes. If
> yes, how m
Hi Ashwin,
On 3 May 2017 at 13:24, Jianbo Liu wrote:
> Hi Ashwin,
>
> On 2 May 2017 at 19:47, Sekhar, Ashwin wrote:
>> Hi Jianbo,
>>
>> I tested your neon changes on thunderx. I am seeing a performance
>> regression of ~10% for LPM case and ~20% for EM case
On 5 May 2017 at 12:24, Sekhar, Ashwin wrote:
> On Thu, 2017-05-04 at 16:42 +0800, Jianbo Liu wrote:
>> Hi Ashwin,
>>
>> On 3 May 2017 at 13:24, Jianbo Liu wrote:
>> >
>> > Hi Ashwin,
>> >
>> > On 2 May 2017 at 19:47, Sekhar, Ashwin
&g
v2:
- change name of l3fwd_em_sse.h to l3fwd_em_sequential.h
- add the times of hash multi-lookup for different Archs
- performance tuning on ThunderX: prefetching, set NO_HASH_LOOKUP_MULTI ...
Jianbo Liu (7):
examples/l3fwd: extract arch independent code from multi hash lookup
examples
Keep x86 related code in l3fwd_sse.h, and move common code to
l3fwd_common.h, which will be used by other Archs.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_common.h | 293 ++
examples/l3fwd/l3fwd_sse.h| 255
The l3fwd_em_sse.h is enabled by NO_HASH_LOOKUP_MULTI.
Renaming it because it's only for sequential hash lookup,
and doesn't include any x86 SSE instructions.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 2 +-
examples/l3fwd/{l3fw
Extract common code from l3fwd_em_hlm_sse.h, and add to the new file
l3fwd_em_hlm.h.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c | 2 +-
examples/l3fwd/l3fwd_em_hlm.h | 302 ++
examples/l3fwd/l3fwd_em_hlm_sse.h | 280
As l3fwd_em_sse.h is renamed to l3fwd_em_sequential.h, change the macro
to __L3FWD_EM_SEQUENTIAL_H__ to maintain consistency.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_sequential.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/examples/l3fwd
Use ARM NEON intrinsics to accelerate l3 fowarding.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 4 +-
examples/l3fwd/l3fwd_em_hlm.h| 19 ++-
examples/l3fwd/l3fwd_em_hlm_neon.h | 74 ++
examples/l3fwd/l3fwd_em_sequential.h | 20 ++-
examples/l3fwd
New micro to define how many times of hash lookup in one time, and this
makes the code more concise.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_hlm.h | 241 +-
1 file changed, 71 insertions(+), 170 deletions(-)
diff --git a/examples/l3fwd
Signed-off-by: Jianbo Liu
Some common code can be used by other ARCHs, move to l3fwd_lpm.c
---
examples/l3fwd/l3fwd_lpm.c | 83 ++
examples/l3fwd/l3fwd_lpm.h | 26 +
examples/l3fwd/l3fwd_lpm_sse.h | 66
Hi Ashwin,
On 9 May 2017 at 16:10, Sekhar, Ashwin wrote:
> On Fri, 2017-05-05 at 13:43 +0800, Jianbo Liu wrote:
>> On 5 May 2017 at 12:24, Sekhar, Ashwin
>> wrote:
>> >
>> > On Thu, 2017-05-04 at 16:42 +0800, Jianbo Liu wrote:
>> > >
>>
which helped improve performance on my
> Thunderx setup. For details see comments inline.
>
>
> On Wed, 2017-05-10 at 10:30 +0800, Jianbo Liu wrote:
>> Use ARM NEON intrinsics to accelerate l3 fowarding.
>>
>> Signed-off-by: Jianbo Liu
>> ---
>> e
On 11 May 2017 at 12:27, Sekhar, Ashwin wrote:
>
> On Thu, 2017-05-11 at 04:14 +, Sekhar, Ashwin wrote:
> ...
>> > > Combining all the above comments, I made some changes on top of
>> > > your
>> > > patch. These changes are giving 3-4% improvement over your
>> > > version.
>> > >
>> > > You m
Keep x86 related code in l3fwd_sse.h, and move common code to
l3fwd_common.h, which will be used by other Archs.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_common.h | 293 ++
examples/l3fwd/l3fwd_sse.h| 255
Extract common code from l3fwd_em_hlm_sse.h, and add to the new file
l3fwd_em_hlm.h.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c | 2 +-
examples/l3fwd/l3fwd_em_hlm.h | 302 ++
examples/l3fwd/l3fwd_em_hlm_sse.h | 280
Signed-off-by: Jianbo Liu
Some common code can be used by other ARCHs, move to l3fwd_lpm.c
---
examples/l3fwd/l3fwd_lpm.c | 83 ++
examples/l3fwd/l3fwd_lpm.h | 26 +
examples/l3fwd/l3fwd_lpm_sse.h | 66
The l3fwd_em_sse.h is enabled by NO_HASH_LOOKUP_MULTI.
Renaming it because it's only for sequential hash lookup,
and doesn't include any x86 SSE instructions.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 2 +-
examples/l3fwd/{l3fw
ching, set NO_HASH_LOOKUP_MULTI ...
Jianbo Liu (7):
examples/l3fwd: extract arch independent code from multi hash lookup
examples/l3fwd: rename l3fwd_em_sse.h to l3fwd_em_sequential.h
examples/l3fwd: extract common code from multi packet send
examples/l3fwd: rearrange the code for lpm_
Use ARM NEON intrinsics to accelerate l3 fowarding.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 4 +-
examples/l3fwd/l3fwd_em_hlm.h| 17 ++-
examples/l3fwd/l3fwd_em_hlm_neon.h | 74 ++
examples/l3fwd/l3fwd_em_sequential.h | 18 ++-
examples/l3fwd
As l3fwd_em_sse.h is renamed to l3fwd_em_sequential.h, change the macro
to __L3FWD_EM_SEQUENTIAL_H__ to maintain consistency.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_sequential.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/examples/l3fwd
New macro to define how many times of hash lookup in one time, and this
makes the code more concise.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_hlm.h | 241 +-
1 file changed, 71 insertions(+), 170 deletions(-)
diff --git a/examples/l3fwd
1 - 100 of 270 matches
Mail list logo