100% CPU load when generating traffic to destination network that nexthop is not reachable

2017-08-15 Thread Paweł Staszewski
Hi Doing some tests i discovered that when traffic is send by pktgen to forwarding host where nexthop for destination network on forwarding router is not reachable i have 100% cpu on all cores and perf top show mostly: 77.19% [kernel][k] queued_spin_lock_slowpath 10.20%

Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable

2017-08-15 Thread Paweł Staszewski
8-15 at 18:30 +0200, Paweł Staszewski wrote: Hi Doing some tests i discovered that when traffic is send by pktgen to forwarding host where nexthop for destination network on forwarding router is not reachable i have 100% cpu on all cores and perf top show mostly: 77.19% [kernel]

Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable

2017-08-15 Thread Paweł Staszewski
, Eric Dumazet pisze: On Tue, 2017-08-15 at 19:42 +0200, Paweł Staszewski wrote: # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 2M of event 'cycles' # Event count (approx.): 1585571545969 # # Children Sel

Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable

2017-08-15 Thread Paweł Staszewski
: 0381a33a9c4a [ 3855.326371] cpuidle_enter+0x12/0x14 [ 3855.326372] do_idle+0x113/0x16b [ 3855.326373] cpu_startup_entry+0x1a/0x1c [ 3855.326376] start_secondary+0xd0/0xd3 [ 3855.326379] secondary_startup_64+0xa5/0xa5 W dniu 2017-08-15 o 22:53, Paweł Staszewski pisze: Hi Patch applied but

Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable

2017-08-16 Thread Paweł Staszewski
Hi Patch applied - but no big change - from 0.7Mpps per vlan to 1.2Mpps per vlan previously(without patch) 100% cpu load: bwm-ng v0.6.1 (probing every 0.500s), press 'h' for help input: /proc/net/dev type: rate | iface Rx TxTotal ==

Re: 100% CPU load when generating traffic to destination network that nexthop is not reachable

2017-08-17 Thread Paweł Staszewski
e udp stream (iptv or other filesystem syncing protocol) to the other via forwarding linux router - if receiving server will goes down and dissapear from arp all other streams that are forwarded by linux router will suffer from this. W dniu 2017-08-16 o 12:07, Paweł Staszewski pisze: Hi

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-15 Thread Paweł Staszewski
the fix was just pushed by Jeff Kirsher a few days ago. The issue should be fixed in the following commit: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/drivers/net/ethernet/intel/i40e/i40e_txrx.c?id=2b9478ffc550f17c6cd8c69057234e91150f5972 Thanks. - Alex On Sat, Oct 1

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-16 Thread Paweł Staszewski
W dniu 2017-10-16 o 13:20, Pavlos Parissis pisze: On 15/10/2017 02:58 πμ, Alexander Duyck wrote: Hi Pawel, To clarify is that Dave Miller's tree or Linus's that you are talking about? If it is Dave's tree how long ago was it you pulled it since I think the fix was just pushed by Jeff Kirsher

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-16 Thread Paweł Staszewski
W dniu 2017-10-16 o 18:26, Paweł Staszewski pisze: W dniu 2017-10-16 o 13:20, Pavlos Parissis pisze: On 15/10/2017 02:58 πμ, Alexander Duyck wrote: Hi Pawel, To clarify is that Dave Miller's tree or Linus's that you are talking about? If it is Dave's tree how long ago was i

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-16 Thread Paweł Staszewski
W dniu 2017-10-17 o 01:56, Alexander Duyck pisze: On Mon, Oct 16, 2017 at 4:34 PM, Paweł Staszewski wrote: W dniu 2017-10-16 o 18:26, Paweł Staszewski pisze: W dniu 2017-10-16 o 13:20, Pavlos Parissis pisze: On 15/10/2017 02:58 πμ, Alexander Duyck wrote: Hi Pawel, To clarify is that

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-17 Thread Paweł Staszewski
W dniu 2017-10-17 o 02:44, Paweł Staszewski pisze: W dniu 2017-10-17 o 01:56, Alexander Duyck pisze: On Mon, Oct 16, 2017 at 4:34 PM, Paweł Staszewski wrote: W dniu 2017-10-16 o 18:26, Paweł Staszewski pisze: W dniu 2017-10-16 o 13:20, Pavlos Parissis pisze: On 15/10/2017 02:58 πμ

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-17 Thread Paweł Staszewski
W dniu 2017-10-17 o 11:48, Paweł Staszewski pisze: W dniu 2017-10-17 o 02:44, Paweł Staszewski pisze: W dniu 2017-10-17 o 01:56, Alexander Duyck pisze: On Mon, Oct 16, 2017 at 4:34 PM, Paweł Staszewski wrote: W dniu 2017-10-16 o 18:26, Paweł Staszewski pisze: W dniu 2017-10-16 o 13

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-17 Thread Paweł Staszewski
W dniu 2017-10-17 o 12:20, Paweł Staszewski pisze: W dniu 2017-10-17 o 11:48, Paweł Staszewski pisze: W dniu 2017-10-17 o 02:44, Paweł Staszewski pisze: W dniu 2017-10-17 o 01:56, Alexander Duyck pisze: On Mon, Oct 16, 2017 at 4:34 PM, Paweł Staszewski wrote: W dniu 2017-10-16 o 18

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-17 Thread Paweł Staszewski
W dniu 2017-10-17 o 12:51, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:20, Paweł Staszewski pisze: W dniu 2017-10-17 o 11:48, Paweł Staszewski pisze: W dniu 2017-10-17 o 02:44, Paweł Staszewski pisze: W dniu 2017-10-17 o 01:56, Alexander Duyck pisze: On Mon, Oct 16, 2017 at 4:34

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-17 Thread Paweł Staszewski
W dniu 2017-10-17 o 12:59, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:51, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:20, Paweł Staszewski pisze: W dniu 2017-10-17 o 11:48, Paweł Staszewski pisze: W dniu 2017-10-17 o 02:44, Paweł Staszewski pisze: W dniu 2017-10-17 o 01:56

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-17 Thread Paweł Staszewski
W dniu 2017-10-17 o 13:05, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:59, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:51, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:20, Paweł Staszewski pisze: W dniu 2017-10-17 o 11:48, Paweł Staszewski pisze: W dniu 2017-10-17 o 02:44

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-17 Thread Paweł Staszewski
W dniu 2017-10-17 o 13:52, Paweł Staszewski pisze: W dniu 2017-10-17 o 13:05, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:59, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:51, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:20, Paweł Staszewski pisze: W dniu 2017-10-17 o 11:48

Re: Linux 4.12+ memory leak on router with i40e NICs

2017-10-18 Thread Paweł Staszewski
W dniu 2017-10-17 o 16:08, Paweł Staszewski pisze: W dniu 2017-10-17 o 13:52, Paweł Staszewski pisze: W dniu 2017-10-17 o 13:05, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:59, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:51, Paweł Staszewski pisze: W dniu 2017-10-17 o 12:20

Latest net-next from GIT panic

2017-09-19 Thread Paweł Staszewski
Just tried latest net-next git and found kernel panic. Below link to bugzilla. https://bugzilla.kernel.org/attachment.cgi?id=258499

Re: Latest net-next from GIT panic

2017-09-19 Thread Paweł Staszewski
Added few more screenshoots from kernels 4.14-rc1(net-next) and 4.14-rc1(linux-next) https://bugzilla.kernel.org/show_bug.cgi?id=197005 W dniu 2017-09-20 o 00:35, Paweł Staszewski pisze: Just tried latest net-next git and found kernel panic. Below link to bugzilla. https

Re: Latest net-next from GIT panic

2017-09-19 Thread Paweł Staszewski
. Also when I run tris server without turning on BGP and push thru this server traffic by pktgen there is no panic. just after it learn routes it panick W dniu 2017-09-20 o 01:45, Paweł Staszewski pisze: Added few more screenshoots from kernels 4.14-rc1(net-next) and 4.14-rc1(linux-next

Re: Latest net-next from GIT panic

2017-09-19 Thread Paweł Staszewski
Just checked kernel 4.13.2 and same problem Just after start all 6 bgp sessions - and kernel starts to learn routes it panic. https://bugzilla.kernel.org/attachment.cgi?id=258509 W dniu 2017-09-20 o 02:01, Paweł Staszewski pisze: Some information about enviroment: Server is acting as a ip

Re: Latest net-next from GIT panic

2017-09-19 Thread Paweł Staszewski
Latest working kernel with same configuration and kernel config 4.12.13 There is no panic after routes from all 6x bgp sessions are learned. ip r | wc -l 653112 W dniu 2017-09-20 o 02:06, Paweł Staszewski pisze: Just checked kernel 4.13.2 and same problem Just after start all 6 bgp

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
Hi Will try bisecting tonight W dniu 2017-09-20 o 05:24, Eric Dumazet pisze: On Wed, 2017-09-20 at 02:06 +0200, Paweł Staszewski wrote: Just checked kernel 4.13.2 and same problem Just after start all 6 bgp sessions - and kernel starts to learn routes it panic. https

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
Trying to make video from ipmi :) with that results: https://bugzilla.kernel.org/attachment.cgi?id=258521 catched two more lines where it starts - panic from 4.13.2. Now will try tro do some bisection W dniu 2017-09-20 o 09:58, Paweł Staszewski pisze: Hi Will try bisecting tonight W

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
going to: git bisect good Bisecting: 1787 revisions left to test after this (roughly 11 steps) W dniu 2017-09-20 o 10:44, Paweł Staszewski pisze: Trying to make video from ipmi :) with that results: https://bugzilla.kernel.org/attachment.cgi?id=258521 catched two more lines where it

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
, Paweł Staszewski pisze: Ok looks like ending bisection Latest bisected kernel when there is no kernel panic 4.12.0+ (from next)  - but only this warning: [  309.030019] NETDEV WATCHDOG: enp4s0f0 (ixgbe): transmit queue 0 timed out [  309.030034] [ cut here

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
/jmorris/linux-security git bisect good e24dd9ee5399747b71c1d982a484fc7601795f31 W dniu 2017-09-20 o 12:21, Paweł Staszewski pisze: Ok kernel crashed with different panic that i didnt catch when i was doing bisect and now my bisection is broken :) git bisect good Bisecting: 1787 revisions left to test after t

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
#x27; git bisect good 073cf9e20c333ab29744717a23f9e43ec7512a20 W dniu 2017-09-20 o 12:22, Paweł Staszewski pisze: Soo far bisected and marked: git bisect start # bad: [07dd6cc1fff160143e82cf5df78c1db0b6e03355] Linux 4.13.2 git bisect bad 07dd6cc1fff160143e82cf5df78c1db0b6e03

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
Almost there Bisecting: 6 revisions left to test after this (roughly 3 steps) [ad65a2f05695aced349e308193c6e2a6b1d87112] ipv6: call dst_hold_safe() properly W dniu 2017-09-20 o 13:02, Paweł Staszewski pisze: Ok resumed and soo far: Panic: # bad: [9cc9a5cb176ccb4f2cda5ac34da5a659926f125f

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
ter this (roughly 0 steps) [b838d5e1c5b6e57b10ec8af2268824041e3ea911] ipv4: mark DST_NOGC and remove the operation of dst_free() W dniu 2017-09-20 o 14:23, Paweł Staszewski pisze: Almost there Bisecting: 6 revisions left to test after this (roughly 3 st

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
try to get again bisect without panic. W dniu 2017-09-20 o 14:49, Paweł Staszewski pisze: And the last one git bisect good Bisecting: 1 revision left to test after this (roughly 1 step) [1cfb71eeb12047bcdbd3e6730ffed66e810a0855] ipv6: take dst->__refcnt for insertion into fib6 tree W

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
8af2268824041e3ea911] ipv4: mark DST_NOGC and remove the operation of dst_free() W dniu 2017-09-20 o 15:05, Paweł Staszewski pisze: hmm But after b838d5e1c5b6e57b10ec8af2268824041e3ea911 is the first bad commit commit b838d5e1c5b6e57b10ec8af2268824041e3ea911 Author: Wei Wang Date:   Sat Ju

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
dev is not NULL, but netdev->pcpu_refcnt is NULL 65 ff 08decl %gs:(%rax) // CRASH since rax = NULL Pawel, please share your netdevices and routing setup ? Thanks ! On Wed, 2017-09-20 at 14:49 +0200, Paweł Staszewski wrote: And the last one git bisect good Bisecting:

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
W dniu 2017-09-20 o 15:34, Eric Dumazet pisze: Could you try this debug patch ? diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index f535779d9dc1dfe36934c2abba4e43d053ac5d6f..1eaa3553a724dc8c048f67b556337072d5addc82 100644 --- a/include/linux/netdevice.h +++ b/include/lin

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
Nit much more after adding this patch https://bugzilla.kernel.org/attachment.cgi?id=258529 W dniu 2017-09-20 o 15:44, Eric Dumazet pisze: On Wed, 2017-09-20 at 15:39 +0200, Paweł Staszewski wrote: W dniu 2017-09-20 o 15:34, Eric Dumazet pisze: Could you try this debug patch ? diff --git a

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
W dniu 2017-09-20 o 16:40, Eric Dumazet pisze: On Wed, 2017-09-20 at 16:03 +0200, Paweł Staszewski wrote: Nit much more after adding this patch https://bugzilla.kernel.org/attachment.cgi?id=258529 This is why I suggested to replace the BUG() in another mail So : diff --git a/include/linux

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
v6 bgp sessions - so nt many ipv6 prefixes and ipv6 fib is almost empty ip -6 r ls | wc -l 57 Thanks. Wei 6:03 +0200, Paweł Staszewski wrote: Nit much more after adding this patch https://bugzilla.kernel.org/attachment.cgi?id=258529 This is why I suggested to replace the BUG() in another

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
W dniu 2017-09-20 o 20:36, Cong Wang pisze: On Wed, Sep 20, 2017 at 11:30 AM, Eric Dumazet wrote: On Wed, 2017-09-20 at 11:22 -0700, Cong Wang wrote: but dmesg at this time shows nothing about interfaces or flaps. This is very odd. We only free netdevice in free_netdev() and it is only cal

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
W dniu 2017-09-20 o 21:13, Paweł Staszewski pisze: W dniu 2017-09-20 o 20:36, Cong Wang pisze: On Wed, Sep 20, 2017 at 11:30 AM, Eric Dumazet wrote: On Wed, 2017-09-20 at 11:22 -0700, Cong Wang wrote: but dmesg at this time shows nothing about interfaces or flaps. This is very odd. We

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
W dniu 2017-09-20 o 21:23, Paweł Staszewski pisze: W dniu 2017-09-20 o 21:13, Paweł Staszewski pisze: W dniu 2017-09-20 o 20:36, Cong Wang pisze: On Wed, Sep 20, 2017 at 11:30 AM, Eric Dumazet wrote: On Wed, 2017-09-20 at 11:22 -0700, Cong Wang wrote: but dmesg at this time shows

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
W dniu 2017-09-20 o 23:10, Paweł Staszewski pisze: W dniu 2017-09-20 o 21:23, Paweł Staszewski pisze: W dniu 2017-09-20 o 21:13, Paweł Staszewski pisze: W dniu 2017-09-20 o 20:36, Cong Wang pisze: On Wed, Sep 20, 2017 at 11:30 AM, Eric Dumazet wrote: On Wed, 2017-09-20 at 11:22

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
W dniu 2017-09-20 o 23:24, Paweł Staszewski pisze: W dniu 2017-09-20 o 23:10, Paweł Staszewski pisze: W dniu 2017-09-20 o 21:23, Paweł Staszewski pisze: W dniu 2017-09-20 o 21:13, Paweł Staszewski pisze: W dniu 2017-09-20 o 20:36, Cong Wang pisze: On Wed, Sep 20, 2017 at 11:30 AM

Re: Latest net-next from GIT panic

2017-09-20 Thread Paweł Staszewski
W dniu 2017-09-20 o 23:25, Paweł Staszewski pisze: W dniu 2017-09-20 o 23:24, Paweł Staszewski pisze: W dniu 2017-09-20 o 23:10, Paweł Staszewski pisze: W dniu 2017-09-20 o 21:23, Paweł Staszewski pisze: W dniu 2017-09-20 o 21:13, Paweł Staszewski pisze: W dniu 2017-09-20 o 20:36

Re: Latest net-next from GIT panic

2017-09-21 Thread Paweł Staszewski
W dniu 2017-09-21 o 03:17, Eric Dumazet pisze: On Wed, 2017-09-20 at 18:09 -0700, Wei Wang wrote: Thanks very much Pawel for the feedback. I was looking into the code (specifically IPv4 part) and found that in free_fib_info_rcu(), we call free_nh_exceptions() without holding the fnhe_lock. I

Re: Latest net-next from GIT panic

2017-09-21 Thread Paweł Staszewski
W dniu 2017-09-21 o 13:03, Eric Dumazet pisze: On Thu, 2017-09-21 at 11:06 +0200, Paweł Staszewski wrote: W dniu 2017-09-21 o 03:17, Eric Dumazet pisze: On Wed, 2017-09-20 at 18:09 -0700, Wei Wang wrote: Thanks very much Pawel for the feedback. I was looking into the code (specifically

Re: Latest net-next from GIT panic

2017-09-21 Thread Paweł Staszewski
W dniu 2017-09-21 o 13:12, Paweł Staszewski pisze: W dniu 2017-09-21 o 13:03, Eric Dumazet pisze: On Thu, 2017-09-21 at 11:06 +0200, Paweł Staszewski wrote: W dniu 2017-09-21 o 03:17, Eric Dumazet pisze: On Wed, 2017-09-20 at 18:09 -0700, Wei Wang wrote: Thanks very much Pawel for the

Re: Latest net-next from GIT panic

2017-09-21 Thread Paweł Staszewski
W dniu 2017-09-21 o 13:03, Eric Dumazet pisze: OK we have two problems here 1) We need to unify skb_dst_force() ( for net tree ) 2) Vlan devices should try to correctly handle IFF_XMIT_DST_RELEASE from lower device. This will considerably help your performance. For 1), this is what I had i

Re: Latest net-next from GIT panic

2017-09-21 Thread Paweł Staszewski
W dniu 2017-09-21 o 13:31, Paweł Staszewski pisze: W dniu 2017-09-21 o 13:03, Eric Dumazet pisze: OK we have two problems here 1) We need to unify skb_dst_force()  ( for net tree ) 2) Vlan devices should try to correctly handle IFF_XMIT_DST_RELEASE from lower device. This will

Re: Kernel 4.13.0-rc4-next-20170811 - IP Routing / Forwarding performance vs Core/RSS number / HT on

2017-09-21 Thread Paweł Staszewski
W dniu 2017-08-15 o 11:11, Paweł Staszewski pisze: diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c index 5e831de3103e2f7092c7fa15534def403bc62fb4..9472de846d5c0960996261cb2843032847fa4bf7 100644 --- a/net/8021q/vlan_netlink.c +++ b/net/8021q/vlan_netlink.c @@ -143,6 +143,7

Re: Kernel 4.13.0-rc4-next-20170811 - IP Routing / Forwarding performance vs Core/RSS number / HT on

2017-09-21 Thread Paweł Staszewski
W dniu 2017-09-21 o 23:34, Eric Dumazet pisze: On Thu, 2017-09-21 at 23:26 +0200, Paweł Staszewski wrote: W dniu 2017-08-15 o 11:11, Paweł Staszewski pisze: diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c index 5e831de3103e2f7092c7fa15534def403bc62fb4

Re: Kernel 4.13.0-rc4-next-20170811 - IP Routing / Forwarding performance vs Core/RSS number / HT on

2017-09-21 Thread Paweł Staszewski
W dniu 2017-09-21 o 23:41, Florian Fainelli pisze: On 09/21/2017 02:26 PM, Paweł Staszewski wrote: W dniu 2017-08-15 o 11:11, Paweł Staszewski pisze: diff --git a/net/8021q/vlan_netlink.c b/net/8021q/vlan_netlink.c index 5e831de3103e2f7092c7fa15534def403bc62fb4

Latest kernel net-next - 4.14-rc1+ / WARNING: CPU: 16 PID: 0 at net/sched/sch_hfsc.c:1385 hfsc_dequeue+0x241/0x269

2017-09-26 Thread Paweł Staszewski
[50102.787542] [ cut here ] [50102.787545] WARNING: CPU: 16 PID: 0 at net/sched/sch_hfsc.c:1385 hfsc_dequeue+0x241/0x269 [50102.787545] Modules linked in: ipmi_si x86_pkg_temp_thermal [50102.787547] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G W   4.14.0-rc1+ #10 [501

Re: Latest kernel net-next - 4.14-rc1+ / WARNING: CPU: 16 PID: 0 at net/sched/sch_hfsc.c:1385 hfsc_dequeue+0x241/0x269

2017-09-26 Thread Paweł Staszewski
trace 8558fb6f1ca3beb0 ]--- W dniu 2017-09-26 o 14:00, Paweł Staszewski pisze: [50102.787542] [ cut here ] [50102.787545] WARNING: CPU: 16 PID: 0 at net/sched/sch_hfsc.c:1385 hfsc_dequeue+0x241/0x269 [50102.787545] Modules linked in: ipmi_si x86_pkg_temp_thermal [50102.78754

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-20 Thread Paweł Staszewski
W dniu 19.11.2018 o 22:59, David Ahern pisze: On 11/9/18 5:06 PM, David Ahern wrote: On 11/9/18 9:21 AM, David Ahern wrote: Is there possible to add only counters from xdp for vlans ? This will help me in testing. I will take a look today at adding counters that you can dump using bpftool. I

Re: consistency for statistics with XDP mode

2018-11-21 Thread Paweł Staszewski
W dniu 21.11.2018 o 22:14, Toke Høiland-Jørgensen pisze: David Ahern writes: Paweł ran some more XDP tests yesterday and from it found a couple of issues. One is a panic in the mlx5 driver unloading the bpf program (mlx5e_xdp_xmit); he will send a send a separate email for that problem. Sam

Re: [Patch net] net: invert the check of detecting hardware RX checksum fault

2018-11-21 Thread Paweł Staszewski
W dniu 16.11.2018 o 21:06, Cong Wang pisze: On Thu, Nov 15, 2018 at 8:50 PM Herbert Xu wrote: On Thu, Nov 15, 2018 at 06:23:38PM -0800, Cong Wang wrote: Normally if the hardware's partial checksum is valid then we just trust it and send the packet along. However, if the partial checksum is

Weird traces 4.20.0-rc3+ / RIP: 0010:fib6_walk_continue+0x37/0xe6

2018-11-30 Thread Paweł Staszewski
Traces attached below: [310658.536190] rcu: INFO: rcu_sched self-detected stall on CPU [310658.536195] rcu:    15-: (322 ticks this GP) idle=fca/1/0x4002 softirq=50617185/50617185 fqs=64 [310658.536195] rcu: (t=15049 jiffies g=84272013 q=4728) [310658.536200] NMI backtra

Re: Latest net-next kernel 4.19.0+

2018-10-29 Thread Paweł Staszewski
00 R15: [  342.190929]  do_idle+0x1a3/0x1c0 [  342.190931]  cpu_startup_entry+0x14/0x20 [  342.190934]  start_secondary+0x165/0x190 [  342.190939]  secondary_startup_64+0xa4/0xb0 W dniu 30.10.2018 o 01:10, Paweł Staszewski pisze: Hi Just checked in test lab latest kernel and have weird

Latest net-next kernel 4.19.0+

2018-10-29 Thread Paweł Staszewski
Hi Just checked in test lab latest kernel and have weird traces: [  219.888673] CPU: 52 PID: 0 Comm: swapper/52 Not tainted 4.19.0+ #1 [  219.888674] Call Trace: [  219.888676]  [  219.888685]  dump_stack+0x46/0x5b [  219.888691]  __skb_checksum_complete+0x9a/0xa0 [  219.888694]  tcp_v4_rcv+0x

Re: Latest net-next kernel 4.19.0+

2018-10-29 Thread Paweł Staszewski
W dniu 30.10.2018 o 01:11, Paweł Staszewski pisze: Sorry not complete - followed by hw csum: [  342.190831] vlan1490: hw csum failure [  342.190835] CPU: 52 PID: 0 Comm: swapper/52 Not tainted 4.19.0+ #1 [  342.190836] Call Trace: [  342.190839]  [  342.190849]  dump_stack+0x46/0x5b

Re: Latest net-next kernel 4.19.0+

2018-10-30 Thread Paweł Staszewski
W dniu 30.10.2018 o 08:29, Eric Dumazet pisze: On 10/29/2018 11:09 PM, Dimitris Michailidis wrote: Indeed this is a bug. I would expect it to produce frequent errors though as many odd-length packets would trigger it. Do you have RXFCS? Regardless, how frequently do you see the problem? O

Re: Latest net-next kernel 4.19.0+

2018-10-31 Thread Paweł Staszewski
W dniu 31.10.2018 o 22:05, Saeed Mahameed pisze: On Tue, 2018-10-30 at 10:32 -0700, Cong Wang wrote: On Tue, Oct 30, 2018 at 7:16 AM Eric Dumazet wrote: On 10/30/2018 01:09 AM, Paweł Staszewski wrote: W dniu 30.10.2018 o 08:29, Eric Dumazet pisze: On 10/29/2018 11:09 PM, Dimitris

Re: Latest net-next kernel 4.19.0+

2018-10-31 Thread Paweł Staszewski
W dniu 30.10.2018 o 15:16, Eric Dumazet pisze: On 10/30/2018 01:09 AM, Paweł Staszewski wrote: W dniu 30.10.2018 o 08:29, Eric Dumazet pisze: On 10/29/2018 11:09 PM, Dimitris Michailidis wrote: Indeed this is a bug. I would expect it to produce frequent errors though as many odd-length

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-10-31 Thread Paweł Staszewski
W dniu 31.10.2018 o 23:09, Eric Dumazet pisze: On 10/31/2018 02:57 PM, Paweł Staszewski wrote: Hi So maybee someone will be interested how linux kernel handles normal traffic (not pktgen :) ) Server HW configuration: CPU : Intel(R) Xeon(R) Gold 6132 CPU @ 2.60GHz NIC's: 2x

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-10-31 Thread Paweł Staszewski
W dniu 31.10.2018 o 23:20, Paweł Staszewski pisze: W dniu 31.10.2018 o 23:09, Eric Dumazet pisze: On 10/31/2018 02:57 PM, Paweł Staszewski wrote: Hi So maybee someone will be interested how linux kernel handles normal traffic (not pktgen :) ) Server HW configuration: CPU : Intel(R

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-01 Thread Paweł Staszewski
W dniu 01.11.2018 o 10:22, Jesper Dangaard Brouer pisze: On Wed, 31 Oct 2018 23:20:01 +0100 Paweł Staszewski wrote: W dniu 31.10.2018 o 23:09, Eric Dumazet pisze: On 10/31/2018 02:57 PM, Paweł Staszewski wrote: Hi So maybee someone will be interested how linux kernel handles normal

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-01 Thread Paweł Staszewski
W dniu 01.11.2018 o 11:55, Jesper Dangaard Brouer pisze: On Wed, 31 Oct 2018 21:37:16 -0600 David Ahern wrote: This is mainly a forwarding use case? Seems so based on the perf report. I suspect forwarding with XDP would show pretty good improvement. Yes, significant performance improvement

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-01 Thread Paweł Staszewski
W dniu 01.11.2018 o 12:09, Paweł Staszewski pisze: rx_cqe_compress_pkts: 0 If this is a pcie bottleneck it might be useful to  enable CQE compression (to reduce PCIe completion descriptors transactions) you should see the above rx_cqe_compress_pkts increasing when enabled. $ ethtool  --set

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-01 Thread Paweł Staszewski
W dniu 01.11.2018 o 18:23, David Ahern pisze: On 11/1/18 7:52 AM, Paweł Staszewski wrote: W dniu 01.11.2018 o 11:55, Jesper Dangaard Brouer pisze: On Wed, 31 Oct 2018 21:37:16 -0600 David Ahern wrote: This is mainly a forwarding use case? Seems so based on the perf report. I suspect

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-01 Thread Paweł Staszewski
W dniu 01.11.2018 o 21:37, Saeed Mahameed pisze: On Thu, 2018-11-01 at 12:09 +0100, Paweł Staszewski wrote: W dniu 01.11.2018 o 10:50, Saeed Mahameed pisze: On Wed, 2018-10-31 at 22:57 +0100, Paweł Staszewski wrote: Hi So maybee someone will be interested how linux kernel handles normal

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-01 Thread Paweł Staszewski
W dniu 01.11.2018 o 22:18, Paweł Staszewski pisze: W dniu 01.11.2018 o 21:37, Saeed Mahameed pisze: On Thu, 2018-11-01 at 12:09 +0100, Paweł Staszewski wrote: W dniu 01.11.2018 o 10:50, Saeed Mahameed pisze: On Wed, 2018-10-31 at 22:57 +0100, Paweł Staszewski wrote: Hi So maybee

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-01 Thread Paweł Staszewski
W dniu 01.11.2018 o 22:24, Paweł Staszewski pisze: W dniu 01.11.2018 o 22:18, Paweł Staszewski pisze: W dniu 01.11.2018 o 21:37, Saeed Mahameed pisze: On Thu, 2018-11-01 at 12:09 +0100, Paweł Staszewski wrote: W dniu 01.11.2018 o 10:50, Saeed Mahameed pisze: On Wed, 2018-10-31 at 22

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-02 Thread Paweł Staszewski
W dniu 02.11.2018 o 15:20, Aaron Lu pisze: On Fri, Nov 02, 2018 at 12:40:37PM +0100, Jesper Dangaard Brouer wrote: On Fri, 2 Nov 2018 13:23:56 +0800 Aaron Lu wrote: On Thu, Nov 01, 2018 at 08:23:19PM +, Saeed Mahameed wrote: On Thu, 2018-11-01 at 23:27 +0800, Aaron Lu wrote: On Thu,

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-02 Thread Paweł Staszewski
W dniu 02.11.2018 o 20:02, Paweł Staszewski pisze: W dniu 02.11.2018 o 15:20, Aaron Lu pisze: On Fri, Nov 02, 2018 at 12:40:37PM +0100, Jesper Dangaard Brouer wrote: On Fri, 2 Nov 2018 13:23:56 +0800 Aaron Lu wrote: On Thu, Nov 01, 2018 at 08:23:19PM +, Saeed Mahameed wrote: On

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-02 Thread Paweł Staszewski
W dniu 01.11.2018 o 21:37, Saeed Mahameed pisze: On Thu, 2018-11-01 at 12:09 +0100, Paweł Staszewski wrote: W dniu 01.11.2018 o 10:50, Saeed Mahameed pisze: On Wed, 2018-10-31 at 22:57 +0100, Paweł Staszewski wrote: Hi So maybee someone will be interested how linux kernel handles normal

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-03 Thread Paweł Staszewski
W dniu 03.11.2018 o 01:16, Paweł Staszewski pisze: W dniu 02.11.2018 o 20:02, Paweł Staszewski pisze: W dniu 02.11.2018 o 15:20, Aaron Lu pisze: On Fri, Nov 02, 2018 at 12:40:37PM +0100, Jesper Dangaard Brouer wrote: On Fri, 2 Nov 2018 13:23:56 +0800 Aaron Lu wrote: On Thu, Nov 01

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-03 Thread Paweł Staszewski
W dniu 03.11.2018 o 13:58, Jesper Dangaard Brouer pisze: On Sat, 3 Nov 2018 01:16:08 +0100 Paweł Staszewski wrote: W dniu 02.11.2018 o 20:02, Paweł Staszewski pisze: W dniu 02.11.2018 o 15:20, Aaron Lu pisze: On Fri, Nov 02, 2018 at 12:40:37PM +0100, Jesper Dangaard Brouer wrote: On

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-03 Thread Paweł Staszewski
W dniu 03.11.2018 o 16:23, Paweł Staszewski pisze: W dniu 03.11.2018 o 13:58, Jesper Dangaard Brouer pisze: On Sat, 3 Nov 2018 01:16:08 +0100 Paweł Staszewski wrote: W dniu 02.11.2018 o 20:02, Paweł Staszewski pisze: W dniu 02.11.2018 o 15:20, Aaron Lu pisze: On Fri, Nov 02, 2018 at

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-03 Thread Paweł Staszewski
n W dniu 03.11.2018 o 18:32, David Ahern pisze: On 11/1/18 11:30 AM, Paweł Staszewski wrote: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/xdp_fwd_kern.c I can try some tests on same hw but testlab configuration - will give it a try :) That

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-07 Thread Paweł Staszewski
W dniu 05.11.2018 o 21:17, Jesper Dangaard Brouer pisze: On Sun, 4 Nov 2018 01:24:03 +0100 Paweł Staszewski wrote: And today again after allpy patch for page allocator - reached again 64/64 Gbit/s with only 50-60% cpu load Great. today no slowpath hit for netwoking :) But again

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-07 Thread Paweł Staszewski
W dniu 08.11.2018 o 01:59, Paweł Staszewski pisze: W dniu 05.11.2018 o 21:17, Jesper Dangaard Brouer pisze: On Sun, 4 Nov 2018 01:24:03 +0100 Paweł Staszewski wrote: And today again after allpy patch for page allocator - reached again 64/64 Gbit/s with only 50-60% cpu load Great

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-08 Thread Paweł Staszewski
W dniu 07.11.2018 o 22:06, David Ahern pisze: On 11/3/18 6:24 PM, Paweł Staszewski wrote: Does your setup have any other device types besides physical ports with VLANs (e.g., any macvlans or bonds)? no. just phy(mlnx)->vlans only config VLAN and non-VLAN (and a mix) seem to work

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-08 Thread Paweł Staszewski
W dniu 08.11.2018 o 01:59, Paweł Staszewski pisze: W dniu 05.11.2018 o 21:17, Jesper Dangaard Brouer pisze: On Sun, 4 Nov 2018 01:24:03 +0100 Paweł Staszewski wrote: And today again after allpy patch for page allocator - reached again 64/64 Gbit/s with only 50-60% cpu load Great

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-08 Thread Paweł Staszewski
W dniu 08.11.2018 o 17:06, David Ahern pisze: On 11/8/18 6:33 AM, Paweł Staszewski wrote: W dniu 07.11.2018 o 22:06, David Ahern pisze: On 11/3/18 6:24 PM, Paweł Staszewski wrote: Does your setup have any other device types besides physical ports with VLANs (e.g., any macvlans or bonds

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-08 Thread Paweł Staszewski
W dniu 08.11.2018 o 17:25, Paweł Staszewski pisze: W dniu 08.11.2018 o 17:06, David Ahern pisze: On 11/8/18 6:33 AM, Paweł Staszewski wrote: W dniu 07.11.2018 o 22:06, David Ahern pisze: On 11/3/18 6:24 PM, Paweł Staszewski wrote: Does your setup have any other device types besides

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-08 Thread Paweł Staszewski
W dniu 08.11.2018 o 17:32, David Ahern pisze: On 11/8/18 9:27 AM, Paweł Staszewski wrote: What hardware is this? mellanox connectx 4 ethtool -i enp175s0f0 driver: mlx5_core version: 5.0-0 firmware-version: 12.21.1000 (SM_200101033) expansion-rom-version: bus-info: :af:00.0 supports

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-08 Thread Paweł Staszewski
W dniu 03.11.2018 o 01:18, Paweł Staszewski pisze: W dniu 01.11.2018 o 21:37, Saeed Mahameed pisze: On Thu, 2018-11-01 at 12:09 +0100, Paweł Staszewski wrote: W dniu 01.11.2018 o 10:50, Saeed Mahameed pisze: On Wed, 2018-10-31 at 22:57 +0100, Paweł Staszewski wrote: Hi So maybee

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-08 Thread Paweł Staszewski
W dniu 08.11.2018 o 17:32, David Ahern pisze: On 11/8/18 9:27 AM, Paweł Staszewski wrote: What hardware is this? mellanox connectx 4 ethtool -i enp175s0f0 driver: mlx5_core version: 5.0-0 firmware-version: 12.21.1000 (SM_200101033) expansion-rom-version: bus-info: :af:00.0 supports

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-09 Thread Paweł Staszewski
W dniu 09.11.2018 o 05:52, Saeed Mahameed pisze: On Thu, 2018-11-08 at 17:42 -0700, David Ahern wrote: On 11/8/18 5:40 PM, Paweł Staszewski wrote: W dniu 08.11.2018 o 17:32, David Ahern pisze: On 11/8/18 9:27 AM, Paweł Staszewski wrote: What hardware is this? mellanox connectx 4 ethtool

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-09 Thread Paweł Staszewski
W dniu 08.11.2018 o 17:06, David Ahern pisze: On 11/8/18 6:33 AM, Paweł Staszewski wrote: W dniu 07.11.2018 o 22:06, David Ahern pisze: On 11/3/18 6:24 PM, Paweł Staszewski wrote: Does your setup have any other device types besides physical ports with VLANs (e.g., any macvlans or bonds

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-09 Thread Paweł Staszewski
W dniu 09.11.2018 o 17:21, David Ahern pisze: On 11/9/18 3:20 AM, Paweł Staszewski wrote: I just catch some weird behavior :) All was working fine for about 20k packets Then after xdp start to forward every 10 packets Interesting. Any counter showing drops? nothing that will fit NIC

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-09 Thread Paweł Staszewski
W dniu 08.11.2018 o 20:12, Paweł Staszewski pisze: CPU load is lower than for connectx4 - but it looks like bandwidth limit is the same :) But also after reaching 60Gbit/60Gbit  bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help   input: /proc/net/dev type: rate   -

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-10 Thread Paweł Staszewski
W dniu 10.11.2018 o 01:06, David Ahern pisze: On 11/9/18 9:21 AM, David Ahern wrote: Is there possible to add only counters from xdp for vlans ? This will help me in testing. I will take a look today at adding counters that you can dump using bpftool. It will be a temporary solution for this

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-10 Thread Paweł Staszewski
W dniu 10.11.2018 o 20:34, Jesper Dangaard Brouer pisze: On Fri, 9 Nov 2018 23:20:38 +0100 Paweł Staszewski wrote: W dniu 08.11.2018 o 20:12, Paweł Staszewski pisze: CPU load is lower than for connectx4 - but it looks like bandwidth limit is the same :) But also after reaching 60Gbit

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-10 Thread Paweł Staszewski
W dniu 10.11.2018 o 20:49, Paweł Staszewski pisze: W dniu 10.11.2018 o 20:34, Jesper Dangaard Brouer pisze: On Fri, 9 Nov 2018 23:20:38 +0100 Paweł Staszewski wrote: W dniu 08.11.2018 o 20:12, Paweł Staszewski pisze: CPU load is lower than for connectx4 - but it looks like bandwidth

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-10 Thread Paweł Staszewski
W dniu 10.11.2018 o 20:34, Jesper Dangaard Brouer pisze: I want you to experiment with: ethtool --set-priv-flags DEVICE rx_striding_rq off just checked that previously connectx4 was have thos disabled:  ethtool --show-priv-flags enp175s0f0 Private flags for enp175s0f0: rx_cqe_moder   :

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-10 Thread Paweł Staszewski
W dniu 10.11.2018 o 22:01, Jesper Dangaard Brouer pisze: On Sat, 10 Nov 2018 21:02:10 +0100 Paweł Staszewski wrote: W dniu 10.11.2018 o 20:34, Jesper Dangaard Brouer pisze: I want you to experiment with: ethtool --set-priv-flags DEVICE rx_striding_rq off just checked that previously

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-10 Thread Paweł Staszewski
W dniu 10.11.2018 o 22:53, Paweł Staszewski pisze: W dniu 10.11.2018 o 22:01, Jesper Dangaard Brouer pisze: On Sat, 10 Nov 2018 21:02:10 +0100 Paweł Staszewski wrote: W dniu 10.11.2018 o 20:34, Jesper Dangaard Brouer pisze: I want you to experiment with:    ethtool --set-priv-flags

Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

2018-11-10 Thread Paweł Staszewski
W dniu 10.11.2018 o 23:06, Jesper Dangaard Brouer pisze: On Sat, 10 Nov 2018 20:56:02 +0100 Paweł Staszewski wrote: W dniu 10.11.2018 o 20:49, Paweł Staszewski pisze: W dniu 10.11.2018 o 20:34, Jesper Dangaard Brouer pisze: On Fri, 9 Nov 2018 23:20:38 +0100 Paweł Staszewski wrote

  1   2   >