Re: [PATCH v2 net-next 0/3] tcp: add rx/tx cache to reduce lock contention

2019-03-22 Thread Soheil Hassas Yeganeh
On Thu, Mar 21, 2019 at 9:55 PM Willem de Bruijn wrote: > > On Thu, Mar 21, 2019 at 8:16 PM Eric Dumazet wrote: > > > > On hosts with many cpus we can observe a very serious contention > > on spinlocks used in mm slab layer. > > > > The following can happen quite often : > > > > 1) TX path > >

[RFC 1/1] net/tun: fdinfo to relate TAP/TUN network interfaces unambiguously with their serving processes

2019-03-22 Thread Harald Albrecht
This is a first for me to appear on the Linux kernel netdev list, so please bear with me while killing me... For diagnosing Linux virtual networking I need to relate TAP/TUN network interfacces to the user space processes they are served by. I'm aware of Kirill Tkhai's TUN SIOCGSKNS ioctl patch

[PATCH net-next v2 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc

2019-03-22 Thread Paolo Abeni
With commit c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array") pfifo_fast no longer benefit from the TCQ_F_CAN_BYPASS optimization. Due to retpolines the cost of the enqueue()/dequeue() pair has become relevant and we observe measurable regression for the uncontended scenario when the packet-rat

[PATCH net-next v2 1/2] net: sched: add empty status flag for NOLOCK qdisc

2019-03-22 Thread Paolo Abeni
The queue is marked not empty after acquiring the seqlock, and it's up to the NOLOCK qdisc clearing such flag on dequeue. Since the empty status lays on the same cache-line of the seqlock, it's always hot on cache during the updates. This makes the empty flag update a little bit loosy. Given the l

[PATCH net-next v2 0/2] net: dev: BYPASS for lockless qdisc

2019-03-22 Thread Paolo Abeni
This patch series is aimed at improving xmit performances of lockless qdisc in the uncontended scenario. After the lockless refactor pfifo_fast can't leverage the BYPASS optimization. Due to retpolines the overhead for the avoidables enqueue and dequeue operations has increased and we see measurab

Re: [patch net-next 02/11] bnxt: add missing net/devlink.h include

2019-03-22 Thread Vasundhara Volam
On Thu, Mar 21, 2019 at 6:50 PM Jiri Pirko wrote: > > From: Jiri Pirko > > devlink functions are in use, so include the related header file. > > Signed-off-by: Jiri Pirko > --- > drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/n

Re: [PATCH net-next v2 0/2] net: dev: BYPASS for lockless qdisc

2019-03-22 Thread Ivan Vecera
On 22. 03. 19 9:30, Paolo Abeni wrote: This patch series is aimed at improving xmit performances of lockless qdisc in the uncontended scenario. After the lockless refactor pfifo_fast can't leverage the BYPASS optimization. Due to retpolines the overhead for the avoidables enqueue and dequeue ope

Re: [PATCH v2 net-next 0/3] tcp: add rx/tx cache to reduce lock contention

2019-03-22 Thread Michael S. Tsirkin
On Thu, Mar 21, 2019 at 05:14:41PM -0700, Eric Dumazet wrote: > On hosts with many cpus we can observe a very serious contention > on spinlocks used in mm slab layer. > > The following can happen quite often : > > 1) TX path > sendmsg() allocates one (fclone) skb on CPU A, sends a clone. > AC

[PATCH net v2 1/1] net: sched: fix cleanup NULL pointer exception in act_mirr

2019-03-22 Thread John Hurley
A new mirred action is created by the tcf_mirred_init function. This contains a list head struct which is inserted into a global list on successful creation of a new action. However, after a creation, it is still possible to error out and call the tcf_idr_release function. This, in turn, calls the

Re: [PATCH v2 net-next 0/3] tcp: add rx/tx cache to reduce lock contention

2019-03-22 Thread Eric Dumazet
On Fri, Mar 22, 2019 at 4:28 AM Michael S. Tsirkin wrote: > > > Just a thought: would it make sense to flush the cache > in enter_memory_pressure? > Good question, thanks ! Willem asked me something similar yesterday. The argument of keeping one skb for tx and one for rx makes some sense to me,

Re: [BUG] BPF splat on latest kernels

2019-03-22 Thread Eric Dumazet
On Thu, Mar 21, 2019 at 8:36 PM Alexei Starovoitov wrote: > > On Wed, Mar 20, 2019 at 09:49:34PM -0700, Eric Dumazet wrote: > > > > > > On 03/08/2019 04:29 PM, Alexei Starovoitov wrote: > > > On Fri, Mar 8, 2019 at 12:33 PM Eric Dumazet > > > wrote: > > >> > > >> Running test_progs on a LOCKDEP

[PATCH net-next] ipv6: Move ipv6 stubs to a separate header file

2019-03-22 Thread David Ahern
From: David Ahern The number of stubs is growing and has nothing to do with addrconf. Move the definition of the stubs to a separate header file and update users. In the move, drop the vxlan specific comment before ipv6_stub. Code move only; no functional change intended. Signed-off-by: David A

[RFC v2 6/6] nl80211: tag policies with strict_start_type

2019-03-22 Thread Johannes Berg
From: Johannes Berg Tag all the nl80211 policies with strict_start_type so that strict validation is done for all types that we have a policy for. Signed-off-by: Johannes Berg --- net/wireless/nl80211.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/net/wireless/nl

[RFC v2 1/6] netlink: add NLA_MIN_LEN

2019-03-22 Thread Johannes Berg
From: Johannes Berg Rather than using NLA_UNSPEC for this type of thing, use NLA_MIN_LEN so we can make NLA_UNSPEC be NLA_REJECT under certain conditions for future attributes. While at it, also use NLA_EXACT_LEN for the struct example. Signed-off-by: Johannes Berg --- include/net/netlink.h |

[RFC v2 4/6] netlink: add strict parsing for future attributes

2019-03-22 Thread Johannes Berg
From: Johannes Berg Unfortunately, we cannot add strict parsing for all attributes, as that would break existing userspace. We currently warn about it, but that's about all we can do. For new attributes, however, the story is better: nobody is using them, so we can reject bad sizes. Also, for n

[RFC v2 0/6] strict netlink validation

2019-03-22 Thread Johannes Berg
This version gets us to where I wanted to be: strict parsing commandattribute attribute message oldold -- oldnew X- new*XX Additionally, it does lots of cross-tree renam

[RFC v2 3/6] netlink: re-add parse/validate functions in strict mode

2019-03-22 Thread Johannes Berg
From: Johannes Berg This re-adds the parse and validate functions like nla_parse() that are now actually strict after the previous rename and were just split out to make sure everything is converted (and if not compilation of the previous patch would fail.) Signed-off-by: Johannes Berg --- inc

[RFC v2 5/6] genetlink: optionally validate strictly/dumps

2019-03-22 Thread Johannes Berg
From: Johannes Berg Add options to strictly validate messages and dump messages, sometimes perhaps validating dump messages non-strictly may be required, so add an option for that as well. Since none of this can really be applied to existing commands, set the options everwhere using the followin

Re: [PATCH net v3 0/2] vti4: ipip tunnel fixes

2019-03-22 Thread Steffen Klassert
On Tue, Mar 19, 2019 at 03:39:19PM +, Jeremy Sowden wrote: > Some fixes for the initialization and clean-up of the ipip tunnel. > > Jeremy Sowden (2): > vti4: ipip tunnel deregistration fixes. > vti4: removed duplicate log message. Both patches applied, thanks!

[PATCH net-next] tcp: remove conditional branches from tcp_mstamp_refresh()

2019-03-22 Thread Eric Dumazet
tcp_clock_ns() (aka ktime_get_ns()) is using monotonic clock, so the checks we had in tcp_mstamp_refresh() are no longer relevant. This patch removes cpu stall (when the cache line is not hot) Signed-off-by: Eric Dumazet --- net/ipv4/tcp_output.c | 8 ++-- 1 file changed, 2 insertions(+), 6

Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports

2019-03-22 Thread Jiri Pirko
Thu, Mar 21, 2019 at 06:42:55PM CET, pa...@mellanox.com wrote: > > >> -Original Message- >> From: Jiri Pirko >> Sent: Thursday, March 21, 2019 12:24 PM >> To: Parav Pandit >> Cc: Jakub Kicinski ; Samudrala, Sridhar >> ; da...@davemloft.net; >> netdev@vger.kernel.org; oss-driv...@netronome

Re: [patch net-next 02/11] bnxt: add missing net/devlink.h include

2019-03-22 Thread Jiri Pirko
Fri, Mar 22, 2019 at 10:16:34AM CET, vasundhara-v.vo...@broadcom.com wrote: >On Thu, Mar 21, 2019 at 6:50 PM Jiri Pirko wrote: >> >> From: Jiri Pirko >> >> devlink functions are in use, so include the related header file. >> >> Signed-off-by: Jiri Pirko >> --- >> drivers/net/ethernet/broadcom/b

Re: [patch net-next 06/11] net: devlink: don't take devlink_mutex for devlink_compat_*

2019-03-22 Thread Jiri Pirko
Thu, Mar 21, 2019 at 08:08:24PM CET, jakub.kicin...@netronome.com wrote: >On Thu, 21 Mar 2019 14:20:14 +0100, Jiri Pirko wrote: >> From: Jiri Pirko >> >> The netdevice is guaranteed to not disappear so we can rely that >> devlink_port and devlink won't disappear as well. No need to take >> devlin

Re: [PATCH net-next v2 1/2] net: sched: add empty status flag for NOLOCK qdisc

2019-03-22 Thread Eric Dumazet
On 03/22/2019 01:30 AM, Paolo Abeni wrote: > The queue is marked not empty after acquiring the seqlock, > and it's up to the NOLOCK qdisc clearing such flag on dequeue. > Since the empty status lays on the same cache-line of the > seqlock, it's always hot on cache during the updates. > > >

Re: [PATCH net-next v2 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc

2019-03-22 Thread Eric Dumazet
On 03/22/2019 01:30 AM, Paolo Abeni wrote: > With commit c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array") > pfifo_fast no longer benefit from the TCQ_F_CAN_BYPASS optimization. > Due to retpolines the cost of the enqueue()/dequeue() pair has become > relevant and we observe measurable regre

[RFC PATCH 1/1 v2] net: sched: Introduce conndscp action

2019-03-22 Thread Kevin 'ldir' Darbyshire-Bryant
Conndscp is a new tc filter action module. It is designed to copy DSCPs to conntrack marks and the reverse operation of conntrack mark contained DSCPs to the diffserv field of suitable skbs. The feature is intended for use and has been found useful for restoring ingress classifications based on e

Re: [PATCH net-next v2] tls: Add SOL_TLS to uapi

2019-03-22 Thread David Miller
From: Boris Pismenny Date: Thu, 21 Mar 2019 16:29:02 +0200 > User applications are forced to define the SOL_TLS manually at the > moment, which is inappropriate. Add SOL_TLS to the uapi. > > Other protocols handle this similarly. For example see SOL_TIPC. > > Signed-off-by: Boris Pismenny > --

Re: [PATCH v2 net-next] net: phy: aquantia: add downshift support

2019-03-22 Thread David Miller
From: Heiner Kallweit Date: Thu, 21 Mar 2019 21:08:35 +0100 > Aquantia PHY's of the AQR107 family support the downshift feature. > Add support for it as standard PHY tunable so that it can be controlled > via ethtool. > The AQCS109 supports a proprietary 2-pair 1Gbps mode. If two such PHY's > are

Re: [PATCH net-next v2] tls: Add SOL_TLS to uapi

2019-03-22 Thread Boris Pismenny
On 3/22/2019 3:24 PM, David Miller wrote: > From: Boris Pismenny > Date: Thu, 21 Mar 2019 16:29:02 +0200 > >> User applications are forced to define the SOL_TLS manually at the >> moment, which is inappropriate. Add SOL_TLS to the uapi. >> >> Other protocols handle this similarly. For example s

Re: [PATCH net] r8169: don't read interrupt mask register in interrupt handler

2019-03-22 Thread David Miller
From: Heiner Kallweit Date: Thu, 21 Mar 2019 21:23:14 +0100 > After the original patch network starts to crash on heavy load. > It's not fully clear why this additional register read has such side > effects, but removing it fixes the issue. > > Thanks also to Alex for his contribution and hints.

Re: [PATCH net-next] r8169: use netif_start_queue instead of netif_wake_qeueue in rtl8169_start_xmit

2019-03-22 Thread David Miller
From: Heiner Kallweit Date: Thu, 21 Mar 2019 21:41:48 +0100 > Replace the call to netif_wake_queue in rtl8169_start_xmit with > netif_start_queue as we don't need to actually wake up the queue since > we are still in mid transmit so we just need to reset the bit so it > doesn't prevent the next t

Re: [PATCH] genetlink: make policy common to family

2019-03-22 Thread David Miller
From: Johannes Berg Date: Thu, 21 Mar 2019 22:51:02 +0100 > From: Johannes Berg > > Since maxattr is common, the policy can't really differ sanely, > so make it common as well. > > The only user that did in fact manage to make a non-common policy > is taskstats, which has to be really careful

Re: [PATCH v2 net-next 3/3] tcp: add one skb cache for rx

2019-03-22 Thread kbuild test robot
Hi Eric, I love your patch! Perhaps something to improve: [auto build test WARNING on net-next/master] url: https://github.com/0day-ci/linux/commits/Eric-Dumazet/tcp-add-rx-tx-cache-to-reduce-lock-contention/20190322-215506 config: i386-randconfig-x005-201911 (attached as .config) compiler

Re: macb: MID register on SAMA5D2 series?

2019-03-22 Thread Nicolas.Ferre
On 22/03/2019 at 11:49, Alexander Dahl wrote: > External E-Mail > > > Hei hei, > > while bringing up support for a new SAMA5D27 based board I noticed something > strange in the macb driver in both U-Boot and Linux. There's a function in > both to determine if or not the IP block in the SoC is th

[PATCH net-next] tcp: add documentation for tcp_ca_state

2019-03-22 Thread Soheil Hassas Yeganeh
From: Soheil Hassas Yeganeh Add documentation to the tcp_ca_state enum, since this enum is exposed in uapi. Signed-off-by: Neal Cardwell Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Signed-off-by: Soheil Hassas Yeganeh Cc: Sowmini Varadhan --- include/uapi/linux/tcp.h | 27

Re: [PATCH v2 net-next 3/3] tcp: add one skb cache for rx

2019-03-22 Thread kbuild test robot
Hi Eric, I love your patch! Yet something to improve: [auto build test ERROR on net-next/master] url: https://github.com/0day-ci/linux/commits/Eric-Dumazet/tcp-add-rx-tx-cache-to-reduce-lock-contention/20190322-215506 config: x86_64-randconfig-x016-201911 (attached as .config) compiler: gcc

[PATCH net-next v3 0/2] net: dev: BYPASS for lockless qdisc

2019-03-22 Thread Paolo Abeni
This patch series is aimed at improving xmit performances of lockless qdisc in the uncontended scenario. After the lockless refactor pfifo_fast can't leverage the BYPASS optimization. Due to retpolines the overhead for the avoidables enqueue and dequeue operations has increased and we see measurab

[PATCH net-next v3 2/2] net: dev: introduce support for sch BYPASS for lockless qdisc

2019-03-22 Thread Paolo Abeni
With commit c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array") pfifo_fast no longer benefit from the TCQ_F_CAN_BYPASS optimization. Due to retpolines the cost of the enqueue()/dequeue() pair has become relevant and we observe measurable regression for the uncontended scenario when the packet-rat

[PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc

2019-03-22 Thread Paolo Abeni
The queue is marked not empty after acquiring the seqlock, and it's up to the NOLOCK qdisc clearing such flag on dequeue. Since the empty status lays on the same cache-line of the seqlock, it's always hot on cache during the updates. This makes the empty flag update a little bit loosy. Given the l

Re: [PATCH net-next] tcp: add documentation for tcp_ca_state

2019-03-22 Thread Sowmini Varadhan
On (03/22/19 10:59), Soheil Hassas Yeganeh wrote: > > Add documentation to the tcp_ca_state enum, since this enum is > exposed in uapi. Acked-by: Sowmini Varadhan

[PATCH bpf-next v2 00/13] bpf tc tunneling

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn BPF allows for dynamic tunneling, choosing the tunnel destination and features on-demand. Extend bpf_skb_adjust_room to allow for efficient tunneling at the TC hooks. Most features are required for large packets with GSO, as these will be modified after this patch. Patch

[PATCH bpf-next v2 01/13] bpf: in bpf_skb_adjust_room avoid copy in tx fast path

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn bpf_skb_adjust_room calls skb_cow on grow. This expensive operation can be avoided in the fast path when the only other clone has released the header. This is the common case for TCP, where one headerless clone is kept on the retransmit queue. It is safe to do so even whe

[PATCH bpf-next v2 03/13] selftests/bpf: expand bpf tunnel test with decap

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn The bpf tunnel test encapsulates using bpf, then decapsulates using a standard tunnel device to verify correctness. Once encap is verified, also test decap, by replacing the tunnel device on decap with another bpf program. Signed-off-by: Willem de Bruijn --- .../selftes

[PATCH bpf-next v2 07/13] bpf: add bpf_skb_adjust_room mode BPF_ADJ_ROOM_MAC

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn bpf_skb_adjust_room net allows inserting room in an skb. Existing mode BPF_ADJ_ROOM_NET inserts room after the network header by pulling the skb, moving the network header forward and zeroing the new space. Add new mode BPF_ADJUST_ROOM_MAC that inserts room after the mac

[PATCH bpf-next v2 05/13] selftests/bpf: extend bpf tunnel test with gre

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn GRE is a commonly used protocol. Add GRE cases for both IPv4 and IPv6. It also inserts different sized headers, which can expose some unexpected edge cases. Signed-off-by: Willem de Bruijn --- .../selftests/bpf/progs/test_tc_tunnel.c | 148 +- tools

[PATCH bpf-next v2 10/13] bpf: Sync bpf.h to tools

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn Sync include/uapi/linux/bpf.h with tools/ Changes v1->v2: - BPF_F_ADJ_ROOM_MASK moved, no longer in this commit Signed-off-by: Willem de Bruijn --- tools/include/uapi/linux/bpf.h | 32 +--- 1 file changed, 29 insertions(+), 3 deletions(-)

[PATCH bpf-next v2 04/13] selftests/bpf: expand bpf tunnel test to ipv6

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn The test only uses ipv4 so far, expand to ipv6. This is mostly a boilerplate near copy of the ipv4 path. Signed-off-by: Willem de Bruijn --- tools/testing/selftests/bpf/config| 2 + .../selftests/bpf/progs/test_tc_tunnel.c | 116 +++--- too

[PATCH bpf-next v2 06/13] selftests/bpf: extend bpf tunnel test with tso

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn Segmentation offload takes a longer path. Verify that the feature works with large packets. The test succeeds if not setting dodgy in bpf_skb_adjust_room, as veth TSO is permissive. If not setting SKB_GSO_DODGY, this enables tunneled TSO offload on supporting NICs. The f

[PATCH bpf-next v2 09/13] bpf: add bpf_skb_adjust_room encap flags

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn When pushing tunnel headers, annotate skbs in the same way as tunnel devices. For GSO packets, the network stack requires certain fields set to segment packets with tunnel headers. gro_gse_segment depends on transport and inner mac header, for instance. Add an option to p

[PATCH bpf-next v2 08/13] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_FIXED_GSO

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn bpf_skb_adjust_room adjusts gso_size of gso packets to account for the pushed or popped header room. This is not allowed with UDP, where gso_size delineates datagrams. Add an option to avoid these updates and allow this call for datagrams. It can also be used with TCP, wh

[PATCH bpf-next v2 02/13] selftests/bpf: bpf tunnel encap test

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn Validate basic tunnel encapsulation using ipip. Set up two namespaces connected by veth. Connect a client and server. Do this with and without bpf encap. Signed-off-by: Willem de Bruijn --- tools/testing/selftests/bpf/Makefile | 3 +- .../selftests/bpf/progs/t

[PATCH bpf-next v2 11/13] selftests/bpf: convert bpf tunnel test to BPF_ADJ_ROOM_MAC

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn Avoid moving the network layer header when prefixing tunnel headers. This avoids an explicit call to bpf_skb_store_bytes and an implicit move of the network header bytes in bpf_skb_adjust_room. Signed-off-by: Willem de Bruijn --- .../selftests/bpf/progs/test_tc_tunnel.c

[PATCH bpf-next v2 12/13] selftests/bpf: convert bpf tunnel test to BPF_F_ADJ_ROOM_FIXED_GSO

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn Lower route MTU to ensure packets fit in device MTU after encap, then skip the gso_size changes. Signed-off-by: Willem de Bruijn --- tools/testing/selftests/bpf/progs/test_tc_tunnel.c | 11 --- tools/testing/selftests/bpf/test_tc_tunnel.sh | 6 ++ 2 fil

[PATCH bpf-next v2 13/13] selftests/bpf: convert bpf tunnel test to encap modes

2019-03-22 Thread Willem de Bruijn
From: Willem de Bruijn Make the tests correctly annotate skbs with tunnel metadata. This makes the gso tests succeed. Enable them. Signed-off-by: Willem de Bruijn --- .../selftests/bpf/progs/test_tc_tunnel.c | 19 +++ tools/testing/selftests/bpf/test_tc_tunnel.sh | 10 +++

Re: [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc

2019-03-22 Thread Eric Dumazet
On 03/22/2019 08:01 AM, Paolo Abeni wrote: > The queue is marked not empty after acquiring the seqlock, > and it's up to the NOLOCK qdisc clearing such flag on dequeue. > Since the empty status lays on the same cache-line of the > seqlock, it's always hot on cache during the updates. > > This m

Re: [PATCH v2 net-next 3/3] tcp: add one skb cache for rx

2019-03-22 Thread Eric Dumazet
On 03/22/2019 08:00 AM, kbuild test robot wrote: > Hi Eric, > > I love your patch! Yet something to improve: > > [auto build test ERROR on net-next/master] > > url: > https://github.com/0day-ci/linux/commits/Eric-Dumazet/tcp-add-rx-tx-cache-to-reduce-lock-co

Re: [PATCH net-next v3 1/2] net: sched: add empty status flag for NOLOCK qdisc

2019-03-22 Thread Ivan Vecera
On 22. 03. 19 16:01, Paolo Abeni wrote: The queue is marked not empty after acquiring the seqlock, and it's up to the NOLOCK qdisc clearing such flag on dequeue. Since the empty status lays on the same cache-line of the seqlock, it's always hot on cache during the updates. This makes the empty f

Re: [PATCH bpf-next v2 09/13] bpf: add bpf_skb_adjust_room encap flags

2019-03-22 Thread Alexei Starovoitov
On Fri, Mar 22, 2019 at 8:15 AM Willem de Bruijn wrote: > > +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 (1ULL << 1) > +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 (1ULL << 2) > +#define BPF_F_ADJ_ROOM_ENCAP_L3_MASK (BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 | \ > +BPF_F_ADJ_ROO

Re: [RFC 1/1] net/tun: fdinfo to relate TAP/TUN network interfaces unambiguously with their serving processes

2019-03-22 Thread Stephen Hemminger
On Fri, 22 Mar 2019 08:40:17 +0100 "Harald Albrecht" wrote: > This is a first for me to appear on the Linux kernel netdev list, so please > bear with me while killing me... > > For diagnosing Linux virtual networking I need to relate TAP/TUN network > interfacces to the user space processes th

Fw: [Bug 202997] New: High UDP traffic results in packet receive errors and system-wide UDP failure

2019-03-22 Thread Stephen Hemminger
Not sure if this a new problem, or just another case of "don't expect fragmented packets to be reliable". Begin forwarded message: Date: Thu, 21 Mar 2019 22:42:41 + From: bugzilla-dae...@bugzilla.kernel.org To: step...@networkplumber.org Subject: [Bug 202997] New: High UDP traffic results in

Re: [PATCH bpf-next v2 09/13] bpf: add bpf_skb_adjust_room encap flags

2019-03-22 Thread Willem de Bruijn
On Fri, Mar 22, 2019 at 11:44 AM Alexei Starovoitov wrote: > > On Fri, Mar 22, 2019 at 8:15 AM Willem de Bruijn > wrote: > > > > +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 (1ULL << 1) > > +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6 (1ULL << 2) > > +#define BPF_F_ADJ_ROOM_ENCAP_L3_MASK (BPF_F_ADJ_ROOM_

[BACKPORT 4.4.y 18/25] tcp/dccp: drop SYN packets if accept queue is full

2019-03-22 Thread Arnd Bergmann
From: Eric Dumazet Per listen(fd, backlog) rules, there is really no point accepting a SYN, sending a SYNACK, and dropping the following ACK packet if accept queue is full, because application is not draining accept queue fast enough. This behavior is fooling TCP clients that believe they establ

[PATCH v3 net-next 0/3] tcp: add rx/tx cache to reduce lock contention

2019-03-22 Thread Eric Dumazet
On hosts with many cpus we can observe a very serious contention on spinlocks used in mm slab layer. The following can happen quite often : 1) TX path sendmsg() allocates one (fclone) skb on CPU A, sends a clone. ACK is received on CPU B, and consumes the skb that was in the retransmit queu

[PATCH v3 net-next 1/3] net: convert rps_needed and rfs_needed to new static branch api

2019-03-22 Thread Eric Dumazet
We prefer static_branch_unlikely() over static_key_false() these days. Signed-off-by: Eric Dumazet Acked-by: Soheil Hassas Yeganeh Acked-by: Willem de Bruijn --- drivers/net/tun.c | 2 +- include/linux/netdevice.h | 4 ++-- include/net/sock.h | 2 +- net/core/dev.c

[PATCH v3 net-next 2/3] tcp: add one skb cache for tx

2019-03-22 Thread Eric Dumazet
On hosts with a lot of cores, RPC workloads suffer from heavy contention on slab spinlocks. 20.69% [kernel] [k] queued_spin_lock_slowpath 5.64% [kernel] [k] _raw_spin_lock 3.83% [kernel] [k] syscall_return_via_sysret 3.48% [kernel] [k] __entry_text_s

[PATCH v3 net-next 3/3] tcp: add one skb cache for rx

2019-03-22 Thread Eric Dumazet
Often times, recvmsg() system calls and BH handling for a particular TCP socket are done on different cpus. This means the incoming skb had to be allocated on a cpu, but freed on another. This incurs a high spinlock contention in slab layer for small rpc, but also a high number of cache line ping

Re: [PATCH net-next 1/3] net: convert rps_needed and rfs_needed to new static branch api

2019-03-22 Thread kbuild test robot
Hi Eric, I love your patch! Yet something to improve: [auto build test ERROR on net-next/master] url: https://github.com/0day-ci/linux/commits/Eric-Dumazet/net-convert-rps_needed-and-rfs_needed-to-new-static-branch-api/20190322-211954 config: ia64-allmodconfig (attached as .config) compiler

Re: [PATCH bpf-next v2 09/13] bpf: add bpf_skb_adjust_room encap flags

2019-03-22 Thread Alexei Starovoitov
On Fri, Mar 22, 2019 at 8:48 AM Willem de Bruijn wrote: > > On Fri, Mar 22, 2019 at 11:44 AM Alexei Starovoitov > wrote: > > > > On Fri, Mar 22, 2019 at 8:15 AM Willem de Bruijn > > wrote: > > > > > > +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV4 (1ULL << 1) > > > +#define BPF_F_ADJ_ROOM_ENCAP_L3_IPV6

Re: [pull request][net-next V2 00/15] Mellanox, mlx5 updates 2019-03-20

2019-03-22 Thread Saeed Mahameed
On Thu, Mar 21, 2019 at 3:51 PM Saeed Mahameed wrote: > > Hi Dave, > > This series includes mlx5 updates. > For more information please see tag log below. > > Please pull and let me know if there is any problem. > Hi Dave, I would like to re-spin and send V3 to move one patch out of this series i

Re: [PATCH net-next] ipv6: Move ipv6 stubs to a separate header file

2019-03-22 Thread Alexei Starovoitov
On Fri, Mar 22, 2019 at 06:06:09AM -0700, David Ahern wrote: > From: David Ahern > > The number of stubs is growing and has nothing to do with addrconf. > Move the definition of the stubs to a separate header file and update > users. In the move, drop the vxlan specific comment before ipv6_stub.

Re: [PATCH net-next] ipv6: Move ipv6 stubs to a separate header file

2019-03-22 Thread David Ahern
On 3/22/19 5:14 PM, Alexei Starovoitov wrote: > On Fri, Mar 22, 2019 at 06:06:09AM -0700, David Ahern wrote: >> From: David Ahern >> >> The number of stubs is growing and has nothing to do with addrconf. >> Move the definition of the stubs to a separate header file and update >> users. In the move

Re: [PATCH bpf-next v2 09/13] bpf: add bpf_skb_adjust_room encap flags

2019-03-22 Thread Willem de Bruijn
On Fri, Mar 22, 2019 at 12:11 PM Alexei Starovoitov wrote: > > On Fri, Mar 22, 2019 at 8:48 AM Willem de Bruijn > wrote: > > > > On Fri, Mar 22, 2019 at 11:44 AM Alexei Starovoitov > > wrote: > > > > > > On Fri, Mar 22, 2019 at 8:15 AM Willem de Bruijn > > > wrote: > > > > > > > > +#define BPF_

Re: [PATCH v3 net-next 0/3] tcp: add rx/tx cache to reduce lock contention

2019-03-22 Thread Tariq Toukan
On 3/22/2019 5:56 PM, Eric Dumazet wrote: On hosts with many cpus we can observe a very serious contention on spinlocks used in mm slab layer. The following can happen quite often : 1) TX path sendmsg() allocates one (fclone) skb on CPU A, sends a clone. ACK is received on CPU B, and c

Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports

2019-03-22 Thread Jiri Pirko
Thu, Mar 21, 2019 at 06:34:22PM CET, pa...@mellanox.com wrote: > > >> -Original Message- >> From: Jiri Pirko >> Sent: Thursday, March 21, 2019 12:21 PM >> To: Parav Pandit >> Cc: Jakub Kicinski ; Samudrala, Sridhar >> ; da...@davemloft.net; >> netdev@vger.kernel.org; oss-driv...@netronome

Re: [PATCH v3 net-next 0/3] tcp: add rx/tx cache to reduce lock contention

2019-03-22 Thread Eric Dumazet
On Fri, Mar 22, 2019 at 9:37 AM Tariq Toukan wrote: > > > > Hi Eric, > > Does this have any effect on non tcp traffic? Of course not :)

[patch net-next v2 04/15] bnxt: set devlink port attrs properly

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Set the attrs properly so delink has enough info to generate physical port names. Signed-off-by: Jiri Pirko --- drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/

[patch net-next v2 01/15] net: devlink: add couple of missing mutex_destroy() calls

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Add missing called to mutex_destroy() for two mutexes used in devlink code. Signed-off-by: Jiri Pirko --- net/core/devlink.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/core/devlink.c b/net/core/devlink.c index 78e22cea4cc7..3dc51ddf7451 100644 --- a/net/core/dev

[patch net-next v2 00/15] devlink: small spring cleanup

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Mostly cosmetics and janitor work. Jiri Pirko (15): net: devlink: add couple of missing mutex_destroy() calls bnxt: add missing net/devlink.h include dsa: add missing net/devlink.h include bnxt: set devlink port attrs properly bnxt: call devlink_port_type_eth_set() bef

[patch net-next v2 02/15] bnxt: add missing net/devlink.h include

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko devlink functions are in use, so include the related header file. Signed-off-by: Jiri Pirko --- drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broad

[patch net-next v2 03/15] dsa: add missing net/devlink.h include

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko devlink functions are in use, so include the related header file. Signed-off-by: Jiri Pirko Reviewed-by: Andrew Lunn --- net/dsa/dsa2.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c index c00ee464afc7..4558de672b4f 100644 --- a/net/dsa/ds

[patch net-next v2 07/15] net: devlink: don't pass return value of __devlink_port_type_set()

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko __devlink_port_type_set() returns void, it makes no sense to pass it on, so don't do that. Signed-off-by: Jiri Pirko --- net/core/devlink.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/net/core/devlink.c b/net/core/devlink.c index 1e125c3b890c..

[patch net-next v2 10/15] net: devlink: disallow port_attrs_set() to be called before register

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Since the port attributes are static and cannot change during the port lifetime, WARN_ON if some driver calls it after registration. Also, no need to call notifications as it is noop anyway due to check of devlink_port->registered there. Signed-off-by: Jiri Pirko --- net/core/

[patch net-next v2 06/15] net: devlink: don't take devlink_mutex for devlink_compat_*

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko The netdevice is guaranteed to not disappear so we can rely that devlink_port and devlink won't disappear as well. No need to take devlink_mutex so don't take it here. Signed-off-by: Jiri Pirko --- net/core/devlink.c | 18 -- 1 file changed, 8 insertions(+), 10

[patch net-next v2 09/15] dsa: move devlink_port_attrs_set() call before register

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Since attrs are static during the existence of devlink port, set the before registration of the port. Signed-off-by: Jiri Pirko --- v1->v2: - fixed comment for port numbering --- net/dsa/dsa2.c | 47 ++- 1 file changed, 26 insertions

[patch net-next v2 05/15] bnxt: call devlink_port_type_eth_set() before port register

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Call devlink_port_type_eth_set() before devlink_port_register(). Bnxt instances won't change type during lifetime. This avoids one extra userspace devlink notification. Signed-off-by: Jiri Pirko --- drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c | 2 +- 1 file changed, 1 in

[patch net-next v2 14/15] net: devlink: add port type spinlock

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Add spinlock to protect port type and type_dev pointer consistency. Without that, userspace may see inconsistent type and type_dev combinations. Signed-off-by: Jiri Pirko v1->v2: - rebased --- include/net/devlink.h | 4 net/core/devlink.c| 17 + 2 fil

[patch net-next v2 15/15] net: devlink: select NET_DEVLINK from drivers

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Some drivers are becoming more dependent on NET_DEVLINK being selected in configuration. With upcoming compat functions, the behavior would be wrong in case devlink was not compiled in. So make the drivers select NET_DEVLINK and rely on the functions being there, not just stubs.

[patch net-next v2 11/15] nfp: move devlink port type set after netdev registration

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Similar to other driver, move the port type set after netdev registration is done. Along with that, clear the type before unregistration. Signed-off-by: Jiri Pirko --- v1->v2: - new patch --- drivers/net/ethernet/netronome/nfp/nfp_devlink.c | 11 ++- drivers/net/ether

[patch net-next v2 12/15] bnxt: set devlink port type after registration

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Move the type set of devlink port after it is registered. Signed-off-by: Jiri Pirko --- v1->v2: - new patch --- drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink

[patch net-next v2 08/15] mlxsw: Move devlink_port_attrs_set() call before register

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Since attrs are static during the existence of devlink port, set the before registration of the port. Signed-off-by: Jiri Pirko --- drivers/net/ethernet/mellanox/mlxsw/core.c | 12 ++-- drivers/net/ethernet/mellanox/mlxsw/core.h | 8 drivers/net/ether

[patch net-next v2 13/15] net: devlink: warn on setting type on unregistered port

2019-03-22 Thread Jiri Pirko
From: Jiri Pirko Port needs to be registered first before the type is set. Warn and bail-out in case it is not. Signed-off-by: Jiri Pirko --- v1->v2: - new patch --- net/core/devlink.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/core/devlink.c b/net/core/devlink.c index f77a68f72

Re: [PATCH net-next] ipv6: Move ipv6 stubs to a separate header file

2019-03-22 Thread Alexei Starovoitov
On Fri, Mar 22, 2019 at 05:17:52PM +0100, David Ahern wrote: > On 3/22/19 5:14 PM, Alexei Starovoitov wrote: > > On Fri, Mar 22, 2019 at 06:06:09AM -0700, David Ahern wrote: > >> From: David Ahern > >> > >> The number of stubs is growing and has nothing to do with addrconf. > >> Move the definitio

Re: [iproute2 PATCH] ip: bridge: add mcast to unicast config flag

2019-03-22 Thread Stephen Hemminger
On Thu, 21 Mar 2019 09:32:39 +0100 Tobias Jungel wrote: > This adds configuration for the IFLA_BRPORT_MCAST_TO_UCAST flag that > allows multicast packets to be replicated as unicast packets. > > Signed-off-by: Tobias Jungel Applied. And this motivated me to fix the long lines in the man pages

Re: [PATCH net v2 1/1] net: sched: fix cleanup NULL pointer exception in act_mirr

2019-03-22 Thread Cong Wang
On Fri, Mar 22, 2019 at 5:37 AM John Hurley wrote: > > A new mirred action is created by the tcf_mirred_init function. This > contains a list head struct which is inserted into a global list on > successful creation of a new action. However, after a creation, it is > still possible to error out an

[net-next 13/15] ice: Set LAN_EN for all directional rules

2019-03-22 Thread Jeff Kirsher
From: Yashaswini Raghuram Prathivadi Bhayankaram The LAN_EN bit for a switch rule determines if the packet can go out on the wire or not. Set the LAN_EN flag in the switch action for all directional rules. Signed-off-by: Yashaswini Raghuram Prathivadi Bhayankaram Signed-off-by: Anirudh Venkat

[net-next 14/15] ice: Don't let VF know that it is untrusted

2019-03-22 Thread Jeff Kirsher
From: Akeem G Abodunrin Don't let the VF know it's not trusted when it tries to add more than permitted additional MAC addresses. Signed-off-by: Akeem G Abodunrin Signed-off-by: Anirudh Venkataramanan Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/ice/ic

[net-next 00/15][pull request] 100GbE Intel Wired LAN Driver Updates 2019-03-22

2019-03-22 Thread Jeff Kirsher
This series contains updates to ice driver only. Akeem enables MAC anti-spoofing by default when a new VSI is being created. Fixes an issue when reclaiming VF resources back to the pool after reset, by freeing VF resources separately using the first VF vector index to traverse the list, instead o

[net-next 09/15] ice: code cleanup in ice_sched.c

2019-03-22 Thread Jeff Kirsher
From: Victor Raj This patch does some clean up in the Tx scheduler code: 1. Adjust the stack variable usage 2. Modify the debug prints to display the FW error 3. Add additional debug prints while adding/removing VSIs Signed-off-by: Victor Raj Reviewed-by: Bruce Allan Signed-off-by: Anirudh Ve

[net-next 07/15] ice: fix some function prototype and signature style issues

2019-03-22 Thread Jeff Kirsher
From: Bruce Allan Put the return type on a separate line for function prototypes and signatures that would exceed the 80-character limit if both were on the same line. Signed-off-by: Bruce Allan Signed-off-by: Anirudh Venkataramanan Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- d

[net-next 01/15] ice: Enable MAC anti-spoof by default

2019-03-22 Thread Jeff Kirsher
From: Akeem G Abodunrin This patch enables MAC anti-spoof by default, with creation of VF VSIs or when the VF VSIs are being re-initialized. Signed-off-by: Akeem G Abodunrin Signed-off-by: Anirudh Venkataramanan Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/i

  1   2   >