On Mon, Oct 22, 2018 at 6:13 AM Paolo Abeni <pab...@redhat.com> wrote: > > On Sun, 2018-10-21 at 16:06 -0400, Willem de Bruijn wrote: > > On Fri, Oct 19, 2018 at 10:30 AM Paolo Abeni <pab...@redhat.com> wrote: > > > > > > This is the RX counterpart of commit bec1f6f69736 ("udp: generate gso > > > with UDP_SEGMENT"). When UDP_GRO is enabled, such socket is also > > > eligible for GRO in the rx path: UDP segments directed to such socket > > > are assembled into a larger GSO_UDP_L4 packet. > > > > > > The core UDP GRO support is enabled with setsockopt(UDP_GRO). > > > > > > Initial benchmark numbers: > > > > > > Before: > > > udp rx: 1079 MB/s 769065 calls/s > > > > > > After: > > > udp rx: 1466 MB/s 24877 calls/s > > > > > > > > > This change introduces a side effect in respect to UDP tunnels: > > > after a UDP tunnel creation, now the kernel performs a lookup per ingress > > > UDP packet, while before such lookup happened only if the ingress packet > > > carried a valid internal header csum. > > > > > > v1 -> v2: > > > - use a new option to enable UDP GRO > > > - use static keys to protect the UDP GRO socket lookup > > > > > > Signed-off-by: Paolo Abeni <pab...@redhat.com> > > > --- > > > include/linux/udp.h | 3 +- > > > include/uapi/linux/udp.h | 1 + > > > net/ipv4/udp.c | 7 +++ > > > net/ipv4/udp_offload.c | 109 +++++++++++++++++++++++++++++++-------- > > > net/ipv6/udp_offload.c | 6 +-- > > > 5 files changed, 98 insertions(+), 28 deletions(-) > > > > > > diff --git a/include/linux/udp.h b/include/linux/udp.h > > > index a4dafff407fb..f613b329852e 100644 > > > --- a/include/linux/udp.h > > > +++ b/include/linux/udp.h > > > @@ -50,11 +50,12 @@ struct udp_sock { > > > __u8 encap_type; /* Is this an Encapsulation > > > socket? */ > > > unsigned char no_check6_tx:1,/* Send zero UDP6 checksums on > > > TX? */ > > > no_check6_rx:1,/* Allow zero UDP6 checksums on > > > RX? */ > > > - encap_enabled:1; /* This socket enabled encap > > > + encap_enabled:1, /* This socket enabled encap > > > * processing; UDP tunnels and > > > * different encapsulation > > > layer set > > > * this > > > */ > > > + gro_enabled:1; /* Can accept GRO packets */ > > > > > > /* > > > * Following member retains the information to create a UDP header > > > * when the socket is uncorked. > > > diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h > > > index 09502de447f5..30baccb6c9c4 100644 > > > --- a/include/uapi/linux/udp.h > > > +++ b/include/uapi/linux/udp.h > > > @@ -33,6 +33,7 @@ struct udphdr { > > > #define UDP_NO_CHECK6_TX 101 /* Disable sending checksum for UDP6X */ > > > #define UDP_NO_CHECK6_RX 102 /* Disable accpeting checksum for UDP6 */ > > > #define UDP_SEGMENT 103 /* Set GSO segmentation size */ > > > +#define UDP_GRO 104 /* This socket can receive UDP > > > GRO packets */ > > > > > > /* UDP encapsulation types */ > > > #define UDP_ENCAP_ESPINUDP_NON_IKE 1 /* > > > draft-ietf-ipsec-nat-t-ike-00/01 */ > > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > > > index 9fcb5374e166..3c277378814f 100644 > > > --- a/net/ipv4/udp.c > > > +++ b/net/ipv4/udp.c > > > @@ -115,6 +115,7 @@ > > > #include "udp_impl.h" > > > #include <net/sock_reuseport.h> > > > #include <net/addrconf.h> > > > +#include <net/udp_tunnel.h> > > > > > > struct udp_table udp_table __read_mostly; > > > EXPORT_SYMBOL(udp_table); > > > @@ -2459,6 +2460,12 @@ int udp_lib_setsockopt(struct sock *sk, int level, > > > int optname, > > > up->gso_size = val; > > > break; > > > > > > + case UDP_GRO: > > > + if (valbool) > > > + udp_tunnel_encap_enable(sk->sk_socket); > > > + up->gro_enabled = valbool; > > > > The socket lock is not held here, so multiple updates to > > up->gro_enabled and the up->encap_enabled and the static branch can > > race. Syzkaller is adept at generating those. > > Good catch. I was fooled by the current existing code. I think there > are potentially similar issues for UDP_ENCAP, UDPLITE_SEND_CSCOV, ... > > Since the rx path don't take it anymore and we don't risk starving, I > think we should could/always acquire the socket lock on setsockopt, > wdyt?
Agreed. We had to add a lot of those in packet_setsockopt for the same reason.