On Tue, Mar 22, 2016 at 2:45 PM, Alexander Duyck <alexander.du...@gmail.com> wrote: > On Tue, Mar 22, 2016 at 2:10 PM, Tom Herbert <t...@herbertland.com> wrote: >> On Tue, Mar 22, 2016 at 1:20 PM, Alexander Duyck >> <alexander.du...@gmail.com> wrote: >>> On Tue, Mar 22, 2016 at 12:19 PM, Tom Herbert <t...@herbertland.com> wrote: >>>> In skb_segment the check of whether or not to perform the checksum on >>>> host was changed to not consider rather remote checksum offload is >>>> in use. In the case that can_checksum_protocol fails the checksum >>>> is computed regardless. __skb_udp_tunnel_segment was modified in a >>>> related patch to add NETIF_F_HW_CSUM into features when grabbing >>>> the enc_features and remote checksum offload is being done. The >>>> problem is that this bit can be cleared in lower GSO layers that >>>> are also doing tunneling (e.g. ipip, GRE when used with GUE), >>>> so when we get to skb_segment that intent has been lost and >>>> can_checksum_protocol fails. >>> >>> So what you are describing sounds like a tunnel in tunnel scenario. >>> It might work better to just skip masking the features if >>> skb->remcsum_offload is set rather than trying to change how we >>> perform the offload. >>> >> To be clear, my patch is restoring the old behavior not implementing a new >> one. > > Yes, but the old behavior could lead to kernel panics under certain > circumstances. > >>> I'm pretty sure this will cause data corruption and maybe a kernel >>> panic if Tx checksum offload is disabled. >>> >> Nope, working fine for me. > > What are the options you used to create the tunnel? Did you enable > both remcsum and udpcsum? > modprobe fou ./ip fou add port 6080 gue ./ip link add name tun1 type ipip remote 10.1.1.2 local 10.1.1.2 ttl 225 encap gue encap-sport auto encap-dport 6080 encap-csum encap-remcsum ifconfig tun1 192.168.1.1 ip route add 192.168.1.0/24 dev tun1
Thanks, Tom >> The outer checksum would be computed in >>>> This patch: >>>> >>>> - Restores the check in skb_segment to look at remote checksum offload. >>>> - Removes the code in __skb_udp_tunnel_segment to force the >>>> NETIF_F_HW_CSUM feature since this is no longer useful with above >>>> change. >>>> - Removes check for remote checksum offload in gso_reset_checksum. >>>> A special case should not be needed here. >>>> >>>> Tested: Single netperf STREAM over GUE-ipip >>>> >>>> Before fix: >>>> 5625 Mbps >>>> After fix: >>>> 6410 Mbps >>>> >>>> Fixes: 76443456227097179c1482 ("net: Move GSO csum into SKB_GSO_CB") >>>> Signed-off-by: Tom Herbert <t...@herbertland.com> >>>> --- >>>> include/linux/skbuff.h | 4 ---- >>>> net/core/skbuff.c | 5 ++--- >>>> net/ipv4/udp_offload.c | 10 ---------- >>>> 3 files changed, 2 insertions(+), 17 deletions(-) >>>> >>>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h >>>> index 15d0df9..f6fe8a5 100644 >>>> --- a/include/linux/skbuff.h >>>> +++ b/include/linux/skbuff.h >>>> @@ -3615,10 +3615,6 @@ static inline int gso_pskb_expand_head(struct >>>> sk_buff *skb, int extra) >>>> >>>> static inline void gso_reset_checksum(struct sk_buff *skb, __wsum res) >>>> { >>>> - /* Do not update partial checksums if remote checksum is enabled. >>>> */ >>>> - if (skb->remcsum_offload) >>>> - return; >>>> - >>>> SKB_GSO_CB(skb)->csum = res; >>>> SKB_GSO_CB(skb)->csum_start = skb_checksum_start(skb) - skb->head; >>>> } >>> >>> I'm pretty sure this part here will break things when you don't have >>> an outer offload enabled. What NIC did you test this on? Did you try >>> disabling the Tx checksum support in the hardware to see what would >>> happen? >>> >> Mellanox. >> >> When TX checksum is disabled the outer checksum is computed in >> __skb_udp_tunnel_segment. > > Yes, but that uses gso_make_checksum which requires the csum and > csum_start fields be initialized in the SKB_GSO_CB(). > >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >>>> index f044f97..e4eb78d 100644 >>>> --- a/net/core/skbuff.c >>>> +++ b/net/core/skbuff.c >>>> @@ -3259,14 +3259,13 @@ skip_fraglist: >>>> nskb->truesize += nskb->data_len; >>>> >>>> perform_csum_check: >>>> - if (!csum) { >>>> + if (!csum && !nskb->remcsum_offload) { >>>> if (skb_has_shared_frag(nskb)) { >>>> err = __skb_linearize(nskb); >>>> if (err) >>>> goto err; >>>> } >>>> - if (!nskb->remcsum_offload) >>>> - nskb->ip_summed = CHECKSUM_NONE; >>>> + nskb->ip_summed = CHECKSUM_NONE; >>>> SKB_GSO_CB(nskb)->csum = >>>> skb_checksum(nskb, doffset, >>>> nskb->len - doffset, 0); >>> >>> I'm pretty sure this is going to cause a huge mess if you are >>> requesting remote checksum but cannot perform an outer checksum. One >>> of the reasons I merged these features together the way I did was >>> because we needed to perform the initial checksum to avoid causing a >>> kernel panic later on. Otherwise we don't have the SKB_GSO_CB()->csum >>> and SKB_GSO_CB()->csum_start fields populated. >>> >> Nope, if we can't perform outer checksum it is done in the host. I'm >> not seeing any problem. > > Are you running remcsum with udpcsum set or is udpcsum not set? If > you don't set udpcsum it doesn't actually compute an outer checksum > value. > >>>> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c >>>> index 8a3405a..f86f1e1 100644 >>>> --- a/net/ipv4/udp_offload.c >>>> +++ b/net/ipv4/udp_offload.c >>>> @@ -78,16 +78,6 @@ static struct sk_buff *__skb_udp_tunnel_segment(struct >>>> sk_buff *skb, >>>> >>>> features &= skb->dev->hw_enc_features; >>>> >>>> - /* The only checksum offload we care about from here on out is the >>>> - * outer one so strip the existing checksum feature flags and >>>> - * instead set the flag based on our outer checksum offload value. >>>> - */ >>>> - if (remcsum || ufo) { >>>> - features &= ~NETIF_F_CSUM_MASK; >>>> - if (!need_csum || offload_csum) >>>> - features |= NETIF_F_HW_CSUM; >>>> - } >>>> - >>>> /* segment inner packet. */ >>>> segs = gso_inner_segment(skb, features); >>>> if (IS_ERR_OR_NULL(segs)) { >>> >>> This part breaks UDP fragmentation I am pretty sure. We need this bit >>> for UFO to avoid having to perform a checksum over the data twice if >>> we are offloading the outer UDP checksum. >> >> The maybe leave this for ufo if it is a problem (please test your >> suppositions!), but this is not good for remcsum offload. > > I suspect we are supposed to be setting encap_level to 0 when we set > skb->encapsulation to 0 before processing inner headers in the case of > UDP and GRE tunnel offloads. I'll try to test and submit something in > the next hour or two. > > - Alex