The check in harmonize_features() is supposed to match the skb against the features of the device in question, and prevent us from handing a skb to a device which can't handle it.
It doesn't work correctly. A device with NETIF_F_IP_CSUM or NETIF_F_IPV6_CSUM capabilities is only required to checksum TCP or UDP, on Legacy IP and IPv6 respectively. But the existing check will allow us to pass it *any* ETH_P_IP/ETH_P_IPV6 packets for hardware checksum offload. Depending on the driver in use, this leads to a BUG, a WARNING, or just silent data corruption. This is one approach to fixing that, and my test program at http://bombadil.infradead.org/~dwmw2/raw.c can no longer trivially reproduce the problem. The test does now have false *negatives*, but those shouldn't happen for locally-generated packets; only packets injected from af_packet, tun, virtio_net and other places that allow us to inject CHECKSUM_PARTIAL packets in order to make use of hardware offload features. And false negatives aren't anywhere near as much of a problem as false positives are — we just finish the checksum in software and send the packet anyway. It would be possible to fix those false negatives, if we really wanted to — perhaps by adding an additional bit in the skbuff which indicates that it *is* a TCP or UDP packet, rather than using ->sk->sk_protocol. Then that bit could be set if appropriate in skb_partial_csum_set(), as well as the places where we locally generate such packets. And the check in can_checksum_protocol() would just check for that bit. Signed-off-by: David Woodhouse <david.woodho...@intel.com> --- Since can_checksum_protocol is inline, the compiler ought to know that it doesn't even need to dereference skb->sk in the case where the device has the NETIF_F_GEN_CSUM feature. So the additional check should not slow down the (hopefully) common case in the fast path. diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 2d15e38..76c8330 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3628,15 +3628,23 @@ struct sk_buff *skb_gso_segment(struct sk_buff *skb, netdev_features_t features) __be16 skb_network_protocol(struct sk_buff *skb, int *depth); static inline bool can_checksum_protocol(netdev_features_t features, - __be16 protocol) -{ - return ((features & NETIF_F_GEN_CSUM) || - ((features & NETIF_F_V4_CSUM) && - protocol == htons(ETH_P_IP)) || - ((features & NETIF_F_V6_CSUM) && - protocol == htons(ETH_P_IPV6)) || - ((features & NETIF_F_FCOE_CRC) && - protocol == htons(ETH_P_FCOE))); + __be16 protocol, u8 sk_protocol) +{ + if ((features & NETIF_F_GEN_CSUM) || + ((features & NETIF_F_FCOE_CRC) && protocol == htons(ETH_P_FCOE))) + return 1; + + /* NETIF_F_V[46]_CSUM are defined to work only on TCP and UDP. + * That is, when it needs to start checksumming at the transport + * header, and place the result at an offset of either 6 (for UDP) + * or 16 (for TCP). + */ + if ((((features & NETIF_F_V4_CSUM) && protocol == htons(ETH_P_IP)) || + ((features & NETIF_F_V6_CSUM) && protocol == htons(ETH_P_IPV6))) && + (sk_protocol == IPPROTO_TCP || sk_protocol == IPPROTO_UDP)) + return 1; + + return 0; } #ifdef CONFIG_BUG diff --git a/net/core/dev.c b/net/core/dev.c index 6bb6470..3c40957 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2613,7 +2613,8 @@ static netdev_features_t harmonize_features(struct sk_buff *skb, features = net_mpls_features(skb, features, type); if (skb->ip_summed != CHECKSUM_NONE && - !can_checksum_protocol(features, type)) { + !can_checksum_protocol(features, type, + skb->sk ? skb->sk->sk_protocol : 0)) { features &= ~NETIF_F_ALL_CSUM; } else if (illegal_highdma(skb->dev, skb)) { features &= ~NETIF_F_SG; diff --git a/net/core/skbuff.c b/net/core/skbuff.c index dad4dd3..9126c6f 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -3004,7 +3004,8 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb, return ERR_PTR(-EINVAL); csum = !head_skb->encap_hdr_csum && - !!can_checksum_protocol(features, proto); + !!can_checksum_protocol(features, proto, + head_skb->sk ? head_skb->sk->sk_protocol : 0); headroom = skb_headroom(head_skb); pos = skb_headlen(head_skb); -- David Woodhouse Open Source Technology Centre david.woodho...@intel.com Intel Corporation
smime.p7s
Description: S/MIME cryptographic signature