On 1/14/21 3:36 PM, Jesper Dangaard Brouer wrote:
[...]
+BPF_CALL_5(bpf_skb_check_mtu, struct sk_buff *, skb,
+          u32, ifindex, u32 *, mtu_len, s32, len_diff, u64, flags)
+{
+       int ret = BPF_MTU_CHK_RET_FRAG_NEEDED;
+       struct net_device *dev = skb->dev;
+       int skb_len, dev_len;
+       int mtu;
+
+       if (unlikely(flags & ~(BPF_MTU_CHK_SEGS)))
+               return -EINVAL;
+
+       dev = __dev_via_ifindex(dev, ifindex);
+       if (unlikely(!dev))
+               return -ENODEV;
+
+       mtu = READ_ONCE(dev->mtu);
+
+       dev_len = mtu + dev->hard_header_len;
+       skb_len = skb->len + len_diff; /* minus result pass check */
+       if (skb_len <= dev_len) {
+               ret = BPF_MTU_CHK_RET_SUCCESS;
+               goto out;
+       }
+       /* At this point, skb->len exceed MTU, but as it include length of all
+        * segments, it can still be below MTU.  The SKB can possibly get
+        * re-segmented in transmit path (see validate_xmit_skb).  Thus, user
+        * must choose if segs are to be MTU checked.  Last SKB "headlen" is
+        * checked against MTU.
+        */
+       if (skb_is_gso(skb)) {
+               ret = BPF_MTU_CHK_RET_SUCCESS;
+
+               if (!(flags & BPF_MTU_CHK_SEGS))
+                       goto out;
+
+               if (!skb_gso_validate_network_len(skb, mtu)) {
+                       ret = BPF_MTU_CHK_RET_SEGS_TOOBIG;
+                       goto out;
+               }
+
+               skb_len = skb_headlen(skb) + len_diff;
+               if (skb_len > dev_len) {
[...]
Do you have a particular use case for the BPF_MTU_CHK_SEGS?

The complaint from Maze (and others) were that when skb_is_gso then all
the MTU checks are bypassed.  This flag enables checking the GSO part
via skb_gso_validate_network_len().  We cannot enable it per default,
as you say, it is universally correct in all cases.

If there is a desire to have access to the skb_gso_validate_network_len(), I'd
keep that behind the flag then, but would drop the skb_headlen(skb) + len_diff
case given the mentioned case on rx where it would yield misleading results to
users that might be unintuitive & hard to debug.

I also don't see the flag being used anywhere in your selftests, so I presume
not as otherwise you would have added an example there?

I'm using the flag in the bpf-examples code[1], this is how I've tested
the code path.

I've not found a way to generate GSO packet via the selftests
infrastructure via bpf_prog_test_run_xattr().  I'm

[1] 
https://github.com/xdp-project/bpf-examples/blob/master/MTU-tests/tc_mtu_enforce.c

Haven't checked but likely something as prog_tests/skb_ctx.c might not be 
sufficient
to pass it into the helper. For real case you might need a netns + veth setup 
like
some of the other tests are doing and then generating TCP stream from one end 
to the
other.

Reply via email to