On Mon, Jan 14, 2019 at 10:15 AM Willem de Bruijn <willemdebruijn.ker...@gmail.com> wrote: > > On Mon, Jan 14, 2019 at 9:51 AM Nicolas Dichtel > <nicolas.dich...@6wind.com> wrote: > > > > Le 12/01/2019 à 19:00, Willem de Bruijn a écrit : > > > On Fri, Jan 11, 2019 at 4:29 PM Willem de Bruijn > > > <willemdebruijn.ker...@gmail.com> wrote: > > >> > > >> On Fri, Jan 11, 2019 at 9:44 AM Nicolas Dichtel > > >> <nicolas.dich...@6wind.com> wrote: > > >>> > > >>> Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in > > >>> SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel: > > >>> > > >>> Here is a example of the setup: > > >>> $ ip link set ntfp2 up > > >>> $ ip addr add 10.125.0.1/24 dev ntfp2 > > >>> $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 > > >>> dev ntfp2 > > >>> $ ip addr add fd00:cafe:cafe::1/128 dev tun1 > > >>> $ ip link set dev tun1 up > > >>> $ ip route add fd00:200::/64 dev tun1 > > >>> $ scapy > > >>>>>> p = [] > > >>>>>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest() > > >>>>>> send(p, count=1, inter=0.1) > > >>>>>> quit() > > >>> $ ip -s link ls dev tun1 | grep -A1 "TX.*errors" > > >>> TX: bytes packets errors dropped carrier collsns > > >>> 0 0 1 0 0 0 > > >>> > > >>> The problem is that the network offset is set to the hard_header_len of > > >>> the > > >>> output device (tun1, ie 14 + 20) and in our case, because the packet is > > >>> small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 > > >>> bytes > > >>> (ipv6 header) starting from the network offset). Let's reset the network > > >>> offset so that it points to the beginning of the L3 data, ie skb->data. > > >> > > >> This only just landed in my inbox, sorry for the delay. So far I'm not > > >> reproducing the issue, but I'm trying and am having a look. > > >> > > >> Which pskb_inet_may_pull() is failing? > > > > > > Okay. I see what's going on. > > > > > > Commit cb9f1b783850 converted a pskb_may_pull ipv6hdr, pulling from > > > skb->data, to pskb_network_may_pull, pulling from skb->network_header. > > > > > > Normally packet sockets with SOCK_RAW write a fixed size hardware > > > header and place skb->network_header directly after that. Dropping > > > these in sit_tunnel_xmit if they fail pskb_inet_may_pull is correct. > > > > > > But with devices with variable length hardware headers like sit, it is > > > possible to cook packets that are shorter than the upper bound hhlen. > > > In that case, we have no way of knowing where the variable length > > > header ends (short of device specific parsing if implemented, see also > > > header_ops.validate), so we set the network header to the same as the > > > mac header as of commit 993675a3100b1. > > > > > > This is not a good thing to do in general, and not needed in the > > > common case. > > Sorry, but I fail to understand what we try to achieve here. In this case, > > the > > packet socket builds a packet without a hard header: > > > > <-------48B---------> > > <--40B---> > > +--------+----------+ > > |IPv6 hdr|ICMPv6 hdr| > > +--------+----------+ > > ^ skb->data > > ^ nh offset: 34 (ie dev->hard_header_len, ie ethernet hdr + ipv4 hdr) > > > > The nh offset does not match any data in the packet. > > > > dev_validate_header() checks that datalen (ie 48) is >= > > dev->hard_header_len (ie > > 34). Still no hard header here. > > > > Then, dev->xmit (ie sit_tunnel_xmit()) is called. This function calls > > pskb_inet_may_pull() which tries to pull 40B from the nh offset (34). But > > the nh > > offset still points somewhere in the ipv6 hdr, thus the test fails (34 + 40 > > > 48). > > I don't understand why it is wrong to set the nh offset to 0 (skb->data), ie > > exactly where the ipv6 header starts. > > It is wrong because for other devices l2 header length is not zero, so > this incorrectly sets skb->network_header to the start of the link > layer header on all those devices. > > A one line variant of the above would be > > - if (len < reserve) > + if (len < reserve + sizeof(struct ipv6hdr))
and exclude the majority of devices with fixed hard header len. Those require len to be >= reserve, so this workaround does not apply to them: if (len < reserve + sizeof(struct ipv6hdr) && dev->min_header_len != dev->hard_header_len)