On Sun, Oct 11, 2020 at 2:01 PM Willem de Bruijn <willemdebruijn.ker...@gmail.com> wrote: > > There is agreement that hard_header_len should be the length of link > layer headers visible to the upper layers, needed_headroom the > additional room required for headers that are not exposed, i.e., those > pushed inside ndo_start_xmit. > > The link layer header length also has to agree with the interface > hardware type (ARPHRD_..). > > Tunnel devices have not always been consistent in this, but today > "bare" ip tunnel devices without additional headers (ipip, sit, ..) do > match this and advertise 0 byte hard_header_len. Bareudp, vxlan and > geneve also conform to this. Known exception that probably needs to be > addressed is sit, which still advertises LL_MAX_HEADER and so has > exposed quite a few syzkaller issues. Side note, it is not entirely > clear to me what sets ARPHRD_TUNNEL et al apart from ARPHRD_NONE and > why they are needed. > > GRE devices advertise ARPHRD_IPGRE and GRETAP advertise ARPHRD_ETHER. > The second makes sense, as it appears as an Ethernet device. The first > should match "bare" ip tunnel devices, if following the above logic. > Indeed, this is what commit e271c7b4420d ("gre: do not keep the GRE > header around in collect medata mode") implements. It changes > dev->type to ARPHRD_NONE in collect_md mode. > > Some of the inconsistency comes from the various modes of the GRE > driver. Which brings us to ipgre_header_ops. It is set only in two > special cases. > > Commit 6a5f44d7a048 ("[IPV4] ip_gre: sendto/recvfrom NBMA address") > added ipgre_header_ops.parse to be able to receive the inner ip source > address with PF_PACKET recvfrom. And apparently relies on > ipgre_header_ops.create to be able to set an address, which implies > SOCK_DGRAM. > > The other special case, CONFIG_NET_IPGRE_BROADCAST, predates git. Its > implementation starts with the beautiful comment "/* Nice toy. > Unfortunately, useless in real life :-)". From the rest of that > detailed comment, it is not clear to me why it would need to expose > the headers. The example does not use packet sockets. > > A packet socket cannot know devices details such as which configurable > mode a device may be in. And different modes conflict with the basic > rule that for a given well defined link layer type, i.e., dev->type, > header length can be expected to be consistent. In an ideal world > these exceptions would not exist, therefore.
Nice explanation of the situation. I agree with you. Thanks!