On Mon, Jun 29, 2020 at 8:37 PM Tom Herbert <t...@herbertland.com> wrote:
>
> On Mon, Jun 29, 2020 at 4:07 PM Eric Dumazet <eric.duma...@gmail.com> wrote:
> >
> >
> >
> > On 6/29/20 2:30 PM, Willem de Bruijn wrote:
> > > On Mon, Jun 29, 2020 at 5:15 PM Eric Dumazet <eric.duma...@gmail.com> 
> > > wrote:
> > >>
> > >>
> > >>
> > >> On 6/29/20 9:57 AM, Willem de Bruijn wrote:
> > >>> From: Willem de Bruijn <will...@google.com>
> > >>>
> > >>> ICMP messages may include an extension structure after the original
> > >>> datagram. RFC 4884 standardized this behavior.
> > >>>
> > >>> It introduces an explicit original datagram length field in the ICMP
> > >>> header to delineate the original datagram from the extension struct.
> > >>>
> > >>> Return this field when reading an ICMP error from the error queue.
> > >>
> > >> RFC mentions a 'length' field of 8 bits, your patch chose to export the 
> > >> whole
> > >> second word of icmp header.
> > >>
> > >> Why is this field mapped to a prior one (icmp_hdr(skb)->un.gateway) ?
> > >>
> > >> Should we add an element in the union to make this a little bit more 
> > >> explicit/readable ?
> > >>
> > >> diff --git a/include/uapi/linux/icmp.h b/include/uapi/linux/icmp.h
> > >> index 
> > >> 5589eeb791ca580bb182e1dc38c05eab1c75adb9..427ed5a6765316a4c1e2fa06f3b6618447c01564
> > >>  100644
> > >> --- a/include/uapi/linux/icmp.h
> > >> +++ b/include/uapi/linux/icmp.h
> > >> @@ -76,6 +76,7 @@ struct icmphdr {
> > >>                 __be16  sequence;
> > >>         } echo;
> > >>         __be32  gateway;
> > >> +       __be32  second_word; /* RFC 4884 4.[123] : 
> > >> <unused:8>,<length:8>,<mtu:16> */
> > >>         struct {
> > >>                 __be16  __unused;
> > >>                 __be16  mtu;
> > >
> > > Okay. How about a variant of the existing struct frag?
> > >
> > > @@ -80,6 +80,11 @@ struct icmphdr {
> > >                 __be16  __unused;
> > >                 __be16  mtu;
> > >         } frag;
> > > +       struct {
> > > +               __u8    __unused;
> > > +               __u8    length;
> > > +               __be16  mtu;
> > > +       } rfc_4884;
> > >         __u8    reserved[4];
> > >    } un;
> > >
> >
> > Sure, but my point was later in the code :
> >
> > >>> +     if (inet_sk(sk)->recverr_rfc4884)
> > >>> +             info = ntohl(icmp_hdr(skb)->un.gateway);
> > >>
> > >> ntohl(icmp_hdr(skb)->un.second_word);
> >
> > If you leave there "info = ntohl(icmp_hdr(skb)->un.gateway)" it is a bit 
> > hard for someone
> > reading linux kernel code to understand why we do this.
> >
> It's also potentially problematic. The other bits are Unused, which
> isn't the same thing as necessarily being zero. Userspace might assume
> that info is the length without checking its bounded.

It shouldn't. The icmp type and code are passed in sock_extended_err
as ee_type and ee_code. So it can demultiplex the meaning of the rest
of the icmp header.

It just needs access to the other 32-bits, which indeed are context
sensitive. It makes more sense to me to let userspace demultiplex this
in one place, rather than demultiplex in the kernel and define a new,
likely no simpler, data structure to share with userspace.

Specific to RFC 4884, the 8-bit length field coexists with the
16-bit mtu field in case of ICMP_FRAG_NEEDED, so we cannot just pass
the first as ee_info in RFC 4884 mode. sock_extended_err additionally
has ee_data, but after that we're out of fields, too, so this approach
is not very future proof to additional ICMP extensions.

On your previous point, it might be useful to define struct rfc_4884
equivalent outside struct icmphdr, so that an application can easily
cast to that. RFC 4884 itself does not define any extension objects.
That is out of scope there, and in my opinion, here. Again, better
left to userspace. Especially because as it describes, it standardized
the behavior after observing non-compliant, but existing in the wild,
proprietary extension variants. Users may have to change how they
interpret the fields based on what they have deployed.

Reply via email to