Adding a few virtio people. Please see the virtio spec/flag question 
below.

On Tue, 12 Nov 2024, Chia-Yu Chang (Nokia) wrote:

> >From: Ilpo Järvinen <i...@kernel.org> 
> >Sent: Thursday, November 7, 2024 8:28 PM
> >To: Eric Dumazet <eduma...@google.com>
> >Cc: Chia-Yu Chang (Nokia) <chia-yu.ch...@nokia-bell-labs.com>; 
> >netdev@vger.kernel.org; dsah...@gmail.com; da...@davemloft.net; 
> >dsah...@kernel.org; pab...@redhat.com; joel.grana...@kernel.org; 
> >k...@kernel.org; andrew+net...@lunn.ch; ho...@kernel.org; 
> >pa...@netfilter.org; kad...@netfilter.org; netfilter-de...@vger.kernel.org; 
> >coret...@netfilter.org; ncardw...@google.com; Koen De Schepper (Nokia) 
> ><koen.de_schep...@nokia-bell-labs.com>; g.wh...@cablelabs.com; 
> >ingemar.s.johans...@ericsson.com; mirja.kuehlew...@ericsson.com; 
> >chesh...@apple.com; rs.i...@gmx.at; jason_living...@comcast.com; 
> >vidhi_g...@apple.com
> >Subject: Re: [PATCH v5 net-next 09/13] gro: prevent ACE field corruption & 
> >better AccECN handling
> >
> >
> >CAUTION: This is an external email. Please be very careful when clicking 
> >links or opening attachments. See the URL nok.it/ext for additional 
> >information.
> >
> >
> >
> >On Thu, 7 Nov 2024, Eric Dumazet wrote:
> >
> >>On Tue, Nov 5, 2024 at 11:07 AM <chia-yu.ch...@nokia-bell-labs.com> wrote:
> >>>
> >>> From: Ilpo Järvinen <i...@kernel.org>
> >>>
> >>> There are important differences in how the CWR field behaves in 
> >>> RFC3168 and AccECN. With AccECN, CWR flag is part of the ACE counter 
> >>> and its changes are important so adjust the flags changed mask 
> >>> accordingly.
> >>>
> >>> Also, if CWR is there, set the Accurate ECN GSO flag to avoid 
> >>> corrupting CWR flag somewhere.
> >>>
> >>> Signed-off-by: Ilpo Järvinen <i...@kernel.org>
> >>> Signed-off-by: Chia-Yu Chang <chia-yu.ch...@nokia-bell-labs.com>
> >>> ---
> >>>  net/ipv4/tcp_offload.c | 4 ++--
> >>>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index 
> >>> 0b05f30e9e5f..f59762d88c38 100644
> >>> --- a/net/ipv4/tcp_offload.c
> >>> +++ b/net/ipv4/tcp_offload.c
> >>> @@ -329,7 +329,7 @@ struct sk_buff *tcp_gro_receive(struct list_head 
> >>> *head, struct sk_buff *skb,
> >>>         th2 = tcp_hdr(p);
> >>>         flush = (__force int)(flags & TCP_FLAG_CWR);
> >>>         flush |= (__force int)((flags ^ tcp_flag_word(th2)) &
> >>> -                 ~(TCP_FLAG_CWR | TCP_FLAG_FIN | TCP_FLAG_PSH));
> >>> +                 ~(TCP_FLAG_FIN | TCP_FLAG_PSH));
> >>>         flush |= (__force int)(th->ack_seq ^ th2->ack_seq);
> >>>         for (i = sizeof(*th); i < thlen; i += 4)
> >>>                 flush |= *(u32 *)((u8 *)th + i) ^ @@ -405,7 +405,7 
> >>> @@ void tcp_gro_complete(struct sk_buff *skb)
> >>>         shinfo->gso_segs = NAPI_GRO_CB(skb)->count;
> >>>
> >>>         if (th->cwr)
> >>> -               shinfo->gso_type |= SKB_GSO_TCP_ECN;
> >>> +               shinfo->gso_type |= SKB_GSO_TCP_ACCECN;
> >>>  }
> >>>  EXPORT_SYMBOL(tcp_gro_complete);
> >>>
> >>
> >> I do not really understand this patch. How a GRO engine can know which 
> >> ECN variant the peers are using ?
> >
> >Hi Eric,
> >
> >Thanks for taking a look.
> >
> >GRO doesn't know. Setting SKB_GSO_TCP_ECN in case of not knowing can result 
> >in header change that corrupts ACE field. Thus, GRO has to assume the 
> >RFC3168 CWR offloading trick cannot be used anymore (unless it tracks the 
> >connection and knows the bits are used for RFC3168 which is something nobody 
> >is going to do).
> >
> >The main point of SKB_GSO_TCP_ACCECN is to prevent SKB_GSO_TCP_ECN or 
> >NETIF_F_TSO_ECN offloading to be used for the skb as it would corrupt ACE 
> >field value.
> 
> Hi Eric and Ilpo,
> 
> From my understanding of another email thread (patch 08/13), it seems the 
> conclusions that SKB_GSO_TCP_ACCECN will still be needed.
> 
> >
> >SKB_GSO_TCP_ACCECN doesn't allow CWR bits change within a super-skb but the 
> >same CWR flag should be repeated for all segments. In a sense, it's simpler 
> >than RFC3168 offloading.
> >
> >> SKB_GSO_TCP_ECN is also used from other points, what is your plan ?
> >>
> >> git grep -n SKB_GSO_TCP_ECN
> >> drivers/net/ethernet/hisilicon/hns3/hns3_enet.c:3888:
> >> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
> >> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:1291:
> >> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
> >> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:1312:
> >> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
> >
> >These looked like they should be just changed to set SKB_GSO_TCP_ACCECN 
> >instead.
> 
> I agree with these changes and will apply them in the next version.
> 
> >
> >I don't anymore recall why I didn't change those when I made this patch long 
> >time ago, perhaps it was just an oversight or things have changed somehow 
> >since then.
> >
> >> include/linux/netdevice.h:5061: BUILD_BUG_ON(SKB_GSO_TCP_ECN != 
> >> (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT));
> >> include/linux/skbuff.h:664:     SKB_GSO_TCP_ECN = 1 << 2,
> >
> >Not relevant.
> >
> >> include/linux/virtio_net.h:88:                  gso_type |= 
> >> SKB_GSO_TCP_ECN;
> >> include/linux/virtio_net.h:161:         switch (gso_type & 
> >> ~SKB_GSO_TCP_ECN) {
> >> include/linux/virtio_net.h:226:         if (sinfo->gso_type & 
> >> SKB_GSO_TCP_ECN)
> >
> >These need to be looked further what's going on as UAPI is also 
> > involved here. 
> 
> I had a look at these parts, and only the 1st and 3rd ones are relevant.
> Related to the 1st one, I propose to change
> from
> 
>                 if (hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN)
>                         gso_type |= SKB_GSO_TCP_ECN;
> 
> to
> 
>                 if (hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN)
>                         gso_type |= SKB_GSO_TCP_ACCECN;
> 
> The reason behind this proposed change is similar as the above changes 
> in en_rx.c.

But en_rx.c is based one CWR flag, there's no similar check here.

> For the 3rd one, I suggest to change from
> 
>                 if (sinfo->gso_type & SKB_GSO_TCP_ECN)
>                         hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN;
>
> to
> 
>                 if (sinfo->gso_type & (SKB_GSO_TCP_ECN | SKB_GSO_TCP_ACCECN))
>                         hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN;
> 
> This proposed change is because VIRTIO_NET_HDR_GSO_ECN must be set to 
> allow TCP packets requiring segmentation offload which have ECN bit set.
> So, no matter whether skb gso_type have GSO_TCP_ECN flag or 
> GSO_TCP_ACCECN flag, the corresponding hdr gso_type shall be set.
> 
> But, I wonder what would the driver do when with VIRTIO_NET_HDR_GSO_ECN 
> flag. Will it corrupts CWR or not?

I'm starting to heavily question what is the meaning of that 
VIRTIO_NET_HDR_GSO_ECN flag and if it's even consistent...

https://github.com/qemu/qemu/blob/master/net/eth.c

That sets VIRTIO_NET_HDR_GSO_ECN based on CE?? (Whereas kernel associates 
the related SKB_GSO_TCP_ECN to CWR offloading.)

The virtio spec is way too vague to be useful so it would not be 
surprising if there are diverging interpretations from implementers:

"If the driver negotiated the VIRTIO_NET_F_HOST_ECN feature, the 
VIRTIO_NET_HDR_GSO_ECN bit in gso_type indicates that the TCP packet has 
the ECN bit set."

What is "the ECN bit" in terms used by TCP (or IP)? Could some virtio 
folks please explain?


-- 
 i.

Reply via email to