>From: Jason Wang <jasow...@redhat.com> 
>Sent: Wednesday, November 13, 2024 2:43 AM
>To: Ilpo Järvinen <i...@kernel.org>
>Cc: Chia-Yu Chang (Nokia) <chia-yu.ch...@nokia-bell-labs.com>; Michael S. 
>Tsirkin <m...@redhat.com>; virtualizat...@lists.linux.dev; Eric Dumazet 
><eduma...@google.com>; netdev@vger.kernel.org; dsah...@gmail.com; 
>da...@davemloft.net; dsah...@kernel.org; pab...@redhat.com; 
>joel.grana...@kernel.org; k...@kernel.org; andrew+net...@lunn.ch; 
>ho...@kernel.org; pa...@netfilter.org; kad...@netfilter.org; 
>netfilter-de...@vger.kernel.org; coret...@netfilter.org; ncardw...@google.com; 
>Koen De Schepper (Nokia) <koen.de_schep...@nokia-bell-labs.com>; 
>g.wh...@cablelabs.com; ingemar.s.johans...@ericsson.com; 
>mirja.kuehlew...@ericsson.com; chesh...@apple.com; rs.i...@gmx.at; 
>jason_living...@comcast.com; vidhi_g...@apple.com
>Subject: Re: [PATCH v5 net-next 09/13] gro: prevent ACE field corruption & 
>better AccECN handling
>
>[You don't often get email from jasow...@redhat.com. Learn why this is 
>important at https://aka.ms/LearnAboutSenderIdentification ]
>
>CAUTION: This is an external email. Please be very careful when clicking links 
>or opening attachments. See the URL nok.it/ext for additional information.
>
>
>
>On Wed, Nov 13, 2024 at 5:19 AM Ilpo Järvinen <i...@kernel.org> wrote:
>>
>> Adding a few virtio people. Please see the virtio spec/flag question 
>> below.
>>
>> On Tue, 12 Nov 2024, Chia-Yu Chang (Nokia) wrote:
>>
>> > >From: Ilpo Järvinen <i...@kernel.org>
>> > >Sent: Thursday, November 7, 2024 8:28 PM
>> > >To: Eric Dumazet <eduma...@google.com>
>> > >Cc: Chia-Yu Chang (Nokia) <chia-yu.ch...@nokia-bell-labs.com>; 
>> > >netdev@vger.kernel.org; dsah...@gmail.com; da...@davemloft.net; 
>> > >dsah...@kernel.org; pab...@redhat.com; joel.grana...@kernel.org; 
>> > >k...@kernel.org; andrew+net...@lunn.ch; ho...@kernel.org; 
>> > >pa...@netfilter.org; kad...@netfilter.org; 
>> > >netfilter-de...@vger.kernel.org; coret...@netfilter.org; 
>> > >ncardw...@google.com; Koen De Schepper (Nokia) 
>> > ><koen.de_schep...@nokia-bell-labs.com>; g.wh...@cablelabs.com; 
>> > >ingemar.s.johans...@ericsson.com; mirja.kuehlew...@ericsson.com; 
>> > >chesh...@apple.com; rs.i...@gmx.at; jason_living...@comcast.com; 
>> > >vidhi_g...@apple.com
>> > >Subject: Re: [PATCH v5 net-next 09/13] gro: prevent ACE field 
>> > >corruption & better AccECN handling
>> > >
>> > >
>> > >CAUTION: This is an external email. Please be very careful when clicking 
>> > >links or opening attachments. See the URL nok.it/ext for additional 
>> > >information.
>> > >
>> > >
>> > >
>> > >On Thu, 7 Nov 2024, Eric Dumazet wrote:
>> > >
>> > >>On Tue, Nov 5, 2024 at 11:07 AM <chia-yu.ch...@nokia-bell-labs.com> 
>> > >>wrote:
>> > >>>
>> > >>> From: Ilpo Järvinen <i...@kernel.org>
>> > >>>
>> > >>> There are important differences in how the CWR field behaves in
>> > >>> RFC3168 and AccECN. With AccECN, CWR flag is part of the ACE 
>> > >>> counter and its changes are important so adjust the flags 
>> > >>> changed mask accordingly.
>> > >>>
>> > >>> Also, if CWR is there, set the Accurate ECN GSO flag to avoid 
>> > >>> corrupting CWR flag somewhere.
>> > >>>
>> > >>> Signed-off-by: Ilpo Järvinen <i...@kernel.org>
>> > >>> Signed-off-by: Chia-Yu Chang <chia-yu.ch...@nokia-bell-labs.com>
>> > >>> ---
>> > >>>  net/ipv4/tcp_offload.c | 4 ++--
>> > >>>  1 file changed, 2 insertions(+), 2 deletions(-)
>> > >>>
>> > >>> diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c 
>> > >>> index
>> > >>> 0b05f30e9e5f..f59762d88c38 100644
>> > >>> --- a/net/ipv4/tcp_offload.c
>> > >>> +++ b/net/ipv4/tcp_offload.c
>> > >>> @@ -329,7 +329,7 @@ struct sk_buff *tcp_gro_receive(struct list_head 
>> > >>> *head, struct sk_buff *skb,
>> > >>>         th2 = tcp_hdr(p);
>> > >>>         flush = (__force int)(flags & TCP_FLAG_CWR);
>> > >>>         flush |= (__force int)((flags ^ tcp_flag_word(th2)) &
>> > >>> -                 ~(TCP_FLAG_CWR | TCP_FLAG_FIN | TCP_FLAG_PSH));
>> > >>> +                 ~(TCP_FLAG_FIN | TCP_FLAG_PSH));
>> > >>>         flush |= (__force int)(th->ack_seq ^ th2->ack_seq);
>> > >>>         for (i = sizeof(*th); i < thlen; i += 4)
>> > >>>                 flush |= *(u32 *)((u8 *)th + i) ^ @@ -405,7 
>> > >>> +405,7 @@ void tcp_gro_complete(struct sk_buff *skb)
>> > >>>         shinfo->gso_segs = NAPI_GRO_CB(skb)->count;
>> > >>>
>> > >>>         if (th->cwr)
>> > >>> -               shinfo->gso_type |= SKB_GSO_TCP_ECN;
>> > >>> +               shinfo->gso_type |= SKB_GSO_TCP_ACCECN;
>> > >>>  }
>> > >>>  EXPORT_SYMBOL(tcp_gro_complete);
>> > >>>
>> > >>
>> > >> I do not really understand this patch. How a GRO engine can know 
>> > >> which ECN variant the peers are using ?
>> > >
>> > >Hi Eric,
>> > >
>> > >Thanks for taking a look.
>> > >
>> > >GRO doesn't know. Setting SKB_GSO_TCP_ECN in case of not knowing can 
>> > >result in header change that corrupts ACE field. Thus, GRO has to assume 
>> > >the RFC3168 CWR offloading trick cannot be used anymore (unless it tracks 
>> > >the connection and knows the bits are used for RFC3168 which is something 
>> > >nobody is going to do).
>> > >
>> > >The main point of SKB_GSO_TCP_ACCECN is to prevent SKB_GSO_TCP_ECN or 
>> > >NETIF_F_TSO_ECN offloading to be used for the skb as it would corrupt ACE 
>> > >field value.
>> >
>> > Hi Eric and Ilpo,
>> >
>> > From my understanding of another email thread (patch 08/13), it seems the 
>> > conclusions that SKB_GSO_TCP_ACCECN will still be needed.
>> >
>> > >
>> > >SKB_GSO_TCP_ACCECN doesn't allow CWR bits change within a super-skb but 
>> > >the same CWR flag should be repeated for all segments. In a sense, it's 
>> > >simpler than RFC3168 offloading.
>> > >
>> > >> SKB_GSO_TCP_ECN is also used from other points, what is your plan ?
>> > >>
>> > >> git grep -n SKB_GSO_TCP_ECN
>> > >> drivers/net/ethernet/hisilicon/hns3/hns3_enet.c:3888:
>> > >> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
>> > >> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:1291:
>> > >> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
>> > >> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:1312:
>> > >> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
>> > >
>> > >These looked like they should be just changed to set SKB_GSO_TCP_ACCECN 
>> > >instead.
>> >
>> > I agree with these changes and will apply them in the next version.
>> >
>> > >
>> > >I don't anymore recall why I didn't change those when I made this patch 
>> > >long time ago, perhaps it was just an oversight or things have changed 
>> > >somehow since then.
>> > >
>> > >> include/linux/netdevice.h:5061: BUILD_BUG_ON(SKB_GSO_TCP_ECN != 
>> > >> (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT));
>> > >> include/linux/skbuff.h:664:     SKB_GSO_TCP_ECN = 1 << 2,
>> > >
>> > >Not relevant.
>> > >
>> > >> include/linux/virtio_net.h:88:                  gso_type |= 
>> > >> SKB_GSO_TCP_ECN;
>> > >> include/linux/virtio_net.h:161:         switch (gso_type & 
>> > >> ~SKB_GSO_TCP_ECN) {
>> > >> include/linux/virtio_net.h:226:         if (sinfo->gso_type & 
>> > >> SKB_GSO_TCP_ECN)
>> > >
>> > >These need to be looked further what's going on as UAPI is also  
>> > >involved here.
>> >
>> > I had a look at these parts, and only the 1st and 3rd ones are relevant.
>> > Related to the 1st one, I propose to change from
>> >
>> >                 if (hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN)
>> >                         gso_type |= SKB_GSO_TCP_ECN;
>> >
>> > to
>> >
>> >                 if (hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN)
>> >                         gso_type |= SKB_GSO_TCP_ACCECN;
>> >
>> > The reason behind this proposed change is similar as the above 
>> > changes in en_rx.c.
>>
>> But en_rx.c is based one CWR flag, there's no similar check here.
>>
>> > For the 3rd one, I suggest to change from
>> >
>> >                 if (sinfo->gso_type & SKB_GSO_TCP_ECN)
>> >                         hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN;
>> >
>> > to
>> >
>> >                 if (sinfo->gso_type & (SKB_GSO_TCP_ECN | 
>> > SKB_GSO_TCP_ACCECN))
>> >                         hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN;
>> >
>> > This proposed change is because VIRTIO_NET_HDR_GSO_ECN must be set 
>> > to allow TCP packets requiring segmentation offload which have ECN bit set.
>> > So, no matter whether skb gso_type have GSO_TCP_ECN flag or 
>> > GSO_TCP_ACCECN flag, the corresponding hdr gso_type shall be set.
>> >
>> > But, I wonder what would the driver do when with 
>> > VIRTIO_NET_HDR_GSO_ECN flag. Will it corrupts CWR or not?
>>
>> I'm starting to heavily question what is the meaning of that 
>> VIRTIO_NET_HDR_GSO_ECN flag and if it's even consistent...
>>
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
>> ub.com%2Fqemu%2Fqemu%2Fblob%2Fmaster%2Fnet%2Feth.c&data=05%7C02%7Cchia
>> -yu.chang%40nokia-bell-labs.com%7C64e73bacd5eb4647905208dd03848651%7C5
>> d4717519675428d917b70f44f9630b0%7C0%7C0%7C638670589949336838%7CUnknown
>> %7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4
>> zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=iEeGdA4mcq1WC
>> unKWvxNZ6yt213xKWKOpjyYdBzLzTk%3D&reserved=0
>>
>> That sets VIRTIO_NET_HDR_GSO_ECN based on CE?? (Whereas kernel 
>> associates the related SKB_GSO_TCP_ECN to CWR offloading.)
>>
>> The virtio spec is way too vague to be useful so it would not be 
>> surprising if there are diverging interpretations from implementers:
>>
>> "If the driver negotiated the VIRTIO_NET_F_HOST_ECN feature, the 
>> VIRTIO_NET_HDR_GSO_ECN bit in gso_type indicates that the TCP packet 
>> has the ECN bit set."
>>
>> What is "the ECN bit" in terms used by TCP (or IP)? Could some virtio 
>> folks please explain?
>
>According to the current Linux implementation in virtio_net_hdr_to_skb():
>
>                if (hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN)
>                        gso_type |= SKB_GSO_TCP_ECN;
>
>It mapps to SKB_GSO_TCP_ECN which is:
>
>        /* This indicates the tcp segment has CWR set. */
>        SKB_GSO_TCP_ECN = 1 << 2,
>
>Thanks

This patch series is related to adding new SKB_GSO_TCP_ACCECN.
The reason behind is because how the CWR field handling is different between 
RFC3168 and AccECN.
You can to find the original patches in below:
https://lore.kernel.org/netdev/20241105100647.117346-9-chia-yu.ch...@nokia-bell-labs.com/
https://lore.kernel.org/netdev/20241105100647.117346-10-chia-yu.ch...@nokia-bell-labs.com/

When reviewing all occurrence of SKB_GSO_TCP_ECN, we see it's related to 
VIRTIO_NET_HDR_GSO_ECN.
But after checking the virtio spec, it's quite unclear about 
VIRTIO_NET_HDR_GSO_ECN:
"If the driver negotiated the VIRTIO_NET_F_HOST_ECN feature, the 
VIRTIO_NET_HDR_GSO_ECN bit in gso_type indicates that the TCP packet has the 
ECN bit set."

In virtnet_probe(), it looks like VIRTIO_NET_F_HOST_ECN is used to enable 
NETIF_F_TSO_ECN support:
                if (virtio_has_feature(vdev, VIRTIO_NET_F_HOST_ECN))
                        dev->hw_features |= NETIF_F_TSO_ECN;

So the question is that is its TSO engine can be set to not touch CWR flag even 
with NETIF_F_TSO_ECN?

Reply via email to