Hi Konstantin,

I thought your comments make sense, and addressed them with a patchset 
https://patches.dpdk.org/project/dpdk/list/?series=29709,
please check again.
Thanks,


Trevor Tao

At 2023-09-26 21:49:05, "Konstantin Ananyev" <konstantin.v.anan...@yandex.ru> 
wrote:
>
>Hi Trevor,
>>>
>>>
>>>> 
>>>> At 2023-09-18 02:04:19, "Konstantin Ananyev" 
>>>> <konstantin.v.anan...@yandex.ru> wrote:
>>>>>03/09/2023 05:01, Trevor Tao пишет:
>>>>>> Now the port Rx mq_mode had been set to RTE_ETH_MQ_RX_RSS, and offload
>>>>>> mode set to RTE_ETH_RX_OFFLOAD_CHECKSUM by default, but some hardware
>>>>>> and/or virtual interface does not support the RSS and offload mode
>>>>>> presupposed, e.g., some virtio interfaces in the cloud don't support
>>>>>> RSS and may only partly support RTE_ETH_RX_OFFLOAD_UDP_CKSUM/
>>>>>> RTE_ETH_RX_OFFLOAD_TCP_CKSUM,
>>>>>> but not RTE_ETH_RX_OFFLOAD_IPV4_CKSUM, and the error msg here:
>>>>>> 
>>>>>> virtio_dev_configure(): RSS support requested but not supported by
>>>>>> the device
>>>>>> Port0 dev_configure = -95
>>>>>> 
>>>>>> and:
>>>>>> Ethdev port_id=0 requested Rx offloads 0xe does not match Rx offloads
>>>>>> capabilities 0x201d in rte_eth_dev_configure()
>>>>>> 
>>>>>> So to enable the l3fwd running in that environment, the Rx mode 
>>>>>> requirement
>>>>>> can be relaxed to reflect the hardware feature reality here, and the 
>>>>>> l3fwd
>>>>>> can run smoothly then.
>>>>>> A warning msg would be provided to user in case it happens here.
>>>>>> 
>>>>>> On the other side, enabling the software cksum check in case the
>>>>>> hw support missing.
>>>>>> 
>>>>>> Fixes: af75078fece3 ("first public release")
>>>>>> Cc: sta...@dpdk.org
>>>>>
>>>>>I don't think there was abug here.
>>>>>We are talking about changing current requirements for the app.
>>>>>So not sure it is a real fix and that such change can be
>>>> 
>>>>>propagated to stable releases.
>>>> Trevor: I think it's not a bug fix but a feature enhancement, it would 
>>>> enable l3fwd to work smoothly on the HW/virtual interfaces which don't 
>>>> support RSS and/or cksum offloading.
>>>
>>>
>>>Yes. it seems like sort of an enhancement.
>>>While 'Fixes: ...' are for bugs only.
>>>AFAIK, only bug-fixes are take for backporting by stable releases.
>>>That's why there seems no point to add CC: sta...@dpdk.org
>>>
>>>Another generic things:
>>  >- l3fwd doc and release notes probably need to be updated
>> *Trevor>>I think it's ok to update the l3fwd doc and release notes, but 
>> I would like to know which part of the doc/notes is approriate to add 
>> the enhancement declaration. *
>
>  think both:
>http://doc.dpdk.org/guides/sample_app_ug/l3_forward.html
>and elease notes in doc/guides/rel_notes/ need to be updated.
>
>>>- as you areintroducing 2 distinct features: no-rss and no-ipv4-cksum
>>>   it is probably better to split it into 2 different patches (in the 
>>  >same series).
>> *Trevor>>I think it's ok to split it into 2 patches here in the same 
>> series, if you would like to.*
>> *Thanks.*
>
>That is not my own desire, but usual contrution practise we all try
>to comply with.
>You can find more details at:
>https://doc.dpdk.org/guides/contributing/patches.html
>
>Thanks
>Konstantin
>
>
>>>
>>>> 
>>>> 
>>>>>
>>>>>> 
>>>>>> Signed-off-by: Trevor Tao <taozj...@163.com>
>>>>>> ---
>>>>>>   examples/l3fwd/l3fwd.h | 12 +++++++++++-
>>>>>>   examples/l3fwd/main.c  | 21 +++++++++++++++++++--
>>>>>>   2 files changed, 30 insertions(+), 3 deletions(-)
>>>>>> 
>>>>>> diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
>>>>>> index b55855c932..cc10643c4b 100644
>>>>>> --- a/examples/l3fwd/l3fwd.h
>>>>>> +++ b/examples/l3fwd/l3fwd.h
>>>>>> @@ -115,6 +115,8 @@ extern struct acl_algorithms acl_alg[];
>>>>>>   
>>>>>>   extern uint32_t max_pkt_len;
>>>>>>   
>>>>>> +extern struct rte_eth_conf port_conf;
>>>>>> +
>>>>>>   /* Send burst of packets on an output interface */
>>>>>>   static inline int
>>>>>>   send_burst(struct lcore_conf *qconf, uint16_t n, uint16_t port)
>>>>>> @@ -170,7 +172,15 @@ is_valid_ipv4_pkt(struct rte_ipv4_hdr *pkt, 
>>>>>> uint32_t link_len)
>>>>>>                  return -1;
>>>>>>   
>>>>>>          /* 2. The IP checksum must be correct. */
>>>>>> -        /* this is checked in H/W */
>>>>>> +        /* if this is not checked in H/W, check it. */
>>>>>> +        if ((port_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_IPV4_CKSUM) 
>>>>>> == 0) {
>>>>>
>>>>>Might be better to check particular mbuf flag:
>>>>>if ((mbuf->ol_flags & RTE_MBUF_F_RX_IP_CKSUM_MASK) == 
>>>> 
>>>>>TE_MBUF_F_RX_IP_CKSUM_UNKNOWN) {...}
>>>> Trevor: the utility function is_valid_ipv4_pkt is just against an IPv4 
>>>> pkt, and there's no mbuf information, and if needed, there would be an 
>>>> extra ol_flags added here to check if it was already done by the ethernet 
>>>> device, but look for a sample in:
>>>> https://github.com/DPDK/dpdk/blob/main/examples/l3fwd-power/main.c#L487
>>>> so I think it's ok to just use the port_conf here. If you still think it's 
>>>> better to use m->ol_flags, please tell me.
>>>
>>>
>>>Yep, passing ol_flags, or mbuf itself seems like a proper way to do it.
>>>Aproach taken in l3fwd-power doesn't look right to me, see below.
>>>
>>>>>
>>>>>> +                uint16_t actual_cksum, expected_cksum;
>>>>>> +                actual_cksum = pkt->hdr_checksum;
>>>>>> +                pkt->hdr_checksum = 0;
>>>>>> +                expected_cksum = rte_ipv4_cksum(pkt);
>>>>>> +                if (actual_cksum != expected_cksum)
>>>>>> +                        return -2;
>>>>>> +        }
>>>>>>   
>>>>>>          /*
>>>>>>           * 3. The IP version number must be 4. If the version number is 
>>>>>> not 4
>>>>>> diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
>>>>>> index 6063eb1399..37aec64718 100644
>>>>>> --- a/examples/l3fwd/main.c
>>>>>> +++ b/examples/l3fwd/main.c
>>>>>> @@ -117,7 +117,7 @@ static struct lcore_params * lcore_params = 
>>>>>> lcore_params_array_default;
>>>>>>   static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
>>>>>>                                  sizeof(lcore_params_array_default[0]);
>>>>>>   
>>>>>> -static struct rte_eth_conf port_conf = {
>>>>>> +struct rte_eth_conf port_conf = {
>>>>>>          .rxmode = {
>>>>>>                  .mq_mode = RTE_ETH_MQ_RX_RSS,
>>>>>>                  .offloads = RTE_ETH_RX_OFFLOAD_CHECKSUM,
>>>>>> @@ -1257,8 +1257,12 @@ l3fwd_poll_resource_setup(void)
>>>>>>                  local_port_conf.rx_adv_conf.rss_conf.rss_hf &=
>>>>>>                          dev_info.flow_type_rss_offloads;
>>>>>>   
>>>>>> -                if (dev_info.max_rx_queues == 1)
>>>>>> +                /* relax the rx rss requirement */
>>>>>> +                if (dev_info.max_rx_queues == 1 || 
>>>>>> !local_port_conf.rx_adv_conf.rss_conf.rss_hf) {
>>>>>> +                        printf("warning: modified the rx mq_mode to 
>>>>>> RTE_ETH_MQ_RX_NONE base on"
>>>>>> +                                " device capability\n");
>>>>>>                          local_port_conf.rxmode.mq_mode = 
>>>>>> RTE_ETH_MQ_RX_NONE;
>>>>>
>>>>>Should we probably instead have a new commnad-line option to explicitly 
>>>>>disable RSS?
>>>> 
>>>>>Something like: '--no-rss' or so?
>>>> Trevor: the RSS capability for a certain port was got by the 
>>>> rte_eth_dev_info_get() automatically, and we think the user should not 
>>>> care about its status beforehand, but if it's missing, a warning 
>>>> notification for the degrade here would be proposed to make it run 
>>>> smoothly.
>>>
>>>Personally, I still think it would be better the user will
>>>have an ability to disable it explicitly.
>>>Same as l3fwd does now with 'parse-ptype'.
>>>
>>>>>
>>>>>> +                }
>>>>>>   
>>>>>>                  if (local_port_conf.rx_adv_conf.rss_conf.rss_hf !=
>>>>>>                                  port_conf.rx_adv_conf.rss_conf.rss_hf) {
>>>>>> @@ -1269,6 +1273,19 @@ l3fwd_poll_resource_setup(void)
>>>>>>                                  
>>>>>> local_port_conf.rx_adv_conf.rss_conf.rss_hf);
>>>>>>                  }
>>>>>>   
>>>>>> +                /* relax the rx offload requirement */
>>>>>> +                if ((local_port_conf.rxmode.offloads & 
>>>>>> dev_info.rx_offload_capa) !=
>>>>>> +                        local_port_conf.rxmode.offloads) {
>>>>>> +                        printf("Port %u requested Rx offloads 
>>>>>> 0x%"PRIx64" does not"
>>>>>> +                                " match Rx offloads capabilities 
>>>>>> 0x%"PRIx64"\n",
>>>>>> +                                portid, local_port_conf.rxmode.offloads,
>>>>>> +                                dev_info.rx_offload_capa);
>>>>>> +                        local_port_conf.rxmode.offloads &= 
>>>>>> dev_info.rx_offload_capa;
>>>>>> +                        port_conf.rxmode.offloads = 
>>>>>> local_port_conf.rxmode.offloads;
>>>>>
>>>>>Why to remove offloads in port_conf?
>>>>>There could be multiple ports, and on others desired HW offloads might 
>>>> 
>>>>>be supported.
>>>> Trevor: Yes, there would be multiple ports, so if one of the ports lack HW 
>>>> offload, it would be ok to just use the relaxed requirement here, like we 
>>>> previously talked in 
>>>> https://github.com/DPDK/dpdk/blob/main/examples/l3fwd-power/main.c#L487, 
>>>> if you still think it's needed to use the per-port case, it would be ok to 
>>>> use the ol_flags as talked previously.
>>>
>>>
>>>But then, depending on the ports order you can end-up with IP_CKSUM 
>>>offload enabled on some ports (but not used), while completely disable 
>>>on other ports - even if these ports do support IP_CKSUM.
>>>I think the better way would be not to touch port_conf here, and above
>>>use ol_flags to decide should we compute cksum in SW or not.
>>>
>>>
>>>>>
>>>>>> +                        printf("warning: modified the rx offload to 
>>>>>> 0x%"PRIx64" based on device"
>>>>>> +                                " capability\n", 
>>>>>> local_port_conf.rxmode.offloads);
>>>>>> +                }
>>>>>> +
>>>>>>                  ret = rte_eth_dev_configure(portid, nb_rx_queue,
>>>>>>                                          (uint16_t)n_tx_queue, 
>>>>>> &local_port_conf);
>>>>>>                  if (ret < 0)

Reply via email to