Hi Konstantin,
I thought your comments make sense, and addressed them with a patchset https://patches.dpdk.org/project/dpdk/list/?series=29709, please check again. Thanks, Trevor Tao At 2023-09-26 21:49:05, "Konstantin Ananyev" <konstantin.v.anan...@yandex.ru> wrote: > >Hi Trevor, >>> >>> >>>> >>>> At 2023-09-18 02:04:19, "Konstantin Ananyev" >>>> <konstantin.v.anan...@yandex.ru> wrote: >>>>>03/09/2023 05:01, Trevor Tao пишет: >>>>>> Now the port Rx mq_mode had been set to RTE_ETH_MQ_RX_RSS, and offload >>>>>> mode set to RTE_ETH_RX_OFFLOAD_CHECKSUM by default, but some hardware >>>>>> and/or virtual interface does not support the RSS and offload mode >>>>>> presupposed, e.g., some virtio interfaces in the cloud don't support >>>>>> RSS and may only partly support RTE_ETH_RX_OFFLOAD_UDP_CKSUM/ >>>>>> RTE_ETH_RX_OFFLOAD_TCP_CKSUM, >>>>>> but not RTE_ETH_RX_OFFLOAD_IPV4_CKSUM, and the error msg here: >>>>>> >>>>>> virtio_dev_configure(): RSS support requested but not supported by >>>>>> the device >>>>>> Port0 dev_configure = -95 >>>>>> >>>>>> and: >>>>>> Ethdev port_id=0 requested Rx offloads 0xe does not match Rx offloads >>>>>> capabilities 0x201d in rte_eth_dev_configure() >>>>>> >>>>>> So to enable the l3fwd running in that environment, the Rx mode >>>>>> requirement >>>>>> can be relaxed to reflect the hardware feature reality here, and the >>>>>> l3fwd >>>>>> can run smoothly then. >>>>>> A warning msg would be provided to user in case it happens here. >>>>>> >>>>>> On the other side, enabling the software cksum check in case the >>>>>> hw support missing. >>>>>> >>>>>> Fixes: af75078fece3 ("first public release") >>>>>> Cc: sta...@dpdk.org >>>>> >>>>>I don't think there was abug here. >>>>>We are talking about changing current requirements for the app. >>>>>So not sure it is a real fix and that such change can be >>>> >>>>>propagated to stable releases. >>>> Trevor: I think it's not a bug fix but a feature enhancement, it would >>>> enable l3fwd to work smoothly on the HW/virtual interfaces which don't >>>> support RSS and/or cksum offloading. >>> >>> >>>Yes. it seems like sort of an enhancement. >>>While 'Fixes: ...' are for bugs only. >>>AFAIK, only bug-fixes are take for backporting by stable releases. >>>That's why there seems no point to add CC: sta...@dpdk.org >>> >>>Another generic things: >> >- l3fwd doc and release notes probably need to be updated >> *Trevor>>I think it's ok to update the l3fwd doc and release notes, but >> I would like to know which part of the doc/notes is approriate to add >> the enhancement declaration. * > > think both: >http://doc.dpdk.org/guides/sample_app_ug/l3_forward.html >and elease notes in doc/guides/rel_notes/ need to be updated. > >>>- as you areintroducing 2 distinct features: no-rss and no-ipv4-cksum >>> it is probably better to split it into 2 different patches (in the >> >same series). >> *Trevor>>I think it's ok to split it into 2 patches here in the same >> series, if you would like to.* >> *Thanks.* > >That is not my own desire, but usual contrution practise we all try >to comply with. >You can find more details at: >https://doc.dpdk.org/guides/contributing/patches.html > >Thanks >Konstantin > > >>> >>>> >>>> >>>>> >>>>>> >>>>>> Signed-off-by: Trevor Tao <taozj...@163.com> >>>>>> --- >>>>>> examples/l3fwd/l3fwd.h | 12 +++++++++++- >>>>>> examples/l3fwd/main.c | 21 +++++++++++++++++++-- >>>>>> 2 files changed, 30 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h >>>>>> index b55855c932..cc10643c4b 100644 >>>>>> --- a/examples/l3fwd/l3fwd.h >>>>>> +++ b/examples/l3fwd/l3fwd.h >>>>>> @@ -115,6 +115,8 @@ extern struct acl_algorithms acl_alg[]; >>>>>> >>>>>> extern uint32_t max_pkt_len; >>>>>> >>>>>> +extern struct rte_eth_conf port_conf; >>>>>> + >>>>>> /* Send burst of packets on an output interface */ >>>>>> static inline int >>>>>> send_burst(struct lcore_conf *qconf, uint16_t n, uint16_t port) >>>>>> @@ -170,7 +172,15 @@ is_valid_ipv4_pkt(struct rte_ipv4_hdr *pkt, >>>>>> uint32_t link_len) >>>>>> return -1; >>>>>> >>>>>> /* 2. The IP checksum must be correct. */ >>>>>> - /* this is checked in H/W */ >>>>>> + /* if this is not checked in H/W, check it. */ >>>>>> + if ((port_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_IPV4_CKSUM) >>>>>> == 0) { >>>>> >>>>>Might be better to check particular mbuf flag: >>>>>if ((mbuf->ol_flags & RTE_MBUF_F_RX_IP_CKSUM_MASK) == >>>> >>>>>TE_MBUF_F_RX_IP_CKSUM_UNKNOWN) {...} >>>> Trevor: the utility function is_valid_ipv4_pkt is just against an IPv4 >>>> pkt, and there's no mbuf information, and if needed, there would be an >>>> extra ol_flags added here to check if it was already done by the ethernet >>>> device, but look for a sample in: >>>> https://github.com/DPDK/dpdk/blob/main/examples/l3fwd-power/main.c#L487 >>>> so I think it's ok to just use the port_conf here. If you still think it's >>>> better to use m->ol_flags, please tell me. >>> >>> >>>Yep, passing ol_flags, or mbuf itself seems like a proper way to do it. >>>Aproach taken in l3fwd-power doesn't look right to me, see below. >>> >>>>> >>>>>> + uint16_t actual_cksum, expected_cksum; >>>>>> + actual_cksum = pkt->hdr_checksum; >>>>>> + pkt->hdr_checksum = 0; >>>>>> + expected_cksum = rte_ipv4_cksum(pkt); >>>>>> + if (actual_cksum != expected_cksum) >>>>>> + return -2; >>>>>> + } >>>>>> >>>>>> /* >>>>>> * 3. The IP version number must be 4. If the version number is >>>>>> not 4 >>>>>> diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c >>>>>> index 6063eb1399..37aec64718 100644 >>>>>> --- a/examples/l3fwd/main.c >>>>>> +++ b/examples/l3fwd/main.c >>>>>> @@ -117,7 +117,7 @@ static struct lcore_params * lcore_params = >>>>>> lcore_params_array_default; >>>>>> static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) / >>>>>> sizeof(lcore_params_array_default[0]); >>>>>> >>>>>> -static struct rte_eth_conf port_conf = { >>>>>> +struct rte_eth_conf port_conf = { >>>>>> .rxmode = { >>>>>> .mq_mode = RTE_ETH_MQ_RX_RSS, >>>>>> .offloads = RTE_ETH_RX_OFFLOAD_CHECKSUM, >>>>>> @@ -1257,8 +1257,12 @@ l3fwd_poll_resource_setup(void) >>>>>> local_port_conf.rx_adv_conf.rss_conf.rss_hf &= >>>>>> dev_info.flow_type_rss_offloads; >>>>>> >>>>>> - if (dev_info.max_rx_queues == 1) >>>>>> + /* relax the rx rss requirement */ >>>>>> + if (dev_info.max_rx_queues == 1 || >>>>>> !local_port_conf.rx_adv_conf.rss_conf.rss_hf) { >>>>>> + printf("warning: modified the rx mq_mode to >>>>>> RTE_ETH_MQ_RX_NONE base on" >>>>>> + " device capability\n"); >>>>>> local_port_conf.rxmode.mq_mode = >>>>>> RTE_ETH_MQ_RX_NONE; >>>>> >>>>>Should we probably instead have a new commnad-line option to explicitly >>>>>disable RSS? >>>> >>>>>Something like: '--no-rss' or so? >>>> Trevor: the RSS capability for a certain port was got by the >>>> rte_eth_dev_info_get() automatically, and we think the user should not >>>> care about its status beforehand, but if it's missing, a warning >>>> notification for the degrade here would be proposed to make it run >>>> smoothly. >>> >>>Personally, I still think it would be better the user will >>>have an ability to disable it explicitly. >>>Same as l3fwd does now with 'parse-ptype'. >>> >>>>> >>>>>> + } >>>>>> >>>>>> if (local_port_conf.rx_adv_conf.rss_conf.rss_hf != >>>>>> port_conf.rx_adv_conf.rss_conf.rss_hf) { >>>>>> @@ -1269,6 +1273,19 @@ l3fwd_poll_resource_setup(void) >>>>>> >>>>>> local_port_conf.rx_adv_conf.rss_conf.rss_hf); >>>>>> } >>>>>> >>>>>> + /* relax the rx offload requirement */ >>>>>> + if ((local_port_conf.rxmode.offloads & >>>>>> dev_info.rx_offload_capa) != >>>>>> + local_port_conf.rxmode.offloads) { >>>>>> + printf("Port %u requested Rx offloads >>>>>> 0x%"PRIx64" does not" >>>>>> + " match Rx offloads capabilities >>>>>> 0x%"PRIx64"\n", >>>>>> + portid, local_port_conf.rxmode.offloads, >>>>>> + dev_info.rx_offload_capa); >>>>>> + local_port_conf.rxmode.offloads &= >>>>>> dev_info.rx_offload_capa; >>>>>> + port_conf.rxmode.offloads = >>>>>> local_port_conf.rxmode.offloads; >>>>> >>>>>Why to remove offloads in port_conf? >>>>>There could be multiple ports, and on others desired HW offloads might >>>> >>>>>be supported. >>>> Trevor: Yes, there would be multiple ports, so if one of the ports lack HW >>>> offload, it would be ok to just use the relaxed requirement here, like we >>>> previously talked in >>>> https://github.com/DPDK/dpdk/blob/main/examples/l3fwd-power/main.c#L487, >>>> if you still think it's needed to use the per-port case, it would be ok to >>>> use the ol_flags as talked previously. >>> >>> >>>But then, depending on the ports order you can end-up with IP_CKSUM >>>offload enabled on some ports (but not used), while completely disable >>>on other ports - even if these ports do support IP_CKSUM. >>>I think the better way would be not to touch port_conf here, and above >>>use ol_flags to decide should we compute cksum in SW or not. >>> >>> >>>>> >>>>>> + printf("warning: modified the rx offload to >>>>>> 0x%"PRIx64" based on device" >>>>>> + " capability\n", >>>>>> local_port_conf.rxmode.offloads); >>>>>> + } >>>>>> + >>>>>> ret = rte_eth_dev_configure(portid, nb_rx_queue, >>>>>> (uint16_t)n_tx_queue, >>>>>> &local_port_conf); >>>>>> if (ret < 0)