Hi Trevor,
At 2023-09-18 02:04:19, "Konstantin Ananyev" <konstantin.v.anan...@yandex.ru>
wrote:
03/09/2023 05:01, Trevor Tao пишет:
Now the port Rx mq_mode had been set to RTE_ETH_MQ_RX_RSS, and offload
mode set to RTE_ETH_RX_OFFLOAD_CHECKSUM by default, but some hardware
and/or virtual interface does not support the RSS and offload mode
presupposed, e.g., some virtio interfaces in the cloud don't support
RSS and may only partly support RTE_ETH_RX_OFFLOAD_UDP_CKSUM/
RTE_ETH_RX_OFFLOAD_TCP_CKSUM,
but not RTE_ETH_RX_OFFLOAD_IPV4_CKSUM, and the error msg here:
virtio_dev_configure(): RSS support requested but not supported by
the device
Port0 dev_configure = -95
and:
Ethdev port_id=0 requested Rx offloads 0xe does not match Rx offloads
capabilities 0x201d in rte_eth_dev_configure()
So to enable the l3fwd running in that environment, the Rx mode requirement
can be relaxed to reflect the hardware feature reality here, and the l3fwd
can run smoothly then.
A warning msg would be provided to user in case it happens here.
On the other side, enabling the software cksum check in case the
hw support missing.
Fixes: af75078fece3 ("first public release")
Cc: sta...@dpdk.org
I don't think there was abug here.
We are talking about changing current requirements for the app.
So not sure it is a real fix and that such change can be
propagated to stable releases.
Trevor: I think it's not a bug fix but a feature enhancement, it would enable
l3fwd to work smoothly on the HW/virtual interfaces which don't support RSS
and/or cksum offloading.
Yes. it seems like sort of an enhancement.
While 'Fixes: ...' are for bugs only.
AFAIK, only bug-fixes are take for backporting by stable releases.
That's why there seems no point to add CC: sta...@dpdk.org
Another generic things:
- l3fwd doc and release notes probably need to be updated
- as you areintroducing 2 distinct features: no-rss and no-ipv4-cksum
it is probably better to split it into 2 different patches (in the
same series).
Signed-off-by: Trevor Tao <taozj...@163.com>
---
examples/l3fwd/l3fwd.h | 12 +++++++++++-
examples/l3fwd/main.c | 21 +++++++++++++++++++--
2 files changed, 30 insertions(+), 3 deletions(-)
diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
index b55855c932..cc10643c4b 100644
--- a/examples/l3fwd/l3fwd.h
+++ b/examples/l3fwd/l3fwd.h
@@ -115,6 +115,8 @@ extern struct acl_algorithms acl_alg[];
extern uint32_t max_pkt_len;
+extern struct rte_eth_conf port_conf;
+
/* Send burst of packets on an output interface */
static inline int
send_burst(struct lcore_conf *qconf, uint16_t n, uint16_t port)
@@ -170,7 +172,15 @@ is_valid_ipv4_pkt(struct rte_ipv4_hdr *pkt, uint32_t
link_len)
return -1;
/* 2. The IP checksum must be correct. */
- /* this is checked in H/W */
+ /* if this is not checked in H/W, check it. */
+ if ((port_conf.rxmode.offloads & RTE_ETH_RX_OFFLOAD_IPV4_CKSUM) == 0) {
Might be better to check particular mbuf flag:
if ((mbuf->ol_flags & RTE_MBUF_F_RX_IP_CKSUM_MASK) ==
TE_MBUF_F_RX_IP_CKSUM_UNKNOWN) {...}
Trevor: the utility function is_valid_ipv4_pkt is just against an IPv4 pkt, and
there's no mbuf information, and if needed, there would be an extra ol_flags
added here to check if it was already done by the ethernet device, but look for
a sample in:
https://github.com/DPDK/dpdk/blob/main/examples/l3fwd-power/main.c#L487
so I think it's ok to just use the port_conf here. If you still think it's better
to use m->ol_flags, please tell me.
Yep, passing ol_flags, or mbuf itself seems like a proper way to do it.
Aproach taken in l3fwd-power doesn't look right to me, see below.
+ uint16_t actual_cksum, expected_cksum;
+ actual_cksum = pkt->hdr_checksum;
+ pkt->hdr_checksum = 0;
+ expected_cksum = rte_ipv4_cksum(pkt);
+ if (actual_cksum != expected_cksum)
+ return -2;
+ }
/*
* 3. The IP version number must be 4. If the version number is not 4
diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 6063eb1399..37aec64718 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -117,7 +117,7 @@ static struct lcore_params * lcore_params =
lcore_params_array_default;
static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
sizeof(lcore_params_array_default[0]);
-static struct rte_eth_conf port_conf = {
+struct rte_eth_conf port_conf = {
.rxmode = {
.mq_mode = RTE_ETH_MQ_RX_RSS,
.offloads = RTE_ETH_RX_OFFLOAD_CHECKSUM,
@@ -1257,8 +1257,12 @@ l3fwd_poll_resource_setup(void)
local_port_conf.rx_adv_conf.rss_conf.rss_hf &=
dev_info.flow_type_rss_offloads;
- if (dev_info.max_rx_queues == 1)
+ /* relax the rx rss requirement */
+ if (dev_info.max_rx_queues == 1 ||
!local_port_conf.rx_adv_conf.rss_conf.rss_hf) {
+ printf("warning: modified the rx mq_mode to
RTE_ETH_MQ_RX_NONE base on"
+ " device capability\n");
local_port_conf.rxmode.mq_mode = RTE_ETH_MQ_RX_NONE;
Should we probably instead have a new commnad-line option to explicitly
disable RSS?
Something like: '--no-rss' or so?
Trevor: the RSS capability for a certain port was got by the
rte_eth_dev_info_get() automatically, and we think the user should not care
about its status beforehand, but if it's missing, a warning notification for
the degrade here would be proposed to make it run smoothly.
Personally, I still think it would be better the user will
have an ability to disable it explicitly.
Same as l3fwd does now with 'parse-ptype'.
+ }
if (local_port_conf.rx_adv_conf.rss_conf.rss_hf !=
port_conf.rx_adv_conf.rss_conf.rss_hf) {
@@ -1269,6 +1273,19 @@ l3fwd_poll_resource_setup(void)
local_port_conf.rx_adv_conf.rss_conf.rss_hf);
}
+ /* relax the rx offload requirement */
+ if ((local_port_conf.rxmode.offloads &
dev_info.rx_offload_capa) !=
+ local_port_conf.rxmode.offloads) {
+ printf("Port %u requested Rx offloads 0x%"PRIx64" does
not"
+ " match Rx offloads capabilities 0x%"PRIx64"\n",
+ portid, local_port_conf.rxmode.offloads,
+ dev_info.rx_offload_capa);
+ local_port_conf.rxmode.offloads &=
dev_info.rx_offload_capa;
+ port_conf.rxmode.offloads =
local_port_conf.rxmode.offloads;
Why to remove offloads in port_conf?
There could be multiple ports, and on others desired HW offloads might
be supported.
Trevor: Yes, there would be multiple ports, so if one of the ports lack HW
offload, it would be ok to just use the relaxed requirement here, like we
previously talked in
https://github.com/DPDK/dpdk/blob/main/examples/l3fwd-power/main.c#L487, if you
still think it's needed to use the per-port case, it would be ok to use the
ol_flags as talked previously.
But then, depending on the ports order you can end-up with IP_CKSUM
offload enabled on some ports (but not used), while completely disable
on other ports - even if these ports do support IP_CKSUM.
I think the better way would be not to touch port_conf here, and above
use ol_flags to decide should we compute cksum in SW or not.
+ printf("warning: modified the rx offload to 0x%"PRIx64"
based on device"
+ " capability\n",
local_port_conf.rxmode.offloads);
+ }
+
ret = rte_eth_dev_configure(portid, nb_rx_queue,
(uint16_t)n_tx_queue, &local_port_conf);
if (ret < 0)