On 10/21/2020 10:47 AM, Ananyev, Konstantin wrote:
On 10/20/2020 10:07 AM, Ananyev, Konstantin wrote:testpmd will initialize default max packet length to 1518 which doesn't include vlan tag size in ether overheader. Once, send the max mtu length packet with vlan tag, the max packet length will exceed 1518 that will cause packets dropped directly from NIC hwside.ice can support dual vlan tags that need more 8 bytes for max packet size, so, configures the correct max packet size in dev_configops.Fixes: 50cc9d2a6e9d ("net/ice: fix max frame size") Signed-off-by: SteveX Yang <stevex.y...@intel.com> --- drivers/net/ice/ice_ethdev.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c index cfd357b05..6b7098444 100644 --- a/drivers/net/ice/ice_ethdev.c +++ b/drivers/net/ice/ice_ethdev.c @@ -3146,6 +3146,7 @@ ice_dev_configure(struct rte_eth_dev*dev)struct ice_adapter *ad = ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); struct ice_pf *pf = ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); +uint32_t frame_size = dev->data->mtu + ICE_ETH_OVERHEAD; int ret; /* Initialize to TRUE. If any of Rx queues doesn't meet the @@ -3157,6 +3158,16 @@ ice_dev_configure(struct rte_eth_dev *dev) if (dev->data->dev_conf.rxmode.mq_mode Ð_MQ_RX_RSS_FLAG)dev->data->dev_conf.rxmode.offloads |=DEV_RX_OFFLOAD_RSS_HASH;+/** + * Considering QinQ packet, max frame size should be equal or + * larger than total size of MTU and Ether overhead. + */+if (frame_size > dev->data->dev_conf.rxmode.max_rx_pkt_len) {Why we need this check? Can we just call ice_mtu_set directlyI think that without that check we can silently overwrite provided by user dev_conf.rxmode.max_rx_pkt_len value.OK, I see But still have one question dev->data->mtu is initialized to 1518 as default , but if dev->data->application set dev_conf.rxmode.max_rx_pkt_len = 1000 in dev_configure. does that mean we will still will set mtu to 1518, is this expected?max_rx_pkt_len should be larger than mtu at least, so we should raise the max_rx_pkt_len (e.g.:1518) to hold expected mtu value (e.g.: 1500).Ok, this describe the problem more general and better to replace existcode comment and commit log for easy understanding.Please send a new version for rewordI didn't really get this set. Application explicitly sets 'max_rx_pkt_len' to '1518', and a frame bigger than this size is dropped.Sure, it is normal case for dropping oversize data.Isn't this what should be, why we are trying to overwrite user configuration in PMD to prevent this?But it is a confliction that application/user sets mtu & max_rx_pkt_len at the same time. This fix will make a decision when confliction occurred. MTU value will come from user operation (e.g.: port config mtu 0 1500) directly, so, the max_rx_pkt_len will resize itself to adapt expected MTU value if its size is smaller than MTU + Ether overhead.During eth_dev allocation, mtu set to default '1500', by ethdev layer. And testpmd sets 'max_rx_pkt_len' by default to '1518'. I think Qi's concern above is valid, what is user set 'max_rx_pkt_len' to '1000' and mean it? PMD will not honor the user config.I'm not sure when set 'mtu' to '1500' and 'max_rx_pkt_len' to '1000', what's the behavior expected? If still keep the 'max_rx_pkt_len' value, that means the larger 'mtu' will be invalid.Why not simply increase the default 'max_rx_pkt_len' in testpmd?The default 'max_rx_pkt_len' has been initialized to generical value (1518) and default 'mtu' is '1500' in testpmd, But it isn't suitable to those NIC drivers which Ether overhead is larger than 18. (e.g.: ice, i40e) if 'mtu' value is preferable.And I guess even better what we need is to tell to the application what the frame overhead PMD accepts. So the application can set proper 'max_rx_pkt_len' value per port for a given/requested MTU value. @Ian, cc'ed, was complaining almost same thing years ago, these PMD overhead macros and 'max_mtu'/'min_mtu' added because of that, perhaps he has a solution now?From my perspective the main problem here: We have 2 different variables for nearly the same thing: rte_eth_dev_data.mtu and rte_eth_dev_data.dev_conf.max_rx_pkt_len. and 2 different API to update them: dev_mtu_set() and dev_configure().According API 'max_rx_pkt_len' is 'Only used if JUMBO_FRAME enabled' Although not sure that is practically what is done for all drivers.I think most of Intel PMDs use it unconditionally.And inside majority of Intel PMDs we don't keep these 2 variables in sync: - mtu_set() will update both variables. - dev_configure() will update only max_rx_pkt_len, but will keep mtu intact. This patch fixes this inconsistency, which I think is a good thing. Though yes, it introduces change in behaviour. Let say the code: rte_eth_dev_set_mtu(port, 1500); dev_conf.max_rx_pkt_len = 1000; rte_eth_dev_configure(port, 1, 1, &dev_conf);'rte_eth_dev_configure()' is one of the first APIs called, it is called before 'rte_eth_dev_set_mtu().Usually yes. But you can still do sometimes later: dev_mtu_set(); ...; dev_stop(); dev_configure(); dev_start();When 'rte_eth_dev_configure()' is called, MTU is set to '1500' by default by ethdev layer, so it is not user configuration, but 'max_rx_pkt_len' is.See above. PMD doesn't know where this MTU value came from (default ethdev value or user specified value) and probably it shouldn't care.And later, when 'rte_eth_dev_set_mtu()' is called, but MTU and 'max_rx_pkt_len' are updated (mostly).Yes, in mtu_set() we update both. But we don't update MTU in dev_configure(), only max_rx_pkt_len. That what this patch tries to fix (as I understand it).To be more precise - it doesn't change MTU value in dev_configure(), but instead doesn't allow max_rx_pkt_len to become smaller then MTU + OVERHEAD. Probably changing MTU value instead is a better choice.+1 to change mtu for this case. And this is what happens in practice when there is no 'rte_eth_dev_set_mtu()' call, since PMD is using ('max_rx_pkt_len' - OVERHEAD) to set MTU.Hmm, I don't see that happens within Intel PMDs. As I can read the code: if user never call mtu_set(), then MTU value is left intact.I was checking ice, in 'ice_dev_start()', 'rxmode.max_rx_pkt_len' is used to configure the device.Yes, I am not arguing with that. What I am saying - dev_config() doesn't update MTU based on max_rx_pkt_len. While it probably should.
Yes 'dev_configure()' doesn't update the 'dev->data->mtu' and 'max_rx_pkt_len' & 'dev->data->mtu' may diverge there.
I think best place to update 'dev->data->mtu' is where the device is actually updated, but to prevent the diversion above we can update 'dev->data->mtu' in ethdev layer, in 'rte_eth_dev_configure()' based on 'max_rx_pkt_len', will it work?
Only concern I see is if user reads the MTU ('rte_eth_dev_get_mtu()') after 'rte_eth_dev_configure()' but before device configured, user will get the wrong value, I guess that problem was already there but changing default value may make it more visible.
But this won't solve the problem Steve is trying to solve.You mean we still need to update test-pmd code to calculate max_rx_pkt_len properly for default mtu value?Yes. Because target of this set is able to receive packets with payload size 'RTE_ETHER_MTU', if MTU is updated according to the provided 'max_rx_pkt_len', device still won't able to receive those packets.Agree.Before the patch will result: mtu==1500, max_rx_pkt_len=1000; //out of sync looks wrong to me After the patch: mtu=1500, max_rx_ptk_len=1518; // in sync, change in behaviour. If you think we need to preserve current behaviour, then I suppose the easiest thing would be to change dev_config() code to update mtu value based on max_rx_pkt_len. I.E: dev_configure {...; mtu_set(max_rx_pkt_len - OVERHEAD); ...} So the code snippet above will result: mtu=982,max_rx_pkt_len=1000;The 'max_rx_ptk_len' is annoyance for a long time, what do you think to just drop it? By default device will be up with default MTU (1500), later 'rte_eth_dev_set_mtu' can be used to set the MTU, no frame size setting at all. Will this work?I think it might, but that's a big change, probably too risky at that stage...Defintely, I was thinking for 21.11. Let me send a deprecation notice and see what happens.And for short term, for above Intel PMDs, there must be a place this 'max_rx_pkt_len' value taken into account (mostly 'start()' dev_ops), that function can be updated to take 'max_rx_pkt_len' only if JUMBO_FRAME set, otherwise use the 'MTU' value.Even if we'll use max_rx_pkt_len only when if JUMBO_FRAME is set, I think we still need to keep max_rx_pkt_len and MTU values in sync.Without 'start()' updated the current logic won't work after stop & start anyway.And why this same thing can't happen to other PMDs? If this is a problem for all PMDs, we should solve in other level, not for only some PMDs.No, all PMDs exist the same issue, another proposal: - rte_ethdev provides the unique resize 'max_rx_pkt_len' in rte_eth_dev_configure(); - provide the uniform API for fetching the NIC's supported Ether Overhead size; Is it feasible?Generally, the mtu value can be adjustable from user (e.g.: ip link set ens801f0 mtu 1400), hence, we just adjust the max_rx_pkt_len to satisfy mtu requirement.Should we just call ice_mtu_set(dev, dev_conf.rxmode.max_rx_pkt_len) here?ice_mtu_set(dev, mtu) will append ether overhead to frame_size/max_rx_pkt_len, so we need pass the mtu value as the 2nd parameter, or not the max_rx_pkt_len.And please remove above comment, since ether overhead is alreadyconsidered in ice_mtu_set.Ether overhead is already considered in ice_mtu_set, but it also should be considered as the adjustment condition that if ice_mtu_setneed be invoked.So, it perhaps should remain this comment before this if() condition.+ret = ice_mtu_set(dev, dev->data->mtu); if (ret != 0) return +ret; } + ret = ice_init_rss(pf); if (ret) { PMD_DRV_LOG(ERR, "Failed to enable rss for PF"); -- 2.17.1