> >>>>>>>>>>>> > >>>>>>>>>>>> testpmd will initialize default max packet length to 1518 which > >>>>>>>>>>>> doesn't include vlan tag size in ether overheader. Once, send the > >>>>>>>>>>>> max mtu length packet with vlan tag, the max packet length will > >>>>>>>>>>>> exceed 1518 that will cause packets dropped directly from NIC hw > >>>>>>>> side. > >>>>>>>>>>>> > >>>>>>>>>>>> ice can support dual vlan tags that need more 8 bytes for max > >>>>>>>>>>>> packet size, so, configures the correct max packet size in > >>>>>>>>>>>> dev_config > >>>>>>>>> ops. > >>>>>>>>>>>> > >>>>>>>>>>>> Fixes: 50cc9d2a6e9d ("net/ice: fix max frame size") > >>>>>>>>>>>> > >>>>>>>>>>>> Signed-off-by: SteveX Yang <stevex.y...@intel.com> > >>>>>>>>>>>> --- > >>>>>>>>>>>> drivers/net/ice/ice_ethdev.c | 11 +++++++++++ > >>>>>>>>>>>> 1 file changed, 11 insertions(+) > >>>>>>>>>>>> > >>>>>>>>>>>> diff --git a/drivers/net/ice/ice_ethdev.c > >>>>>>>>>>>> b/drivers/net/ice/ice_ethdev.c index > >>>>>>>>>>>> cfd357b05..6b7098444 100644 > >>>>>>>>>>>> --- a/drivers/net/ice/ice_ethdev.c > >>>>>>>>>>>> +++ b/drivers/net/ice/ice_ethdev.c > >>>>>>>>>>>> @@ -3146,6 +3146,7 @@ ice_dev_configure(struct rte_eth_dev > >>>>>> *dev) > >>>>>>>>>>>> struct ice_adapter *ad = > >>>>>>>>>>>> ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); > >>>>>>>>>>>> struct ice_pf *pf = > >>>>>>>>>>>> ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); > >>>>>>>>>>>> +uint32_t frame_size = dev->data->mtu + ICE_ETH_OVERHEAD; > >>>>>>>>>>>> int ret; > >>>>>>>>>>>> > >>>>>>>>>>>> /* Initialize to TRUE. If any of Rx queues doesn't meet the > >>>>>>>>>>>> @@ > >>>>>>>>>>>> -3157,6 > >>>>>>>>>>>> +3158,16 @@ ice_dev_configure(struct rte_eth_dev *dev) > >>>>>>>>>>>> if (dev->data->dev_conf.rxmode.mq_mode & > >>>>>> ETH_MQ_RX_RSS_FLAG) > >>>>>>>>>>>> dev->data->dev_conf.rxmode.offloads |= > >>>>>>>>> DEV_RX_OFFLOAD_RSS_HASH; > >>>>>>>>>>>> > >>>>>>>>>>>> +/** > >>>>>>>>>>>> + * Considering QinQ packet, max frame size should be equal or > >>>>>>>>>>>> + * larger than total size of MTU and Ether overhead. > >>>>>>>>>>>> + */ > >>>>>>>>>>> > >>>>>>>>>>>> +if (frame_size > dev->data->dev_conf.rxmode.max_rx_pkt_len) { > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Why we need this check? > >>>>>>>>>>> Can we just call ice_mtu_set directly > >>>>>>>>>> > >>>>>>>>>> I think that without that check we can silently overwrite provided > >>>>>>>>>> by user dev_conf.rxmode.max_rx_pkt_len value. > >>>>>>>>> > >>>>>>>>> OK, I see > >>>>>>>>> > >>>>>>>>> But still have one question > >>>>>>>>> dev->data->mtu is initialized to 1518 as default , but if > >>>>>>>>> dev->data->application set > >>>>>>>>> dev_conf.rxmode.max_rx_pkt_len = 1000 in dev_configure. > >>>>>>>>> does that mean we will still will set mtu to 1518, is this expected? > >>>>>>>>> > >>>>>>>> > >>>>>>>> max_rx_pkt_len should be larger than mtu at least, so we should raise > >>>>>>>> the max_rx_pkt_len (e.g.:1518) to hold expected mtu value (e.g.: > >>>>>>>> 1500). > >>>>>>> > >>>>>>> Ok, this describe the problem more general and better to replace exist > >>>>>> code comment and commit log for easy understanding. > >>>>>>> Please send a new version for reword > >>>>>>> > >>>>>> > >>>>>> I didn't really get this set. > >>>>>> > >>>>>> Application explicitly sets 'max_rx_pkt_len' to '1518', and a frame > >>>>>> bigger than > >>>>>> this size is dropped. > >>>>> > >>>>> Sure, it is normal case for dropping oversize data. > >>>>> > >>>>>> Isn't this what should be, why we are trying to overwrite user > >>>>>> configuration > >>>>>> in PMD to prevent this? > >>>>>> > >>>>> > >>>>> But it is a confliction that application/user sets mtu & max_rx_pkt_len > >>>>> at the same time. > >>>>> This fix will make a decision when confliction occurred. > >>>>> MTU value will come from user operation (e.g.: port config mtu 0 1500) > >>>>> directly, > >>>>> so, the max_rx_pkt_len will resize itself to adapt expected MTU value > >>>>> if its size is smaller than MTU + Ether overhead. > >>>>> > >>>>>> During eth_dev allocation, mtu set to default '1500', by ethdev layer. > >>>>>> And testpmd sets 'max_rx_pkt_len' by default to '1518'. > >>>>>> I think Qi's concern above is valid, what is user set 'max_rx_pkt_len' > >>>>>> to '1000' > >>>>>> and mean it? PMD will not honor the user config. > >>>>> > >>>>> I'm not sure when set 'mtu' to '1500' and 'max_rx_pkt_len' to '1000', > >>>>> what's the behavior expected? > >>>>> If still keep the 'max_rx_pkt_len' value, that means the larger 'mtu' > >>>>> will be invalid. > >>>>> > >>>>>> > >>>>>> Why not simply increase the default 'max_rx_pkt_len' in testpmd? > >>>>>> > >>>>> The default 'max_rx_pkt_len' has been initialized to generical value > >>>>> (1518) and default 'mtu' is '1500' in testpmd, > >>>>> But it isn't suitable to those NIC drivers which Ether overhead is > >>>>> larger than 18. (e.g.: ice, i40e) if 'mtu' value is preferable. > >>>>> > >>>>>> And I guess even better what we need is to tell to the application > >>>>>> what the > >>>>>> frame overhead PMD accepts. > >>>>>> So the application can set proper 'max_rx_pkt_len' value per port for a > >>>>>> given/requested MTU value. > >>>>>> @Ian, cc'ed, was complaining almost same thing years ago, these PMD > >>>>>> overhead macros and 'max_mtu'/'min_mtu' added because of that, perhaps > >>>>>> he has a solution now? > >>>> > >>>> From my perspective the main problem here: > >>>> We have 2 different variables for nearly the same thing: > >>>> rte_eth_dev_data.mtu and rte_eth_dev_data.dev_conf.max_rx_pkt_len. > >>>> and 2 different API to update them: dev_mtu_set() and dev_configure(). > >>> > >>> According API 'max_rx_pkt_len' is 'Only used if JUMBO_FRAME enabled' > >>> Although not sure that is practically what is done for all drivers. > >> > >> I think most of Intel PMDs use it unconditionally. > >> > >>> > >>>> And inside majority of Intel PMDs we don't keep these 2 variables in > >>>> sync: > >>>> - mtu_set() will update both variables. > >>>> - dev_configure() will update only max_rx_pkt_len, but will keep mtu > >>>> intact. > >>>> > >>>> This patch fixes this inconsistency, which I think is a good thing. > >>>> Though yes, it introduces change in behaviour. > >>>> > >>>> Let say the code: > >>>> rte_eth_dev_set_mtu(port, 1500); > >>>> dev_conf.max_rx_pkt_len = 1000; > >>>> rte_eth_dev_configure(port, 1, 1, &dev_conf); > >>>> > >>> > >>> 'rte_eth_dev_configure()' is one of the first APIs called, it is called > >>> before > >>> 'rte_eth_dev_set_mtu(). > >> > >> Usually yes. > >> But you can still do sometimes later: dev_mtu_set(); ...; dev_stop(); > >> dev_configure(); dev_start(); > >> > >>> > >>> When 'rte_eth_dev_configure()' is called, MTU is set to '1500' by default > >>> by > >>> ethdev layer, so it is not user configuration, but 'max_rx_pkt_len' is. > >> > >> See above. > >> PMD doesn't know where this MTU value came from (default ethdev value or > >> user specified value) > >> and probably it shouldn't care. > >> > >>> > >>> And later, when 'rte_eth_dev_set_mtu()' is called, but MTU and > >>> 'max_rx_pkt_len' > >>> are updated (mostly). > >> > >> Yes, in mtu_set() we update both. > >> But we don't update MTU in dev_configure(), only max_rx_pkt_len. > >> That what this patch tries to fix (as I understand it). > > > > To be more precise - it doesn't change MTU value in dev_configure(), > > but instead doesn't allow max_rx_pkt_len to become smaller > > then MTU + OVERHEAD. > > Probably changing MTU value instead is a better choice. > > > > +1 to change mtu for this case. > And this is what happens in practice when there is no 'rte_eth_dev_set_mtu()' > call, since PMD is using ('max_rx_pkt_len' - OVERHEAD) to set MTU.
Hmm, I don't see that happens within Intel PMDs. As I can read the code: if user never call mtu_set(), then MTU value is left intact. > But this won't solve the problem Steve is trying to solve. You mean we still need to update test-pmd code to calculate max_rx_pkt_len properly for default mtu value? > >>> > >>> > >>>> Before the patch will result: > >>>> mtu==1500, max_rx_pkt_len=1000; //out of sync looks wrong to me > >>>> > >>>> After the patch: > >>>> mtu=1500, max_rx_ptk_len=1518; // in sync, change in behaviour. > >>>> > >>>> If you think we need to preserve current behaviour, > >>>> then I suppose the easiest thing would be to change dev_config() code > >>>> to update mtu value based on max_rx_pkt_len. > >>>> I.E: dev_configure {...; mtu_set(max_rx_pkt_len - OVERHEAD); ...} > >>>> So the code snippet above will result: > >>>> mtu=982,max_rx_pkt_len=1000; > >>>> > >>> > >>> The 'max_rx_ptk_len' is annoyance for a long time, what do you think to > >>> just > >>> drop it? > >>> > >>> By default device will be up with default MTU (1500), later > >>> 'rte_eth_dev_set_mtu' can be used to set the MTU, no frame size setting > >>> at all. > >>> > >>> Will this work? > >> > >> I think it might, but that's a big change, probably too risky at that > >> stage... > >> > > Defintely, I was thinking for 21.11. Let me send a deprecation notice and see > what happens. > > >> > >>> > >>> > >>> And for short term, for above Intel PMDs, there must be a place this > >>> 'max_rx_pkt_len' value taken into account (mostly 'start()' dev_ops), that > >>> function can be updated to take 'max_rx_pkt_len' only if JUMBO_FRAME set, > >>> otherwise use the 'MTU' value. > >> > >> Even if we'll use max_rx_pkt_len only when if JUMBO_FRAME is set, > >> I think we still need to keep max_rx_pkt_len and MTU values in sync. > >> > >>> > >>> Without 'start()' updated the current logic won't work after stop & start > >>> anyway. > >>> > >>> > >>>> > >>>> > >>>> > >>>>> > >>>>>> > >>>>>> And why this same thing can't happen to other PMDs? If this is a > >>>>>> problem for > >>>>>> all PMDs, we should solve in other level, not for only some PMDs. > >>>>>> > >>>>> No, all PMDs exist the same issue, another proposal: > >>>>> - rte_ethdev provides the unique resize 'max_rx_pkt_len' in > >>>>> rte_eth_dev_configure(); > >>>>> - provide the uniform API for fetching the NIC's supported Ether > >>>>> Overhead size; > >>>>> Is it feasible? > >>>>> > >>>>>>> > >>>>>>>> Generally, the mtu value can be adjustable from user (e.g.: ip link > >>>>>>>> set ens801f0 mtu 1400), hence, we just adjust the max_rx_pkt_len to > >>>>>>>> satisfy mtu requirement. > >>>>>>>> > >>>>>>>>> Should we just call ice_mtu_set(dev, dev_conf.rxmode.max_rx_pkt_len) > >>>>>>>>> here? > >>>>>>>> ice_mtu_set(dev, mtu) will append ether overhead to > >>>>>>>> frame_size/max_rx_pkt_len, so we need pass the mtu value as the 2nd > >>>>>>>> parameter, or not the max_rx_pkt_len. > >>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> And please remove above comment, since ether overhead is already > >>>>>>>>>> considered in ice_mtu_set. > >>>>>>>> Ether overhead is already considered in ice_mtu_set, but it also > >>>>>>>> should be considered as the adjustment condition that if ice_mtu_set > >>>>>> need be invoked. > >>>>>>>> So, it perhaps should remain this comment before this if() condition. > >>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> +ret = ice_mtu_set(dev, dev->data->mtu); if (ret != 0) return > >>>>>>>>>>>> +ret; } > >>>>>>>>>>>> + > >>>>>>>>>>>> ret = ice_init_rss(pf); > >>>>>>>>>>>> if (ret) { > >>>>>>>>>>>> PMD_DRV_LOG(ERR, "Failed to enable rss for PF"); > >>>>>>>>>>>> -- > >>>>>>>>>>>> 2.17.1 > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>> > >