> > > >>>>>>>>> > > >>>>>>>>> testpmd will initialize default max packet length to 1518 which > > >>>>>>>>> doesn't include vlan tag size in ether overheader. Once, send the > > >>>>>>>>> max mtu length packet with vlan tag, the max packet length will > > >>>>>>>>> exceed 1518 that will cause packets dropped directly from NIC hw > > >>>>> side. > > >>>>>>>>> > > >>>>>>>>> ice can support dual vlan tags that need more 8 bytes for max > > >>>>>>>>> packet size, so, configures the correct max packet size in > > >>>>>>>>> dev_config > > >>>>>> ops. > > >>>>>>>>> > > >>>>>>>>> Fixes: 50cc9d2a6e9d ("net/ice: fix max frame size") > > >>>>>>>>> > > >>>>>>>>> Signed-off-by: SteveX Yang <stevex.y...@intel.com> > > >>>>>>>>> --- > > >>>>>>>>> drivers/net/ice/ice_ethdev.c | 11 +++++++++++ > > >>>>>>>>> 1 file changed, 11 insertions(+) > > >>>>>>>>> > > >>>>>>>>> diff --git a/drivers/net/ice/ice_ethdev.c > > >>>>>>>>> b/drivers/net/ice/ice_ethdev.c index > > >>>>>>>>> cfd357b05..6b7098444 100644 > > >>>>>>>>> --- a/drivers/net/ice/ice_ethdev.c > > >>>>>>>>> +++ b/drivers/net/ice/ice_ethdev.c > > >>>>>>>>> @@ -3146,6 +3146,7 @@ ice_dev_configure(struct rte_eth_dev > > >>> *dev) > > >>>>>>>>> struct ice_adapter *ad = > > >>>>>>>>> ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); > > >>>>>>>>> struct ice_pf *pf = > > >>>>>>>>> ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); > > >>>>>>>>> +uint32_t frame_size = dev->data->mtu + ICE_ETH_OVERHEAD; > > >>>>>>>>> int ret; > > >>>>>>>>> > > >>>>>>>>> /* Initialize to TRUE. If any of Rx queues doesn't meet the @@ > > >>>>>>>>> -3157,6 > > >>>>>>>>> +3158,16 @@ ice_dev_configure(struct rte_eth_dev *dev) > > >>>>>>>>> if (dev->data->dev_conf.rxmode.mq_mode & > > >>> ETH_MQ_RX_RSS_FLAG) > > >>>>>>>>> dev->data->dev_conf.rxmode.offloads |= > > >>>>>> DEV_RX_OFFLOAD_RSS_HASH; > > >>>>>>>>> > > >>>>>>>>> +/** > > >>>>>>>>> + * Considering QinQ packet, max frame size should be equal or > > >>>>>>>>> + * larger than total size of MTU and Ether overhead. > > >>>>>>>>> + */ > > >>>>>>>> > > >>>>>>>>> +if (frame_size > dev->data->dev_conf.rxmode.max_rx_pkt_len) { > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> Why we need this check? > > >>>>>>>> Can we just call ice_mtu_set directly > > >>>>>>> > > >>>>>>> I think that without that check we can silently overwrite provided > > >>>>>>> by user dev_conf.rxmode.max_rx_pkt_len value. > > >>>>>> > > >>>>>> OK, I see > > >>>>>> > > >>>>>> But still have one question > > >>>>>> dev->data->mtu is initialized to 1518 as default , but if > > >>>>>> dev->data->application set > > >>>>>> dev_conf.rxmode.max_rx_pkt_len = 1000 in dev_configure. > > >>>>>> does that mean we will still will set mtu to 1518, is this expected? > > >>>>>> > > >>>>> > > >>>>> max_rx_pkt_len should be larger than mtu at least, so we should raise > > >>>>> the max_rx_pkt_len (e.g.:1518) to hold expected mtu value (e.g.: > > >>>>> 1500). > > >>>> > > >>>> Ok, this describe the problem more general and better to replace exist > > >>> code comment and commit log for easy understanding. > > >>>> Please send a new version for reword > > >>>> > > >>> > > >>> I didn't really get this set. > > >>> > > >>> Application explicitly sets 'max_rx_pkt_len' to '1518', and a frame > > >>> bigger than > > >>> this size is dropped. > > >> > > >> Sure, it is normal case for dropping oversize data. > > >> > > >>> Isn't this what should be, why we are trying to overwrite user > > >>> configuration > > >>> in PMD to prevent this? > > >>> > > >> > > >> But it is a confliction that application/user sets mtu & max_rx_pkt_len > > >> at the same time. > > >> This fix will make a decision when confliction occurred. > > >> MTU value will come from user operation (e.g.: port config mtu 0 1500) > > >> directly, > > >> so, the max_rx_pkt_len will resize itself to adapt expected MTU value if > > >> its size is smaller than MTU + Ether overhead. > > >> > > >>> During eth_dev allocation, mtu set to default '1500', by ethdev layer. > > >>> And testpmd sets 'max_rx_pkt_len' by default to '1518'. > > >>> I think Qi's concern above is valid, what is user set 'max_rx_pkt_len' > > >>> to '1000' > > >>> and mean it? PMD will not honor the user config. > > >> > > >> I'm not sure when set 'mtu' to '1500' and 'max_rx_pkt_len' to '1000', > > >> what's the behavior expected? > > >> If still keep the 'max_rx_pkt_len' value, that means the larger 'mtu' > > >> will be invalid. > > >> > > >>> > > >>> Why not simply increase the default 'max_rx_pkt_len' in testpmd? > > >>> > > >> The default 'max_rx_pkt_len' has been initialized to generical value > > >> (1518) and default 'mtu' is '1500' in testpmd, > > >> But it isn't suitable to those NIC drivers which Ether overhead is > > >> larger than 18. (e.g.: ice, i40e) if 'mtu' value is preferable. > > >> > > >>> And I guess even better what we need is to tell to the application what > > >>> the > > >>> frame overhead PMD accepts. > > >>> So the application can set proper 'max_rx_pkt_len' value per port for a > > >>> given/requested MTU value. > > >>> @Ian, cc'ed, was complaining almost same thing years ago, these PMD > > >>> overhead macros and 'max_mtu'/'min_mtu' added because of that, perhaps > > >>> he has a solution now? > > > > > > From my perspective the main problem here: > > > We have 2 different variables for nearly the same thing: > > > rte_eth_dev_data.mtu and rte_eth_dev_data.dev_conf.max_rx_pkt_len. > > > and 2 different API to update them: dev_mtu_set() and dev_configure(). > > > > According API 'max_rx_pkt_len' is 'Only used if JUMBO_FRAME enabled' > > Although not sure that is practically what is done for all drivers. > > I think most of Intel PMDs use it unconditionally. > > > > > > And inside majority of Intel PMDs we don't keep these 2 variables in sync: > > > - mtu_set() will update both variables. > > > - dev_configure() will update only max_rx_pkt_len, but will keep mtu > > > intact. > > > > > > This patch fixes this inconsistency, which I think is a good thing. > > > Though yes, it introduces change in behaviour. > > > > > > Let say the code: > > > rte_eth_dev_set_mtu(port, 1500); > > > dev_conf.max_rx_pkt_len = 1000; > > > rte_eth_dev_configure(port, 1, 1, &dev_conf); > > > > > > > 'rte_eth_dev_configure()' is one of the first APIs called, it is called > > before > > 'rte_eth_dev_set_mtu(). > > Usually yes. > But you can still do sometimes later: dev_mtu_set(); ...; dev_stop(); > dev_configure(); dev_start(); > > > > > When 'rte_eth_dev_configure()' is called, MTU is set to '1500' by default by > > ethdev layer, so it is not user configuration, but 'max_rx_pkt_len' is. > > See above. > PMD doesn't know where this MTU value came from (default ethdev value or user > specified value) > and probably it shouldn't care. > > > > > And later, when 'rte_eth_dev_set_mtu()' is called, but MTU and > > 'max_rx_pkt_len' > > are updated (mostly). > > Yes, in mtu_set() we update both. > But we don't update MTU in dev_configure(), only max_rx_pkt_len. > That what this patch tries to fix (as I understand it).
To be more precise - it doesn't change MTU value in dev_configure(), but instead doesn't allow max_rx_pkt_len to become smaller then MTU + OVERHEAD. Probably changing MTU value instead is a better choice. > > > > > > > Before the patch will result: > > > mtu==1500, max_rx_pkt_len=1000; //out of sync looks wrong to me > > > > > > After the patch: > > > mtu=1500, max_rx_ptk_len=1518; // in sync, change in behaviour. > > > > > > If you think we need to preserve current behaviour, > > > then I suppose the easiest thing would be to change dev_config() code > > > to update mtu value based on max_rx_pkt_len. > > > I.E: dev_configure {...; mtu_set(max_rx_pkt_len - OVERHEAD); ...} > > > So the code snippet above will result: > > > mtu=982,max_rx_pkt_len=1000; > > > > > > > The 'max_rx_ptk_len' is annoyance for a long time, what do you think to just > > drop it? > > > > By default device will be up with default MTU (1500), later > > 'rte_eth_dev_set_mtu' can be used to set the MTU, no frame size setting at > > all. > > > > Will this work? > > I think it might, but that's a big change, probably too risky at that stage... > > > > > > > > And for short term, for above Intel PMDs, there must be a place this > > 'max_rx_pkt_len' value taken into account (mostly 'start()' dev_ops), that > > function can be updated to take 'max_rx_pkt_len' only if JUMBO_FRAME set, > > otherwise use the 'MTU' value. > > Even if we'll use max_rx_pkt_len only when if JUMBO_FRAME is set, > I think we still need to keep max_rx_pkt_len and MTU values in sync. > > > > > Without 'start()' updated the current logic won't work after stop & start > > anyway. > > > > > > > > > > > > > > > >> > > >>> > > >>> And why this same thing can't happen to other PMDs? If this is a > > >>> problem for > > >>> all PMDs, we should solve in other level, not for only some PMDs. > > >>> > > >> No, all PMDs exist the same issue, another proposal: > > >> - rte_ethdev provides the unique resize 'max_rx_pkt_len' in > > >> rte_eth_dev_configure(); > > >> - provide the uniform API for fetching the NIC's supported Ether > > >> Overhead size; > > >> Is it feasible? > > >> > > >>>> > > >>>>> Generally, the mtu value can be adjustable from user (e.g.: ip link > > >>>>> set ens801f0 mtu 1400), hence, we just adjust the max_rx_pkt_len to > > >>>>> satisfy mtu requirement. > > >>>>> > > >>>>>> Should we just call ice_mtu_set(dev, dev_conf.rxmode.max_rx_pkt_len) > > >>>>>> here? > > >>>>> ice_mtu_set(dev, mtu) will append ether overhead to > > >>>>> frame_size/max_rx_pkt_len, so we need pass the mtu value as the 2nd > > >>>>> parameter, or not the max_rx_pkt_len. > > >>>>> > > >>>>>> > > >>>>>> > > >>>>>>> > > >>>>>>>> And please remove above comment, since ether overhead is already > > >>>>>>> considered in ice_mtu_set. > > >>>>> Ether overhead is already considered in ice_mtu_set, but it also > > >>>>> should be considered as the adjustment condition that if ice_mtu_set > > >>> need be invoked. > > >>>>> So, it perhaps should remain this comment before this if() condition. > > >>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>>> +ret = ice_mtu_set(dev, dev->data->mtu); if (ret != 0) return > > >>>>>>>>> +ret; } > > >>>>>>>>> + > > >>>>>>>>> ret = ice_init_rss(pf); > > >>>>>>>>> if (ret) { > > >>>>>>>>> PMD_DRV_LOG(ERR, "Failed to enable rss for PF"); > > >>>>>>>>> -- > > >>>>>>>>> 2.17.1 > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >