> 
> On 10/20/2020 10:07 AM, Ananyev, Konstantin wrote:
> >
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> testpmd will initialize default max packet length to 1518 which
> >>>>>>>>>>>>>> doesn't include vlan tag size in ether overheader. Once, send 
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>> max mtu length packet with vlan tag, the max packet length will
> >>>>>>>>>>>>>> exceed 1518 that will cause packets dropped directly from NIC 
> >>>>>>>>>>>>>> hw
> >>>>>>>>>> side.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ice can support dual vlan tags that need more 8 bytes for max
> >>>>>>>>>>>>>> packet size, so, configures the correct max packet size in
> >>>>>>>>>>>>>> dev_config
> >>>>>>>>>>> ops.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Fixes: 50cc9d2a6e9d ("net/ice: fix max frame size")
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Signed-off-by: SteveX Yang <stevex.y...@intel.com>
> >>>>>>>>>>>>>> ---
> >>>>>>>>>>>>>>      drivers/net/ice/ice_ethdev.c | 11 +++++++++++
> >>>>>>>>>>>>>>      1 file changed, 11 insertions(+)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> diff --git a/drivers/net/ice/ice_ethdev.c
> >>>>>>>>>>>>>> b/drivers/net/ice/ice_ethdev.c index
> >>>>>>>>>>>>>> cfd357b05..6b7098444 100644
> >>>>>>>>>>>>>> --- a/drivers/net/ice/ice_ethdev.c
> >>>>>>>>>>>>>> +++ b/drivers/net/ice/ice_ethdev.c
> >>>>>>>>>>>>>> @@ -3146,6 +3146,7 @@ ice_dev_configure(struct rte_eth_dev
> >>>>>>>> *dev)
> >>>>>>>>>>>>>> struct ice_adapter *ad =
> >>>>>>>>>>>>>> ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
> >>>>>>>>>>>>>>      struct ice_pf *pf =
> >>>>>>>>>>>>>> ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> >>>>>>>>>>>>>> +uint32_t frame_size = dev->data->mtu + ICE_ETH_OVERHEAD;
> >>>>>>>>>>>>>>      int ret;
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>      /* Initialize to TRUE. If any of Rx queues doesn't meet 
> >>>>>>>>>>>>>> the @@
> >>>>>>>>>>>>>> -3157,6
> >>>>>>>>>>>>>> +3158,16 @@ ice_dev_configure(struct rte_eth_dev *dev)
> >>>>>>>>>>>>>>      if (dev->data->dev_conf.rxmode.mq_mode &
> >>>>>>>> ETH_MQ_RX_RSS_FLAG)
> >>>>>>>>>>>>>> dev->data->dev_conf.rxmode.offloads |=
> >>>>>>>>>>> DEV_RX_OFFLOAD_RSS_HASH;
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> +/**
> >>>>>>>>>>>>>> + * Considering QinQ packet, max frame size should be equal or
> >>>>>>>>>>>>>> + * larger than total size of MTU and Ether overhead.
> >>>>>>>>>>>>>> + */
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> +if (frame_size > dev->data->dev_conf.rxmode.max_rx_pkt_len) {
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Why we need this check?
> >>>>>>>>>>>>> Can we just call ice_mtu_set directly
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think that without that check we can silently overwrite 
> >>>>>>>>>>>> provided
> >>>>>>>>>>>> by user dev_conf.rxmode.max_rx_pkt_len value.
> >>>>>>>>>>>
> >>>>>>>>>>> OK, I see
> >>>>>>>>>>>
> >>>>>>>>>>> But still have one question
> >>>>>>>>>>> dev->data->mtu is initialized to 1518 as default , but if
> >>>>>>>>>>> dev->data->application set
> >>>>>>>>>>> dev_conf.rxmode.max_rx_pkt_len = 1000 in dev_configure.
> >>>>>>>>>>> does that mean we will still will set mtu to 1518, is this 
> >>>>>>>>>>> expected?
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> max_rx_pkt_len should be larger than mtu at least, so we should 
> >>>>>>>>>> raise
> >>>>>>>>>> the max_rx_pkt_len (e.g.:1518) to hold expected mtu value (e.g.: 
> >>>>>>>>>> 1500).
> >>>>>>>>>
> >>>>>>>>> Ok, this describe the problem more general and better to replace 
> >>>>>>>>> exist
> >>>>>>>> code comment and commit log for easy understanding.
> >>>>>>>>> Please send a new version for reword
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> I didn't really get this set.
> >>>>>>>>
> >>>>>>>> Application explicitly sets 'max_rx_pkt_len' to '1518', and a frame 
> >>>>>>>> bigger than
> >>>>>>>> this size is dropped.
> >>>>>>>
> >>>>>>> Sure, it is normal case for dropping oversize data.
> >>>>>>>
> >>>>>>>> Isn't this what should be, why we are trying to overwrite user 
> >>>>>>>> configuration
> >>>>>>>> in PMD to prevent this?
> >>>>>>>>
> >>>>>>>
> >>>>>>> But it is a confliction that application/user sets mtu & 
> >>>>>>> max_rx_pkt_len at the same time.
> >>>>>>> This fix will make a decision when confliction occurred.
> >>>>>>> MTU value will come from user operation (e.g.: port config mtu 0 
> >>>>>>> 1500) directly,
> >>>>>>> so, the max_rx_pkt_len will resize itself to adapt expected MTU value 
> >>>>>>> if its size is smaller than MTU + Ether overhead.
> >>>>>>>
> >>>>>>>> During eth_dev allocation, mtu set to default '1500', by ethdev 
> >>>>>>>> layer.
> >>>>>>>> And testpmd sets 'max_rx_pkt_len' by default to '1518'.
> >>>>>>>> I think Qi's concern above is valid, what is user set 
> >>>>>>>> 'max_rx_pkt_len' to '1000'
> >>>>>>>> and mean it? PMD will not honor the user config.
> >>>>>>>
> >>>>>>> I'm not sure when set 'mtu' to '1500' and 'max_rx_pkt_len' to '1000', 
> >>>>>>> what's the behavior expected?
> >>>>>>> If still keep the 'max_rx_pkt_len' value, that means the larger 'mtu' 
> >>>>>>> will be invalid.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> Why not simply increase the default 'max_rx_pkt_len' in testpmd?
> >>>>>>>>
> >>>>>>> The default 'max_rx_pkt_len' has been initialized to generical value 
> >>>>>>> (1518) and default 'mtu' is '1500' in testpmd,
> >>>>>>> But it isn't suitable to those NIC drivers which Ether overhead is 
> >>>>>>> larger than 18. (e.g.: ice, i40e) if 'mtu' value is preferable.
> >>>>>>>
> >>>>>>>> And I guess even better what we need is to tell to the application 
> >>>>>>>> what the
> >>>>>>>> frame overhead PMD accepts.
> >>>>>>>> So the application can set proper 'max_rx_pkt_len' value per port 
> >>>>>>>> for a
> >>>>>>>> given/requested MTU value.
> >>>>>>>> @Ian, cc'ed, was complaining almost same thing years ago, these PMD
> >>>>>>>> overhead macros and 'max_mtu'/'min_mtu' added because of that, 
> >>>>>>>> perhaps
> >>>>>>>> he has a solution now?
> >>>>>>
> >>>>>>    From my perspective the main problem here:
> >>>>>> We have 2 different variables for nearly the same thing:
> >>>>>> rte_eth_dev_data.mtu and rte_eth_dev_data.dev_conf.max_rx_pkt_len.
> >>>>>> and 2 different API to update them: dev_mtu_set() and dev_configure().
> >>>>>
> >>>>> According API 'max_rx_pkt_len' is 'Only used if JUMBO_FRAME enabled'
> >>>>> Although not sure that is practically what is done for all drivers.
> >>>>
> >>>> I think most of Intel PMDs use it unconditionally.
> >>>>
> >>>>>
> >>>>>> And inside majority of Intel PMDs we don't keep these 2 variables in 
> >>>>>> sync:
> >>>>>> - mtu_set() will update both variables.
> >>>>>> - dev_configure() will update only max_rx_pkt_len, but will keep mtu 
> >>>>>> intact.
> >>>>>>
> >>>>>> This patch fixes this inconsistency, which I think is a good thing.
> >>>>>> Though yes, it introduces change in behaviour.
> >>>>>>
> >>>>>> Let say the code:
> >>>>>> rte_eth_dev_set_mtu(port, 1500);
> >>>>>> dev_conf.max_rx_pkt_len = 1000;
> >>>>>> rte_eth_dev_configure(port, 1, 1, &dev_conf);
> >>>>>>
> >>>>>
> >>>>> 'rte_eth_dev_configure()' is one of the first APIs called, it is called 
> >>>>> before
> >>>>> 'rte_eth_dev_set_mtu().
> >>>>
> >>>> Usually yes.
> >>>> But you can still do sometimes later: dev_mtu_set(); ...; dev_stop(); 
> >>>> dev_configure(); dev_start();
> >>>>
> >>>>>
> >>>>> When 'rte_eth_dev_configure()' is called, MTU is set to '1500' by 
> >>>>> default by
> >>>>> ethdev layer, so it is not user configuration, but 'max_rx_pkt_len' is.
> >>>>
> >>>> See above.
> >>>> PMD doesn't know where this MTU value came from (default ethdev value or 
> >>>> user specified value)
> >>>> and probably it shouldn't care.
> >>>>
> >>>>>
> >>>>> And later, when 'rte_eth_dev_set_mtu()' is called, but MTU and 
> >>>>> 'max_rx_pkt_len'
> >>>>> are updated (mostly).
> >>>>
> >>>> Yes, in mtu_set() we update both.
> >>>> But we don't update MTU in dev_configure(), only max_rx_pkt_len.
> >>>> That what this patch tries to fix (as I understand it).
> >>>
> >>> To be more precise - it doesn't change MTU value in dev_configure(),
> >>> but instead doesn't allow max_rx_pkt_len to become smaller
> >>> then MTU + OVERHEAD.
> >>> Probably changing MTU value instead is a better choice.
> >>>
> >>
> >> +1 to change mtu for this case.
> >> And this is what happens in practice when there is no 
> >> 'rte_eth_dev_set_mtu()'
> >> call, since PMD is using ('max_rx_pkt_len' - OVERHEAD) to set MTU.
> >
> > Hmm, I don't see that happens within Intel PMDs.
> > As I can read the code: if user never call mtu_set(), then MTU value is 
> > left intact.
> >
> 
> I was checking ice,
> in 'ice_dev_start()', 'rxmode.max_rx_pkt_len' is used to configure the device.

Yes, I am not arguing with that.
What I am saying - dev_config() doesn't update MTU based on max_rx_pkt_len.
While it probably should. 

> 
> >> But this won't solve the problem Steve is trying to solve.
> >
> > You mean we still need to update test-pmd code to calculate max_rx_pkt_len
> > properly for default mtu value?
> >
> 
> Yes.
> Because target of this set is able to receive packets with payload size
> 'RTE_ETHER_MTU', if MTU is updated according to the provided 'max_rx_pkt_len',
> device still won't able to receive those packets.

Agree.

> 
> >>>>>
> >>>>>
> >>>>>> Before the patch will result:
> >>>>>> mtu==1500, max_rx_pkt_len=1000;  //out of sync looks wrong to me
> >>>>>>
> >>>>>> After the patch:
> >>>>>> mtu=1500, max_rx_ptk_len=1518; // in sync, change in behaviour.
> >>>>>>
> >>>>>> If you think we need to preserve current behaviour,
> >>>>>> then I suppose the easiest thing would be to change dev_config() code
> >>>>>> to update mtu value based on max_rx_pkt_len.
> >>>>>> I.E: dev_configure {...; mtu_set(max_rx_pkt_len - OVERHEAD); ...}
> >>>>>> So the code snippet above will result:
> >>>>>> mtu=982,max_rx_pkt_len=1000;
> >>>>>>
> >>>>>
> >>>>> The 'max_rx_ptk_len' is annoyance for a long time, what do you think to 
> >>>>> just
> >>>>> drop it?
> >>>>>
> >>>>> By default device will be up with default MTU (1500), later
> >>>>> 'rte_eth_dev_set_mtu' can be used to set the MTU, no frame size setting 
> >>>>> at all.
> >>>>>
> >>>>> Will this work?
> >>>>
> >>>> I think it might, but that's a big change, probably too risky at that 
> >>>> stage...
> >>>>
> >>
> >> Defintely, I was thinking for 21.11. Let me send a deprecation notice and 
> >> see
> >> what happens.
> >>
> >>>>
> >>>>>
> >>>>>
> >>>>> And for short term, for above Intel PMDs, there must be a place this
> >>>>> 'max_rx_pkt_len' value taken into account (mostly 'start()' dev_ops), 
> >>>>> that
> >>>>> function can be updated to take 'max_rx_pkt_len' only if JUMBO_FRAME 
> >>>>> set,
> >>>>> otherwise use the 'MTU' value.
> >>>>
> >>>> Even if we'll use max_rx_pkt_len only when if JUMBO_FRAME is set,
> >>>> I think we still need to keep max_rx_pkt_len and MTU values in sync.
> >>>>
> >>>>>
> >>>>> Without 'start()' updated the current logic won't work after stop & 
> >>>>> start anyway.
> >>>>>
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>> And why this same thing can't happen to other PMDs? If this is a 
> >>>>>>>> problem for
> >>>>>>>> all PMDs, we should solve in other level, not for only some PMDs.
> >>>>>>>>
> >>>>>>> No, all PMDs exist the same issue, another proposal:
> >>>>>>>     -  rte_ethdev provides the unique resize 'max_rx_pkt_len' in 
> >>>>>>> rte_eth_dev_configure();
> >>>>>>>     - provide the uniform API for fetching the NIC's supported Ether 
> >>>>>>> Overhead size;
> >>>>>>> Is it feasible?
> >>>>>>>
> >>>>>>>>>
> >>>>>>>>>> Generally, the mtu value can be adjustable from user (e.g.: ip link
> >>>>>>>>>> set ens801f0 mtu 1400), hence, we just adjust the max_rx_pkt_len to
> >>>>>>>>>> satisfy mtu requirement.
> >>>>>>>>>>
> >>>>>>>>>>> Should we just call ice_mtu_set(dev, 
> >>>>>>>>>>> dev_conf.rxmode.max_rx_pkt_len)
> >>>>>>>>>>> here?
> >>>>>>>>>> ice_mtu_set(dev, mtu) will append ether overhead to
> >>>>>>>>>> frame_size/max_rx_pkt_len, so we need pass the mtu value as the 2nd
> >>>>>>>>>> parameter, or not the max_rx_pkt_len.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> And please remove above comment, since ether overhead is already
> >>>>>>>>>>>> considered in ice_mtu_set.
> >>>>>>>>>> Ether overhead is already considered in ice_mtu_set, but it also
> >>>>>>>>>> should be considered as the adjustment condition that if 
> >>>>>>>>>> ice_mtu_set
> >>>>>>>> need be invoked.
> >>>>>>>>>> So, it perhaps should remain this comment before this if() 
> >>>>>>>>>> condition.
> >>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> +ret = ice_mtu_set(dev, dev->data->mtu); if (ret != 0) return
> >>>>>>>>>>>>>> +ret; }
> >>>>>>>>>>>>>> +
> >>>>>>>>>>>>>>      ret = ice_init_rss(pf);
> >>>>>>>>>>>>>>      if (ret) {
> >>>>>>>>>>>>>>      PMD_DRV_LOG(ERR, "Failed to enable rss for PF");
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> 2.17.1
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> >

Reply via email to