On 6/24/20 9:32 PM, Ferruh Yigit wrote: > On 6/24/2020 9:52 AM, Ferruh Yigit wrote: >> On 6/24/2020 4:48 AM, Chengchang Tang wrote: >>> >>> On 2020/6/23 17:30, Andrew Rybchenko wrote: >>>> On 6/23/20 9:48 AM, Chengchang Tang wrote: >>>>> In common practice, PMD configure the rx_buf_size according to the data >>>>> room size of the object in mempool. But in fact the final value is related >>>>> to the specifications of hw, and its values will affect the number of >>>>> fragments in recieving pkts. >>>>> >>>>> At present, we seem to have no way to espose relevant information to upper >>>>> layer users. >>>>> >>>>> Add a field named rx_bufsize in rte_eth_rxq_info to indicate the buffer >>>>> size used in recieving pkts for hw. >>>>> >>>> >>>> I'm OK with the change in general. >>>> I'm unsure which name to use: 'rx_buf_size' or 'rx_bursize', >>>> since I found both 'min_rx_buf_size' and 'min_rx_bufsize' in >>>> ethdev. >>>> >>>> I think it is important to update PMDs which provides the >>>> information to fill the field in. >>> >>> My plan is to divide the subsequent series into two patches, >>> one to modify rte_eth_rxq_info, and one to add our hns3 PMD >>> implementation of rxq_info_get. Should i update all the PMDs >>> that provide this information and test programs such as >>> testpmd at the same time? >> >> Hi Chengchang, Andrew, >> >> No objection to the change, but it should be crystal clear what is added. >> These >> are for PMD developers to implement and when it is not clear we end up having >> different implementations and inconsistencies. >> >> There is already some confusion for the Rx packet size etc.. my concern is >> adding more to it, here all we have is "size of RX buffer." comment, I think >> we >> need more. > > cc'ed a few more people. > > Back to this favorite topic, how to configure/limit the packet size. > > Can you please help to have a common/correct understanding? I tried to clarify > as much as I got it, any comment welcome. (I know it is long, please bare > with me) > > > The related config options I can see, > 1) rte_eth_conf->rxmode->max_rx_pkt_len > 2) rte_eth_dev_info->max_rx_pktlen > 3) DEV_RX_OFFLOAD_JUMBO_FRAME > 4) rte_eth_dev->data->mtu > 5) DEV_RX_OFFLOAD_SCATTER > 6) dev->data->scattered_rx > 7) rte_eth_dev_info->min_mtu, rte_eth_dev_info->max_mtu > > 'mtu' (4): Both for Tx and Rx. The network layer payload length. Default value > 'RTE_ETHER_MTU'. > > 'max_rx_pkt_len' (1): Only for Rx, maximum Rx frame length configured by > application. > > 'max_rx_pktlen' (2): Device reported value on what maximum Rx frame length it > can receive. Application shouldn't set Rx frame length more than this value. > > 'DEV_RX_OFFLOAD_JUMBO_FRAME' (3): Device Jumbo Frame capability. > When not enabled the Rx frame length is 'MTU' + overhead > When enabled Rx frame length is 'max_rx_pkt_len' > > 'DEV_RX_OFFLOAD_SCATTER' (5): Capability to scatter packet to multiple > descriptor by device and driver converting this to chained mbuf. > > 'dev->data->scattered_rx' (6): The current status of driver scattered Rx, in > device data mostly for PMD internal usage. > > 'rte_eth_dev_info->min_mtu' & 'rte_eth_dev_info->max_mtu' (7): minimum and > maximum MTU values device supported. > 'max_mtu' == 'max_rx_pkt_len' - L2_OVERHEAD. > > > I can see two different limits, > a) The Rx frame length limit that device can receive from wire. Any packet > larger than this size will be dropped by device in an early stage. > b) The Rx buffer length limit that received packets are written to. Device > shouldn't DMA larger than reserved buffer size. > > If device supports scattered Rx to multiple descriptors, it can be possible to > configure (a) > (b). > Otherwise configuration have to be (b) >= (a). > > For example if the mbuf size is 2Kb and the device can receive up to 9000 > bytes. > Options are: > - If device supports it, large packet will be scattered on multiple mbufs > - or need to configure device Rx frame length to 2K (mbuf size) > - or need to allocate mbuf big enough to get largest possible packet (9000) > > > > Issues I see: > ------------- > > i) Although the code clearly says 'max_rx_pkt_len' is only valid when jumbo > frames enabled, some drivers are taking it account always.
Ack, that's not good. > ii) Some drivers enable 'scattered_rx' & 'jumbo frame' implicitly, without > having 'DEV_RX_OFFLOAD_JUMBO_FRAME' or 'DEV_RX_OFFLOAD_SCATTER' requested by > application. Ack that it is a problem especially for scatter. > iii) Having both 'mtu' & 'max_rx_pkt_len' are confusing, although they are not > exactly same thing they are related. Difference is MTU applies for Tx too, and > L2 network layer overhead is not included. > 'MTU' can be more interested by upper layers, 'max_rx_pkt_len' is more driver > level information. And driver should know how to convert one to another. Agree > iv) 'max_rx_pkt_len' provided as part of 'rte_eth_dev_configure()' and there > is > no API to update it later. > 'mtu' is not part of 'rte_eth_dev_configure()', it can only be updated later > with specific API. > But driver have to keep these two values consistent. Agree > v) 'max_rx_pktlen' & 'max_mtu' reports from driver are redundant information. > Again they are not same thing, but correlated. Agree > > Suggested changes: > ----------------- > > Overall unify 'max_rx_pkt_len' & 'mtu' as much as possible, at first step: > > i) Make 'max_rx_pkt_len' independent from 'DEV_RX_OFFLOAD_JUMBO_FRAME', so > 'max_rx_pkt_len' value will be always valid, jumbo frame enabled or not. It will make handling of the max_rx_pkt_len mandatory in all network PMDs. > > ii) in '.dev_configure' convert 'max_rx_pkt_len' value to 'mtu' value, this > will > be only point 'max_rx_pkt_len' is used, after that point PMD will always use > 'mtu' value. I'm not sure that it is a right direction. Above you say that 'max_rx_pkt_len' is more driver level and I agree with it. I guess most drivers operate it finally (i.e. configure underlying HW in terms of max_rx_pkt_len, not MTU). So, converted from max_rx_pkt_len to MTU on ethdev level and covered back from MTU to max_rx_pkt_len in drivers. > Even don't reflect 'rte_eth_dev_set_mtu()' changes to 'max_rx_pkt_len' > anymore. Not sure that I get it. max_rx_pkt_len is used on dev_configure only. Is it reported on get somewhere? > > iii) Don't make 'max_rx_pkt_len' a mandatory config option, let it be '0' by > application, in that case 'rte_eth_dev_configure()' will set > "'max_rx_pkt_len' = RTE_ETHER_MAX_LEN" if 'DEV_RX_OFFLOAD_JUMBO_FRAME' > disabled > "'max_rx_pkt_len' = 9000 if 'DEV_RX_OFFLOAD_JUMBO_FRAME' enabled Why 9000? IMHO, if max_rx_pkt_len is 0, just use value derived from MTU. > > iv) Allow implicit update of 'DEV_RX_OFFLOAD_JUMBO_FRAME' on MTU set, since > setting a large MTU implies the jumbo frame request. And there is no harm to > application. Yes and I'd deprecate DEV_RX_OFFLOAD_JUMBO_FRAME. > > v) Do NOT allow implicit update of 'DEV_RX_OFFLOAD_SCATTER' on MTU set (when > Rx > frame length > Rx buffer length), since application may not be capable of > parsing chained mbufs. Instead fails the MTU set in that case. > [This can break some applications, relying on this implicit set.] Yes, definitely. > > > Any comments? > > > > Additional details: > ------------------- > > Behavior of some drivers: > > What igb & ixgbe does > - Set Rx frame limit (a) using 'max_rx_pkt_len' (1) > - Set Rx buffer limit (b) using mbuf data size > - Enable Scattered Rx (5 & 6) if the Rx frame limit (a) bigger than Rx buffer > limit (b) (even user not requested for it) > > What i40e does same as above, only differences > - Return error if jumbo frame enabled and 'max_rx_pkt_len' < RTE_ETHER_MAX_LEN > > sfc: > - Set Rx frame limit (a) > - using 'max_rx_pkt_len' (1) when jumbo frame enabled > - using 'mtu' when jumbo frame not enabled. > - Set Rx buffer limit (b) using mbuf data size > - If Rx frame limit (a) bigger than Rx buffer limit (b), and user not > requested > 'DEV_RX_OFFLOAD_SCATTER', return error. Ack > > octeontx2: > - Set Rx frame limit (a) using 'max_rx_pkt_len' (1). Implicitly enable jumbo > frame based on 'max_rx_pkt_len'. > - I can't able find how Rx buffer limit (b) set > - Enable Scattered Rx (5) if the Rx frame limit (a) bigger than Rx buffer > limit > (b) (even user not requested for it). 'dev->data->scattered_rx' not set at > all. > > >> Adding a PMD implementation and testpmd updates helps to clarify the >> intention/usage, so I suggest sending them as a single patch with this one. >> >> Updating all PMDs is a bigger ask and sometimes too hard because of lack of >> knowledge on the internals of other PMDs, although this is causing feature >> gaps >> time to time, we are not mandating this to developers, so please update as >> many >> PMD as you can, that you are confident, rest should be done by their >> maintainers. >> >>>> >>>>> Signed-off-by: Chengchang Tang <tangchengch...@huawei.com> >>>> >>>> Acked-by: Andrew Rybchenko <arybche...@solarflare.com> >>>> >>>>> --- >>>>> lib/librte_ethdev/rte_ethdev.h | 1 + >>>>> 1 file changed, 1 insertion(+) >>>>> >>>>> diff --git a/lib/librte_ethdev/rte_ethdev.h >>>>> b/lib/librte_ethdev/rte_ethdev.h >>>>> index 0f6d053..82b7e98 100644 >>>>> --- a/lib/librte_ethdev/rte_ethdev.h >>>>> +++ b/lib/librte_ethdev/rte_ethdev.h >>>>> @@ -1306,6 +1306,7 @@ struct rte_eth_rxq_info { >>>>> struct rte_eth_rxconf conf; /**< queue config parameters. */ >>>>> uint8_t scattered_rx; /**< scattered packets RX supported. */ >>>>> uint16_t nb_desc; /**< configured number of RXDs. */ >>>>> + uint16_t rx_bufsize; /**< size of RX buffer. */ >>>>> } __rte_cache_min_aligned; >>>>> >>>>> /**