Hi, Few comments below.
On Mon, Dec 16, 2019 at 11:39:05AM +0300, Andrew Rybchenko wrote: > On 12/10/19 9:07 PM, Ferruh Yigit wrote: > > On 1/23/2018 2:34 PM, Shahaf Shuler wrote: > >> Tuesday, January 23, 2018 3:53 PM, Olivier Matz: > > > > <...> > > > >>> > >>> 2/ meaning of rxmode.jumbo_frame, rxmode.enable_scatter, > >>> rxmode.max_rx_pkt_len > >>> > >>> While it's not related to the new API, it is probably a good opportunity > >>> to clarify the meaning of these flags. I'm not able to find a good > >>> documentation about them. > >>> > >>> Here is my understanding, the configuration only depends on: - the maximum > >>> rx frame length - the amount of data available in a mbuf (minus headroom) > >>> > >>> Flags to set in rxmode (example): > >>> +---------------+----------------+----------------+-----------------+ | > >>> |mbuf_data_len=1K|mbuf_data_len=2K|mbuf_data_len=16K| > >>> +---------------+----------------+----------------+-----------------+ > >>> |max_rx_len=1500|enable_scatter | | | > >>> +---------------+----------------+----------------+-----------------+ > >>> |max_rx_len=9000|enable_scatter, |enable_scatter, |jumbo_frame | | > >>> |jumbo_frame |jumbo_frame | | > >>> +---------------+----------------+----------------+-----------------+ Due to successive quotes, the table was not readable in my mail client, here it is again (narrower): +------------+---------------+---------------+---------------+ | |mbuf_data_len= |mbuf_data_len= |mbuf_data_len= | | |1K |2K |16K | +------------+---------------+---------------+---------------+ |max_rx_len= |enable_scatter | | | |1500 | | | | +------------+---------------+---------------+---------------+ |max_rx_len= |enable_scatter,|enable_scatter,|jumbo_frame | |9000 |jumbo_frame |jumbo_frame | | +------------+---------------+---------------+---------------+ > >>> If this table is correct, the flag jumbo_frame would be equivalent to > >>> check > >>> if max_rx_pkt_len is above a threshold. > >>> > >>> And enable_scatter could be deduced from the mbuf size of the given rxq > >>> (which is a bit harder but maybe doable). > >> > >> I glad you raised this subject. We had a lot of discussion on it internally > >> in Mellanox. > >> > >> I fully agree. All application needs is to specify the maximum packet size > >> it > >> wants to receive. > >> > >> I think also the lack of documentation is causing PMDs to use those flags > >> wrongly. For example - some PMDs set the jumbo_frame flag internally > >> without > >> it being set by the application. > >> > >> I would like to add one more item : MTU. What is the relation (if any) > >> between setting MTU and the max_rx_len ? I know MTU stands for Max Transmit > >> Unit, however at least in Linux it is the same for the Send and the > >> receive. > >> > >> > > > > (Resurrecting the thread after two years, I will reply again with latest > > understanding.) > > > > Thanks Olivier for above summary and table, and unfortunately usage still > > not > > consistent between PMDs. According my understanding: > > > > 'max_rx_pkt_len' is user configuration value, to limit the size packet that > > is > > shared with host, but this doesn't limit the size of packet that NIC > > receives. When you say the size of packet shared with the host, do you mean for instance that the NIC will receive a 1500B packet and will only write 128 bytes of data in the mbuf? If yes, this was not my understanding. I suppose it could be used for monitoring. What should be the value for rx offload infos like checksum or packet type if the packet (or the header) is truncated? > Also comment in lib/librte_ethdev/rte_ethdev.h says that the > rxmode field is used if (and I think only if) JUMBO_FRAME is > enabled. So, if user wants to set it on device configure stage, > device *must* support JUMBO_FRAME offload which mean that > driver code handles rxmode.max_rx_pkt_len and either accept it > and configures HW appropriately or return an error if specified > value is wrong. Basically it is written in jumbo frame feature > definition in features.rst. User has max_rx_pktlen in dev_info > to find out maximum supported value for max_rx_pkt_len. > > > Like if the mbuf size of the mempool used by a queue is 1024 bytes, we don't > > want packets bigger than buffer size, but if NIC supports it is possible > > receive > > 6000 bytes packet and split data into multiple buffers, and we can use multi > > segment packets to represent it. > > So what we need is NIC ability to limit the size of data to share to host > > and > > scattered Rx support (device + driver). > > It requires RX_SCATTER offload enabled and it must be > controlled by the user only (not PMD) since it basically > mean if the application is ready to handle multi-segment > packets (have code which takes a look at the number of > segments and next pointers etc). Moreover, application > may disable MULTI_SEG Tx offload (and drivers may ignore > number of segments and next pointer as well). Agree, I think it is important that the application can control the enabling of rx scatter, either by a flag, or simply by passing max_rx_len <= mbuf_data_len. > > But MTU limits the size of the packet that NIC receives. > > Personally I've never treated it this way. For me the only > difference between max_rx_pkt_len and MTU is: > - max_rx_pkt_len is entire packet with all L2 headers and > even FCS (basically everything which could be provided > to application in mbuf) > - MTU does not cover L2 (and VLANs, I'm not sure about MPLS) > > > Assuming above are correct J, > > > > Using mbuf data size as 'max_rx_pkt_len' without asking from user is an > > option, > > but perhaps user has different reason to limit packet size, so I think > > better to > > keep as different config option. > > > > I think PMD itself enabling "jumbo frame" offload is not too bad, and > > acceptable, since providing a large MTU already implies it. > > Yes +1 > > But not sure about PMD enabling scattered Rx, application may want to force > > to > > receive single segment mbufs, for that case PMD enabling this config on its > > own > > looks like a problem. > > Yes +1 > > But user really needs this when a packet doesn't fit to the mbuf, so > > providing a > > MTU larger than 'max_rx_pkt_len' _may_ imply enabling scattered Rx, I assume > > this is the logic in some PMDs which looks acceptable. > > I don't think so. IMO auto enabling Rx scatter from PMD is a > breakage of a contract between application and driver. > As stated above the application may be simply not ready to > handle multi-segment packets correctly. > > I think that providing an MTU larger than 'max_rx_pkt_len' is simply a > change of max_rx_pkt_len = (MTU plus space for L2+). As VLAN(s) are not taken in account in MTU, it means that if MTU is 1500, max Ethernet len is 1500 + 14 (eth hdr) + 4 (vlan) + 4 (2nd vlan / qinq) + 4 (crc) = 1522. Shouldn't we only use a L2 lengths instead of MTU? I don't know what is usually expected by different hardware (mtu or l2 len). > > And PMD behavior should be according for mentioned configs: > > > > 1) Don't change user provided 'max_rx_pkt_len' value > I have no strong opinion. However, it is important to clarify > which understanding of max_rx_pkt_len vs MTU is the right one. > > > 2) If jumbo frame is not enabled, don't limit the size of packets to the > > host (I > > think this is based on assumption that mbuf size always will be > 1514) > > I think that JUMBO_FRAME is not relevant here. It is just a > promise to take a look at max_rx_pkt_len on configure or > start stage. > > > 3) When user request to set the MTU bigger than ETH_MAX, PMD enable jumbo > > frame > > support (if it is not enabled by user already and supported by HW). If HW > > doesn't support if of course it should fail. > > I'm not sure which ETH_MAX is mentioned above. > #define ETH_MAX_MTU 0xFFFFU /* 65535, same as IP_MAX_MTU */ > or do you mean > #define ETH_FRAME_LEN 1514 /* Max. octets in frame sans FCS */ > or even > #define ETH_DATA_LEN 1500 /* Max. octets in payload */ > > We should be careful when we talk about Ethernet lengths and > MTU. > > > 4) When user request to set MTU bigger than 'max_rx_pkt_len' > > I think the second parameter to consider here is not > max_rx_pkt_len, but amount of space for data in single > mbuf (for all Rx queues). > > > 4a) if "scattered Rx" is enabled, configure the MTU and limit packet size to > > host to 'max_rx_pkt_len' > > Yes and no. IMO configure the MTU and bump max_rx_pkt_len. > > > 4b) if "scattered Rx" is not enabled but HW supports it, enable "scattered > > Rx" > > by PMD, configure the MTU and limit packet size to host to 'max_rx_pkt_len' > > No, I think it is wrong to enable Rx scatter from PMD. > > > 4c) if "scattered Rx" is not enabled and not supported by HW, fail MTU set. > > Yes, regardless support in HW. > > > 4d) if HW doesn't support to limit the packet size to host, but requested > > MTU > > bigger than 'max_rx_pkt_len' it should fail. > > I would rephrase it as impossibility to disable Rx scatter. > If so, it must be driver responsibility to drop scattered > packets if Rx scatter offload is not enabled. > > > Btw, I am aware of that some PMDs have a larger MTU by default and can't > > limit > > the packet size to host to 'max_rx_pkt_len' value, I don't know what to do > > in > > that case, fail in configure? Or at least be sure configured mempool's mbuf > > size > > is big enough? > > See above. > > Thanks for reminder about the topic. I have the impression that what we want can be done with these 3 values: - max_rx_pkt_size: maximum size of received packet, larger ones are dropped - max_rx_data_size: maximum size of data copied in a mbuf chain, larger packets are truncated - max_rx_seg_size: maximum size written in a segment (this can be retrieved from pool private info = rte_pktmbuf_data_room_size() - RTE_PKTMBUF_HEADROOM) I think the first 2 values should be L2 lengths, including CRC if CRC-receive is enabled. if max_rx_data_size <= max_rx_seg_size, scatter is disabled if max_rx_data_size < max_rx_pkt_size, packets can be truncated In case a PMD is not able to limit a packet size, it can be advertised by a capability, and it would be up to the application to do it by sw. Olivier