> >>>>>>> TSO breaks when MSS spans more than 8 data fragments. Those > >>>>>>> packets will be dropped by Tx preparation API, but it will > >>> cause > >>>>>>> MDD event if txonly forwarding engine does not call the Tx > >>>>> preparation > >>>>>>> API before transmitting packets. > >>>>>>> > >>>>>> > >>>>>> txonly is used commonly, adding Tx prepare for a specific case > >>> may > >>>>>> impact performance for users. > >>>>>> > >>>>>> What happens when driver throws MDD (Malicious Driver Detection) > >>>>> event, > >>>>>> can't it be ignored? As you are already OK to drop the packet, > >>> can > >>>>>> device be configured to drop these packages? > >>>>>> > >>>>>> > >>>>>> Or as Jerin suggested adding a new forwarding engine is a > >>> solution, > >>>>> but > >>>>>> that will create code duplication, I prefer to not have it if > >>> this > >>>>> can > >>>>>> be handled in device level. > >>>>> > >>>>> Actually I am agree with the author of the patch - when TX offloads > >>>>> and/or multisegs are enabled, > >>>>> user supposed to invoke eth_tx_prepare(). > >>>>> Not doing that seems like a bug to me. > >>>> > >>>> I strongly disagree with that statement, Konstantin! > >>>> It is not documented anywhere that using TX offloads and/or multisegs > >>> requires calling rte_eth_tx_prepare() before > >>>> rte_eth_tx_burst(). And none of the examples do it. > >>> > >>> In fact, we do use it for test-pmd/csumonly.c. > >>> About other sample apps: > >>> AFAIK, not many of other DPDK apps do use L4 offloads. > >>> Right now special treatment (pseudo-header cksum calculation) is needed > >>> only for L4 offloads (CKSUM, SEG). > >>> So, majority of our apps who rely on other TX offloads (multi-seg, ipv4 > >>> cksum, vlan insertion) happily run without > >>> calling tx_prepare(), even though it is not the safest way. > >>> > >>>> > >>>> In my opinion: > >>>> If some driver has limitations for a feature, e.g. max 8 fragments, > >>> it should be documented for that driver, so the application > >>>> developer can make the appropriate decisions when designing the > >>> application. > >>>> Furthermore, we have APIs for the drivers to expose to the > >>> applications what the driver supports, so the application can configure > >>>> itself optimally at startup. Perhaps those APIs need to be expanded. > >>>> And if a feature limitation is common across the majority of drivers, > >>> that limitation should be mentioned in the documentation of the > >>>> feature itself. > >>> > >>> Many of such limitations *are* documented and in fact we do have an API > >>> to check max segments that each driver support, > >>> see struct rte_eth_desc_lim. > >> > >> Yes, this is the kind of API we should provide, so the application can > >> configure itself appropriately. > >> > >>> The problem is: > >>> - none of our sample app does proper check on these values, so users > >>> don't have a good example how to do it. > >> > >> Agreed. > >> Adding an example showing how to do it properly would be the best solution. > >> Calling tx_prepare() in the examples is certainly not the solution. > >> > >>> - with current DPDK API not all of HW/PMD requirements could be > >>> extracted programmatically: > >>> let say majority of Intel PMDs for TCP offloads expect pseudo-header > >>> cksum to be pre-calculated by the SW. > >> > >> I hope this requirement is documented somewhere. > >> > >>> another example, some HW expects pkt_len to be bigger then some > >>> threshold value, otherwise HW hang may appear. > >> > >> I hope this requirement is also documented somewhere. > > > > No idea, I found it only in the code. > > IMHO Tx burst must check such limitations. If you made your HW simpler > (or just lost it on initial testing), pay in your drivers (or your > HW+driver will be unusable because of such problems). > > >> Generally, if the requirements cannot be extracted programmatically, they > >> must be prominently documented, like this note to > >> rte_eth_rx_burst(): > > > > Obviously, more detailed documentation is always good, but... > > Right now we have 50+ different PMDs from different vendors. > > Even if each and every of them will carefully document all possible > > limitations and necessary preparation steps, > > how DPDK app developer supposed to deal with all that? > > Do you expect everyone, to read carefully through all of them, and handle > > all of them properly oh his own > > in each and every DPDK app he is going to write? > > That seems unrealistic. > > Again what to do with backward compatibility: when new driver (with new > > limitations) will arise > > *after* your app is already written and tested? > > +1 > > >> > >> * @note > >> * Some drivers using vector instructions require that *nb_pkts* is > >> * divisible by 4 or 8, depending on the driver implementation. > > I'm wondering what application should do if it needs to send just one > packet and do it now. IMHO, such limitations are not acceptable. > > >> > >>> - As new HW and PMD keep appearing it is hard to predict what extra > >>> limitations/requirements will arise, > >>> that's why tx_prepare() was introduced as s driver op. > >>> > >>>> > >>>> We don't want to check in the fast path what can be checked at > >>> startup or build time! > >>> > >>> If your app supposed to work with just a few, known in advance, NIC > >>> models, then sure, you can do that. > >>> For apps that supposed to work 'in general' with any possible PMDs > >>> that DPDK supports - that might be a problem. > >>> That's why tx_prepare() was introduced and it is strongly recommended > >>> to use it by the apps that do use TX offloads. > >>> Probably tx_prepare() is not the best possible approach, but right now > >>> there are not many alternatives within DPDK. > >> > >> What exactly is an application supposed to do if tx_prepare() doesn't > >> accept the full burst? It doesn't return information about > what is > >> wrong. > > > > It provides some information, but it is *very* limited: just index of 'bad' > > packet and error code. > > In theory app can try to handle it in a 'smart' way: let say if ENOTSUP is > > returned, then try to disable all HW offloads > > and do all in SW. But again, it is much better to do so *before* submitting > > packets for TX, so in practice everyone > > just drop such 'bad' packets. > > > > Dropping the packets might not be an option, e.g. for applications used > > in life support or tele-medicine. > > Critical applications should be able to do all Tx offloads in SW and > retry. Of course, various statistics and logs should help to improve the > application. > > > If the packet is 'bad', then it is much better to drop it, then TX > > corrupted packet or even hang NIC HW completely. > > IMHO Tx burst must drop packet which could hang NIC HW completely. I > realize that it is an extra checks and performance drop, but vendor > should pay in performance if HW is not good enough. > > > Though off-course it is much better to have an app that would check for > > limitations that can be checked by API provided > > and enable only supported offloads. > > Yes, that's why API to get limitation is much better than documentation. > > >> If limitations are documented, an application can use the lowest common > >> denominator of the NICs it supports. And if the > application is > >> supposed to work in general, that becomes the lowest common denominator of > >> all NICs. > > > > I agree: for limitations that can be extracted with generic API, like: > > number of segments per packet, supported TX offloads, mbuf fileds that must > > be provided for each TX offload, etc. - > > it is responsibility of well-written application to obey all of them. > > Yes, many tx_prepare() implementations do such checks anyway, but from my > > perspective it is sort of last-line of defense. > > For well written application that should just never happen. > > But there is one more important responsibility of tx_prepare() - it > > performs PMD specific packet modifications for requested > offloads. > > As I already mentioned for Intel NICs - it does pseudo-header cksum > > calcucation, for packets with size less then minimal, it can > > probably do padding (even if doesn't do it right now), for some other PMDs > > - might be something else, I didn't check. > > Obviously it saves app developer from a burden to do all these things on > > his own. > > > >> It looks like tx_prepare() has become a horrible workaround for > >> undocumented limitations. > > I strongly disagree. Documentation is never a solution for a generic > application which is intended to work on any HW and IMHO it is the > goal to have more and more applications which work on any HW. > > >> Limitations due to hardware and/or software tradeoffs are unavoidable, so > >> we have to live with them; but we should not accept > >> PMDs with undocumented limitations. > > > > As I already said, more documentation never hurts, but for that case, I > > think it is not enough. > > I expect PMD to provide a tx_prepare() implementation that would deal with > > such specific things. > > +1 > > Anyway, back to the original patch - I looked at it once again, and > > realized that the problem > > is just in the unsupported number of segments. > > As we discussed above - such limitations should be handled by well written > > app, > > but none of DPDK apps does it right now. > > So probably it is good opportunity to do things in a proper way and > > introduce such checks > > in testpmd ;) > > +1 since we already have fields in device information to report such > limitations, but it does not say that Tx prepare should be dropped. > Drivers which don't need Tx prepare keep it NULL and it returns > immediately from ethdev. Since it is done per-burst, it should not > affect performance a lot. >
100% agree. I think tx_prepare needs to stay, and in general has to be strongly recommended for apps that do use TX offloads/multi-segs.