> -----Original Message----- > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of David Marchand > Sent: Friday, February 15, 2019 5:32 PM > To: Thomas Monjalon <tho...@monjalon.net> > Cc: Richardson, Bruce <bruce.richard...@intel.com>; dev@dpdk.org; Lu, Wenzhuo > <wenzhuo...@intel.com>; Wu, Jingjing > <jingjing...@intel.com>; Iremonger, Bernard <bernard.iremon...@intel.com>; > Maxime Coquelin <mcoqu...@redhat.com>; Yigit, Ferruh > <ferruh.yi...@intel.com>; Andrew Rybchenko <arybche...@solarflare.com>; > Wiles, Keith <keith.wi...@intel.com> > Subject: Re: [dpdk-dev] [PATCH 3/5] app/testpmd: add missing transmit errors > stats > > On Fri, Feb 15, 2019 at 5:19 PM Thomas Monjalon <tho...@monjalon.net> wrote: > > > 15/02/2019 16:04, David Marchand: > > > On Fri, Feb 15, 2019 at 3:05 PM Bruce Richardson < > > bruce.richard...@intel.com> > > > wrote: > > > > > > > On Fri, Feb 15, 2019 at 10:33:47AM +0100, David Marchand wrote: > > > > > On Fri, Feb 15, 2019 at 9:58 AM Thomas Monjalon > > > > > <[1]tho...@monjalon.net> wrote: > > > > > > > > > > 14/02/2019 19:51, David Marchand: > > > > > > What is the purpose of oerrors ? > > > > > > > > > > > > Since the drivers (via rte_eth_tx_burst return value) report > > the > > > > > numbers of > > > > > > packets successfully transmitted, the application can try to > > > > > retransmit the > > > > > > packets that did not make it and counts this. > > > > > > If the driver counts such "missed" packets, then it does the > > job > > > > > the > > > > > > application will do anyway (wasting some cycles). > > > > > > But what is more a problem is that the application does not > > know > > > > > if the > > > > > > packets in oerrors are its own retries or problems that the > > driver > > > > > can not > > > > > > detect (hw problems) but the hw can. > > > > > > > > > > > > So the best option is that oerrors does not report the > > packets the > > > > > driver > > > > > > refuses (and I can see some drivers that do not comply to > > this) > > > > > but only > > > > > > "external" errors from the driver pov. > > > > > I can see the benefit of having driver errors in the stats, > > > > > so it is generically stored for later analysis or print. > > > > > It could be managed at ethdev level instead of the application > > > > > doing the computation. > > > > > What about splitting the Tx errors in 2 fields? oerrors / ofull > > ? > > > > > Who said it's awful? sorry Bruce for anticipating ;) > > > > > > > > > > Summary, correct me if we are not aligned :-) > > > > > - ofull (maybe ofifoerrors?) is actually a count of SW failed > > > > transmits > > > > > - it would be handled in rte_eth_tx_burst() itself in a generic > > way > > > > > - the drivers do not need to track such SW failed transmits > > > > > - oerrors only counts packets HW failed transmits, dropped out of > > the > > > > > driver tx_pkt_burst() knowledge > > > > > - the application does not have to track SW failed transmits > > since the > > > > > stats is in ethdev > > > > > It sounds good to me, this means an ethdev abi breakage. > > > > > > > > Hang on, why do we need ethdev to track this at all, given that it's > > > > trivial for apps to track this themselves. Would we not be better just > > to > > > > add this tracking into testpmd and leave ethdev and drivers alone? > > Perhaps > > > > I'm missing something? > > > > > > > > > > This was my first intention but Thomas hopped in ;-) > > > > I was just opening the discussion :) > > > > > testpmd does it already via the fs->fwd_dropped stats and ovs has its > > > tx_dropped stat. > > > > > > The problem is that all drivers have different approach about this. > > > Some drivers only count real hw errors in oerrors. > > > But others count the packets it can't send in oerrors (+ there are some > > > cases that seem buggy to me where the driver will always refuse the mbufs > > > for reason X and the application can retry indefinitely to send...). > > > > We have 3 options: > > 1/ status quo = oerrors is inconsistent across drivers > > 2/ API break = oerrors stop being incremented for temporary > > unavailability (i.e. queue full, kind of ERETRY), > > report only packets which will be never sent, > > may be a small performance gain for some drivers > > 3/ API + ABI break = same as 2/ + > > report ERETRY errors in ofull (same as tx_burst() delta) > > > > Note that the option 2 is a light API break which does not require > > any deprecation notice because the original definition of oerrors > > is really vague: "failed transmitted packets" > > By changing the definition of errors to "packets lost", we can count > > HW errors + packets not matching requirements. > > As David suggests, the packets not matching requirements can be freed > > as it is done for packets successfully transmitted to the HW. > > We need also to update the definition of the return value of > > rte_eth_tx_burst(): "packets actually stored in transmit descriptors". > > We should also count the bad packets rejected by the driver. > > Then the number of bad packets would be the difference between > > the return value of rte_eth_tx_burst() and opackets counter. > > This solution is fixing some bugs and enforce a consistent behaviour. > > > > I am also for option 2 especially because of this. > A driver that refuses a packet for reason X (which is a limitation, or an > incorrect config or whatever that is not a transient condition) but gives > it back to the application is a bad driver.
Why? What.s wrong to leave it to the upper layer to decide what to do with the packets that can't be sent (by one reason or another)? > > > > The option 3 is breaking the ABI and may degrade the performances. > > The only benefit is convenience or semantic: ofull would be the > > equivalent of imissed. > > The application can count the same by making the difference between > > the burst size and the return of rte_eth_tx_burst. > > > > My vote is for the option 2. > > > > > > -- > David Marchand