On Sun, Sep 2, 2018 at 7:34 AM Matan Azrad <ma...@mellanox.com> wrote: > > Hi Luca\Chas > > From: Luca Boccassi > > On Wed, 2018-08-29 at 15:20 +0000, Matan Azrad wrote: > > > > > > From: Chas Williams > > > > On Tue, Aug 28, 2018 at 5:51 AM Matan Azrad <mailto:matan@mellanox. > > > > com> wrote: > > > > > > > > > > > > From: Chas Williams > > > > > On Mon, Aug 27, 2018 at 11:30 AM Matan Azrad <mailto:mailto:matan > > > > > @mellanox.com> wrote: > > > > > > > > <snip> > > > > > > > Because rings are generally quite efficient. > > > > > > > > > > > > But you are using a ring in addition to regular array > > > > > > management, it must hurt performance of the bonding PMD (means > > > > > > the bonding itself - not the slaves PMDs which are called from > > > > > > the bonding) > > > > > > > > > > It adds latency. > > > > > > > > And by that hurts the application performance because it takes more > > > > CPU time in the bonding PMD. > > > > > > > > No, as I said before it takes _less_ CPU time in the bonding PMD > > > > because we use a more optimal read from the slaves. > > > > > > Each packet pointer should be copied more 2 times because of this > > > patch + some management(the ring overhead) So in the bonding code you > > > lose performance. > > > > > > > > > > > > It increases performance because we spend less CPU time reading > > > > > from the PMDs. > > > > > > > > So, it's hack in the bonding PMD to improve some slaves code > > > > performance but hurt the bonding code performance, Over all the > > > > performance we gain for those slaves improves the application > > > > performance only when working with those slaves. > > > > But may hurt the application performance when working with other > > > > slaves. > > > > > > > > What is your evidence that is hurts bonding performance? Your > > > > argument is purely theoretical. > > > > > > Yes, we cannot test all the scenarios cross the PMDs. > > > > Chas has evidence that this helps, a _lot_, in some very common cases. > > We haven't seen evidence of negative impact anywhere in 2 years. Given > > this, surely it's not unreasonable to ask to substantiate theoretical > > arguments > > with some testing? > > What is the common cases of the bond usage? > Do you really know all the variance of the bond usages spreading all over the > world?
We actually have a fairly large number of deployments using this bonding code across a couple different adapter types (mostly Intel though and some virtual usage). The patch was designed to address starvation of slaves because of the way that vector receives tend on the Intel PMDs. If there isn't enough space to attempt a vector receive (a minimum of 4 buffers), then the rx burst will return a value of 0 -- no buffers read. The rx burst in bonding moves to the next adapter. So this tend to starve any slaves that aren't first in line. The issue doesn't really show up in single streams. You need to run multiple streams that multiplex across all the slaves. > > I’m saying that using a hack in the bond code which helps for some slaves > PMDs\application scenarios (your common cases) but hurting > the bond code performance and latency is not the right thing to do because it > may hurt other scenarios\PMDs using the bond. What do you think about the attached patch? It implements an explicit round-robin for the "first" slave in order to enforce some sort of fairness. Some limited testing has shown that this our application scan scale polling to read the PMDs fast enough. Note, I have only tested the 802.3ad paths. The changes are likely necessary for the other RX burst routines since they should suffer the same issue.