Hi Chas From: Chas Williams >On Tue, Aug 21, 2018 at 11:43 AM Matan Azrad <mailto:ma...@mellanox.com> wrote: >Hi Chas > >From: Chas Williams >> On Tue, Aug 21, 2018 at 6:56 AM Matan Azrad <mailto:ma...@mellanox.com> >> wrote: >> Hi >> >> From: Chas Williams >> > This will need to be implemented for some of the other RX burst >> > methods at some point for other modes to see this performance >> > improvement (with the exception of active-backup). >> >> Yes, I think it should be done at least to >> bond_ethdev_rx_burst_8023ad_fast_queue (should be easy) for now. >> >> There is some duplicated code between the various RX paths. >> I would like to eliminate that as much as possible, so I was going to give >> that >> some thought first. > >There is no reason to stay this function as is while its twin is changed. > >Unfortunately, this is all the patch I have at this time. > > >> >> >> > On Thu, Aug 16, 2018 at 9:32 AM Luca Boccassi <mailto:bl...@debian.org> >> > wrote: >> > >> > > During bond 802.3ad receive, a burst of packets is fetched from each >> > > slave into a local array and appended to per-slave ring buffer. >> > > Packets are taken from the head of the ring buffer and returned to >> > > the caller. The number of mbufs provided to each slave is >> > > sufficient to meet the requirements of the ixgbe vector receive. >> >> Luca, >> >> Can you explain these requirements of ixgbe? >> >> The ixgbe (and some other Intel PMDs) have vectorized RX routines that are >> more efficient (if not faster) taking advantage of some advanced CPU >> instructions. I think you need to be receiving at least 32 packets or more. > >So, why to do it in bond which is a generic driver for all the vendors PMDs, >If for ixgbe and other Intel nics it is better you can force those PMDs to >receive always 32 packets >and to manage a ring by themselves. > >The drawback of the ring is some additional latency on the receive path. >In testing, the additional latency hasn't been an issue for bonding.
When bonding does processing slower it may be a bottleneck for the packet processing for some application. > The bonding PMD has a fair bit of overhead associated with the RX and TX path >calculations. Most applications can just arrange to call the RX path >with a sufficiently large receive. Bonding can't do this. I didn't talk on application I talked on the slave PMDs, The slave PMD can manage a ring by itself if it helps for its own performance. The bonding should not be oriented to specific PMDs. >> Did you check for other vendor PMDs? It may hurt performance there.. >> >> I don't know, but I suspect probably not. For the most part you are >> typically >> reading almost up to the vector requirement. But if one slave has just a >> single packet, then you can't vectorize on the next slave. >> > >I don't think that the ring overhead is better for PMDs which are not using >the vectorized instructions. > >The non-vectorized PMDs are usually quite slow. The additional >overhead doesn't make a difference in their performance. We should not do things worse than they are.