Hi, 2015-09-24 22:10, Arnon Warshavsky: > Moving from dpdk 1.5 to 2.0 we observed a PPS performance degradation of > ~30%. > After chasing this one for a while we found the problem: > > A) Between the 2 versions rte_mbuf was increased in size from 1 to 2 cache > lines. > B) The standard (non-vector) rx function does not perform a prefetch for > the 2nd cache line of the mbuf (I see this bug exists in 2.1 as well) and > it touches it setting the next pointer to NULL. > I tested it in ixgbe, but it looks like it exists in all drivers in the > *_rx_recv_pkts() and *_rx_recv_scattered_pkts() functions. > Once added the prefetch for the 2nd line, we were back in our previous > numbers. > > I believe this one slipped under the radar as the vector mode is now the > default. > We stumbled into it because we work in non-vector mode due to a different > mempool bug in 2.0 which sometimes crashes the application upon port stop.
Big thanks for this double bug report! > I have 2 questions > 1) > Could anyone tell if the regression tests are comparing performance while > building DPDK with the default set of flags alone, or are multiple options > examined? There is no official regression test of performance. Though Intel is probably monitoring it for their hardware. By the way, it would be a good improvement to have such standard benchmark in DTS or elsewhere. > 2) > How are issues like that being tracked and later associated to a patch? In general, it is followed by discussion and a patch on this mailing list. The patch must track the fixed issue in the release notes. In order to give better exposure of current bugs we could instantiate a bug tracker. I think it's time to think about it seriously. Let's discuss about the possible solutions in another thread. Thanks again to you and all the Qwilt team. PS: it would be nice to hear about your DPDK deployment and results