On Wed, 2017-02-22 at 18:06 -0800, Eric Dumazet wrote: > On Wed, 2017-02-22 at 17:08 -0800, Alexander Duyck wrote: > > > > > Right but you were talking about using both halves one after the > > other. If that occurs you have nothing left that you can reuse. That > > was what I was getting at. If you use up both halves you end up > > having to unmap the page. > > > > You must have misunderstood me. > > Once we use both halves of a page, we _keep_ the page, we do not unmap > it. > > We save the page pointer in a ring buffer of pages. > Call it the 'quarantine' > > When we _need_ to replenish the RX desc, we take a look at the oldest > entry in the quarantine ring. > > If page count is 1 (or pagecnt_bias if needed) -> we immediately reuse > this saved page. > > If not, _then_ we unmap and release the page. > > Note that we would have received 4096 frames before looking at the page > count, so there is high chance both halves were consumed. > > To recap on x86 : > > 2048 active pages would be visible by the device, because 4096 RX desc > would contain dma addresses pointing to the 4096 halves. > > And 2048 pages would be in the reserve. > > > > The whole idea behind using only half the page per descriptor is to > > allow us to loop through the ring before we end up reusing it again. > > That buys us enough time that usually the stack has consumed the frame > > before we need it again. > > > The same will happen really. > > Best maybe is for me to send the patch ;)
Excellent results so far, performance on PowerPC is back, and x86 gets a gain as well. Problem is XDP TX : I do not see any guarantee mlx4_en_recycle_tx_desc() runs while the NAPI RX is owned by current cpu. Since TX completion is using a different NAPI, I really do not believe we can avoid an atomic operation, like a spinlock, to protect the list of pages ( ring->page_cache )