On Sun, 2017-02-12 at 23:38 +0100, Jesper Dangaard Brouer wrote: > Just so others understand this: The number of RX queue slots is > indirectly the size of the page-recycle "cache" in this scheme (that > depend on refcnt tricks to see if page can be reused).
Note that the page recycle tricks only work on some occasions. To provision correctly hosts dealing with TCP flows, one should not rely on page recycling or any opportunistic (non guaranteed) behavior. Page recycling, _if_ possible, will help to reduce system load and thus lower latencies. > > > > A single TCP flow easily can have more than 1024 MSS waiting in its > > receive queue (typical receive window on linux is 6MB/2 ) > > So, you do need to increase the page-"cache" size, and need this for > real-life cases, interesting. I believe this sizing was done mostly to cope with normal system scheduling constraints [1], reducing packet losses under incast blasts. Sizing happened before I did my patches to switch to order-0 pages anyway. The fact that it allowed page-recycling to happen more often was nice of course. [1] - One can not really assume host will always have the ability to process the RX ring in time, unless maybe CPU are fully dedicated to the napi polling logic. - Recent work to shift softirqs to ksoftirqd is potentially magnifying the problem.