On Thu, Nov 01, 2018 at 08:23:19PM +0000, Saeed Mahameed wrote: > On Thu, 2018-11-01 at 23:27 +0800, Aaron Lu wrote: > > On Thu, Nov 01, 2018 at 10:22:13AM +0100, Jesper Dangaard Brouer > > wrote: > > ... ... > > > Section copied out: > > > > > > mlx5e_poll_tx_cq > > > | > > > --16.34%--napi_consume_skb > > > | > > > |--12.65%--__free_pages_ok > > > | | > > > | --11.86%--free_one_page > > > | | > > > | |--10.10% > > > --queued_spin_lock_slowpath > > > | | > > > | --0.65%--_raw_spin_lock > > > > This callchain looks like it is freeing higher order pages than order > > 0: > > __free_pages_ok is only called for pages whose order are bigger than > > 0. > > mlx5 rx uses only order 0 pages, so i don't know where these high order > tx SKBs are coming from..
Perhaps here: __netdev_alloc_skb(), __napi_alloc_skb(), __netdev_alloc_frag() and __napi_alloc_frag() will all call page_frag_alloc(), which will use __page_frag_cache_refill() to get an order 3 page if possible, or fall back to an order 0 page if order 3 page is not available. I'm not sure if your workload will use the above code path though.