On 3/9/21 6:10 PM, Shay Agroskin wrote:
> The page cache holds pages we allocated in the past during napi cycle,
> and tracks their availability status using page ref count.
>
> The cache can hold up to 2048 pages. Upon allocating a page, we check
> whether the next entry in the cache contains an unused page, and if so
> fetch it. If the next page is already used by another entity or if it
> belongs to a different NUMA core than the napi routine, we allocate a
> page in the regular way (page from a different NUMA core is replaced by
> the newly allocated page).
>
> This system can help us reduce the contention between different cores
> when allocating page since every cache is unique to a queue.
For reference, many drivers already use a similar strategy.
> +
> +/* Fetch the cached page (mark the page as used and pass it to the caller).
> + * If the page belongs to a different NUMA than the current one, free the
> cache
> + * page and allocate another one instead.
> + */
> +static struct page *ena_fetch_cache_page(struct ena_ring *rx_ring,
> + struct ena_page *ena_page,
> + dma_addr_t *dma,
> + int current_nid)
> +{
> + /* Remove pages belonging to different node than current_nid from cache
> */
> + if (unlikely(page_to_nid(ena_page->page) != current_nid)) {
> + ena_increase_stat(&rx_ring->rx_stats.lpc_wrong_numa, 1,
> &rx_ring->syncp);
> + ena_replace_cache_page(rx_ring, ena_page);
> + }
> +
>
And they use dev_page_is_reusable() instead of copy/pasting this logic.
As a bonus, they properly deal with pfmemalloc