When receiving traffic, eth_type_trans() is high up on the perf top list, because it's the first function which access the packet data.
Move the DMA unmap a bit higher, and put a prefetch just after it, so we have more time to load the data into the cache. The packet rate increase is about 13% with a tc drop test: 1620 => 1830 kpps Signed-off-by: Matteo Croce <mcr...@redhat.com> --- drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c index 111b3b8239e1..17378e0d8da1 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c @@ -2966,6 +2966,11 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi, continue; } + dma_unmap_single(dev->dev.parent, dma_addr, + bm_pool->buf_size, DMA_FROM_DEVICE); + + prefetch(data); + if (bm_pool->frag_size > PAGE_SIZE) frag_size = 0; else @@ -2983,9 +2988,6 @@ static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi, goto err_drop_frame; } - dma_unmap_single(dev->dev.parent, dma_addr, - bm_pool->buf_size, DMA_FROM_DEVICE); - rcvd_pkts++; rcvd_bytes += rx_bytes; -- 2.21.0