On Fri,  5 Jul 2019 13:47:47 +0300
Ilias Apalodimas <ilias.apalodi...@linaro.org> wrote:

> Quoting Arnd,
> We have to do a sync_single_for_device /somewhere/ before the
> buffer is given to the device. On a non-cache-coherent machine with
> a write-back cache, there may be dirty cache lines that get written back
> after the device DMA's data into it (e.g. from a previous memset
> from before the buffer got freed), so you absolutely need to flush any
> dirty cache lines on it first.
> 
> Since the coherency is configurable in this device make sure we cover
> all configurations by explicitly syncing the allocated buffer for the
> device before refilling it's descriptors
> 
> Signed-off-by: Ilias Apalodimas <ilias.apalodi...@linaro.org>
> ---
> Changes since v2:
> - Only sync for the portion of the packet owned by the NIC as suggested by 
>   Jesper

Acked-by: Jesper Dangaard Brouer <bro...@redhat.com>

Some general comments below.

>  drivers/net/ethernet/socionext/netsec.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/ethernet/socionext/netsec.c 
> b/drivers/net/ethernet/socionext/netsec.c
> index 5544a722543f..6b954ad88842 100644
> --- a/drivers/net/ethernet/socionext/netsec.c
> +++ b/drivers/net/ethernet/socionext/netsec.c
> @@ -727,6 +727,7 @@ static void *netsec_alloc_rx_data(struct netsec_priv 
> *priv,
>  {
>  
>       struct netsec_desc_ring *dring = &priv->desc_ring[NETSEC_RING_RX];
> +     enum dma_data_direction dma_dir;
>       struct page *page;
>  
>       page = page_pool_dev_alloc_pages(dring->page_pool);
> @@ -742,6 +743,8 @@ static void *netsec_alloc_rx_data(struct netsec_priv 
> *priv,
>        * cases and reserve enough space for headroom + skb_shared_info
>        */
>       *desc_len = PAGE_SIZE - NETSEC_RX_BUF_NON_DATA;
> +     dma_dir = page_pool_get_dma_dir(dring->page_pool);
> +     dma_sync_single_for_device(priv->dev, *dma_handle, *desc_len, dma_dir);

Following the API this seems to turn into a noop if dev_is_dma_coherent().

Thus, I don't think it is worth optimizing further, as I suggested
earlier, with only sync of previous packet length.   This sync of the
"full" possible payload-data area (without headroom) is likely the best
and simplest option.  I don't think we should extend and complicate
the API for optimizing for non-coherent DMA hardware.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Reply via email to