On Fri, 5 Jul 2019 13:47:47 +0300 Ilias Apalodimas <ilias.apalodi...@linaro.org> wrote:
> Quoting Arnd, > We have to do a sync_single_for_device /somewhere/ before the > buffer is given to the device. On a non-cache-coherent machine with > a write-back cache, there may be dirty cache lines that get written back > after the device DMA's data into it (e.g. from a previous memset > from before the buffer got freed), so you absolutely need to flush any > dirty cache lines on it first. > > Since the coherency is configurable in this device make sure we cover > all configurations by explicitly syncing the allocated buffer for the > device before refilling it's descriptors > > Signed-off-by: Ilias Apalodimas <ilias.apalodi...@linaro.org> > --- > Changes since v2: > - Only sync for the portion of the packet owned by the NIC as suggested by > Jesper Acked-by: Jesper Dangaard Brouer <bro...@redhat.com> Some general comments below. > drivers/net/ethernet/socionext/netsec.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/ethernet/socionext/netsec.c > b/drivers/net/ethernet/socionext/netsec.c > index 5544a722543f..6b954ad88842 100644 > --- a/drivers/net/ethernet/socionext/netsec.c > +++ b/drivers/net/ethernet/socionext/netsec.c > @@ -727,6 +727,7 @@ static void *netsec_alloc_rx_data(struct netsec_priv > *priv, > { > > struct netsec_desc_ring *dring = &priv->desc_ring[NETSEC_RING_RX]; > + enum dma_data_direction dma_dir; > struct page *page; > > page = page_pool_dev_alloc_pages(dring->page_pool); > @@ -742,6 +743,8 @@ static void *netsec_alloc_rx_data(struct netsec_priv > *priv, > * cases and reserve enough space for headroom + skb_shared_info > */ > *desc_len = PAGE_SIZE - NETSEC_RX_BUF_NON_DATA; > + dma_dir = page_pool_get_dma_dir(dring->page_pool); > + dma_sync_single_for_device(priv->dev, *dma_handle, *desc_len, dma_dir); Following the API this seems to turn into a noop if dev_is_dma_coherent(). Thus, I don't think it is worth optimizing further, as I suggested earlier, with only sync of previous packet length. This sync of the "full" possible payload-data area (without headroom) is likely the best and simplest option. I don't think we should extend and complicate the API for optimizing for non-coherent DMA hardware. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer