On Fri, 2018-11-16 at 09:20 -0600, Thor Thayer wrote:
> Hi Dalon,
> 
> Just a few comments/questions.
> 
> On 11/14/18 6:50 PM, Dalon Westergreen wrote:
> > From: Dalon Westergreen <dalon.westergr...@intel.com>
> > 
> > Add support for the mSGDMA prefetcher.  The prefetcher adds support
> > for a linked list of descriptors in system memory.  The prefetcher
> > feeds these to the mSGDMA dispatcher.
> > 
> > The prefetcher is configured to poll for the next descriptor in the
> > list to be owned by hardware, then pass the descriptor to the
> > dispatcher.  It will then poll the next descriptor until it is
> > owned by hardware.
> > 
> > The dispatcher responses are written back to the appropriate
> > descriptor, and the owned by hardware bit is cleared.
> > 
> > The driver sets up a linked list twice the tx and rx ring sizes,
> > with the last descriptor pointing back to the first.  This ensures
> > that the ring of descriptors will always have inactive descriptors
> > preventing the prefetcher from looping over and reusing descriptors
> > inappropriately.  The prefetcher will continuously loop over these
> > descriptors.  The driver modifies descriptors as required to update
> > the skb address and length as well as the owned by hardware bit.
> > 
> > In addition to the above, the mSGDMA prefetcher can be used to
> > handle rx and tx timestamps coming from the ethernet ip.  These
> > can be included in the prefetcher response in the descriptor.
> > 
> > Signed-off-by: Dalon Westergreen <dalon.westergr...@intel.com>
> > ---
> >   drivers/net/ethernet/altera/Makefile          |   2 +-
> >   .../altera/altera_msgdma_prefetcher.c         | 433 ++++++++++++++++++
> >   .../altera/altera_msgdma_prefetcher.h         |  30 ++
> >   .../altera/altera_msgdmahw_prefetcher.h       |  87 ++++
> >   drivers/net/ethernet/altera/altera_tse.h      |  14 +
> >   drivers/net/ethernet/altera/altera_tse_main.c |  51 +++
> >   6 files changed, 616 insertions(+), 1 deletion(-)
> >   create mode 100644 drivers/net/ethernet/altera/altera_msgdma_prefetcher.c
> >   create mode 100644 drivers/net/ethernet/altera/altera_msgdma_prefetcher.h
> >   create mode 100644
> > drivers/net/ethernet/altera/altera_msgdmahw_prefetcher.h
> > 
> > diff --git a/drivers/net/ethernet/altera/Makefile
> > b/drivers/net/ethernet/altera/Makefile
> > index ad80be42fa26..73b32876f126 100644
> > --- a/drivers/net/ethernet/altera/Makefile
> > +++ b/drivers/net/ethernet/altera/Makefile
> > @@ -5,4 +5,4 @@
> >   obj-$(CONFIG_ALTERA_TSE) += altera_tse.o
> >   altera_tse-objs := altera_tse_main.o altera_tse_ethtool.o \
> >                altera_msgdma.o altera_sgdma.o altera_utils.o \
> > -              altera_ptp.o
> > +              altera_ptp.o altera_msgdma_prefetcher.o
> > diff --git a/drivers/net/ethernet/altera/altera_msgdma_prefetcher.c
> > b/drivers/net/ethernet/altera/altera_msgdma_prefetcher.c
> > new file mode 100644
> > index 000000000000..55b475e9e15b
> > --- /dev/null
> > +++ b/drivers/net/ethernet/altera/altera_msgdma_prefetcher.c
> > @@ -0,0 +1,433 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* MSGDMA Prefetcher driver for Altera ethernet devices
> > + *
> > + * Copyright (C) 2018 Intel Corporation. All rights reserved.
> > + * Author(s):
> > + *   Dalon Westergreen <dalon.westergr...@intel.com>
> > + */
> > +
> > +#include <linux/list.h>
> > +#include <linux/netdevice.h>
> > +#include <linux/net_tstamp.h>
> > +#include "altera_utils.h"
> > +#include "altera_tse.h"
> > +#include "altera_msgdma.h"
> > +#include "altera_msgdmahw.h"
> > +#include "altera_msgdma_prefetcher.h"
> > +#include "altera_msgdmahw_prefetcher.h"
> 
> These could be alphabetized - tse and utils at the end.

sure thing.

> > +
> > +int msgdma_pref_initialize(struct altera_tse_private *priv)
> > +{
> > +   int i;
> > +   struct msgdma_pref_extended_desc *rx_descs;
> > +   struct msgdma_pref_extended_desc *tx_descs;
> > +   dma_addr_t rx_descsphys;
> > +   dma_addr_t tx_descsphys;
> > +   u32 rx_ring_size;
> > +   u32 tx_ring_size;
> > +
> > +   priv->pref_rxdescphys = (dma_addr_t)0;
> > +   priv->pref_txdescphys = (dma_addr_t)0;
> > +
> > +   /* we need to allocate more pref descriptors than ringsize, for now
> > +    * just double ringsize
> > +    */
> > +   rx_ring_size = priv->rx_ring_size * 2;
> > +   tx_ring_size = priv->tx_ring_size * 2;
> > +
> > +   /* The prefetcher requires the descriptors to be aligned to the
> > +    * descriptor read/write master's data width which worst case is
> > +    * 512 bits.  Currently we DO NOT CHECK THIS and only support 32-bit
> > +    * prefetcher masters.
> > +    */
> > +
> > +   /* allocate memory for rx descriptors */
> > +   priv->pref_rxdesc =
> > +           dma_zalloc_coherent(priv->device,
> > +                               sizeof(struct msgdma_pref_extended_desc)
> > +                               * rx_ring_size,
> > +                               &priv->pref_rxdescphys, GFP_KERNEL);
> > +
> > +   if (!priv->pref_rxdesc)
> > +           goto err_rx;
> > +
> > +   /* allocate memory for tx descriptors */
> > +   priv->pref_txdesc =
> > +           dma_zalloc_coherent(priv->device,
> > +                               sizeof(struct msgdma_pref_extended_desc)
> > +                               * tx_ring_size,
> > +                               &priv->pref_txdescphys, GFP_KERNEL);
> > +
> > +   if (!priv->pref_txdesc)
> > +           goto err_tx;
> > +
> > +   /* setup base descriptor ring for tx & rx */
> > +   rx_descs = (struct msgdma_pref_extended_desc *)priv->pref_rxdesc;
> > +   tx_descs = (struct msgdma_pref_extended_desc *)priv->pref_txdesc;
> > +   tx_descsphys = priv->pref_txdescphys;
> > +   rx_descsphys = priv->pref_rxdescphys;
> > +
> > +   /* setup RX descriptors */
> > +   priv->pref_rx_prod = 0;
> > +   for (i = 0; i < rx_ring_size; i++) {
> > +           rx_descsphys = priv->pref_rxdescphys +
> > +                   (((i + 1) % rx_ring_size) *
> > +                   sizeof(struct msgdma_pref_extended_desc));
> > +           rx_descs[i].next_desc_lo = lower_32_bits(rx_descsphys);
> > +           rx_descs[i].next_desc_hi = upper_32_bits(rx_descsphys);
> > +           rx_descs[i].stride = MSGDMA_DESC_RX_STRIDE;
> > +           /* burst set to 0 so it defaults to max configured */
> > +           /* set seq number to desc number */
> > +           rx_descs[i].burst_seq_num = i;
> > +   }
> > +
> > +   /* setup TX descriptors */
> > +   for (i = 0; i < tx_ring_size; i++) {
> > +           tx_descsphys = priv->pref_txdescphys +
> > +                   (((i + 1) % tx_ring_size) *
> > +                   sizeof(struct msgdma_pref_extended_desc));
> > +           tx_descs[i].next_desc_lo = lower_32_bits(tx_descsphys);
> > +           tx_descs[i].next_desc_hi = upper_32_bits(tx_descsphys);
> > +           tx_descs[i].stride = MSGDMA_DESC_TX_STRIDE;
> > +           /* burst set to 0 so it defaults to max configured */
> > +           /* set seq number to desc number */
> > +           tx_descs[i].burst_seq_num = i;
> > +   }
> > +
> > +   if (netif_msg_ifup(priv))
> > +           netdev_info(priv->dev, "%s: RX Desc mem at 0x%x\n", __func__,
> > +                       priv->pref_rxdescphys);
> > +
> > +   if (netif_msg_ifup(priv))
> > +           netdev_info(priv->dev, "%s: TX Desc mem at 0x%x\n", __func__,
> > +                       priv->pref_txdescphys);
> > +
> > +   return 0;
> > +
> > +err_tx:
> > +   dma_free_coherent(priv->device,
> > +                     sizeof(struct msgdma_pref_extended_desc)
> > +                     * rx_ring_size,
> > +                     priv->pref_rxdesc, priv->pref_rxdescphys);
> > +err_rx:
> > +   return -ENOMEM;
> > +}
> > +
> > +void msgdma_pref_uninitialize(struct altera_tse_private *priv)
> > +{
> > +   if (priv->pref_rxdesc)
> > +           dma_free_coherent(priv->device,
> > +                             sizeof(struct msgdma_pref_extended_desc)
> > +                             * priv->rx_ring_size * 2,
> > +                             priv->pref_rxdesc, priv->pref_rxdescphys);
> > +
> > +   if (priv->pref_txdesc)
> > +           dma_free_coherent(priv->device,
> > +                             sizeof(struct msgdma_pref_extended_desc)
> > +                             * priv->rx_ring_size * 2,
> > +                             priv->pref_txdesc, priv->pref_txdescphys);
> 
> Why does this have the ring_size*2 but the error path in 
> msgdma_pref_initialize() above (err_tx path) doesn't?

The msgdma_pref_initialize uses a local rx_ring_size which is already 2 *
priv->rx_ring_size.  i'll just make it more clear and fix the other error
above where priv->rx_ring_size is being used for the tx_ring_size.

> 
> > +}
> > +
> > +void msgdma_pref_enable_txirq(struct altera_tse_private *priv)
> > +{
> > +   tse_set_bit(priv->tx_pref_csr, msgdma_pref_csroffs(control),
> > +               MSGDMA_PREF_CTL_GLOBAL_INTR);
> > +}
> > +
> 
> <snip>
> 
> > +
> > +void msgdma_pref_reset(struct altera_tse_private *priv)
> > +{
> > +   int counter;
> > +
> > +   /* turn off polling */
> > +   tse_clear_bit(priv->rx_pref_csr, msgdma_pref_csroffs(control),
> > +                 MSGDMA_PREF_CTL_DESC_POLL_EN);
> > +   tse_clear_bit(priv->tx_pref_csr, msgdma_pref_csroffs(control),
> > +                 MSGDMA_PREF_CTL_DESC_POLL_EN);
> > +
> > +   /* Reset the RX Prefetcher */
> > +   csrwr32(MSGDMA_PREF_STAT_IRQ, priv->rx_pref_csr,
> > +           msgdma_pref_csroffs(status));
> > +   csrwr32(MSGDMA_PREF_CTL_RESET, priv->rx_pref_csr,
> > +           msgdma_pref_csroffs(control));
> > +
> > +   counter = 0;
> > +   while (counter++ < ALTERA_TSE_SW_RESET_WATCHDOG_CNTR) {
> > +           if (tse_bit_is_clear(priv->rx_pref_csr,
> > +                                msgdma_pref_csroffs(control),
> > +                                MSGDMA_PREF_CTL_RESET))
> > +                   break;
> > +           udelay(1);
> > +   }
> > +
> > +   if (counter >= ALTERA_TSE_SW_RESET_WATCHDOG_CNTR)
> > +           netif_warn(priv, drv, priv->dev,
> > +                      "TSE Rx Prefetcher reset bit never cleared!\n");
> > +
> I take it there are no negative consequences for the reset bit not 
> clearing? Would it be useful to return an error?

I will look into what i can do with this, it is no different than the
currently implemented msgdma code, not that that is an excuse. 

> 
> > +   /* clear all status bits */
> > +   csrwr32(MSGDMA_PREF_STAT_IRQ, priv->tx_pref_csr,
> > +           msgdma_pref_csroffs(status));
> > +
> 
> This looks the same as below. Are they the same or did I miss something?

nope, that's an error. thanks.

> 
> > +   /* Reset the TX Prefetcher */
> > +   csrwr32(MSGDMA_PREF_STAT_IRQ, priv->tx_pref_csr,
> > +           msgdma_pref_csroffs(status));
> > +   csrwr32(MSGDMA_PREF_CTL_RESET, priv->tx_pref_csr,
> > +           msgdma_pref_csroffs(control));
> > +
> > +   counter = 0;
> > +   while (counter++ < ALTERA_TSE_SW_RESET_WATCHDOG_CNTR) {
> > +           if (tse_bit_is_clear(priv->tx_pref_csr,
> > +                                msgdma_pref_csroffs(control),
> > +                                MSGDMA_PREF_CTL_RESET))
> > +                   break;
> > +           udelay(1);
> > +   }
> > +
> > +   if (counter >= ALTERA_TSE_SW_RESET_WATCHDOG_CNTR)
> > +           netif_warn(priv, drv, priv->dev,
> > +                      "TSE Tx Prefetcher reset bit never cleared!\n");
> > +
> Same as above.
> 
> > +   /* clear all status bits */
> > +   csrwr32(MSGDMA_PREF_STAT_IRQ, priv->tx_pref_csr,
> > +           msgdma_pref_csroffs(status));
> > +
> > +   /* Reset mSGDMA dispatchers*/
> > +   msgdma_reset(priv);
> > +}
> > +
> 
> <snip>
> 
> > +
> > +/* Add MSGDMA Prefetcher Descriptor to descriptor list
> > + *   -> This should never be called when a descriptor isn't available
> > + */
> > +void msgdma_pref_add_rx_desc(struct altera_tse_private *priv,
> > +                        struct tse_buffer *rxbuffer)
> > +{
> > +   struct msgdma_pref_extended_desc *rx_descs = priv->pref_rxdesc;
> > +   u32 desc_entry = priv->pref_rx_prod % (priv->rx_ring_size * 2);
> > +
> > +   /* write descriptor entries */
> > +   rx_descs[desc_entry].len = priv->rx_dma_buf_sz;
> > +   rx_descs[desc_entry].write_addr_lo = lower_32_bits(rxbuffer->dma_addr);
> > +   rx_descs[desc_entry].write_addr_hi = upper_32_bits(rxbuffer->dma_addr);
> > +
> > +   /* set the control bits and set owned by hw */
> > +   rx_descs[desc_entry].desc_control = (MSGDMA_DESC_CTL_END_ON_EOP
> > +                   | MSGDMA_DESC_CTL_END_ON_LEN
> > +                   | MSGDMA_DESC_CTL_TR_COMP_IRQ
> > +                   | MSGDMA_DESC_CTL_EARLY_IRQ
> > +                   | MSGDMA_DESC_CTL_TR_ERR_IRQ
> > +                   | MSGDMA_DESC_CTL_GO
> > +                   | MSGDMA_PREF_DESC_CTL_OWNED_BY_HW);
> > +
> > +   /* we need to keep a separate one for rx as RX_DESCRIPTORS are
> > +    * pre-configured at startup
> > +    */
> > +   priv->pref_rx_prod++;
> 
> Can you explain more in the comment? What is "one"?

i will clean up the comment.  The rx path keeps it's own copy
of pref_rx_prod as the pref ring size is 2 * tse ring size.  I'll
also see how tricky it is to get rid of this copy.

> 
> > +
> > +   if (netif_msg_rx_status(priv)) {
> > +           netdev_info(priv->dev, "%s: desc: %d buf: %d control 0x%x\n",
> > +                       __func__, desc_entry,
> > +                       priv->rx_prod % priv->rx_ring_size,
> > +                       priv->pref_rxdesc[desc_entry].desc_control);
> > +   }
> > +}
> > +
> 
> <snip>
> 
> Thanks for the patches!

Nope, thank you for looking them over.

Reply via email to