On Tue, Nov 22, 2011 at 12:55:05PM +0800, b29...@freescale.com wrote: > From: Forrest Shi <b29...@freescale.com> > > dma status check function fsl_tx_status is heavily called in > a tight loop and the desc lock in fsl_tx_status contended by > the dma status update function. this caused the dma performance > degrades much. > > this patch releases the lock in the fsl_tx_status function. > I believe it has no neglect impact on the following call of > dma_async_is_complete(...). > > we can see below three conditions will be identified as success > a) x < complete < use > b) x < complete+N < use+N > c) x < complete < use+N > here complete is the completed_cookie, use is the last_used > cookie, x is the querying cookie, N is MAX cookie > > when chan->completed_cookie is being read, the last_used may > be incresed. Anyway it has no neglect impact on the dma status > decision. > > Signed-off-by: Forrest Shi <xuelin....@freescale.com> > --- > drivers/dma/fsldma.c | 5 ----- > 1 files changed, 0 insertions(+), 5 deletions(-) > > diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c > index 8a78154..1dca56f 100644 > --- a/drivers/dma/fsldma.c > +++ b/drivers/dma/fsldma.c > @@ -986,15 +986,10 @@ static enum dma_status fsl_tx_status(struct dma_chan > *dchan, > struct fsldma_chan *chan = to_fsl_chan(dchan); > dma_cookie_t last_complete; > dma_cookie_t last_used; > - unsigned long flags; > - > - spin_lock_irqsave(&chan->desc_lock, flags); >
This will cause a bug. See below for a detailed explanation. You need this instead: /* * On an SMP system, we must ensure that this CPU has seen the * memory accesses performed by another CPU under the * chan->desc_lock spinlock. */ smp_mb(); > last_complete = chan->completed_cookie; > last_used = dchan->cookie; > > - spin_unlock_irqrestore(&chan->desc_lock, flags); > - > dma_set_tx_state(txstate, last_complete, last_used, 0); > return dma_async_is_complete(cookie, last_complete, last_used); > } Facts: - dchan->cookie is the same member as chan->common.cookie (same memory location) - chan->common.cookie is the "last allocated cookie for a pending transaction" - chan->completed_cookie is the "last completed transaction" I have replaced "dchan->cookie" with "chan->common.cookie" in the below explanation, to keep everything referenced from the same structure. Variable usage before your change. Everything is used locked. - RW chan->common.cookie (fsl_dma_tx_submit) - R chan->common.cookie (fsl_tx_status) - R chan->completed_cookie (fsl_tx_status) - W chan->completed_cookie (dma_do_tasklet) Variable usage after your change: - RW chan->common.cookie LOCKED - R chan->common.cookie NO LOCK - R chan->completed_cookie NO LOCK - W chan->completed_cookie LOCKED What if we assume that you have a 2 CPU system (such as a P2020). After your changes, one possible sequence is: === CPU1 - allocate + submit descriptor: fsl_dma_tx_submit() === spin_lock_irqsave descriptor->cookie = 20 (x in your example) chan->common.cookie = 20 (used in your example) spin_unlock_irqrestore === CPU2 - immediately calls fsl_tx_status() === chan->common.cookie == 19 chan->completed_cookie == 19 descriptor->cookie == 20 Since we don't have locks anymore, CPU2 may not have seen the write to chan->common.cookie yet. Also assume that the DMA hardware has not started processing the transaction yet. Therefore dma_do_tasklet() has not been called, and chan->completed_cookie has not been updated. In this case, dma_async_is_complete() (on CPU2) returns DMA_SUCCESS, even though the DMA operation has not succeeded. The DMA operation has not even started yet! The smp_mb() fixes this, since it forces CPU2 to have seen all memory operations that happened before CPU1 released the spinlock. Spinlocks are implicit SMP memory barriers. Therefore, the above example becomes: smp_mb(); chan->common.cookie == 20 chan->completed_cookie == 19 descriptor->cookie == 20 Then dma_async_is_complete() returns DMA_IN_PROGRESS, which is correct. Thanks, Ira _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev