On Thu, Jun 4, 2015 at 7:00 PM, Alistair Francis <alistair.fran...@xilinx.com> wrote: > On Thu, Jun 4, 2015 at 8:52 AM, Peter Crosthwaite > <peter.crosthwa...@xilinx.com> wrote: >> On Wed, May 27, 2015 at 12:37 AM, Alistair Francis >> <alistair.fran...@xilinx.com> wrote: >>> Previously the stream_running() function didn't check >>> if the DMA was halted. This caused hangs in recent versions >>> of MicroBlaze u-boot. Correct stream_running() to check >>> DMASR_HALTED as well as DMACR_RUNSTOP. >>> >> >> So I'm stuggling with this one. Partly because I think HALTED might be >> misimplemented in existing code. I did some digging, and AFAICS, >> HALTED is conditional on !DAMCR_RUNSTOP. I think i might have got >> 210914e29975d17e635f9e8c1f7478c0ed7a208f wrong: >> >> @@ -276,7 +276,7 @@ static void stream_process_mem2s(struct Stream *s, >> stream_desc_load(s, s->regs[R_CURDESC]); >> >> if (s->desc.status & SDESC_STATUS_COMPLETE) { >> - s->regs[R_DMASR] |= DMASR_IDLE; >> + s->regs[R_DMASR] |= DMASR_HALTED; >> break; >> } >> >> Stepping back and ignoring the existing implementation of HALTED there >> are 4 states of RS:H (RUNSTOP and HALTED): >> >> !RS && H - this is the off state. doc refers to this as the "halted" state. >> RS && !H - This is the running state. >> !RS && !H - This is the transient state. Software has cleared RS but >> there s still something on AXI bus so cant assert halted yet. >> RS && H - This is an invalid state. >> >> Current code reaches the invalid state on the ring buffer full case >> due to the bug above. My thoery is >> 210914e29975d17e635f9e8c1f7478c0ed7a208f should have just been: >> >> if (s->desc.status & SDESC_STATUS_COMPLETE) { >> - s->regs[R_DMASR] |= DMASR_IDLE; >> break; >> } >> >> Now I think there is yet another bug in that clearing RS doesn't seem >> to be able to reliably set the HALTED bit (only in the unrelated case >> of a ring buffer fill). >> >> I'm starting to question whether the HALTED bit as far as QEMU is >> concerned should just be a straight negation of RS. Depending on what >> the conditions cause a transient and what doesn't, the transient as I >> describe above may evaporate as we can get away with this simple >> shortcut. > > Hey Peter, > > Ok, so you are saying maybe we should only have 'Running' and 'Off' > states? > >> >> This would make this patch obsolete without fixing your bug :). >> >> So running on the assumption that HALTED is misimplemented your patch >> is doing something with that behaviour. The misimplemented HALTED is >> currently holding the state of "we are blocked on a full buffer". If >> you can point me which of the 3 call sites of stream_running was >> giving you problems I might have more clues. > > So the problem was basically because: > - axienet_eth_rx_notify() calls the DMA push functions > > - xilinx_axidma_data_stream_can_push() returned true > - Therefore xilinx_axidma_data_stream_push() could be called > > - This meant that stream_process_s2mem() was called > - But it would break straight away as > 's->desc.status & SDESC_STATUS_COMPLETE' would return true. > - This meant it would cause an infinite loop as the stream could be > pushed, but nothing was ever pushed. As ret was always zero > the code never broke out of the second while loop in > axienet_eth_rx_notify() >
Ok I think I know what is up. It happens by fluke, that the bad implementation of halted is exactly the state you need to correct this issue. You could capture it as a new boolean in the state struct and just check it in can_push. I'm looking at ways to refactor it to avoid the extra calls to _push though. It might be possible to refactor such that all of stream_running, stream_idle and the SDESC_STATUS_COMPLETE happen in can_push. This means can_push has to do the stream_desc_load. But that is ok, as the current solution will already have spurious fetches of the descriptor. Then make can_push itself the iteration condition for _mem2s's loop which will handle descriptor fetches for you in the process. Regards, Peter > Thanks, > > Alistair > >> >> FYI you patch may still be correct but I wondering whether is has >> uncovered a bug that should lead to a rework of this. >> >> Regards, >> Peter >> >>> Signed-off-by: Alistair Francis <alistair.fran...@xilinx.com> >>> Reviewed-by: Sai Pavan Boddu <saip...@xilinx.com> >>> --- >>> hw/dma/xilinx_axidma.c | 3 ++- >>> 1 files changed, 2 insertions(+), 1 deletions(-) >>> >>> diff --git a/hw/dma/xilinx_axidma.c b/hw/dma/xilinx_axidma.c >>> index d06002d..27fba40 100644 >>> --- a/hw/dma/xilinx_axidma.c >>> +++ b/hw/dma/xilinx_axidma.c >>> @@ -154,7 +154,8 @@ static inline int stream_resetting(struct Stream *s) >>> >>> static inline int stream_running(struct Stream *s) >>> { >>> - return s->regs[R_DMACR] & DMACR_RUNSTOP; >>> + return s->regs[R_DMACR] & DMACR_RUNSTOP && >>> + !(s->regs[R_DMASR] & DMASR_HALTED); >>> } >>> >>> static inline int stream_idle(struct Stream *s) >>> -- >>> 1.7.1 >>> >>> >> >