Just a short summary: Your suggested fix in stm32_freeframe was resolving the 
issue as it seems for the moment. So far it seems to handle our traffic and 
several ping-floods well. 

The suggested iobs solution do not address our problem since we do not use the 
buffered functions.

On 2020/02/22 07:19:16, Jukka Laitinen <jukka.laiti...@iki.fi> wrote: 
> Hi!
> 
> On 21.2.2020 16.08, Reto Gähwiler wrote:
> > Hi Jukka, 
> > 
> > First of all, thanks for your time and hints. Just applied your suggested 
> > change to the stm32_freebuffer. 
> 
> Don't mention, I am sort of responsible for these bugs originally ;)
> 
> > About the second suggested fix in stm32_freesegment, how do you mean that? 
> > The stm32_get_next_rxdesc just increments the descriptor, this should be 
> > independent from advancing the rx tail pointer, shouldn't it? But the tail 
> > pointer should be advanced be the just modified rx-descriptor!
> 
> In this function the stm32_get_next_rxdesc will return the pointer to the 
> next descriptor to be freed. In case there are no more RX descriptors to be 
> freed, this pointer will point to the START of the FIRST free descriptor, and 
> the tail pointer should point to the END of the LAST free descriptor, which 
> is the same ptr. The DMA fills the descriptors up to this pointer (so also 
> the last desccriptor is available).
> 
> So currently the tail pointer lags one desc behind, and actually the last 
> free rx desc can't be filled in by dma (since the tail pointer must point to 
> the END of the last free RX desc, and now it points to the START of it).
> 
> The code, IMHO, should be just like this:
> 
>       /* Get the next RX descriptor in the chain */
> 
>       rxdesc = stm32_get_next_rxdesc(priv, rxdesc);
> 
>       /* Update the tail pointer */
> 
>       stm32_putreg((uintptr_t)rxdesc, STM32_ETH_DMACRXDTPR);
> 
> If you enable the network error debug outputs, it is worthwhile to also clear 
> the RBU flag in the same function later after it has been checked:
> if (..RBU) {
>       /* Clear the RBU flag */
> 
>       stm32_putreg(ETH_DMACSR_RBU, STM32_ETH_DMACSR);
> 
> Since that bit will not clear by itself, and you'll be spammed by the err...
> 
> Another, but independent, source of networking problems is then having enough 
> IOBs and also enough ethernet buffers. About these you'll easily find out by 
> enabling the corresponding CONFIG_DEBUGs.
> 
> -Jukka
> 
> > 
> > Will let you guys know on Monday how things are with the little change in 
> > the stm32_freebuffer. Have a nice weekend, 
> > Reto
> > 
> > On 2020/02/21 08:07:41, Jukka Laitinen <jukka.laiti...@iki.fi> wrote: 
> >> Hi,
> >>
> >> Reviewing the ethernet driver, I can see couple of bugs:
> >>
> >> 1) In stm32_freeframe, it should free all the buffers, and not just the 
> >> first one. So remove the "if ((txdesc->des3 & ETH_TDES3_RD_FD) != 0)"
> >>
> >> That may cause it run out of buffers.
> >>
> >> 2) In stm32_freesegment, the order of getting the next descriptor, and 
> >> updating the tail pointer is wrong. It should first call the 
> >> stm32_get_next_rxdesc and only after that the 
> >> stm32_putreg((uintptr_t)rxdesc, STM32_ETH_DMACRXDTPR);
> >>
> >> This is not fatal, but leads to the driver not using one of the desciptors 
> >> and buffers at all (there is one less buffer in use and wasted memory for 
> >> that).
> >>
> >> I am very sorry, but I am unable to provide any patches currently.
> >>
> >> Regards,
> >> Jukka
> >>
> >>
> >> On 20.2.2020 9.03, Reto Gähwiler wrote:
> >>> @Gregory: Thanks for your responses and input, will see what I can do. 
> >>>
> >>> On 2020/02/19 22:50:02, Gregory Nutt <spudan...@gmail.com> wrote: 
> >>>>
> >>>>> This sounds a lot like the problem I'm having with the SAMA5D36 Gigabit
> >>>>> ethernet... I'm running into some kind of deadlock on long transfers 
> >>>>> that
> >>>>> send packets very quickly. NuttX seems to run out of IOBs and then can't
> >>>>> send or respond to network packets.
> >>>>>
> >>>>> I tried increasing the low priority worker threads to 2 (and also 3) but
> >>>>> neither of them solved the problem.
> >>>>>
> >>>>> I'll look at the net_lock() to see if there's a way to release it.
> >>>>>
> >>>>> If you find a solution, I would love to know it! If I find one, I'll 
> >>>>> post
> >>>>> it here.
> >>>>
> >>>> The first step in debugging a deadlock is to find what is stuck waiting 
> >>>> for what resource.
> >>>>
> >>>> Then find the logic that provides the resource that is being waited on.
> >>>>
> >>>> Then figure out why that logic is not running.  Most likely, it would be 
> >>>> waiting the low priority work queue.
> >>>>
> >>>> I have had to solve lots of problems like this.  It is not really so 
> >>>> difficult once you unstand the above things.
> >>>>
> >>>>
> >>>>
> >>>>
> >>
> 

Reply via email to