Just a short summary: Your suggested fix in stm32_freeframe was resolving the issue as it seems for the moment. So far it seems to handle our traffic and several ping-floods well.
The suggested iobs solution do not address our problem since we do not use the buffered functions. On 2020/02/22 07:19:16, Jukka Laitinen <jukka.laiti...@iki.fi> wrote: > Hi! > > On 21.2.2020 16.08, Reto Gähwiler wrote: > > Hi Jukka, > > > > First of all, thanks for your time and hints. Just applied your suggested > > change to the stm32_freebuffer. > > Don't mention, I am sort of responsible for these bugs originally ;) > > > About the second suggested fix in stm32_freesegment, how do you mean that? > > The stm32_get_next_rxdesc just increments the descriptor, this should be > > independent from advancing the rx tail pointer, shouldn't it? But the tail > > pointer should be advanced be the just modified rx-descriptor! > > In this function the stm32_get_next_rxdesc will return the pointer to the > next descriptor to be freed. In case there are no more RX descriptors to be > freed, this pointer will point to the START of the FIRST free descriptor, and > the tail pointer should point to the END of the LAST free descriptor, which > is the same ptr. The DMA fills the descriptors up to this pointer (so also > the last desccriptor is available). > > So currently the tail pointer lags one desc behind, and actually the last > free rx desc can't be filled in by dma (since the tail pointer must point to > the END of the last free RX desc, and now it points to the START of it). > > The code, IMHO, should be just like this: > > /* Get the next RX descriptor in the chain */ > > rxdesc = stm32_get_next_rxdesc(priv, rxdesc); > > /* Update the tail pointer */ > > stm32_putreg((uintptr_t)rxdesc, STM32_ETH_DMACRXDTPR); > > If you enable the network error debug outputs, it is worthwhile to also clear > the RBU flag in the same function later after it has been checked: > if (..RBU) { > /* Clear the RBU flag */ > > stm32_putreg(ETH_DMACSR_RBU, STM32_ETH_DMACSR); > > Since that bit will not clear by itself, and you'll be spammed by the err... > > Another, but independent, source of networking problems is then having enough > IOBs and also enough ethernet buffers. About these you'll easily find out by > enabling the corresponding CONFIG_DEBUGs. > > -Jukka > > > > > Will let you guys know on Monday how things are with the little change in > > the stm32_freebuffer. Have a nice weekend, > > Reto > > > > On 2020/02/21 08:07:41, Jukka Laitinen <jukka.laiti...@iki.fi> wrote: > >> Hi, > >> > >> Reviewing the ethernet driver, I can see couple of bugs: > >> > >> 1) In stm32_freeframe, it should free all the buffers, and not just the > >> first one. So remove the "if ((txdesc->des3 & ETH_TDES3_RD_FD) != 0)" > >> > >> That may cause it run out of buffers. > >> > >> 2) In stm32_freesegment, the order of getting the next descriptor, and > >> updating the tail pointer is wrong. It should first call the > >> stm32_get_next_rxdesc and only after that the > >> stm32_putreg((uintptr_t)rxdesc, STM32_ETH_DMACRXDTPR); > >> > >> This is not fatal, but leads to the driver not using one of the desciptors > >> and buffers at all (there is one less buffer in use and wasted memory for > >> that). > >> > >> I am very sorry, but I am unable to provide any patches currently. > >> > >> Regards, > >> Jukka > >> > >> > >> On 20.2.2020 9.03, Reto Gähwiler wrote: > >>> @Gregory: Thanks for your responses and input, will see what I can do. > >>> > >>> On 2020/02/19 22:50:02, Gregory Nutt <spudan...@gmail.com> wrote: > >>>> > >>>>> This sounds a lot like the problem I'm having with the SAMA5D36 Gigabit > >>>>> ethernet... I'm running into some kind of deadlock on long transfers > >>>>> that > >>>>> send packets very quickly. NuttX seems to run out of IOBs and then can't > >>>>> send or respond to network packets. > >>>>> > >>>>> I tried increasing the low priority worker threads to 2 (and also 3) but > >>>>> neither of them solved the problem. > >>>>> > >>>>> I'll look at the net_lock() to see if there's a way to release it. > >>>>> > >>>>> If you find a solution, I would love to know it! If I find one, I'll > >>>>> post > >>>>> it here. > >>>> > >>>> The first step in debugging a deadlock is to find what is stuck waiting > >>>> for what resource. > >>>> > >>>> Then find the logic that provides the resource that is being waited on. > >>>> > >>>> Then figure out why that logic is not running. Most likely, it would be > >>>> waiting the low priority work queue. > >>>> > >>>> I have had to solve lots of problems like this. It is not really so > >>>> difficult once you unstand the above things. > >>>> > >>>> > >>>> > >>>> > >> >