Hi Florian, Thanks for your input.
On 10-09-2018 20:22, Florian Fainelli wrote: > On 09/10/2018 02:14 AM, Jose Abreu wrote: >> This follows David Miller advice and tries to fix coalesce timer in >> multi-queue scenarios. >> >> We are now using per-queue coalesce values and per-queue TX timer. >> >> Coalesce timer default values was changed to 1ms and the coalesce frames >> to 25. >> >> Tested in B2B setup between XGMAC2 and GMAC5. > Why not revert the entire features for this merge window and work on > getting it to work over the next weeks/merge windows? It was already reverted but the performance drops a little bit (not that much but I'm trying to fix it). > > The idea of using a timer to coalesce TX path when there is not a HW > timer is a good idea and if this is made robust enough, you could even > promote that as being a network stack library/feature that could be used > by other drivers. In fact, this could be a great addition to the net DIM > library (Tal, what do you think?) > > Here's a quick drive by review of things that appear wrong in the > current driver (without your patches): > > - in stmmac_xmit(), in case we hit the !is_jumbo branch and we fail the > DMA mapping, there is no timer cancellation, don't we want to abort the > whole transmission? I don't think this is needed because then tx pointer will not advance and in stmmac_tx_clean we just won't perform any work. Besides, we can have a pending timer from previous packets running so canceling it can cause some problems. > > - stmmac_tx_clean() should probably use netif_lock_bh() to guard against > the timer (soft IRQ context) and the the NAPI context (also soft IRQ) > running in parallel on two different CPUs. This may not explain all > problems, but these two things are fundamentally exclusive, because the > timer is meant to emulate the interrupt after N packets, while NAPI > executes when such a thing did actually occur Ok, and now I'm also using __netif_tx_lock_bh(queue) to just lock per queue instead of the whole TX. > > - stmmac_poll() should cancel pending timer(s) if it was able to reclaim > packets, likewise stmmac_tx_timer() should re-enable TX interrupts if it > reclaimed packets, since TX interrupts could have been left disabled > from a prior NAPI run. These could be considered optimizations, since > you could leave the TX timer running all the time, just adjust the > deadline (based on line rate, MTU, IPG, number of fragments and their > respective length), worst case, both NAPI and the timer clean up your TX > ring, so you should always have room to push more packets In next version I'm dropping the direct call to stmmac_tx_clean() in the timer function and just scheduling NAPI instead. Thanks and Best Regards, Jose Miguel Abreu