On 10/16/18 22:37, Heiner Kallweit wrote:
rtl_rx() and rtl_tx() are called only if the respective bits are set in the interrupt status register. Under high load NAPI may not be able to process all data (work_done == budget) and it will schedule subsequent calls to the poll callback. rtl_ack_events() however resets the bits in the interrupt status register, therefore subsequent calls to rtl8169_poll() won't call rtl_rx() and rtl_tx() - chip interrupts are still disabled.
Very interesting! Could this be the reason for the mysterious hangs & resets we experienced when enabling BQL for r8169? They happened more often with TSO/GSO enabled and several people attempted to fix those hangs unsuccessfully; it was later reverted and has been since then (#87cda7cb43). If this bug has been there "forever" it might be tempting to re-apply BQL and see what happens. Any chance you could give that a try? I'll gladly test patches, just like I'll run this one. cheers Holger