From: Govindarajulu Varadarajan <gvara...@cisco.com> Date: Tue, 5 Dec 2017 11:14:41 -0800
> In case of tx clean up, we set '-1' as budget. This means clean up until > wq is empty or till (1 << 32) pkts are cleaned. Under heavy load this > will run for long time and cause > "watchdog: BUG: soft lockup - CPU#25 stuck for 21s!" warning. > > This patch sets wq clean up budget to 256. > > Signed-off-by: Govindarajulu Varadarajan <gvara...@cisco.com> This driver with all of it's indirection and layers upon layers of macros for queue processing is so difficult to read, and this can't be generating nice optimal code either... Anyways, I was walking over the driver to see if the logic is contributing to this. The limit you are proposing sounds unnecessary, nobody else I can see needs this, and that includes all of the most heavily used drivers under load. If I had to guess I'd say that the problem is that the queue loop keeps sampling the head and tail pointers, where as it should just do that _once_ and only process that TX entries found in that snapshot and return to the poll() routine immedately afterwards. Thanks.