On Mon, 2016-09-12 at 19:02 -0400, Jamal Hadi Salim wrote: > On 16-09-12 06:26 PM, Eric Dumazet wrote: > > On Mon, 2016-09-12 at 18:14 -0400, Jamal Hadi Salim wrote: > > > >> I noticed some very weird issues when I took that out. > >> Running sufficiently large amount of traffic (ping -f is sufficient) > >> I saw that when i did a dump it took anywhere between 6-15 seconds. > >> With the read_lock in place response was immediate. > >> I can go back and run things to verify - but it was very odd. > > > > This was on uni processor ? > > > > It was a VM. > > > Looks like typical starvation caused by aggressive softirq. > > > > Well, then it is strange that in one case a tc dump of the rule > was immediate and in the other case it was consistent for 5-15 > seconds. >
This needs investigation ;) One possible loop under high stress would be possible in __gnet_stats_copy_basic(), since we might restart the loop if we are really really unlucky, but this would have nothing with your patches. diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c index 508e051304fb62627e61b5065b2325edd1b84f2e..dc9dd8ae7d5405f76c775278dac7689655b21041 100644 --- a/net/core/gen_stats.c +++ b/net/core/gen_stats.c @@ -142,10 +142,14 @@ __gnet_stats_copy_basic(const seqcount_t *running, return; } do { - if (running) + if (running) { + local_bh_disable(); seq = read_seqcount_begin(running); + } bstats->bytes = b->bytes; bstats->packets = b->packets; + if (running) + local_bh_enable(); } while (running && read_seqcount_retry(running, seq)); } EXPORT_SYMBOL(__gnet_stats_copy_basic);