From: Miroslav Urbanek <m...@miroslavurbanek.com> The threshold for OOM protection is too small for systems with large number of CPUs. Applications report ENOBUFs on connect() every 10 minutes.
The problem is that the variable net->xfrm.flow_cache_gc_count is a global counter while the variable fc->high_watermark is a per-CPU constant. Take the number of CPUs into account as well. Fixes: 6ad3122a08e3 ("flowcache: Avoid OOM condition under preasure") Reported-by: Lukáš Koldrt <l...@excello.cz> Tested-by: Jan Hejl <j...@excello.cz> Signed-off-by: Miroslav Urbanek <m...@miroslavurbanek.com> Signed-off-by: Steffen Klassert <steffen.klass...@secunet.com> --- net/core/flow.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/core/flow.c b/net/core/flow.c index 3937b1b..18e8893 100644 --- a/net/core/flow.c +++ b/net/core/flow.c @@ -95,7 +95,6 @@ static void flow_cache_gc_task(struct work_struct *work) list_for_each_entry_safe(fce, n, &gc_list, u.gc_list) { flow_entry_kill(fce, xfrm); atomic_dec(&xfrm->flow_cache_gc_count); - WARN_ON(atomic_read(&xfrm->flow_cache_gc_count) < 0); } } @@ -236,9 +235,8 @@ flow_cache_lookup(struct net *net, const struct flowi *key, u16 family, u8 dir, if (fcp->hash_count > fc->high_watermark) flow_cache_shrink(fc, fcp); - if (fcp->hash_count > 2 * fc->high_watermark || - atomic_read(&net->xfrm.flow_cache_gc_count) > fc->high_watermark) { - atomic_inc(&net->xfrm.flow_cache_genid); + if (atomic_read(&net->xfrm.flow_cache_gc_count) > + 2 * num_online_cpus() * fc->high_watermark) { flo = ERR_PTR(-ENOBUFS); goto ret_object; } -- 1.9.1