Garrett Wollman <wollman_at_bimajority.org> wrote on
Date: Tue, 09 Sep 2025 16:19:42 UTC :

> On some of our newer large-memory NFS servers, we are seeing services
> killed with "failed to reclaim memory". According to our monitoring,
> the server has >100G of physmem free at the time,

Was that 100G+ somewhat before any reclaiming of memory started,
the lead-up to the notice? Any likelihood of sudden, rapid,
huge drops in free RAM based on workload behavior?

Some other figures from the lead-up to the OOM activity
would be snapshots of the likes of top's:

Active, Inact, Laundry, Wired, and Free
(things in Buf also show up in the other categories)

Is NUMA involved?

> and the only
> solution seems to be rebooting. (There is a small amount of swap
> configured and even less of it in use.)

That swap is in use at all could be of interest. I wonder
whaat it was doing when the swap was put to use or laundry
was growing that lead to swap being put to use.

> Does this sound familiar to
> anyone? What should we be monitoring that we evidently aren't now?

I'll note that you can delay the "failed to
reclaim memory" OOM activity via the use of
the likes of:

# sysctl vm.pageout_oom_seq=120

FYI:

# sysctl -d vm.pageout_oom_seq
vm.pageout_oom_seq: back-to-back calls to oom detector to start OOM

The default is 12 and larger gives more delay by
causing more attempts to meet the threshold
involved before OOM is used. No figure gives an
unbounded delay so far as I know. (I do not know
anything about the "counts wrap" behavior.)

But if the conditions have a bounded duration,
vm.pageout_oom_seq can make OOM activity be
avoided over that duration fairly generally.

(Even just one thread can keep the Active memory
so large as to not meet the free RAM threshold(s)
involved, even if swap is unused.)

Someone might want to see some of the output from
the likes of something like:

# sysctl vm | grep -v "^vm\.uma\." | grep -e "\.v_" -e stats -e oom_seq | sort

from the lead-up to a "failed to reclaim memory".
Having a larger vm.pageout_oom_seq can make it
easier to observe the lead-up time frame.


===
Mark Millard
marklmi at yahoo.com


Reply via email to