hello -
on my workstation running CURRENT (amd64/32g of ram) i've been running
into a scenario where after 4 or 5 days of daily use I get an OOM event
and both chromium and firefox are killed. then in the next day or so
the system will become very unresponsive in the morning when i unlock my
screensaver in the morning forcing a manual power cycle.
one thing i've noticed is growing swap usage but plenty of free and
inactive memory as well as a GB or so of memory in the Laundry state
according top. my understanding is that seeing swap usage grow over
time is expected and doesn't necessarily indicate a problem. but what
concerns me is the system locking up while seeing quite a bit of disk
i/o (maybe from paging back in?).
in order to help chase this down i've setup the
prometheus_sysctl_exporter(8) to send data to a local prometheus
instance. the goal is to examine memory utilizaton over time to help
detect any issues. so my question is this:
what OID's would be useful to help see to help diagnose weird memory
issues like this?
i'm currently looking at:
sysctl_vm_domain_0_stats_laundry
sysctl_vm_domain_0_stats_active
sysctl_vm_domain_0_stats_free_count
sysctl_vm_domain_0_stats_inactive_pps
thanks in advance - and i'd be happy to share my data if anyone is
interested :)
-pete
--
Pete Wright
p...@nomadlogic.org
@nomadlogicLA