On 11/09/2020 10:04, Ralph Corderoy wrote: > Hi Hamish, > >> NB: On desktop I seem to have a very high number for "SUnreclaim" in >> /proc/meminfo: >> >> MemTotal: 32812004 kB >> MemFree: 8619976 kB >> MemAvailable: 9572924 kB >> Buffers: 61772 kB >> Cached: 1061212 kB >> SwapCached: 1190212 kB > ... >> Slab: 16832040 kB >> SReclaimable: 397472 kB >> SUnreclaim: 16434568 kB > I think SUnreclaim is a key number to monitor over time. > >> What does SUnreclaimable mean? > Slab is the memory used by the kernel's memory allocator. Think of it > as malloc(3) but with knowledge of the type of item that may be > allocated. Some of the slab allocations could be freed if there were > other demands of memory: ‘memory pressure’. That's SReclaimable. > SUnreclaim is the amount allocated to things which must be kept no > matter how high memory pressure goes. > >> This isn't the same on the NAS box, but either way there are two >> problems to debug here and I guess they could be related. > The desktop PC will be easier because you've more tools and more > upstream parties interested in any report. Does the NAS last the day? > Schedule a nightly reboot, or kexec? > https://wiki.archlinux.org/index.php/Kexec > >> Do I have a kernel memory leak? > Probably. > >>> ‘sudo slabtop -osc’ will give a breakdown. > ... >> Okay, that yields: http://ix.io/2x4T >> >> The total is much smaller than the number in /proc/meminfo (just >> verified it hasn't changed drastically). Bizarre. > That is odd. > > Are you using any VM stuff? Any disk filesystems over than ext4, > e.g. ZFS? Nvidia graphics drivers? Is this the machine where a kernel > driver keeps dying? > > Monitor SUnreclaim at a regular time period, e.g. 30 seconds, so you can > see it climbing. You said you were doing a large upload. If it's the > kind which can recover from being stopped and re-started then see if > your monitoring shows a steady climb during upload which stops if you > kill the upload only to restart when you resume the upload. > >> My swap is meant to be for an emergency, not because some leaky code >> in a driver/the kernel/whatever is somehow managing to use 20gb of >> ram. > Swap isn't reserved as an overflow when RAM runs low. Even when there > is plenty of RAM free, the kernel might decide to swap out some memory > which is not backed by another device because it thinks that memory > would be better used by a cache. > > BTW, curl(1) did download something. ;-) > I'm gonna have to shut this system down, but in the meantime I'm open to any more suggestions. I'll try a bunch of different things I guess, and see f it makes any difference. May also see if it happens on a live disk.
Hamish
signature.asc
Description: OpenPGP digital signature
-- Next meeting: Online, Jitsi, Tuesday, 2020-10-06 20:00 Check to whom you are replying Meetings, mailing list, IRC, ... http://dorset.lug.org.uk New thread, don't hijack: mailto:dorset@mailman.lug.org.uk