Hi Hamish, > NB: On desktop I seem to have a very high number for "SUnreclaim" in > /proc/meminfo: > > MemTotal: 32812004 kB > MemFree: 8619976 kB > MemAvailable: 9572924 kB > Buffers: 61772 kB > Cached: 1061212 kB > SwapCached: 1190212 kB ... > Slab: 16832040 kB > SReclaimable: 397472 kB > SUnreclaim: 16434568 kB
I think SUnreclaim is a key number to monitor over time. > What does SUnreclaimable mean? Slab is the memory used by the kernel's memory allocator. Think of it as malloc(3) but with knowledge of the type of item that may be allocated. Some of the slab allocations could be freed if there were other demands of memory: ‘memory pressure’. That's SReclaimable. SUnreclaim is the amount allocated to things which must be kept no matter how high memory pressure goes. > This isn't the same on the NAS box, but either way there are two > problems to debug here and I guess they could be related. The desktop PC will be easier because you've more tools and more upstream parties interested in any report. Does the NAS last the day? Schedule a nightly reboot, or kexec? https://wiki.archlinux.org/index.php/Kexec > Do I have a kernel memory leak? Probably. > > ‘sudo slabtop -osc’ will give a breakdown. ... > Okay, that yields: http://ix.io/2x4T > > The total is much smaller than the number in /proc/meminfo (just > verified it hasn't changed drastically). Bizarre. That is odd. Are you using any VM stuff? Any disk filesystems over than ext4, e.g. ZFS? Nvidia graphics drivers? Is this the machine where a kernel driver keeps dying? Monitor SUnreclaim at a regular time period, e.g. 30 seconds, so you can see it climbing. You said you were doing a large upload. If it's the kind which can recover from being stopped and re-started then see if your monitoring shows a steady climb during upload which stops if you kill the upload only to restart when you resume the upload. > My swap is meant to be for an emergency, not because some leaky code > in a driver/the kernel/whatever is somehow managing to use 20gb of > ram. Swap isn't reserved as an overflow when RAM runs low. Even when there is plenty of RAM free, the kernel might decide to swap out some memory which is not backed by another device because it thinks that memory would be better used by a cache. BTW, curl(1) did download something. ;-) -- Cheers, Ralph. -- Next meeting: Online, Jitsi, Tuesday, 2020-10-06 20:00 Check to whom you are replying Meetings, mailing list, IRC, ... http://dorset.lug.org.uk New thread, don't hijack: mailto:dorset@mailman.lug.org.uk