On 11/09/2020 10:04, Ralph Corderoy wrote:
> Hi Hamish,
>
>> NB: On desktop I seem to have a very high number for "SUnreclaim" in
>> /proc/meminfo:
>>
>> MemTotal:       32812004 kB
>> MemFree:         8619976 kB
>> MemAvailable:    9572924 kB
>> Buffers:           61772 kB
>> Cached:          1061212 kB
>> SwapCached:      1190212 kB
> ...
>> Slab:           16832040 kB
>> SReclaimable:     397472 kB
>> SUnreclaim:     16434568 kB
> I think SUnreclaim is a key number to monitor over time.
>
>> What does SUnreclaimable mean?
> Slab is the memory used by the kernel's memory allocator.  Think of it
> as malloc(3) but with knowledge of the type of item that may be
> allocated.  Some of the slab allocations could be freed if there were
> other demands of memory: ‘memory pressure’.  That's SReclaimable.
> SUnreclaim is the amount allocated to things which must be kept no
> matter how high memory pressure goes.
>
>> This isn't the same on the NAS box, but either way there are two
>> problems to debug here and I guess they could be related.
> The desktop PC will be easier because you've more tools and more
> upstream parties interested in any report.  Does the NAS last the day?
> Schedule a nightly reboot, or kexec?
> https://wiki.archlinux.org/index.php/Kexec
>
>> Do I have a kernel memory leak?
> Probably.
>
>>> ‘sudo slabtop -osc’ will give a breakdown.
> ...
>> Okay, that yields: http://ix.io/2x4T
>>
>> The total is much smaller than the number in /proc/meminfo (just
>> verified it hasn't changed drastically). Bizarre.
> That is odd.
>
> Are you using any VM stuff?  Any disk filesystems over than ext4,
> e.g. ZFS?  Nvidia graphics drivers?  Is this the machine where a kernel
> driver keeps dying?
>
> Monitor SUnreclaim at a regular time period, e.g. 30 seconds, so you can
> see it climbing.  You said you were doing a large upload.  If it's the
> kind which can recover from being stopped and re-started then see if
> your monitoring shows a steady climb during upload which stops if you
> kill the upload only to restart when you resume the upload.
>
>> My swap is meant to be for an emergency, not because some leaky code
>> in a driver/the kernel/whatever is somehow managing to use 20gb of
>> ram.
> Swap isn't reserved as an overflow when RAM runs low.  Even when there
> is plenty of RAM free, the kernel might decide to swap out some memory
> which is not backed by another device because it thinks that memory
> would be better used by a cache.
>
> BTW, curl(1) did download something.  ;-)
>
I'm gonna have to shut this system down, but in the meantime I'm open to
any more suggestions. I'll try a bunch of different things I guess, and
see f it makes any difference. May also see if it happens on a live disk.

Hamish

Attachment: signature.asc
Description: OpenPGP digital signature

-- 
  Next meeting: Online, Jitsi, Tuesday, 2020-10-06 20:00
  Check to whom you are replying
  Meetings, mailing list, IRC, ...  http://dorset.lug.org.uk
  New thread, don't hijack:  mailto:dorset@mailman.lug.org.uk

Reply via email to