On Fri, Jul 26, 2019 at 08:55:19PM +0200, Matthias Böttcher wrote: > Reco <recovery...@enotuniq.net>: > > > > Hi. > > > > On Wed, Jul 24, 2019 at 06:54:42PM +0200, Matthias Böttcher wrote: > > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > > > 307534 304741 99% 0,20K 16186 19 64744K vm_area_struct > > > 14280 14274 99% 3,69K 1785 8 57120K task_struct > > > 178048 152224 85% 0,25K 11128 16 44512K filp > > > 8536 8536 100% 4,00K 1067 8 34144K kmalloc-4096 > > > 14640 14640 100% 2,06K 976 15 31232K sighand_cache > > > > > > How can I detect what is eating up my memory in SUnreclaim (slab > > > unreclaimable)? > > > > You did it already. > > "vm_area_struct" is a kernel structure for anonymous memory allocations. > > "task_struct" is a kernel structure for maintaining process execution. > > "filp" is a kernel structure for virtual memory. > > > > My guess is - a small number of processes that constantly allocate > > memory in small numbers by executing brk(2) or its modern equivalents. > > > > Or a relatively large number of short-lived processes. > > > > > > I'd start with "pidstat -rl 1 10". > > Now with an uptime of 2 days, 9 hours all counters of slabtop are > growing no more. > > $ slabtop --sort c --once | head -n12 > Active / Total Objects (% used) : 10336938 / 10484768 (98,6%) > Active / Total Slabs (% used) : 328327 / 328327 (100,0%) > Active / Total Caches (% used) : 98 / 124 (79,0%) > Active / Total Size (% used) : 2443615,58K / 2479644,41K (98,5%) > Minimum / Average / Maximum Object : 0,01K / 0,24K / 8,00K > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > 308520 307456 99% 1,05K 20568 15 329088K ext4_inode_cache > 1285388 1269730 98% 0,20K 67652 19 270608K vm_area_struct > 61712 61692 99% 3,69K 7714 8 246848K task_struct > 765472 649628 84% 0,25K 47842 16 191368K filp > 36088 36083 99% 4,00K 4511 8 144352K kmalloc-4096
"ext4_inode_cache" is the usual top consumer of SLAB. I kind of surprised that "dentry" did not make it to the top, but that can be attributed to the filesystem usage on that server. > What I saw with "pidstat -rl 1 10" was systemd-journald and nmbd, so I did: > > sudo apt purge samba # Samba was not needed Possible, but unlikely. Samba is popular, such abnormal memory allocations would be widely known. > sudo systemctl stop systemd-journald-dev-log.socket \ > systemd-journald-audit.socket systemd-journald.socket It's redundant if you have rsyslog anyway, but again, it's a popular software, such things would be noticed. Unless of course, you have something that's spamming audit records - some Apparmor profile in complain state. > and additionally I stopped the socket for the Check_MK agent: > > sudo systemctl stop check_mk.socket I do not know this one. What's its purpose? Monitoring, backup, something else? Is there a source available. On a side note, you seem to have enough used memory to consider turning on transparent hugepages. It should help with the size of "vm_area_struct" and "filp". Reco