Hi Recently I've started moving a fleet of Debian 7, 32-bit machines over to Debian 9, 64-bit. This migration is done by creating a fresh Debian 9 image with the necessary services, moving over user data (some wars and the content of /home) and rebooting into the new OS.

Relevant services (ones we manage and use) are:

- Jetty- Puppet - SSH - AutoSSH - NewRelic Infrastructure Through Puppet, we enforce system configuration is pretty much identical, save for stuff like host names and SSH keys. Now, we notice that on some systems, the RAM usage is way higher than expected, to the point where system memory is exhausted and processes (are) terminate(d). Investigation into what is causing this, leaves me at a dead end. I can't figure out where the memory is being consumed. Even after quitting all services we manage (leaving a "clean" installation), RAM usage on the system hovers just over 600MB, half a gig over what the same exact image consumes just after boot. The only fix I found so far is to reboot the system. The systems have ~1.8GB of system memory available. We don't use swap. Enabling swap gives the system some breathing room. On one system I enabled a swap volume of 512MB, which the kernel fills up and leaves filled up indefinitely. This points me to unused memory being allocated by mistake. A memory leak in the kernel, maybe? Below I've posted the output of some of the things I checked (with all services online). I also searched the internet and reached a topic about slab allocation (1). However, that didn't seem to solve anything. Can anyone here point me to some more stuff I can check out or try to debug this? Thanks! Simon root@mysystem:~# uname -a Linux mysystem 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u2 (2018-02-21) x86_64 GNU/Linux root@mysystem:~# free -m total used free shared buff/cache available Mem: 1831 1091 238 16 501 520 Swap: 511 511 0 root@mysystem:~# vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 524268 244480 46044 467248 1 2 2137 36 97 35 9 3 85 3 0 0 0 524268 244092 46052 467244 0 0 0 16 1125 2406 3 3 94 1 0 0 0 524268 244092 46060 467284 0 0 4 208 1230 3906 4 3 93 0 0 0 0 524268 244084 46068 467256 0 0 0 32 1003 1990 1 2 97 0 0 0 0 524268 244180 46068 467256 0 0 0 0 1099 2121 4 1 95 0 0 0 0 524268 244204 46076 467272 0 0 0 20 1000 1978 1 2 97 1 0 0 0 524268 244080 46076 467272 0 0 0 0 1135 2315 2 2 96 0 0 0 0 524268 244080 46084 467264 0 0 0 16 1079 2103 1 3 96 1 0 0 0 524268 244080 46092 467272 0 0 0 56 1002 1973 2 2 96 0 0 1 0 524268 244080 46100 467264 0 0 0 16 997 1979 1 2 97 1 0 0 0 524268 244144 46100 467268 0 0 0 0 988 1957 1 2 97 0 0 0 1 524268 244228 46108 467260 0 0 0 980 1292 2700 4 5 81 9 0 root@mysystem:~# smem PID User Command Swap USS PSS RSS 528 root /sbin/agetty -f /etc/issue. 148 4 4 8 554 myuser /usr/lib/autossh/autossh -o 80 24 40 184 220 root /lib/systemd/systemd-udevd 552 108 140 688 10665 root /usr/lib/autossh/autossh -o 0 104 143 648 432 root /usr/sbin/cron -f 164 120 147 500 434 root /usr/sbin/irqbalance --fore 252 156 172 508 12042 mail /usr/sbin/nullmailer-send - 120 144 219 1228 382 systemd-timesync /lib/systemd/systemd-timesy 456 152 301 1060 439 messagebus /usr/bin/dbus-daemon --syst 272 280 348 1116 430 root /lib/systemd/systemd-logind 380 192 448 1224 557 myuser /usr/bin/ssh -o StrictHostK 564 360 515 1472 216 root /sbin/lvmetad -f 188 484 517 1028 10668 root /usr/bin/ssh -o StrictHostK 0 736 825 1112 14225 ntp /usr/sbin/ntpd -p /var/run/ 0 808 849 1404 435 root /usr/sbin/rsyslogd -n 716 896 980 2036 11612 root /usr/sbin/sshd -D 0 856 991 1560 500 root /sbin/dhclient -4 -v -pf /r 840 928 1010 2068 1330 root sudo -i 0 920 1375 3548 1234 myuser -bash 0 600 1407 3684 1331 root -bash 0 640 1447 3712 1233 myuser sshd: myuser@pts/0 0 300 1697 4568 1 root /sbin/init 524 1568 1935 3524 1227 root sshd: posios [priv] 0 1372 2984 6380 193 root /lib/systemd/systemd-journa 312 4304 4521 5828 8885 root /usr/bin/python /usr/bin/sm 0 9100 9452 11292 14574 root /usr/bin/newrelic-infra 1440 17348 17378 17828 520 root /opt/puppetlabs/puppet/bin/ 20636 40140 40168 40592 566 jetty /usr/lib/jvm/java-8-openjdk 493896 958124 958381 959804 [1] https://unix.stackexchange.com/questions/244735/why-are-slab-objects-not-reclaimed-automatically

Reply via email to