Hi Jonas, I remembered this kernel bug [1] related to CGroupv1, which caused memory overallocation and OOM kills of OSDs. It may be worth checking you kernel version against this bug.
Regards, Frédéric. [1] https://www.spinics.net/lists/ceph-users/msg80421.html ----- Le 10 Avr 25, à 7:54, Frédéric Nass frederic.n...@univ-lorraine.fr a écrit : > Hi Jonas, > > Is swap enabled on OSD nodes? > > I've seen OSDs using way more memory than osd_memory_target and being > OOM-killed > from time to time just because swap was enabled. If that's the case, please > disable swap in /etc/fstab and reboot the system. > > Regards, > Frédéric. > > ________________________________ > De : Jonas Schwab <jonas.sch...@uni-wuerzburg.de> > Envoyé : mercredi 9 avril 2025 13:54 > À : ceph-users@ceph.io > Objet : [ceph-users] OSDs ignore memory limit > > Hello everyone, > > I recently have many problems with OSDs using much more memory than they > are supposed to (> 10GB), leading to the node running out of memory and > killing processes. Does someone have ideas why the daemons seem > to completely ignore the set memory limits? > > See e.g. the following: > > $ ceph orch ps ceph2-03 > NAME HOST PORTS STATUS REFRESHED AGE MEM > USE MEM LIM VERSION IMAGE ID CONTAINER ID > mon.ceph2-03 ceph2-03 running (3h) 1s ago > 2y 501M 2048M 19.2.1 f2efb0401a30 d876fc30f741 > node-exporter.ceph2-03 ceph2-03 *:9100 running (3h) 1s ago > 17M 46.5M - 1.7.0 72c9c2088986 d32ec4d266ea > osd.4 ceph2-03 running (26m) 1s ago > 2y 10.2G 3310M 19.2.1 f2efb0401a30 b712a86dacb2 > osd.11 ceph2-03 running (5m) 1s ago > 2y 3458M 3310M 19.2.1 f2efb0401a30 f3d7705325b4 > osd.13 ceph2-03 running (3h) 1s ago > 6d 2059M 3310M 19.2.1 f2efb0401a30 980ee7e11252 > osd.17 ceph2-03 running (114s) 1s ago > 2y 3431M 3310M 19.2.1 f2efb0401a30 be7319fda00b > osd.23 ceph2-03 running (30m) 1s ago > 2y 10.4G 3310M 19.2.1 f2efb0401a30 9cfb86c4b34a > osd.29 ceph2-03 running (8m) 1s ago > 2y 4923M 3310M 19.2.1 f2efb0401a30 d764930bb557 > osd.35 ceph2-03 running (14m) 1s ago > 2y 7029M 3310M 19.2.1 f2efb0401a30 6a4113adca65 > osd.59 ceph2-03 running (2m) 1s ago > 2y 2821M 3310M 19.2.1 f2efb0401a30 8871d6d4f50a > osd.61 ceph2-03 running (49s) 1s ago > 2y 1090M 3310M 19.2.1 f2efb0401a30 3f7a0ed17ac2 > osd.67 ceph2-03 running (7m) 1s ago > 2y 4541M 3310M 19.2.1 f2efb0401a30 eea0a6bcefec > osd.75 ceph2-03 running (3h) 1s ago > 2y 1239M 3310M 19.2.1 f2efb0401a30 5a801902340d > > Best regards, > Jonas > > -- > Jonas Schwab > > Research Data Management, Cluster of Excellence ct.qmat > https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de > Email: jonas.sch...@uni-wuerzburg.de > Tel: +49 931 31-84460 > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io