Hi Jonas,

I remembered this kernel bug [1] related to CGroupv1, which caused memory 
overallocation and OOM kills of OSDs. It may be worth checking you kernel 
version against this bug.

Regards,
Frédéric.

[1] https://www.spinics.net/lists/ceph-users/msg80421.html

----- Le 10 Avr 25, à 7:54, Frédéric Nass frederic.n...@univ-lorraine.fr a 
écrit :

> Hi Jonas,
> 
> Is swap enabled on OSD nodes?
> 
> I've seen OSDs using way more memory than osd_memory_target and being 
> OOM-killed
> from time to time just because swap was enabled. If that's the case, please
> disable swap in /etc/fstab and reboot the system.
> 
> Regards,
> Frédéric.
> 
> ________________________________
> De : Jonas Schwab <jonas.sch...@uni-wuerzburg.de>
> Envoyé : mercredi 9 avril 2025 13:54
> À : ceph-users@ceph.io
> Objet : [ceph-users] OSDs ignore memory limit
> 
> Hello everyone,
> 
> I recently have many problems with OSDs using much more memory than they
> are supposed to (> 10GB), leading to the node running out of memory and
> killing processes. Does someone have ideas why the daemons seem
> to completely ignore the set memory limits?
> 
> See e.g. the following:
> 
> $ ceph orch ps ceph2-03
> NAME                    HOST      PORTS   STATUS REFRESHED  AGE  MEM
> USE  MEM LIM  VERSION  IMAGE ID CONTAINER ID
> mon.ceph2-03            ceph2-03          running (3h)       1s ago
> 2y     501M    2048M  19.2.1   f2efb0401a30  d876fc30f741
> node-exporter.ceph2-03  ceph2-03  *:9100  running (3h)       1s ago
> 17M    46.5M        -  1.7.0    72c9c2088986  d32ec4d266ea
> osd.4                   ceph2-03          running (26m)      1s ago
> 2y    10.2G    3310M  19.2.1   f2efb0401a30  b712a86dacb2
> osd.11                  ceph2-03          running (5m)       1s ago
> 2y    3458M    3310M  19.2.1   f2efb0401a30  f3d7705325b4
> osd.13                  ceph2-03          running (3h)       1s ago
> 6d    2059M    3310M  19.2.1   f2efb0401a30  980ee7e11252
> osd.17                  ceph2-03          running (114s)     1s ago
> 2y    3431M    3310M  19.2.1   f2efb0401a30  be7319fda00b
> osd.23                  ceph2-03          running (30m)      1s ago
> 2y    10.4G    3310M  19.2.1   f2efb0401a30  9cfb86c4b34a
> osd.29                  ceph2-03          running (8m)       1s ago
> 2y    4923M    3310M  19.2.1   f2efb0401a30  d764930bb557
> osd.35                  ceph2-03          running (14m)      1s ago
> 2y    7029M    3310M  19.2.1   f2efb0401a30  6a4113adca65
> osd.59                  ceph2-03          running (2m)       1s ago
> 2y    2821M    3310M  19.2.1   f2efb0401a30  8871d6d4f50a
> osd.61                  ceph2-03          running (49s)      1s ago
> 2y    1090M    3310M  19.2.1   f2efb0401a30  3f7a0ed17ac2
> osd.67                  ceph2-03          running (7m)       1s ago
> 2y    4541M    3310M  19.2.1   f2efb0401a30  eea0a6bcefec
> osd.75                  ceph2-03          running (3h)       1s ago
> 2y    1239M    3310M  19.2.1   f2efb0401a30  5a801902340d
> 
> Best regards,
> Jonas
> 
> --
> Jonas Schwab
> 
> Research Data Management, Cluster of Excellence ct.qmat
> https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de
> Email: jonas.sch...@uni-wuerzburg.de
> Tel: +49 931 31-84460
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to