Then I suggest to do the usual troubleshooting [0], not necessarily in this order:

- osd logs
- ceph tell osd.X heap stats
- ceph osd df tree (to look for unbalanced PG distribution)
- check tracker.ceph.com for existing issues
- How are the nodes equipped RAM wise?
- Are the oom killers happening across all OSDs or only a subset, or even always the same ones?
- Is the cluster healthy? 'ceph -s' output could be useful.

Squid has the osd_memory_target_autotune feature enabled by default, can you check 'ceph config dump' and look for osd memory entries?

[0] https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-osd/

Zitat von Jonas Schwab <jonas.sch...@uni-wuerzburg.de>:

Yes, it's the ceph-osd processes filling up the RAM.

On 2025-04-09 15:13, Eugen Block wrote:
I noticed the quite high reported memory stats for OSDs as well on a
recently upgraded customer cluster, now running 18.2.4. But checking
the top output etc. doesn't confirm those values. I don't really know
where they come from, tbh.
Can you confirm that those are actually OSD processes filling up the RAM?

Zitat von Jonas Schwab <jonas.sch...@uni-wuerzburg.de>:

Hello everyone,

I recently have many problems with OSDs using much more memory than they
are supposed to (> 10GB), leading to the node running out of memory and
killing processes. Does someone have ideas why the daemons seem
to completely ignore the set memory limits?

See e.g. the following:

$ ceph orch ps ceph2-03
NAME                    HOST      PORTS   STATUS REFRESHED  AGE MEM
USE  MEM LIM  VERSION  IMAGE ID CONTAINER ID
mon.ceph2-03            ceph2-03          running (3h)       1s ago
2y     501M    2048M  19.2.1   f2efb0401a30  d876fc30f741
node-exporter.ceph2-03  ceph2-03  *:9100  running (3h)       1s ago
17M    46.5M        -  1.7.0    72c9c2088986  d32ec4d266ea
osd.4                   ceph2-03          running (26m)      1s ago
2y    10.2G    3310M  19.2.1   f2efb0401a30  b712a86dacb2
osd.11                  ceph2-03          running (5m)       1s ago
2y    3458M    3310M  19.2.1   f2efb0401a30  f3d7705325b4
osd.13                  ceph2-03          running (3h)       1s ago
6d    2059M    3310M  19.2.1   f2efb0401a30  980ee7e11252
osd.17                  ceph2-03          running (114s)     1s ago
2y    3431M    3310M  19.2.1   f2efb0401a30  be7319fda00b
osd.23                  ceph2-03          running (30m)      1s ago
2y    10.4G    3310M  19.2.1   f2efb0401a30  9cfb86c4b34a
osd.29                  ceph2-03          running (8m)       1s ago
2y    4923M    3310M  19.2.1   f2efb0401a30  d764930bb557
osd.35                  ceph2-03          running (14m)      1s ago
2y    7029M    3310M  19.2.1   f2efb0401a30  6a4113adca65
osd.59                  ceph2-03          running (2m)       1s ago
2y    2821M    3310M  19.2.1   f2efb0401a30  8871d6d4f50a
osd.61                  ceph2-03          running (49s)      1s ago
2y    1090M    3310M  19.2.1   f2efb0401a30  3f7a0ed17ac2
osd.67                  ceph2-03          running (7m)       1s ago
2y    4541M    3310M  19.2.1   f2efb0401a30  eea0a6bcefec
osd.75                  ceph2-03          running (3h)       1s ago
2y    1239M    3310M  19.2.1   f2efb0401a30  5a801902340d

Best regards,
Jonas

--
Jonas Schwab

Research Data Management, Cluster of Excellence ct.qmat
https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de
Email: jonas.sch...@uni-wuerzburg.de
Tel: +49 931 31-84460
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

--
Jonas Schwab

Research Data Management, Cluster of Excellence ct.qmat
https://data.ctqmat.de | datamanagement.ct.q...@listserv.dfn.de
Email: jonas.sch...@uni-wuerzburg.de
Tel: +49 931 31-84460
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to