Swap usage is not per se wrong - as long as you have enough available memory. However, it can still lead to performance issues. If you don't want to get rid of swap, check your swapiness settings (cat /proc/sys/vm/swappiness).
Op vr 23 mei 2025 om 14:53 schreef Anthony D'Atri <a...@dreamsnake.net>: > > > > Swap can possibly reduce your clusters performance, not? Osd-processes > that swap data will result in supplementary and unwanted disk I/o > > Absolutely. Moreover I’ve seen modern(ish) Linux systems anomalously using > swap space when there is available physmem, in excess of > vm.min_free_kybytes. > > > > > I've got 10 OSD's per host and my memory consumption of ceph is > typically 70GiB per host... Each host has about 40GiB available memory > which is sufficient (for my setup) except one time I ran out of memory > deleting old snapshots. But 8GiB wouldn't have helped... > > Exactly. Filesystem swap can be a useful emergency tool, but in 2025 it > should not be routine. In 1985 a diskless Sun 2/50 with 3MB of physmem > (yes, MB) needed swap, and that was SUPER fun over 10GE ethernet against a > Fuji Eagle. > > In years beginning with a 2, DRAM prices are such that if disabling swap > causes a problem, then that’s a sign that you really, really need more > physmem. > > Swap is 12% of your virtual memory right now. If you run hotter than 84% > usage then really you need more. > > By default the osd_memory_target autotune should be enabled, see what > values it is setting. By default it will divide 70% of physmem by however > many OSDs are placed on a host: > > # ceph config dump | grep osd_memory_target > osd host:x basic > osd_memory_target 12060218196 > osd host:y basic > osd_memory_target 12062644163 > osd host:z basic > osd_memory_target 6614520783 > > Yes, the above cluster has lots of physmem, which is very very fortunate > because it’s based on large HDDs and otherwise would have fallen over (if I > told you the details, you wouldn’t sleep at night). The money would have > been better spent on QLC, but I digress. > > If autotune isn’t on, the default osd_memory_target is 4GB. Remember that > it’s a target not a limit. The docs advise a 20% headroom of available > physmem, having suffered a few things I like to advise 50% at least, plus > margin to run mons/mgrs/mds/etc. > > > OSD containers consumes reasonable amount of RAM (~2.6GB - ~3.6 GB): > > Actually that’s another sign that you may be starved unless these OSDs are > rather idle. Are you using cephadm, or something else to manage the > containers? Is it enforcing an artificial limit on them? > > With 64GB of physmem cephadm’s autotuning would assign an > osd_memory_target of 4.5GB. > > Memory allocation practice and accounting vary across kernel revisions, > which may be a factor here. > > What model of chassis are these? Adding even 4x8GB super cheap DIMMs to > each would do you a world of good, with more of course even better. Be > sure to not mix SKUs within a bank, and populate slots according to your > motherboard’s documentation. > > > > > > > > > >> -----Oorspronkelijk bericht----- > >> Van: Dmitrijs Demidovs <dmitrijs.demid...@carminered.eu> > >> Verzonden: vrijdag 23 mei 2025 10:16 > >> Aan: ceph-users@ceph.io > >> Onderwerp: [ceph-users] Re: SWAP usage 100% on OSD hosts after > >> migration to Rocky Linux 9 (Ceph 16.2.15) > >> > >> Hi Anthony. > >> > >> Yes we have swap enabled. Old Rocky 8 and new Rocky 9 OSD hosts both > >> configured with 8G of swap. > >> > >> I will try to disable swap, but I guess what we will get a lot of Out > Of Memory > >> messages on OSD hosts. > >> > >> > >> > >> = old: > >> [root@ceph-osd11 ~]# free -h > >> total used free shared buff/cache > >> available > >> Mem: 62Gi 30Gi 1.2Gi 2.1Gi 30Gi 29Gi > >> Swap: 8.0Gi 2.8Gi 5.2Gi > >> > >> = new: > >> [root@ceph-osd17 ~]# free -h > >> total used free shared buff/cache > >> available > >> Mem: 62Gi 26Gi 1.0Gi 1.0Gi 36Gi > 36Gi > >> Swap: 8.0Gi 8.0Gi 7.0Mi > >> > >> > >> > >> > >> > >> > >> OSD containers consumes reasonable amount of RAM (~2.6GB - ~3.6 GB): > >> > >> > >> [root@ceph-osd17 ~]# docker stats --no-stream > >> CONTAINER ID NAME > >> CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK > >> I/O PIDS > >> 5cc58e4a77b2 ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-52 > >> 0.28% 3.576GiB / 62.28GiB 5.74% 0B / 0B 3.9TB / > >> 975GB 62 > >> 3a60fecf648d ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-50 > >> 0.28% 2.912GiB / 62.28GiB 4.68% 0B / 0B 100TB / > >> 45.7TB 62 > >> 9c20407e79eb ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-49 > >> 0.28% 2.905GiB / 62.28GiB 4.66% 0B / 0B 93TB / > >> 35.8TB 62 > >> 9deadafef9dd ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-48 > >> 0.56% 3.624GiB / 62.28GiB 5.82% 0B / 0B 102TB / > >> 39.2TB 62 > >> fcfe62a25fd9 ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-55 > >> 0.40% 2.968GiB / 62.28GiB 4.77% 0B / 0B 83.2TB / > >> 34.8TB 62 > >> 38d2d96cc491 ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-51 > >> 1.42% 2.666GiB / 62.28GiB 4.28% 0B / 0B 105TB / > >> 38.1TB 62 > >> e29c6bbc1ae7 ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-54 > >> 2.01% 3.687GiB / 62.28GiB 5.92% 0B / 0B 106TB / > >> 44.6TB 62 > >> 40346a7a45ea ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-53 > >> 0.69% 2.748GiB / 62.28GiB 4.41% 0B / 0B 103TB / > >> 41.4TB 62 > >> 43c3e3a65531 > >> ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-crash-ceph-osd17 > >> 0.00% 3.73MiB / 62.28GiB 0.01% 0B / 0B 567MB / 18MB 2 > >> d9e436f9788c > >> ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-node-exporter-ceph-osd17 > >> 15.04% 30.25MiB / 62.28GiB 0.05% 0B / 0B 410MB / 14.6MB 61 > >> > >> > >> > >> > >> > >> But they also are biggest swap consumers: > >> > >> [root@ceph-osd17 ~]# for file in /proc/*/status; do awk > >> '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 2 > >> -n -r | more > >> ceph-osd 1553520 kB > >> ceph-osd 1447728 kB > >> ceph-osd 1218768 kB > >> ceph-osd 1117536 kB > >> ceph-osd 1026548 kB > >> ceph-osd 641632 kB > >> ceph-osd 495080 kB > >> ceph-osd 424392 kB > >> firewalld 26880 kB > >> dockerd 20352 kB > >> containerd 11136 kB > >> docker 6144 kB > >> docker 6144 kB > >> docker 5952 kB > >> docker 5952 kB > >> docker 5952 kB > >> docker 5952 kB > >> docker 5952 kB > >> docker 5760 kB > >> (sd-pam) 5184 kB > >> ceph-crash 4416 kB > >> python3 4224 kB > >> docker 4032 kB > >> systemd-udevd 3264 kB > >> > >> > >> > >> > >> > >> > >> On 22.05.2025 18:34, Anthony D'Atri wrote: > >>> > >>>> > >>>> Problem: > >>>> > >>>> After migration to Rocky 9 (and new version of Docker) we see what our > >> OSD hosts consumes 100% of SWAP space! It takes approximately one week > >> to fill SWAP from 0% to 100%. > >>> > >>> Why do you have swap configured at all? I suggest disabling swap in > fstab > >> and rebooting serially. > >>> > >>> > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@ceph.io > >> To unsubscribe send an email to ceph-users-le...@ceph.io > > _______________________________________________ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io