Swap usage is not per se wrong - as long as you have enough available
memory. However, it can still lead to performance issues. If you don't want
to get rid of swap, check your swapiness settings
(cat /proc/sys/vm/swappiness).

Op vr 23 mei 2025 om 14:53 schreef Anthony D'Atri <a...@dreamsnake.net>:

>
>
> > Swap can possibly reduce your clusters performance, not? Osd-processes
> that swap data will result in supplementary and unwanted disk I/o
>
> Absolutely. Moreover I’ve seen modern(ish) Linux systems anomalously using
> swap space when there is available physmem, in excess of
> vm.min_free_kybytes.
>
> >
> > I've got 10 OSD's per host and my memory consumption of ceph is
> typically 70GiB per host... Each host has about 40GiB available memory
> which is sufficient (for my setup) except one time I ran out of memory
> deleting old snapshots. But 8GiB wouldn't have helped...
>
> Exactly.  Filesystem swap can be a useful emergency tool, but in 2025 it
> should not be routine.  In 1985 a diskless Sun 2/50 with 3MB of physmem
> (yes, MB) needed swap, and that was SUPER fun over 10GE ethernet against a
> Fuji Eagle.
>
> In years beginning with a 2, DRAM prices are such that if disabling swap
> causes a problem, then that’s a sign that you really, really need more
> physmem.
>
> Swap is 12% of your virtual memory right now.  If you run hotter than 84%
> usage then really you need more.
>
> By default the osd_memory_target autotune should be enabled, see what
> values it is setting.  By default it will divide 70% of physmem by however
> many OSDs are placed on a host:
>
> # ceph config dump | grep osd_memory_target
> osd                                       host:x  basic
>  osd_memory_target                          12060218196
> osd                                       host:y  basic
>  osd_memory_target                          12062644163
> osd                                       host:z  basic
>  osd_memory_target                          6614520783
>
> Yes, the above cluster has lots of physmem, which is very very fortunate
> because it’s based on large HDDs and otherwise would have fallen over (if I
> told you the details, you wouldn’t sleep at night).  The money would have
> been better spent on QLC, but I digress.
>
> If autotune isn’t on, the default osd_memory_target is 4GB.  Remember that
> it’s a target not a limit.  The docs advise a 20% headroom of available
> physmem, having suffered a few things I like to advise 50% at least, plus
> margin to run mons/mgrs/mds/etc.
>
> > OSD containers consumes reasonable amount of RAM (~2.6GB - ~3.6 GB):
>
> Actually that’s another sign that you may be starved unless these OSDs are
> rather idle.  Are you using cephadm, or something else to manage the
> containers?  Is it enforcing an artificial limit on them?
>
> With 64GB of physmem cephadm’s autotuning would assign an
> osd_memory_target of 4.5GB.
>
> Memory allocation practice and accounting vary across kernel revisions,
> which may be a factor here.
>
> What model of chassis are these?  Adding even 4x8GB super cheap DIMMs to
> each would do you a world of good, with more of course even better.  Be
> sure to not mix SKUs within a bank, and populate slots according to your
> motherboard’s documentation.
>
>
> >
> >
> >
> >> -----Oorspronkelijk bericht-----
> >> Van: Dmitrijs Demidovs <dmitrijs.demid...@carminered.eu>
> >> Verzonden: vrijdag 23 mei 2025 10:16
> >> Aan: ceph-users@ceph.io
> >> Onderwerp: [ceph-users] Re: SWAP usage 100% on OSD hosts after
> >> migration to Rocky Linux 9 (Ceph 16.2.15)
> >>
> >> Hi Anthony.
> >>
> >> Yes we have swap enabled. Old Rocky 8 and new Rocky 9 OSD hosts both
> >> configured with 8G of swap.
> >>
> >> I will try to disable swap, but I guess what we will get a lot of Out
> Of Memory
> >> messages on OSD hosts.
> >>
> >>
> >>
> >> = old:
> >> [root@ceph-osd11 ~]# free -h
> >>                total        used        free      shared buff/cache
> >> available
> >> Mem:           62Gi        30Gi       1.2Gi       2.1Gi 30Gi        29Gi
> >> Swap:         8.0Gi       2.8Gi       5.2Gi
> >>
> >> = new:
> >> [root@ceph-osd17 ~]# free -h
> >>                 total        used        free      shared buff/cache
> >> available
> >> Mem:            62Gi        26Gi       1.0Gi       1.0Gi 36Gi
> 36Gi
> >> Swap:          8.0Gi       8.0Gi       7.0Mi
> >>
> >>
> >>
> >>
> >>
> >>
> >> OSD containers consumes reasonable amount of RAM (~2.6GB - ~3.6 GB):
> >>
> >>
> >> [root@ceph-osd17 ~]# docker stats --no-stream
> >> CONTAINER ID   NAME
> >>            CPU %     MEM USAGE / LIMIT     MEM %     NET I/O   BLOCK
> >> I/O         PIDS
> >> 5cc58e4a77b2   ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-52
> >>            0.28%     3.576GiB / 62.28GiB   5.74%     0B / 0B   3.9TB /
> >> 975GB     62
> >> 3a60fecf648d   ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-50
> >>            0.28%     2.912GiB / 62.28GiB   4.68%     0B / 0B   100TB /
> >> 45.7TB    62
> >> 9c20407e79eb   ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-49
> >>            0.28%     2.905GiB / 62.28GiB   4.66%     0B / 0B   93TB /
> >> 35.8TB     62
> >> 9deadafef9dd   ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-48
> >>            0.56%     3.624GiB / 62.28GiB   5.82%     0B / 0B   102TB /
> >> 39.2TB    62
> >> fcfe62a25fd9   ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-55
> >>            0.40%     2.968GiB / 62.28GiB   4.77%     0B / 0B   83.2TB /
> >> 34.8TB   62
> >> 38d2d96cc491   ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-51
> >>            1.42%     2.666GiB / 62.28GiB   4.28%     0B / 0B   105TB /
> >> 38.1TB    62
> >> e29c6bbc1ae7   ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-54
> >>            2.01%     3.687GiB / 62.28GiB   5.92%     0B / 0B   106TB /
> >> 44.6TB    62
> >> 40346a7a45ea   ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-osd-53
> >>            0.69%     2.748GiB / 62.28GiB   4.41%     0B / 0B   103TB /
> >> 41.4TB    62
> >> 43c3e3a65531
> >> ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-crash-ceph-osd17
> >> 0.00%     3.73MiB / 62.28GiB    0.01%     0B / 0B   567MB / 18MB      2
> >> d9e436f9788c
> >> ceph-7e8bff5c-2761-11ec-9bb0-000c29ebc936-node-exporter-ceph-osd17
> >> 15.04%    30.25MiB / 62.28GiB   0.05%     0B / 0B   410MB / 14.6MB    61
> >>
> >>
> >>
> >>
> >>
> >> But they also are biggest swap consumers:
> >>
> >> [root@ceph-osd17 ~]# for file in /proc/*/status; do awk
> >> '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 2
> >> -n -r | more
> >> ceph-osd 1553520 kB
> >> ceph-osd 1447728 kB
> >> ceph-osd 1218768 kB
> >> ceph-osd 1117536 kB
> >> ceph-osd 1026548 kB
> >> ceph-osd 641632 kB
> >> ceph-osd 495080 kB
> >> ceph-osd 424392 kB
> >> firewalld 26880 kB
> >> dockerd 20352 kB
> >> containerd 11136 kB
> >> docker 6144 kB
> >> docker 6144 kB
> >> docker 5952 kB
> >> docker 5952 kB
> >> docker 5952 kB
> >> docker 5952 kB
> >> docker 5952 kB
> >> docker 5760 kB
> >> (sd-pam) 5184 kB
> >> ceph-crash 4416 kB
> >> python3 4224 kB
> >> docker 4032 kB
> >> systemd-udevd 3264 kB
> >>
> >>
> >>
> >>
> >>
> >>
> >> On 22.05.2025 18:34, Anthony D'Atri wrote:
> >>>
> >>>>
> >>>> Problem:
> >>>>
> >>>> After migration to Rocky 9 (and new version of Docker) we see what our
> >> OSD hosts consumes 100% of SWAP space! It takes approximately one week
> >> to fill SWAP from 0% to 100%.
> >>>
> >>> Why do you have swap configured at all?  I suggest disabling swap in
> fstab
> >> and rebooting serially.
> >>>
> >>>
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to