Yes, that is correct.

On Tue, Nov 12, 2024 at 8:51 PM Frédéric Nass
<frederic.n...@univ-lorraine.fr> wrote:
>
> Hello Alexander,
>
> Thank you for clarifying this point. The documentation was not very clear 
> about the 'improvements'.
>
> Does that mean that in the latest releases overspilling no longer occurs 
> between the two thresholds of 30GB and 300GB? Meaning block.db can be 80GB in 
> size without overspilling, for example?
>
> Cheers,
> Frédéric.
>
> ----- Le 12 Nov 24, à 13:32, Alexander Patrakov patra...@gmail.com a écrit :
>
> > Hello Frédéric,
> >
> > The advice regarding 30/300 GB DB sizes is no longer valid. Since Ceph
> > 15.2.8, due to the new default (bluestore_volume_selection_policy =
> > use_some_extra), it no longer wastes the extra capacity of the DB
> > device.
> >
> > On Tue, Nov 12, 2024 at 5:52 PM Frédéric Nass
> > <frederic.n...@univ-lorraine.fr> wrote:
> >>
> >>
> >>
> >> ----- Le 12 Nov 24, à 8:51, Roland Giesler rol...@giesler.za.net a écrit :
> >>
> >> > On 2024/11/12 04:54, Alwin Antreich wrote:
> >> >> Hi Roland,
> >> >>
> >> >> On Mon, Nov 11, 2024, 20:16 Roland Giesler <rol...@giesler.za.net> 
> >> >> wrote:
> >> >>
> >> >>> I have ceph 17.2.6 on a proxmox cluster and want to replace some ssd's
> >> >>> who are end of life.  I have some spinners who have their journals on
> >> >>> SSD.  Each spinner has a 50GB SSD LVM partition and I want to move 
> >> >>> those
> >> >>> each to new corresponding partitions.
> >> >>>
> >> >>> The new 4TB SSD's I have split into volumes with:
> >> >>>
> >> >>> # lvcreate -n NodeA-nvme-LV-RocksDB1 -L 47.69g NodeA-nvme0
> >> >>> # lvcreate -n NodeA-nvme-LV-RocksDB2 -L 47.69g NodeA-nvme0
> >> >>> # lvcreate -n NodeA-nvme-LV-RocksDB3 -L 47.69g NodeA-nvme0
> >> >>> # lvcreate -n NodeA-nvme-LV-RocksDB4 -L 47.69g NodeA-nvme0
> >> >>> # lvcreate -n NodeA-nvme-LV-data -l 100%FREE NodeA-nvme1
> >> >>> # lvcreate -n NodeA-nvme-LV-data -l 100%FREE NodeA-nvme0
> >> >>>
> >> >> I caution the mix of DB/WAL partitions with other applications. The
> >> >> performance profile may not be suited for shared use. And depending on 
> >> >> the
> >> >> use case the ~48GB might not be big enough to hinder DB spillover. See 
> >> >> the
> >> >> current size when querying the OSD.
> >> >
> >> > I see relatively small RocksDB and not WAL?
> >> >
> >> > ceph daemon osd.4 perf dump
> >> > <snip>
> >> >     "bluefs": {
> >> >         "db_total_bytes": 45025845248,
> >> >         "db_used_bytes": 2131755008,
> >> >         "wal_total_bytes": 0,
> >> >         "wal_used_bytes": 0,
> >> > </snip>
> >> >
> >> > I have been led to understand that 4% is die high end and only on very 
> >> > busy
> >> > systems is that reached, if ever?
> >>
> >> Hi Roland,
> >>
> >> This is generally true but it depends on what your cluster is used for.
> >>
> >> If your cluster is used for block (RBD) storage then 1%-2% should be 
> >> enough. If
> >> your cluster is used for file (cephfs) and S3 (RGW) storage then you'd 
> >> rather
> >> stay on the safe size and respect the 4% recommendation as these workloads 
> >> make
> >> heavy use of block.db to store metadata.
> >>
> >> Now percentage is one thing, level size is another. To avoid overspilling 
> >> when
> >> block.db size approaches 30GB you'd better choose a block.db size of 300GB+
> >> whatever the percentage of block size this is, if you don't want to play 
> >> with
> >> rocksdb level size and multiplier, which you probably don't.
> >>
> >> Regards,
> >> Frédéric.
> >>
> >> [1]
> >> https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/#sizing
> >> [2]
> >> https://www.ibm.com/docs/en/storage-ceph/7.1?topic=bluestore-sizing-considerations
> >> [3] https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide
> >>
> >> >
> >> >>> What am I missing to get these changes to be permanent?
> >> >>>
> >> >> Likely just an issue with the order of execution. But there is an easier
> >> >> way to do the move. See:
> >> >> https://docs.ceph.com/en/quincy/ceph-volume/lvm/migrate/
> >> >
> >> > Ah, excellent!  I didn't find that in my searches.  Will try that now.
> >> >
> >> > regards
> >> >
> >> > Roland
> >> >
> >> >
> >> >>
> >> >> Cheers,
> >> >> Alwin
> >> >>
> >> >> --
> >> >>
> >> >>> Alwin Antreich
> >> >> Head of Training and Proxmox Services
> >> >>
> >> >> croit GmbH, Freseniusstr. 31h, 81247 Munich
> >> >> CEO: Martin Verges, Andy Muthmann - VAT-ID: DE310638492
> >> >> Com. register: Amtsgericht Munich HRB 231263
> >> >> Web: https://croit.io/
> >> >> _______________________________________________
> >> >> ceph-users mailing list -- ceph-users@ceph.io
> >> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >> > _______________________________________________
> >> > ceph-users mailing list -- ceph-users@ceph.io
> >> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> >
> > --
> > Alexander Patrakov



-- 
Alexander Patrakov
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to