[ceph-users] Re: ceph-osd/bluestore using page cache

Frédéric Nass Wed, 19 Mar 2025 13:28:46 -0700

Hi Brian,

TL;DR: bluefs_buffered_io = true and SWAP-enabled OSD nodes do not work well 
together.


Please review these two PRs [1] and [2] to understand the rationale behind 
bluefs_buffered_io and why its default value has changed over time and Ceph 
releases (from true to false, then back to true again). 
The reason for changing it to false in the past was due to an observed 
situation where bluefs_buffered_io = true led to excessive SWAP usage. 

As Josh mentioned, whether 'true' or 'false' is better for you depends on your 
workload, how your cluster is built (collocated OSDs or not), and whether SWAP 
is enabled on your OSD nodes.

For example, our cluster is used for many different workloads, some of which 
use OMAP extensively. It consists of non-collocated OSDs using SSDs/NVMes for 
RocksDB. We decided to set bluefs_buffered_io back to true (when it defaulted 
to false) and disable SWAP on all nodes because we were experiencing slow 
requests during snap trimming with bluefs_buffered_io = false. 

What I would recommend you try is to disable SWAP on all nodes (swap was good 
in the 80's :-)) and leave bluefs_buffered_io enabled. 

Regards,
Frédéric

[1] https://github.com/ceph/ceph/pull/34224
[2] https://github.com/ceph/ceph/pull/38044


----- Le 19 Mar 25, à 1:11, Brian Marcotte marco...@panix.com a écrit :

>> The setting you're looking for is bluefs_buffered_io. This is very
>> much a YMMV setting, so it's best to test with both modes, but I
>> usually recommend turning it off for all but omap-intensive workloads
>> (e.g. RGW index) ...
> 
> We're not using RGW, only RBD.
> 
> Currently I find it hard to prevent Linux from swapping at least a little
> no matter what vm settings I use.
> 
> Thanks.
> 
> --
> - Brian
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph-osd/bluestore using page cache

Reply via email to