[ceph-users] Re: OSDs flapping since upgrade to 14.2.10

Ingo Reimann Mon, 10 Aug 2020 00:51:54 -0700

Hi Mark,

i raised the osd_memory_target from 4G to 6G and set bluefs_buffered_io back to 
false. 10 Minutes later, i got the first 'Monitor daemon marked osd.X down, but 
it is still running', after additional 5, the second event. I tried to raise 
the memory_target to 10G, but this didn`t help, so i switched back buffered_io 
to true.


We have a heterogenous cluster with different amounts of OSDs per host. Maximum 
is 37 OSDs with 256GB RAM, so i set the limit of 6G, to save that machine. All 
OSDs are spinners with 2GB on (internal) journal and using bluestore. The 2G 
are a relict of old times. I think for new OSDs, we should use the new(?) 
default of 5G or go even larger with our 10TB+ Disks. 
We use rdb and rgw, but no rbd snapshots!

Does this help, or do you need some more information?

best regards,
Ingo


----- Ursprüngliche Mail -----
Von: "Ingo Reimann" <ireim...@dunkel.de>
An: "Mark Nelson" <mnel...@redhat.com>
CC: "ceph-users" <ceph-users@ceph.io>
Gesendet: Freitag, 7. August 2020 15:29:07
Betreff: [ceph-users] Re: OSDs flapping since upgrade to 14.2.10

Hi Mark,

i`ll check that after the weekend!

Ingo

----- Ursprüngliche Mail -----
Von: "Mark Nelson" <mnel...@redhat.com>
An: "ceph-users" <ceph-users@ceph.io>
Gesendet: Freitag, 7. August 2020 15:15:08
Betreff: [ceph-users] Re: OSDs flapping since upgrade to 14.2.10

Hi Ingo,


If you are able and have lots of available memory, could you also try 
setting it to false but increasing the osd_memory_target size?  I'd like 
to understand a little bit deeper what's going on here.  Ultimately I 
don't want our only line of defense against slow snap trimming to be 
having page cache available!


Mark


On 8/7/20 6:51 AM, Ingo Reimann wrote:
> Hi Stefan, Hi Manuel,
>
> thanks for your quick advices.
>
> In fact, since i set "ceph config set osd bluefs_buffered_io true", the 
> problems disappeared. We have lots of RAM in our osd hosts, so buffering is 
> ok. I`ll trak this issue down further after the weekend!
>
> best regards,
> Ingo
>
> ----- Ursprüngliche Mail -----
> Von: "Stefan Kooman" <ste...@bit.nl>
> An: "ceph-users" <ceph-users@ceph.io>
> Gesendet: Freitag, 7. August 2020 12:24:08
> Betreff: [ceph-users] Re: OSDs flapping since upgrade to 14.2.10
>
> On 2020-08-07 12:07, Ingo Reimann wrote:
>> i`m am not sure, if we really have a problem, but it does not look healthy.
> It might be related to the change that is mentioned in another thread:
> "block.db/block.wal device performance dropped after upgrade to 14.2.10"
>
> TL;DR:  bluefs_buffered_io has been changed to "false" in 14.2.10. It
> doesn't use buffer cache in that case, and in certain workloads (i.e.
> snap trimming) this seem to have a big impact, even for environments
> that have large osd_memory_target.
>
> I would change that back to "true" ("ceph config set osd
> bluefs_buffered_io true" should do the trick). Not sure if the OSDs need
> a restart afterwards, as the config change seem to be effectve
> immediately for running daemons.
>
> Gr. Stefan
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
-- 
Ingo Reimann 
Teamleiter Technik 
        [ https://www.dunkel.de/ ] 
Dunkel GmbH 
Philipp-Reis-Straße 2 
65795 Hattersheim 
Fon: +49 6190 889-100 
Fax: +49 6190 889-399 
eMail: supp...@dunkel.de 
https://www.Dunkel.de/  Amtsgericht Frankfurt/Main 
HRB: 37971 
Geschäftsführer: Axel Dunkel 
Ust-ID: DE 811622001
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
-- 
Ingo Reimann 
Teamleiter Technik 
        [ https://www.dunkel.de/ ] 
Dunkel GmbH 
Philipp-Reis-Straße 2 
65795 Hattersheim 
Fon: +49 6190 889-100 
Fax: +49 6190 889-399 
eMail: supp...@dunkel.de 
https://www.Dunkel.de/  Amtsgericht Frankfurt/Main 
HRB: 37971 
Geschäftsführer: Axel Dunkel 
Ust-ID: DE 811622001
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: OSDs flapping since upgrade to 14.2.10

Reply via email to