[ceph-users] Upgrading RGW before cluster?

2024-08-13 Thread Thomas Byrne - STFC UKRI
Hi all, The Ceph documentation has always recommended upgrading RGWs last when doing a upgrade. Is there a reason for this? As they're mostly just RADOS clients you could imagine the order doesn't matter as long as the cluster and RGW major versions are compatible. Our basic testing has shown n

[ceph-users] Slow initial boot of OSDs in large cluster with unclean state

2025-01-07 Thread Thomas Byrne - STFC UKRI
Hi all, On our 6000+ HDD OSD cluster (pacific), we've been noticing takes significantly longer for brand new OSDs to go from booting to active when the cluster has been in a state of flux for some time. It can take over an hour for a newly created OSD to be marked up in some cases! We've just p

[ceph-users] Re: Slow initial boot of OSDs in large cluster with unclean state

2025-01-08 Thread Thomas Byrne - STFC UKRI
Bluestore, correct? Regards, Frédéric. - Le 8 Jan 25, à 12:29, Thomas Byrne - STFC UKRI tom.by...@stfc.ac.uk a écrit : > Hi Wes, > > It works out at about five new osdmaps a minute, which is about normal for > this > cluster's state changes as far as I can tell. It&

[ceph-users] Re: Slow initial boot of OSDs in large cluster with unclean state

2025-01-08 Thread Thomas Byrne - STFC UKRI
ph.io Subject: Re: [ceph-users] Slow initial boot of OSDs in large cluster with unclean state   Hi Tom, On Tue, Jan 7, 2025 at 10:15 AM Thomas Byrne - STFC UKRI wrote: > I realise the obvious answer here is don't leave big cluster in an unclean > state for this long. Currently we'

[ceph-users] Re: Slow initial boot of OSDs in large cluster with unclean state

2025-01-08 Thread Thomas Byrne - STFC UKRI
Hi Anthony, Please see my replies inline. I also just wanted to say I really enjoyed your talk about QLC flash at Cephalocon, there was a lot of useful info in there. >> On our 6000+ HDD OSD cluster (pacific) > >That’s the bleeding edge in a number of respects.  Updating to at least Reef >would

[ceph-users] Re: Slow initial boot of OSDs in large cluster with unclean state

2025-01-13 Thread Thomas Byrne - STFC UKRI
Thanks for the input Josh. I actually started looking into this was because we're adding some SSD OSDs to this cluster, and they were basically as slow on their initial boot as HDD OSDs when the cluster hasn't trimmed OSDmaps in a while. I'd be interested to know if other people seeing this slo

[ceph-users] Re: Slow initial boot of OSDs in large cluster with unclean state

2025-01-16 Thread Thomas Byrne - STFC UKRI
up time, peering and osdmap updates, and the role it might play regarding flapping, when DB IOs compete with client IOs, even with 100% active+clean PGs. Cheers, Frédéric. [1] https://ceph.io/en/discover/case-studies/ ----- Le 8 Jan 25, à 16:10, Thomas Byrne, STFC UKRI tom.by...@stfc.ac.uk a

[ceph-users] Re: Slow initial boot of OSDs in large cluster with unclean state

2025-01-08 Thread Thomas Byrne - STFC UKRI
in a 5 day period. Respectfully, Wes Dillingham LinkedIn<http://www.linkedin.com/in/wesleydillingham> w...@wesdillingham.com<mailto:w...@wesdillingham.com> On Tue, Jan 7, 2025 at 1:18 PM Thomas Byrne - STFC UKRI mailto:tom.by...@stfc.ac.uk>> wrote: Hi all, On our 6000+ HDD