[ceph-users] Re: Stop Rebalancing

2022-04-12 Thread Dan van der Ster
OK -- here's the tracker for what I mentioned: https://tracker.ceph.com/issues/55303 On Tue, Apr 12, 2022 at 9:50 PM Ray Cunningham wrote: > > Thank you Dan! I will definitely disable autoscaler on the rest of our pools. > I can't get the PG numbers today, but I will try to get them tomorrow. We

[ceph-users] Re: Stop Rebalancing

2022-04-12 Thread Ray Cunningham
Thank you Dan! I will definitely disable autoscaler on the rest of our pools. I can't get the PG numbers today, but I will try to get them tomorrow. We definitely want to get this under control. Thank you, Ray   -Original Message- From: Dan van der Ster Sent: Tuesday, April 12, 2022

[ceph-users] Re: Stop Rebalancing

2022-04-12 Thread Ray Cunningham
Thanks Matt! I didn't know about nobackfill and norebalance! That could be a good stop gap, as long as there's no issue having it set for weeks. We estimate our legacy bluestore cleanup to take about 3-4 weeks. You are correct, I don't want to cancel it we just need to catch up on other mainte

[ceph-users] Re: Stop Rebalancing

2022-04-12 Thread Dan van der Ster
Hi Ray, Disabling the autoscaler on all pools is probably a good idea. At least until https://tracker.ceph.com/issues/53729 is fixed. (You are likely not susceptible to that -- but better safe than sorry). To pause the ongoing PG merges, you can indeed set the pg_num to the current value. This wi

[ceph-users] Re: Stop Rebalancing

2022-04-12 Thread Matt Vandermeulen
It sounds like this is from a PG merge, so I'm going to _guess_ that you don't want to straight up cancel the current backfill and instead pause it to catch your breath. You can set `nobackfill` and/or `norebalance` which should pause the backfill. Alternatively, use `ceph config set osd.* os

[ceph-users] Stop Rebalancing

2022-04-12 Thread Ray Cunningham
Hi Everyone, We just upgraded our 640 OSD cluster to Ceph 16.2.7 and the resulting rebalancing of misplaced objects is overwhelming the cluster and impacting MON DB compaction, deep scrub repairs and us upgrading legacy bluestore OSDs. We have to pause the rebalancing if misplaced objects or we

[ceph-users] Removing osd in the Cluster map

2022-04-12 Thread Michel Niyoyita
Hello team, I am testing my ceph pacific cluster using Vms , which is integrated with openstack . suddenly one of the hosts turned off and failed . I built another host with same number of OSDs with the first one and redeploy again the cluster . unfortunately the cluster still is up with 2 hosts ,

[ceph-users] Re: Ceph Developer Summit - Reef

2022-04-12 Thread Mike Perez
Hi everyone, The Ceph Developer Summit for Reef is now starting on discussions on the Orchestrator https://ceph.io/en/community/events/2022/ceph-developer-summit-reef/ On Fri, Apr 8, 2022 at 1:22 PM Mike Perez wrote: > > Hi everyone, > > If you're a contributor to Ceph, please join us for the ne

[ceph-users] Re: Pool with ghost used space

2022-04-12 Thread Alex Gorbachev
Hi Joao, I have seen something like this in Luminous after increasing the size from 1 to 3, almost looks like an extra copy is being kept. I was never able to resolve this without recreating the pool. -- Alex Gorbachev On Mon, Apr 11, 2022 at 9:13 PM Joao Victor Rodrigues Soares wrote: > No

[ceph-users] Announcing go-ceph v0.15.0

2022-04-12 Thread Sven Anderson
We are happy to announce another release of the go-ceph API library. This is a regular release following our every-two-months release cadence. https://github.com/ceph/go-ceph/releases/tag/v0.15.0 Changes include additions to the rados and rgw packages. More details are available at the link above

[ceph-users] Re: Low performance on format volume

2022-04-12 Thread Mark Nelson
Hi Iban, Most of these options fall under the osd section.  You can get descriptions of what they do here: https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/ The journal settings are for the old filestore backend and aren't relevant unless you are using it.  Still, you can

[ceph-users] Re: osd with unlimited ram growth

2022-04-12 Thread Mark Nelson
Hi Joachim, Thank you much for the great writeup!  This definitely has been a major source of frustration. Thanks, Mark On 4/12/22 05:23, Joachim Kraftmayer (Clyso GmbH) wrote: Hi all, In the last few weeks we have discovered an error for which there have been several tickets and error

[ceph-users] Re: Low performance on format volume

2022-04-12 Thread Iban Cabrillo
Hi, Following with the performance (mimic, 144 SATA disk 10Gbps network),the [OSD] entry has the default conf there is no tunnig yet. I see a lot of parameters that can be set : [OSD] osd journal size = osd max write size = osd client message size cap = osd deep scrub st

[ceph-users] Re: Successful Upgrade from 14.2.18 to 15.2.16

2022-04-12 Thread Dan van der Ster
Hi Stefan, Thanks for the report. 9 hours fsck is the longest I've heard about yet -- and on NVMe, that's quite surprising! Which firmware are you running on those Samsung's? For a different reason Mark and we have been comparing performance of that drive between what's in his lab vs what we have