The replaced OSD is still backfilling, but the MON store size has
decreased to 2.3GB already. I'm going to wait for the recovery to
finish, then I'll reset all the temporary CRUSH weights, cancel the
backfills, and then let the balancer do the rest.
Thanks all!
On 18/12/2024 03:42, Gregory O
On 18/12/24 02:30, Janek Bevendorff wrote:
> I did increase the pgp_num of a pool a while back, totally
> forgot about that. Due to the ongoing rebalancing it was stuck half way,
> but now suddenly started up again. The current PG number of that pool is
> not quite final yet, but definitely higher
I think it was mentioned elsewhere in this thread that there are
limitations to what upmap can do, especially in significant crush map
change situations. It can't violate crush rules (mon-enforced), and if
the same OSD shows up multiple times in a backfill then upmap can't
deal with it.
The numb
I think it was mentioned elsewhere in this thread that there are
limitations to what upmap can do, especially in significant crush map
change situations. It can't violate crush rules (mon-enforced), and if
the same OSD shows up multiple times in a backfill then upmap can't
deal with it.
Creeping b
Something's not quite right yet. I got the remapped PGs down from > 4000
to around 1300, but there it stops. When I restart the process, I can
get it down to around 280, but there it stops and creeps back up afterwards.
I have a bunch of these messages in the output:
WARNING: pg 100.3d53: conf
Hey Janek,
Ah, yes, we ran into that invalid json output in
https://github.com/digitalocean/ceph_exporter as well. I have a patch
I wrote for ceph_exporter that I can port over to pgremapper (that
does similar to what your patch does).
Josh
On Tue, Dec 17, 2024 at 9:38 AM Janek Bevendorff
wrote
Looks like there is something wrong with the .mgr pool. All other have
proper values. For now I've patched the pgremapper source code to
replace the inf values with 0 before unmarshaling the JSON. That at
least made the tool work. I guess it's safe to just delete that pool and
let the MGRs recr
I checked the ceph osd dump json-pretty output and validated it with a
little Python script. Turns out, there's this somewhere around line 1200:
"read_balance": {
"score_acting": inf,
"score_stable": inf,
"optimal_score": 0,
Thanks. I tried running the command (dry run for now), but something's
not working as expected. Have you ever seen this?
$ /root/go/bin/pgremapper cancel-backfill --verbose
** executing: ceph osd dump -f json
panic: invalid character 'i' looking for beginning of value
goroutine 1 [running]:
mai
> You can use pg-remapper (https://github.com/digitalocean/pgremapper) or
> similar tools to cancel the remapping; up-map entries will be created
> that reflect the current state of the cluster. After all currently
> running backfills are finished your mons should not be blocked anymore.
> I would
Thanks for your replies!
You can use pg-remapper (https://github.com/digitalocean/pgremapper)
or similar tools to cancel the remapping; up-map entries will be
created that reflect the current state of the cluster. After all
currently running backfills are finished your mons should not be
bloc
Agree with pg-remapper or upmap-remapped approach. One thing to be aware of
though is that the Mons will invalidate any upmap which breaks the data
placement rules. So for instance if you are moving from host based failure
domain to rack based failure domain attempting to upmap the data back to
its
Hi,
On 17.12.24 14:40, Janek Bevendorff wrote:
Hi all,
We moved our Ceph cluster to a new data centre about three months ago,
which completely changed its physical topology. I changed the CRUSH
map accordingly so that the CRUSH location matches the physical
location again and the cluster has
13 matches
Mail list logo