[ceph-users] Re: OSD failed: still recovering

2025-04-04 Thread Gustavo Garcia Rondina
Hi Alan, Just to share our experience, we have a similar cluster that was in a very similar state after a disk failure: 6 nodes, 168 OSDs, 12 PGs inconsistent, little over 5% misplaced objects. It took a good while for it to sort itself out, but eventually it did it. Repairing the PGs with `cep

[ceph-users] Re: Experience with 100G Ceph in Proxmox

2025-04-04 Thread Giovanna Ratini
Hello, Yes, I will test KRBD. I will be on holiday next week, so I don’t want to make any changes before then. Could you wait until 29.3? This is a production environment, and restoring a backup would take time. Or do you think there’s no risk in making the change without concern? Thank yo

[ceph-users] Cephadm upgrade from 16.2.15 -> 17.2.0

2025-04-04 Thread Jeremy Hansen
I ran in to the “Error ENOENT: Module not found” issue with the orchestrator. I see the note in the cephadm upgrade docs but I don’t quite know what action to take to fix this: ceph versions { "mon": { "ceph version 16.2.15 (618f440892089921c3e944a991122ddc44e60516) pacific (stable)": 3 }, "mgr

[ceph-users] Ceph MDS stuck in reconnect -> rejoin -> failover loop

2025-04-04 Thread Kasper Rasmussen
Hi Ceph pacific 16.2.15 I have 5 MDS hosts, 4 active (4 FS), and 1 standby. One MDS was restarted today (as part of OS Patching), resulting in a failover. This is usually not an issue but today,it got stuck in a reconnect -> rejoin -> failover loop for the specific FS. A ceph fs status shows

[ceph-users] Re: ceph-ansible LARGE OMAP in RGW pool

2025-04-04 Thread Frédéric Nass
Hi Danish, I was wondering if you've sorted it out? Let us know. Regards, Frédéric. - Le 26 Mar 25, à 12:44, Frédéric Nass frederic.n...@univ-lorraine.fr a écrit : > Hi Danish, > > The "unable to find head object data pool for..." could be an incorrect > warning > since it pops out for '

[ceph-users] Re: ceph-ansible LARGE OMAP in RGW pool

2025-04-04 Thread Danish Khan
Hi Frédéric, Thank you for checking in. There was some OSDs replacement activity on the same ceph cluster, Hence I didn't try this yet. I will try these steps next week. But I was able to overwrite the curson.png file in master zone and then initiated the sync. Previously 4 shards were in recov

[ceph-users] Recovery across datacenters after host failure

2025-04-04 Thread Eugen Block
Hi all, let's say I have two DCs with replicated pools (size 4) and a tiebreaker MON somewhere else. Is it possible to control the recovery traffic in case of a host failure? Both DCs have enough replicas, so in theory it should be possible to recover within the DC with the failed host, r

[ceph-users] Re: Lifecycle question

2025-04-04 Thread Anthony D'Atri
>> 3. This also means the overhead of lifecycle is massively increased. >> Lifecycle scans every object in every bucket with policy every . >> This is not usually a problem, because it happens once per day, and has 24 >> hours to complete (but in large systems, it can even take more than 24 h

[ceph-users] Re: Reef: Dashboard bucket edit fails in get_bucket_versioning

2025-04-04 Thread Eugen Block
I can't reproduce it either. I removed the buckets and created a new one, no issue anymore. Then I upgraded a different test cluster from Pacific to Reef with an existing bucket, again no issue. So I guess the tracker can be closed. Thanks for checking. Zitat von Afreen Misbah : I am unab

[ceph-users] OSD_UNREACHABLE After Upgrade to 17.2.8 – Issue with Public Network Detection

2025-04-04 Thread Илья Безруков
Hello everyone, After upgrading our Ceph cluster from 17.2.7 to 17.2.8 using `cephadm`, all OSDs are reported as unreachable with the following error: ``` HEALTH_ERR 32 osds(s) are not reachable [ERR] OSD_UNREACHABLE: 32 osds(s) are not reachable osd.0's public address is not in '172.20.180.1

[ceph-users] Re: Ceph Tentacle release - dev freeze timeline

2025-04-04 Thread Yaarit Hatuka
So far there is an agreement to dev freeze in mid-April. If any component requires more time, please share your constraints so we can accommodate. Thanks. On Tue, Mar 18, 2025 at 1:07 AM Yaarit Hatuka wrote: > Hi everyone, > > In previous discussions, the Ceph Steering Committee tentatively agre

[ceph-users] Re: Prometheus anomaly in Reef

2025-04-04 Thread Tim Holloway
it returns nothing. I'd already done the same via "systemctl | grep prometheus". There simply isn't a systemd service, even though there should be. On 3/26/25 11:31, Eugen Block wrote: There’s a service called „prometheus“, which can have multiple daemons, just like any other service (mon, mgr