More likely the problem would just migrate.
I suggest `ceph pg repair 32.7ef`. If the situation doesn’t improve within a
few minutes, try `ceph osd down 100`
> On Mar 27, 2025, at 6:40 AM, Frédéric Nass
> wrote:
>
> - A PG that cannot be reactivated with a remap operation that
> does
Setting up a new cluster on fresh ubuntu 24.04 hosts using cephadm. The
first 5 hosts all added without issue, but the next N hosts all throw the
same error when adding through the dashboard or through ceph orch add...
All hosts have ssh access, docker and all base requirements confirmed.
Command:
My understanding is that anti-affinity will be enforced unless the service spec
explicitly allows more than one instance per host.
>
> Let’s say I have 2 cephfs, and three hosts I want to use as MDS hosts.
>
> I use ceph orch apply mds to spin up the MDS daemons.
>
> Is there a way to ensure
Look at `ceph osd df`. Is the balancer enabled?
> On Mar 27, 2025, at 8:50 AM, Mihai Ciubancan
> wrote:
>
> Hello,
>
> My name is Mihai, and I have started using CEPH this mount for a HPC cluster.
> When was lunch in the production the available space shown was 80TB now is
> 16TB and I didn'
Let’s say I have 2 cephfs, and three hosts I want to use as MDS hosts.
I use ceph orch apply mds to spin up the MDS daemons.
Is there a way to ensure that I don’t get two active MDS running on the same
host?
I mean when using the ceph orch apply mds command, I can specify —placement,
but it on
quincy-x approved:
https://tracker.ceph.com/projects/rados/wiki/REEF#v1825-httpstrackercephcomissues70563note-1-upgradequincy-x
Asking Radek and Neha about pacific-x.
On Thu, Mar 27, 2025 at 9:54 AM Yuri Weinstein wrote:
> Venky, Guillaume pls review and approve fs and orch/cepadm
>
> Still awa
Thanks for your patience.
host ceph06 isn't referenced in the config database. I think I've
finally purged it. I also reset the dashboard API host address from
ceph08 to dell02. But since prometheus isn't running on dell02 either,
there's no gain there.
I did clear some of that lint out via
Hello,
It seems we are at the end of our stressful adventure! After the big
rebalancing finished, without errors but without any significant impact
on the pool access problem, we decided to reboot all our OSD servers one
by one. The first good news is that it cleared all the reported issues
(
Venky, Guillaume pls review and approve fs and orch/cepadm
Still awaiting arrivals:
rados - Travis? Nizamudeen? Adam King approved?
rgw - Adam E approved?
fs - Venky approved?
upgrade-clients:client-upgrade-octopus-reef-reef - Ilya please take a
look. There are multiple runs
upgrade/pacific
Michel,
I can't recall any situations like that - maybe someone here does? - but I
would advise that you restart all OSDs to trigger the re-peering of every PG.
This should get your cluster back on track.
Just make sure the crush map / crush rules / bucket weights (including OSDs
weights) hav
Hello,
My name is Mihai, and I have started using CEPH this mount for a HPC
cluster.
When was lunch in the production the available space shown was 80TB now
is 16TB and I didn't do anything, while I'm having 12 OSD (SSD of 14TB):
sudo ceph osd tree
ID CLASS WEIGHT TYPE NAME
Frédéric,
When I was writing the last email, my colleague launched a re-peering of
the PG in activating state: the PG became active immediately but
triggered a little bit of rebalancing of other PGs, not necessarily in
the same pool. After this success, we decided to go for your approach,
sel
It gets worse.
It looks like the physical disk backing the 2 failing OSDs is failing. I
destroyed the host for one of them - which causes me to flash-back to
the nightmare of having a deleted OSD get permanently stuck deleting
just like in Pacific. Because I cannot restart the OSD, the deletio
Frédéric,
Thanks for your answer. I checked the number of PG on osd.17: it is 164,
very far from the hard limit (750, the default I think). So it doesn't
seem to be the problem and may be the peering is a victim of the more
general problem leading to many pools to be more or less inaccessible.
Hi Michel,
A common reason for PGs being stuck during activation is reaching the hard
limit of PGs per OSD. You might want to compare the number of PGs osd.17 has
(ceph osd df tree | grep -E 'osd.17 |PGS') to the hard limit set in your
cluster (echo "`ceph config get osd.0 mon_max_pg_per_osd`*`
On 27/03/2025 10:10, Torkil Svensgaard wrote:
Hi
19.2.1
"
[root@franky ~]# ceph device ls | grep franky
ATA_HGST_HDN726060ALE614_K1GV9P4B franky:sda osd.579
now
ATA_HGST_HDS724040ALE640_PK1334PBH7PZ5P franky:sdn osd.577
Hi,
I have not seen an answer yet, help would be very much appreciated as
our production cluster seems in a worst shape that initially described...
After a deeper analysis, we found that more than half of the pools,
despite reported as ok, are not accessible: the 'rados ls' command is
stuck
Hi
19.2.1
"
[root@franky ~]# ceph device ls | grep franky
ATA_HGST_HDN726060ALE614_K1GV9P4B franky:sda
osd.579
now
ATA_HGST_HDS724040ALE640_PK1334PBH7PZ5Pfranky:sdn
osd.577
now
ATA_HGST_HDS72
18 matches
Mail list logo