Hello Eugen
I re-added my node but facing auth issue with the osds, on the host I can see
some of the osd up and running but not showing in dashboard under osd.
“inutes ago - daemon:osd.101
auth get failed: failed to find osd.101 in keyring retval: -2”
# bash unit.run
--> Failed to activate
We really don't know what the cluster state was. I assume by removing
the fifth host you removed the fullest OSD(s). But again, we only get
a few bits of information, so it's still basically guessing.
What exactly do you mean by "so I thought to run pool repair"? What
exactly did you do?
Zi
Hi,
But now issue is, my cluster showing objects misplaced, whereas I
had 5 nodes with host failure domain with R3 pool (size 3 and min
2), EC with 3+2.
the math is pretty straight forward, with 5 chunks (k3, m2) you need
(at least) 5 hosts. So you should add the host back to be able to
Try the following command to fix the osd_remove_queue:
ceph config-key set mgr/cephadm/osd_remove_queue []
After that, the orchestrator will be back.
On Sat, Jan 25, 2025, 8:47 PM Devender Singh wrote:
> +Eugen
> Lets follow “No recovery after removing node -
> active+undersized+degraded-- rem
+Eugen
Lets follow “No recovery after removing node - active+undersized+degraded--
removed osd using purge…”. Here.
Sorry I missed ceph version which is 18.2.4. (with 5 nodes, 22osd each, where I
removed one node and all mess.)
Regards
Dev
> On Jan 25, 2025, at 11:34 AM, Devender Singh w
Hello Eugen
Thanks for your reply.
ceph osd set nodeep-scrub is not stopping if repairs are running.
Reapir started another set for deepscrub+repair which is not controlled using
this command.
When I started my cluster utilization as 74% and when it finished now my
cluster is showin
Hello Fredreic
Thanks for your reply, Yes I also faced this issue after draining and removing
of the node.
So used the same command and remove “original_weight” using ceph config-key get
mgr/cephadm/osd_remove_queue and injected file again. Which resolved the orch
issue.
“Error ENOENT: Mod
Hi Eugen,
my hypothesis is that these recursive counters are uncritical and, in fact,
updated when the dir/file is modified/accessed. Attributes like ceph.dir.rbytes
will show somewhat incorrect values, but these are approximate anyway (updates
are propagated asynchronously).
It would just be
Hi,
I've seen this happening on a test cluster after draining a host that also had
a MGR service. Can you check if Eugen's solution here [1] helps in your case ?
And maybe investigate 'ceph config-key ls' for any issues in config keys ?
Regards,
Frédéric.
[1] https://www.spinics.net/lists/cep
Why would you want to unpause repairs? Do you suffer from performance
issues during deep-scrubs? You could pause those with:
ceph osd set nodeep-scrub
But they would only reveal inconsistent PGs during deep-scrub. To
enable automatic repairs during deep-scrub, you would need to first
enabl
Hi,
we really don't know anything about your cluster (version, health
status, osd tree, crush rules). So at this point one can only assume
what *could* have happened.
Degraded and misplaced PGs isn't that bad, as long as there's any
recovery going on (you let that part out of your status).
Hi Frank,
in that case I would probably wait a bit as well if no clients complain.
I guess one could try to scrub only a single directory instead of "/",
I assume it should be possible to identify the affected directory from
the log output you provided.
Have a calm weekend! ;-)
Eugen
Zitat
Encountered this issue recently, restarting mgrs did the trick.
Cheers
On Sat, Jan 25, 2025, 06:26 Devender Singh wrote:
> Thanks for you reply… but those command not working as its an always
> module..but strange still showing error,
>
> # ceph mgr module enable orchestrator
> module 'orchest
13 matches
Mail list logo