[ceph-users] Re: Watcher Issue

2025-01-21 Thread Frédéric Nass
Hi Dev, Can you run the below command to check if this image is still considered as mapped by any ceph-csi nodeplugins? $ namespace=ceph-csi-rbd $ image=csi-vol-945c6a66-9129 $ for pod in $(kubectl -n $namespace get pods | grep -E 'rbdplugin|nodeplugin' | grep -v provisioner | awk '{print $1}'

[ceph-users] Re: Watcher Issue

2025-01-21 Thread Devender Singh
Similar issue https://github.com/ceph/ceph-csi/discussions/4410 Regards Dev On Tue, 21 Jan 2025 at 2:33 PM, Devender Singh wrote: > Hello Eugen > > Thanks for your reply. > I have the image available and it’s not under trash. > > When scaling a pod to different node using statefulset, pod give

[ceph-users] Re: Seeking Participation! Take the new Ceph User Stores Survey!

2025-01-21 Thread Laura Flores
Hi Robin, As fast feedback when I passed the survey on to somebody else - to > improve responses, if CUC can offer commands to make it easier to grab > some of the quantitative data: Do you have the pg autoscaler enabled? > How many OSDs per node are you using? > How many clients are reading/wri

[ceph-users] Re: Watcher Issue

2025-01-21 Thread Devender Singh
Hello Eugen Thanks for your reply. I have the image available and it’s not under trash. When scaling a pod to different node using statefulset, pod gives mount issue. I was looking for a command if we can kill the client.id from ceph. CEPH must have a command to kill its

[ceph-users] Re: Seeking Participation! Take the new Ceph User Stores Survey!

2025-01-21 Thread Robin H. Johnson
On Tue, Jan 21, 2025 at 10:43:13AM -0600, Laura Flores wrote: > Hi all, > > The Ceph User Council is conducting a survey to gather insights from > community members who actively use production Ceph clusters. We want to > hear directly from you: *What is the use case of your production Ceph > clust

[ceph-users] Re: Watcher Issue

2025-01-21 Thread Eugen Block
Hi, have you checked if the image is in the trash? rbd -p {pool} trash ls You can try to restore the image if there is one, then blocklist the client to release the watcher, then delete the image again. I have to do that from time to time on a customer’s openstack cluster. Zitat von Devend

[ceph-users] Watcher Issue

2025-01-21 Thread Devender Singh
Hello Seeking some help if I can clean the client mounting my volume? rbd status pool/image Watchers: watcher=10.160.0.245:0/2076588905 client.12541259 cookie=140446370329088 Issue: pod is failing in init- state. Events: Type Reason Age From Message

[ceph-users] Re: Seeking Participation! Take the new Ceph User Stores Survey!

2025-01-21 Thread Laura Flores
Correction: "Stories", not "Stores" in the subject. :) On Tue, Jan 21, 2025 at 10:43 AM Laura Flores wrote: > Hi all, > > The Ceph User Council is conducting a survey to gather insights from > community members who actively use production Ceph clusters. We want to > hear directly from you: *What

[ceph-users] Re: Changing crush map result in > 100% objects degraded

2025-01-21 Thread Anthony D'Atri
This is one reason to set nobackfill/norebalance first, so that the cluster doesn’t needlessly react to an intermediate state. Having managed clusters before we had the ability to manipulate the CRUSH topology via the CLI, I would suggest using the CLI whenever possible. It’s all too easy to f

[ceph-users] Re: Changing crush map result in > 100% objects degraded

2025-01-21 Thread Kasper Rasmussen
Oh, but of cause everything smooths out after while. My main concern is just, if I do this on a large cluster, it will send it spinning... From: Kasper Rasmussen Sent: Tuesday, January 21, 2025 18:35 To: Dan van der Ster ; Anthony D'Atri Cc: ceph-users Subjec

[ceph-users] Re: Changing crush map result in > 100% objects degraded

2025-01-21 Thread Kasper Rasmussen
Hi Dan "Also, in the process of moving the hosts one by one, each step creates a new topology which can change the ordering of hosts, incrementally putting things out of whack." RESPONSE: Will it be better to edit the crushmap as a file, and load the new with ceph osd setcrushmap -i ? Kaspar

[ceph-users] Re: Changing crush map result in > 100% objects degraded

2025-01-21 Thread Devender Singh
You moved some oSD’s I believe it’s looking for peer osds data too. But as long as you keep nobackfill,norebalance,norecover it will take longer and keep showing more data to balance, and kept filling or writing data to volumes it will accumulate. So unset and wait for some time to finish it. Rega

[ceph-users] Re: Changing crush map result in > 100% objects degraded

2025-01-21 Thread Dan van der Ster
On Tue, Jan 21, 2025 at 7:12 AM Anthony D'Atri wrote: > > On Jan 21, 2025, at 7:59 AM, Kasper Rasmussen > > wrote: > > > > 1 - Why do this result in such a high - objects degraded - percentage? > > I suspect that’s a function of the new topology having changed the mappings > of multiple OSDs fo

[ceph-users] Re: Changing crush map result in > 100% objects degraded

2025-01-21 Thread Anthony D'Atri
> On Jan 21, 2025, at 7:59 AM, Kasper Rasmussen > wrote: > > 1 - Why do this result in such a high - objects degraded - percentage? I suspect that’s a function of the new topology having changed the mappings of multiple OSDs for given PGs. It’s subtle, but when you move hosts into rack CRU

[ceph-users] Changing crush map result in > 100% objects degraded

2025-01-21 Thread Kasper Rasmussen
Hi community Please help me understand what is going on. I have a ceph (Reef) test cluster with the following crushmap ceph osd crush tree ID CLASS WEIGHTTYPE NAME -1 12.0 root default -7 3.0 host ksr-ceph-osd1 0hdd 1.0 osd.0 6h

[ceph-users] Re: Slow initial boot of OSDs in large cluster with unclean state

2025-01-21 Thread Frédéric Nass
Hi Tom, That's great news! The community will definitely benefit from hearing about your experience. During last week's user+dev monthly meeting (everyone can join, btw), we previewed and discussed the upcoming 'Ceph User Stories' survey, which will help feature use case studies on Ceph.io, if