[ceph-users] Re: Replace HDD with cephadm

2022-03-16 Thread Kai Stian Olstad
On 15.03.2022 10:10, Jimmy Spets wrote: Thanks for your reply. I have two things that I am unsure of: - Is the OSD UUID the same for all OSD:s or should it be unique for each? It's unique and generated when you run ceph-volume lvm prepare or add an OSD. You can find OSD UUID/FSID for existi

[ceph-users] Remove orphaned ceph volumes

2022-03-16 Thread Chris Page
Hi, We had to recreate our Ceph cluster and it seems some legacy data was left over. I think this is causing our valid OSD's to hang for 15-20 minutes before starting up on a machine reboot. When checking /var/log/ceph/ceph-volume.log I can see the following - [2022-03-08 09:32:10,581][ceph_volu

[ceph-users] Re: Keycloack with Radosgw

2022-03-16 Thread Pritha Srivastava
Hi Simone, There is a step that I see missing here - have you created a role? For creating a role, you need to attach 'roles' caps to the user that you created. Also, what tool have you used to make the AssumeRoleWithWebIdentity call? An example using boto3 is outlined in the documentation here: h

[ceph-users] Re: Keycloack with Radosgw

2022-03-16 Thread Pritha Srivastava
Please correct the trust policy with the condition element that I pointed out before. Also, Can you please try using AWS tools - boto3 or AWS STS apis to make the AssumeRoleWithWebIdentity call. You can check the RGW log files to see whether the call reaches RGW with the curl command. Thanks, Prit

[ceph-users] Re: Remove orphaned ceph volumes

2022-03-16 Thread Chris Page
This is now resolved. I simply found the old systemd files inside /etc/systemd/system/multi-user.target.wants and disabled them which automatically cleaned them up. Thanks! On Wed, 16 Mar 2022 at 09:30, Chris Page wrote: > Hi, > > We had to recreate our Ceph cluster and it seems some legacy dat

[ceph-users] Ceph OSD's take 10+ minutes to start on reboot

2022-03-16 Thread Chris Page
Hi, I'm having an issue on one of my nodes where all of it's OSD's take a long time to come back online (between 10 and 15 minutes). In the Ceph log, it sits on: bluestore(/var/lib/ceph/osd/ceph-8) _open_db_and_around read-only:0 repair:0 Until eventually something changes which allows the start

[ceph-users] Re: Ceph OSD's take 10+ minutes to start on reboot

2022-03-16 Thread Igor Fedotov
Hi Chris, could you please raise debug-bluestore level to e.g. 5 and share a new startup log? I presume it's fsck which is running at that point but let's double check. And what Ceph version are we talking about? Thanks, Igor On 3/16/2022 4:58 PM, Chris Page wrote: Hi, I'm having an is

[ceph-users] Re: CephFS snaptrim bug?

2022-03-16 Thread Linkriver Technology
Hi, Has anyone figured whether those "lost" snaps are rediscoverable / trimmable? All pgs in the cluster have been deep scrubbed since my previous email and I'm not seeing any of that wasted space being recovered. Regards, LRT -Original Message- From: Dan van der Ster To: technol...@li

[ceph-users] Re: Ceph OSD's take 10+ minutes to start on reboot

2022-03-16 Thread Chris Page
> > Thanks Igor, So I stuck the debugging up to 5 and rebooted, and suddenly the OSD's are coming back in no time again. Might this be because they were so recently rebooted? I've added the log with debug below: 2022-03-16T14:31:30.031+ 7f739fd28f00 1 bluestore(/var/lib/ceph/osd/ceph-9) _m

[ceph-users] Re: Ceph OSD's take 10+ minutes to start on reboot

2022-03-16 Thread Igor Fedotov
Chris, hmm... so  I can't see any fsck output hence that's apparently not the root cause. Curios what happened to these OSDs and/or node before the first issue appearance?  Ceph upgrade ? Unexpected shutdown? Anything else notable? Have you tried a single OSD restart multiple times and saw

[ceph-users] Re: Ceph OSD's take 10+ minutes to start on reboot

2022-03-16 Thread Chris Page
Hi Igor, > Curios what happened to these OSDs and/or node before the first issue appearance? Ceph upgrade ? Unexpected shutdown? Anything else notable? We originally had Ceph set up with some drives that we chose not to pursue. On removing all of the drives things went a bit wayward and we ended

[ceph-users] Disable peering of some pool

2022-03-16 Thread Jan Pekař - Imatic
Hi all, we have problem on our production cluster running nautilus (14.2.22). Cluster is almost full and few month ago we noticed issues with slow peering - when we restart any osd (or host) it takes hours to finish peering process, instead of minutes. We noticed, that some pool contains 90k

[ceph-users] Re: Keycloack with Radosgw

2022-03-16 Thread Pritha Srivastava
The value of the 'aud' field in the token must be set in the Condition element, checking it against 'app_id'. There is no need to add a custom field 'app_id'. The Ceph STS APIs have been tested using standard AWS tools (boto3 and aws), so I'd suggest you to use them. Thanks, Pritha On Wed, Mar 1

[ceph-users] Re: Ceph OSD's take 10+ minutes to start on reboot

2022-03-16 Thread Chris Page
Interestingly this took a really long time (22 minutes overall). However since the restart of OSD 10 will now restart no problem. I've put the log below... 2022-03-16T15:26:09.181+ 7f0a2f26bf00 1 bluestore(/var/lib/ceph/osd/ceph-10) _mount path /var/lib/ceph/osd/ceph-10 2022-03-16T15:26:09.1

[ceph-users] Re: Keycloack with Radosgw

2022-03-16 Thread Pritha Srivastava
Hi Simone, The condition element will be: "StringEquals": {"mykeycloak.org.com/auth/realms/myrealm:app_id":"radosgw"} Thanks, Pritha On Wed, Mar 16, 2022 at 9:44 PM wrote: > Hi Pritha, > > > > I will test APIs with suggested tools. > > > > What is not clear to me is the aud and app_id. > > >

[ceph-users] Managing Multiple Ceph Clusters

2022-03-16 Thread Paul Cuzner
Hi, A few of the devs have been thinking about how we could make managing multiple ceph clusters easier. At this point we're trying to understand the requirements and problems that a multi-cluster feature needs to fix, and need your help! We've put together a short, 13 question survey; https://fo

[ceph-users] Re: Managing Multiple Ceph Clusters

2022-03-16 Thread Marc
> We've put together a short, 13 question survey; > https://forms.gle/E9cAx4f51Hq2FHQXA > FYI it is behind a sign-in wall ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Managing Multiple Ceph Clusters

2022-03-16 Thread Paul Cuzner
Apologies - it was set to limit responses to 1 from an individual, which uses a google account to prevent spamming the form. I've turned that off now. On Thu, Mar 17, 2022 at 11:21 AM Marc wrote: > > > We've put together a short, 13 question survey; > > https://forms.gle/E9cAx4f51Hq2FHQXA > > >

[ceph-users] Re: How often should I scrub the filesystem ?

2022-03-16 Thread Milind Changire
Chris, After you run "scrub repair" followed by a "scrub" without any issues, and if the "damage ls" still shows you an error, try running "damage rm" and re-run "scrub" to see if the system still reports a damage. Please update the upstream tracker with your findings if possible. -- Milind On S