Re: [ceph-users] Ceph RGW Index Sharding In Jewel

Russell Holloway Wed, 22 Aug 2018 18:23:31 -0700

Did I say Jewel? I was too hopeful. I meant hammer. This particular cluster is 
hammer :(

-Russ

________________________________
From: ceph-users <ceph-users-boun...@lists.ceph.com> on behalf of Russell 
Holloway <russell.hollo...@hotmail.com>
Sent: Wednesday, August 22, 2018 8:49:19 PM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Ceph RGW Index Sharding In Jewel

So, I've finally journeyed deeper into the depths of ceph and discovered a 
grand mistake that is likely the root cause of many woeful nights of blocked 
requests. To start off, I'm running jewel, and I know that is dated and I need 
to upgrade (if anyone knows if this is a seamless upgrade even though several 
major versions behind, do let me know.

My current issue is due to a rgw bucket index. I have just discovered I have a 
bucket with about 12M objects in it. Sharding is not enabled on it. And it's on 
a spinning disk, not SSD (journal is SSD though, so it could be worse?). A bad 
combination as I just learned. From my recent understanding, in jewel I could 
maybe update the rgw region to set max shards for buckets, but it also sounds 
like this may or may not affect my existing bucket. Furthermore, somewhere I 
saw mention that prior to luminous, resharding needed to be done offline. I 
haven't found any documentation on this process though. There is some mention 
around putting bucket indexes on SSD for performance and latency reasons, which 
sounds great, but I get the feeling if I modified crush map and tried to get 
the index pool on SSDs, and tried to start moving things around involving this 
PG, it will fail in the same way I can't even do a deep scrub on the PG.

Does anyone have a good reference on how I could begin to clean this bucket up 
or get it sharded while on jewel? Again, it sounds like in Luminous it may just 
start resharding itself and fix itself right up, but I feel going to luminous 
will require more work and testing (mostly due to my original deployment tool 
Fuel 8 for openstack, bound to jewel, and no easy upgrade path for fuel...I'll 
have to sort out how to transition away from that while maintaining my existing 
nodes)

The core issue was identified when I took finer grained control over deep 
scrubs and trigger them manually. I eventually found out I could trigger my 
entire ceph cluster to hang by triggering a deep scrub on a single PG, which 
happens to be the one hosting this index. The OSD hosting it basically becomes 
unresponsive for a very long time and begins blocking a lot of other requests 
affecting all sorts of VMs using rbd. I could simply not deep scrub this PG 
(ceph ends up marking OSD as down and deep scrub seems to fail, never 
completes, and about 30 minutes after hung requests, cluster eventually 
recovers), but I know I need to address this bucket sizing issue and then try 
to work on upgrading ceph.

Is it doable? For what it's worth, I tried to list the keys in ceph with rados 
and that also hung requests. I'm not quite sure how to break the bucket up at a 
software level especially if I cannot list the contents, so I hope within ceph 
there is some route forward here...

Thanks a bunch in advance for helping a naive ceph operator.

-Russ

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph RGW Index Sharding In Jewel

Reply via email to