Re: [ceph-users] Bucket resharding: "radosgw-admin bi list" ERROR

2017-07-05 Thread Andreas Calminder
Sure thing! I noted the new and old bucket instance id. backup the bucket metadata # radosgw-admin --cluster ceph-prod metadata get bucket:1001/large_bucket > large_bucket.metadata.bak.json # cp large_bucket.metadata.bak.json large_bucket.metadata.patched.json set bucket_id in large_bucket.metad

Re: [ceph-users] Bucket resharding: "radosgw-admin bi list" ERROR

2017-07-05 Thread Maarten De Quick
Hi Orit, We're running on jewel, version 10.2.7. I've ran the bi-list with the debugging commands and this is the end of it: *2017-07-05 08:50:19.705673 7ff3bfefe700 1 -- 10.21.4.1:0/3313807338 <== osd.3 10.21.4.

Re: [ceph-users] Bucket resharding: "radosgw-admin bi list" ERROR

2017-07-05 Thread Maarten De Quick
Hi Andreas, Interesting as we are also on Jewel 10.2.7. We do care about the data in the bucket so we really need the reshard process to run properly :). Could you maybe share how you linked the bucket to the new index by hand? That would already give me some extra insight. Thanks! Regards, Maart

Re: [ceph-users] Bucket resharding: "radosgw-admin bi list" ERROR

2017-07-05 Thread Andreas Calminder
Hi, I had a similar problem while resharding an oversized non-sharded bucket in Jewel (10.2.7), the bi_list exited with ERROR: bi_list(): (4) Interrupted system call at, what seemed like the very end of the operation. I went ahead and resharded the bucket anyway and the reshard process ended the sa

Re: [ceph-users] Bucket resharding: "radosgw-admin bi list" ERROR

2017-07-04 Thread Orit Wasserman
Hi Maarten, On Tue, Jul 4, 2017 at 9:46 PM, Maarten De Quick wrote: > Hi, > > Background: We're having issues with our index pool (slow requests / time > outs causes crashing of an OSD and a recovery -> application issues). We > know we have very big buckets (eg. bucket of 77 million objects wit

[ceph-users] Bucket resharding: "radosgw-admin bi list" ERROR

2017-07-04 Thread Maarten De Quick
Hi, Background: We're having issues with our index pool (slow requests / time outs causes crashing of an OSD and a recovery -> application issues). We know we have very big buckets (eg. bucket of 77 million objects with only 16 shards) that need a reshard so we were looking at the resharding proce