Re: [ceph-users] How to remove a faulty bucket? [WAS:Re: Resharding issues / How long does it take?]

David Turner Fri, 08 Dec 2017 06:20:07 -0800

First off, you can rename a bucket and create a new one for the application
to use. You can also unlink the bucket so it is no longer owned by the
access-key/user that created it. That should get your application back on
its feet.


I have had very little success with bypass-gc, although I think it would be
a wonderful feature of it worked. After you move the bucket to a different
name, you could try a multi-threaded python script to delete all of the
objects in the bucket and then removed the bucket... Maybe. I had a bucket
that still didn't remove after doing that after it failed to delete with
bypass-gc and such. At that point though, it took up little enough space
that I could ignore it myself. Watch your GC queue, though, and make sure
it's going down.

On Fri, Dec 8, 2017, 6:00 AM Martin Emrich <martin.emr...@empolis.com>
wrote:

> Followup:
>
> I eventually gave up trying to salvage the bucket. The bucket is supposed
> to have ca. 110000 objects, every attempt to "bucket index check --fix"
> increased that number by 110000, so something is very wrong.
> Also, deleting the bucket with "radosgw bucket rm --purge-objects" failed
> with a "no such file or directory" error.
>
> Even the biggest shovel I found could not remove the bucket:
>
> # radosgw-admin bucket rm --bucket=XXXX --purge-objects
> --inconsistent-index --yes-i-really-mean-it --bypass-gc
> 2017-12-08 11:56:15.020617 7f799c326c40 -1 ERROR: could not drain handles
> as aio completion returned with -2
> 2017-12-08 11:56:16.879316 7f799c326c40 -1 ERROR: unable to remove
> bucket(2) No such file or directory
>
> As the application relies on the bucket name, which is now occupied by
> this mystery bucket, I seem to be stuck. How can I remove this bucket?
>
> Thanks
>
> Martin
>
> Am 07.12.17, 16:05 schrieb "ceph-users im Auftrag von Martin Emrich" <
> ceph-users-boun...@lists.ceph.com im Auftrag von martin.emr...@empolis.com
> >:
>
>     Hi all!
>
>     Apparently, one of my buckets went wonko during automatic resharding,
> the frontend application only gehts a timeout after 90s.
>     After an attempt to fix the index using “radosgw-admin bucket check
> –fix”, I tried to reshard id (6,3GB of data in ca. 230000 objects).
>
>     The resharding command is now running for over an hour. No significant
> load on any of the 18 OSDs, the host running radosgw-admin or on one of the
> three radosgw hosts. The OSDs are beefy machines with HDDs for Data and
> SSDs for index pools. Running 12.2.2.
>     How long should the resharding take? For a few minutes, radosgw-admin
> seems quite busy, but now it seems to only sit there at only a few % of CPU
> usage.
>
>     “radosgw-admin reshard list” reports an empty list. Reshard status
> reports
>
>     [
>         {
>             "reshard_status": 1,
>             "new_bucket_instance_id":
> "c2ffcb0f-a9a3-4360-a9be-5edef965449a.6860125.1",
>             "num_shards": 10
>         }
>     ]
>
>     I have a feeling that the bucket index is still
> damaged/incomplete/inconsistent. What does the message
>
>     *** NOTICE: operation will not remove old bucket index objects ***
>     ***         these will need to be removed manually             ***
>
>     mean? How can I clean up manually?
>
>     Thanks,
>
>     Martin
>
>
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@lists.ceph.com
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to remove a faulty bucket? [WAS:Re: Resharding issues / How long does it take?]

Reply via email to