Everything you say is to be expected. I was not aware `reshard` could be run 
when the prior shards are removed, but apparently it can, and it creates new 
bucket index shards that are empty. Normally `reshard` reads entries from the 
old shards and copies their data to the new shards but since the old shards no 
longer exist, there’s nothing to copy over. But I presume the reason the 
reshard was suggested by other respondents was to allow for a bucket removal, 
which you verified.

You’re correct in that the objects in the data pool still exist. To list those 
you could run `rgw-orphan-list`. It will output objects in the data pool that 
are not referenced by any bucket index. Note: for large clusters it can take a 
while to run. If after reviewing the list of objects you believe (have 
verified) they’re not used, you can then remove them via `rados` commands. 
rgw-orphan-list is still considered experimental, but it has successfully 
helped clean up large clusters.

You also asked why there’s not a command to scan the data pool and recreate the 
bucket index. I think the concept would work as all head objects include the 
bucket marker in their names. There might be some corner cases where it’d 
partially fail, such as (possibly) transactional changes that were underway 
when the bucket index was purged. And there is metadata in the bucket index 
that’s not stored in the objects, so it would have to be recreated somehow. But 
no one has written it yet.

Eric
(he/him)

> On Feb 22, 2023, at 11:04 AM, Robert Sander <r.san...@heinlein-support.de> 
> wrote:
> 
> On 22.02.23 14:42, David Orman wrote:
>> If it's a test cluster, you could try:
>> root@ceph01:/# radosgw-admin bucket check -h |grep -A1 check-objects
>>    --check-objects           bucket check: rebuilds bucket index according to
>>                              actual objects state
> 
> After a "bi purge" a "bucket check" returns an error:
> 
> # radosgw-admin bi purge --bucket=testbucket --yes-i-really-mean-it
> # radosgw-admin bi list --bucket=testbucket
> ERROR: bi_list(): (2) No such file or directory
> # radosgw-admin bucket check --bucket=testbucket --check-objects
> 2023-02-22T16:51:11.970+0100 7fdcc6093e40  0 int 
> RGWRados::cls_bucket_list_ordered(const DoutPrefixProvider*, RGWBucketInfo&, 
> int, const rgw_obj_index_key&, const string&, const string&, uint32_t, bool, 
> uint16_t, RGWRados::ent_map_t&, bool*, bool*, rgw_obj_index_key*, 
> optional_yield, RGWBucketListNameFilter): CLSRGWIssueBucketList for 
> :testbucket[471f26a3-ff89-4b02-911a-0c89e2e295fa.104944180.1]) failed
> 
> Adding --fix does not change anything.
> 
> I can still download the one S3 object I put in the bucket
> because I know its name, but:
> 
> # s3cmd ls s3://testbucket/
> ERROR: S3 error: 404 (NoSuchKey)
> 
> A "bucket reshard" recreates index objects:
> 
> # radosgw-admin bucket reshard --bucket=testbucket --num-shards=12
> tenant:
> bucket name: testbucket
> old bucket instance id: 471f26a3-ff89-4b02-911a-0c89e2e295fa.104944180.1
> new bucket instance id: 471f26a3-ff89-4b02-911a-0c89e2e295fa.105128491.1
> total entries: 0
> 2023-02-22T16:58:34.496+0100 7f52360dce40  1 execute INFO: reshard of bucket 
> "testbucket" from 
> "testbucket:471f26a3-ff89-4b02-911a-0c89e2e295fa.104944180.1" to 
> "testbucket:471f26a3-ff89-4b02-911a-0c89e2e295fa.105128491.1" completed 
> successfully
> 
> After that "bucket check" runs without error but cannot
> fix the situation:
> 
> # radosgw-admin bucket check --bucket=testbucket --check-objects --fix
> []
> {}
> {
>    "existing_header": {
>        "usage": {}
>    },
>    "calculated_header": {
>        "usage": {}
>    }
> }
> 
> "s3cmd ls s3://testbucket/" shows nothing.
> 
> "s3cmd rb s3://testbucket/" removes the bucket but the RADOS
> objects of the S3 objects remain in the data pool.
> 
> Regards
> -- 
> Robert Sander
> Heinlein Consulting GmbH
> Schwedter Str. 8/9b, 10119 Berlin
> 
> https://www.heinlein-support.de
> 
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
> 
> Amtsgericht Berlin-Charlottenburg - HRB 220009 B
> Geschäftsführer: Peer Heinlein - Sitz: Berlin
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to