Hi Robin,

sorry for the late answer.

So I went through most entries and searched for errors.

- no unfinished multipart uploads
- no missing files that would belong to multipart uploads
- all objects "seem" to be healthy

All was fine from the data structure point. Then I just ran bucket check
--check-objects --fix and bucket check olh and bucket check unlinked, which
all produced no changes or any problem.

But when I ran the rgw-orphan-objects check again, now the output is
absolutly normal as I would expect. No looping, not 1.6billion entries.
Nothing.

So I shrug it off for now and call it fixed :)

Thanks anyway for your input.

- Boris

Am Fr., 16. Mai 2025 um 17:13 Uhr schrieb Robin H. Johnson <
robb...@gentoo.org>:

> On Thu, May 15, 2025 at 02:06:49PM +0200, Boris wrote:
> > Hi,
> > I am in the process of checking orphan objects and the radoslist list
> grep
> > over 300GB while the rados ls list was only 50GB.
> >
> > After some investigation I identified a saw that one bucket fell out
> with a
> > lot of objects in the radoslist part.
> >
> > I did some sorting on the output file and just used the marker part to
> > check if something is wrong:
> > These are the last two lines after sorting:
> > 16145185 ff7a8b0c-07e6-463a-861b-78f0adeba8ad.2297644274.205
> > 1882589930 ff7a8b0c-07e6-463a-861b-78f0adeba8ad.1805769661.3772
> >
> > The top bucket has roughtly the same amount ob objects in the index, but
> > the other bucket only hast 17million object in the index.
> >
> > I stopped the search for orphan objects and pulled a fresh `radosgw-admin
> > bucket radoslist --bucket BUCKET > file`. Then I sorted that list and
> ran a
> > uniq -c against it and sorted that list again.
> >
> > The unique list ist now 34million objects long and there are 32026
> entries
> > that appear more than once and there are entries that 68849 times.
> >
> >   68849
> >
> ff7a8b0c-07e6-463a-861b-78f0adeba8ad.1805769661.3772__shadow_server/download/XXX/YYYY.2~k9UuEdtUkAHkikXz1ONc8UzgwKryYco.6_1
>
> This is one piece inside an incomplete multi-part upload, it should be
> visible with:
> aws s3api list-multipart-uploads --bucket ... --prefix
> server/download/XXX/YYYY
>
> Specifically, the ID of 2~k9UuEdtUkAHkikXz1ONc8UzgwKryYco should be
> present in the output.
>
> You'll need to send a command to abort the MPU.
>
> If the ID is *NOT* present, then keep reading.
>
> Caveats:
> 1. s3cmd has a long-standing bug that does not paginate listing of
> multiple uploads; Use AWS-CLI instead.
>
> 2.
> RGW leaks MPU pieces.
> If you list all of the RADOS objects with the prefix of
>
> "ff7a8b0c-07e6-463a-861b-78f0adeba8ad.1805769661.3772__shadow_server/download/XXX/YYYY.2~k9UuEdtUkAHkikXz1ONc8UzgwKryYco"
>
> There should be:
> - meta object
> - parts
> - optional: additional stripes of parts
>
> Experience has shown me many cases where the .meta and SOME parts &
> stripes are removed, but there are dangling pieces still.
>
> That's where:
> 1. orphan cleanup comes in.
> 2. Future (public) tooling is needed to do MPU-specific cleanup.
>    Previous employers we wrote specific tooling to do this: Do a raw
>    listing ofList RADOS objects with a prefix, and scan for MPU parts
>    where the .meta was gone.
>
> > How can I clean that up? Or is this just a bug in the radosgw-admin
> bucket
> > radoslist?
> > The bucket has 401 shards and got "versioned: true" and
> > "versioning_enabled: true"
> Do check for versioned objects as well, but your case above is not one
> of them.
>
> --
> Robin Hugh Johnson
> Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
> E-Mail   : robb...@gentoo.org
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to