Re: [ceph-users] rgw leaking data, orphan search loop

George Mihaiescu Tue, 28 Feb 2017 05:46:12 -0800

Hi Yehuda,

I've ran the "radosgw-admin orphans find" command again, but captured its
output this time.


There are both "shadow" files and "multipart" files detected as leaked.

leaked:
default.34461213.1__multipart_data/d2a14aeb-a384-51b1-8704-fe76a9a6f5f5.-j0vqDrC0wr44bii2ytrtpcrlnspSyE.44
leaked:
default.34461213.1__multipart_data/dda8a18a-1d99-50b7-a397-b876811bdf94.Mpjqa2RwKirM9Ae_HTsttGiJEHAvdxc.113
leaked:
default.34461213.1__multipart_data/dda8a18a-1d99-50b7-a397-b876811bdf94.bRtExqhbdw_J0gcT-xADwUBQJFOjKmG.111
leaked:
default.34461213.1__multipart_data/f080cdc1-0826-5ac9-a9f7-21700ceeebf3.BqR5aBMpJDmO1U5xKSrEe3EvPmEHNq8.96
leaked:
default.34461213.1__multipart_data/f793a1ec-5c7d-5b59-845a-9d280325bb25.LcCH4ia_LwWV4MyVzwhTv_PAxrpSPpM.52
leaked:
default.34461213.1__multipart_data/f793a1ec-5c7d-5b59-845a-9d280325bb25.gbrMfo0bWww2nEe2x4LL146BwtLMkA6.37
leaked:
default.34461213.1__multipart_data/f793a1ec-5c7d-5b59-845a-9d280325bb25.rIZlATigZEwP6FVW66m-YhcmgIiJihM.48
leaked:
default.34461213.1__shadow_data/0181bbd5-4202-57a0-a1f3-07007d043660.2~gy8NGkx7YmMzHwv8_ordh7u_TNk7_4c.1_1
leaked:
default.34461213.1__shadow_data/0181bbd5-4202-57a0-a1f3-07007d043660.2~gy8NGkx7YmMzHwv8_ordh7u_TNk7_4c.1_2
leaked:
default.34461213.1__shadow_data/0181bbd5-4202-57a0-a1f3-07007d043660.2~gy8NGkx7YmMzHwv8_ordh7u_TNk7_4c.1_3
leaked:
default.34461213.1__shadow_data/0181bbd5-4202-57a0-a1f3-07007d043660.2~gy8NGkx7YmMzHwv8_ordh7u_TNk7_4c.1_4
leaked:
default.34461213.1__shadow_data/0181bbd5-4202-57a0-a1f3-07007d043660.2~gy8NGkx7YmMzHwv8_ordh7u_TNk7_4c.1_5
leaked:
default.34461213.1__shadow_data/0181bbd5-4202-57a0-a1f3-07007d043660.2~gy8NGkx7YmMzHwv8_ordh7u_TNk7_4c.1_6
leaked:
default.34461213.1__shadow_data/02aca392-6d6b-536c-ae17-fdffe164e05a.2~lXP-3WDlbF5MSPYuE7JHLNK1z1hr1Y4.100_1
leaked:
default.34461213.1__shadow_data/02aca392-6d6b-536c-ae17-fdffe164e05a.2~lXP-3WDlbF5MSPYuE7JHLNK1z1hr1Y4.100_10
leaked:
default.34461213.1__shadow_data/02aca392-6d6b-536c-ae17-fdffe164e05a.2~lXP-3WDlbF5MSPYuE7JHLNK1z1hr1Y4.100_11

I deleted both the multipart and shadow leaked files for one of the S3
objects, and then the object couldn't be retrieved anymore.

I deleted just the shadow leaked files for another S3 object, and then that
object couldn't be retrieved anymore after either.

I think the "radosgw-admin orphans find" command again doesn't work as
expected, is there anything else I can do?

Thank you,
George



On Fri, Feb 24, 2017 at 1:22 PM, Yehuda Sadeh-Weinraub <yeh...@redhat.com>
wrote:

> oid is object id. The orphan find command generates a list of objects
> that needs to be removed at the end of the run (if finishes
> successfully). If you didn't catch that, you should be able to still
> run the same scan (using the same scan id) and retrieve that info
> again.
>
> Yehuda
>
> On Fri, Feb 24, 2017 at 9:48 AM, George Mihaiescu <lmihaie...@gmail.com>
> wrote:
> > Hi Yehuda,
> >
> > Thank you for the quick reply.
> >
> > What is the <oid> you're referring to that I should backup and then
> delete?
> > I extracted the files from the ".log" pool where the "orphan find" tool
> > stored the results, but they are zero bytes files.
> >
> >
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.orphans.rados.52
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.orphans.rados.58
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 obj_delete_at_hint.0000000122
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 obj_delete_at_hint.0000000057
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.bck1.rados.53
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.orphans.buckets.20
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.orphans.buckets.25
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.bck1.rados.0
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.orphans.rados.2
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.orphans.linked.19
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.orphans.rados.38
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 obj_delete_at_hint.0000000018
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 obj_delete_at_hint.0000000092
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 obj_delete_at_hint.0000000108
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.bck1.rados.13
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.orphans.linked.20
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.orphans.rados.18
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.bck1.rados.11
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.orphans.rados.50
> > -rw-r--r-- 1 root root 0 Feb 24 12:45 orphan.scan.orphans.buckets.33
> >
> >
> > George
> >
> >
> >
> > On Fri, Feb 24, 2017 at 12:12 PM, Yehuda Sadeh-Weinraub <
> yeh...@redhat.com>
> > wrote:
> >>
> >> Hi,
> >>
> >> we wanted to have more confidence in the orphans search tool before
> >> providing a functionality that actually remove the objects. One thing
> >> that you can do is create a new pool, copy these objects to the new
> >> pool (as a backup, rados -p <source-pool> --target-pool=<target-pool>
> >> cp <oid> <oid>), and remove these objects (rados -p <pool> rm <oid>).
> >> Then when you're confident enough that this didn't break existing
> >> objects, you can remove the backup pool.
> >>
> >> Yehuda
> >>
> >> On Fri, Feb 24, 2017 at 8:23 AM, George Mihaiescu <lmihaie...@gmail.com
> >
> >> wrote:
> >> > Hi,
> >> >
> >> > I updated http://tracker.ceph.com/issues/18331 with my own issue,
> and I
> >> > am
> >> > hoping Orit or Yehuda could give their opinion on what to do next.
> >> > What was the purpose of the "orphan find" tool and how to actually
> clean
> >> > up
> >> > these files?
> >> >
> >> > Thank you,
> >> > George
> >> >
> >> >
> >> > On Fri, Jan 13, 2017 at 2:22 PM, Wido den Hollander <w...@42on.com>
> >> > wrote:
> >> >>
> >> >>
> >> >> > Op 24 december 2016 om 13:47 schreef Wido den Hollander
> >> >> > <w...@42on.com>:
> >> >> >
> >> >> >
> >> >> >
> >> >> > > Op 23 december 2016 om 16:05 schreef Wido den Hollander
> >> >> > > <w...@42on.com>:
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > > Op 22 december 2016 om 19:00 schreef Orit Wasserman
> >> >> > > > <owass...@redhat.com>:
> >> >> > > >
> >> >> > > >
> >> >> > > > HI Maruis,
> >> >> > > >
> >> >> > > > On Thu, Dec 22, 2016 at 12:00 PM, Marius Vaitiekunas
> >> >> > > > <mariusvaitieku...@gmail.com> wrote:
> >> >> > > > > On Thu, Dec 22, 2016 at 11:58 AM, Marius Vaitiekunas
> >> >> > > > > <mariusvaitieku...@gmail.com> wrote:
> >> >> > > > >>
> >> >> > > > >> Hi,
> >> >> > > > >>
> >> >> > > > >> 1) I've written before into mailing list, but one more time.
> >> >> > > > >> We
> >> >> > > > >> have big
> >> >> > > > >> issues recently with rgw on jewel. because of leaked data -
> >> >> > > > >> the
> >> >> > > > >> rate is
> >> >> > > > >> about 50GB/hour.
> >> >> > > > >>
> >> >> > > > >> We've hitted these bugs:
> >> >> > > > >> rgw: fix put_acls for objects starting and ending with
> >> >> > > > >> underscore
> >> >> > > > >> (issue#17625, pr#11669, Orit Wasserman)
> >> >> > > > >>
> >> >> > > > >> Upgraded to jewel 10.2.5 - no luck.
> >> >> > > > >>
> >> >> > > > >> Also we've hitted this one:
> >> >> > > > >> rgw: RGW loses realm/period/zonegroup/zone data: period
> >> >> > > > >> overwritten if
> >> >> > > > >> somewhere in the cluster is still running Hammer
> (issue#17371,
> >> >> > > > >> pr#11519,
> >> >> > > > >> Orit Wasserman)
> >> >> > > > >>
> >> >> > > > >> Fixed zonemaps - also no luck.
> >> >> > > > >>
> >> >> > > > >> We do not use multisite - only default realm, zonegroup,
> zone.
> >> >> > > > >>
> >> >> > > > >> We have no more ideas, how these data leak could happen. gc
> is
> >> >> > > > >> working -
> >> >> > > > >> we can see it in rgw logs.
> >> >> > > > >>
> >> >> > > > >> Maybe, someone could give any hint about this? Where should
> we
> >> >> > > > >> look?
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >> 2) Another story is about removing all the leaked/orphan
> >> >> > > > >> objects.
> >> >> > > > >> radosgw-admin orphans find enters the loop state on stage
> when
> >> >> > > > >> it
> >> >> > > > >> starts
> >> >> > > > >> linking objects.
> >> >> > > > >>
> >> >> > > > >> We've tried to change the number of shards to 16, 64
> >> >> > > > >> (default),
> >> >> > > > >> 512. At
> >> >> > > > >> the moment it's running with shards number 1.
> >> >> > > > >>
> >> >> > > > >> Again, any ideas how to make orphan search happen?
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > > >> I could provide any logs, configs, etc. if someone is ready
> to
> >> >> > > > >> help on
> >> >> > > > >> this case.
> >> >> > > > >>
> >> >> > > > >>
> >> >> > > >
> >> >> > > > How many buckets do you have ? how many object in each?
> >> >> > > > Can you provide the output of rados ls -p .rgw.buckets ?
> >> >> > >
> >> >> > > Marius asked me to look into this for him, so I did.
> >> >> > >
> >> >> > > What I found is that at *least* three buckets have way more RADOS
> >> >> > > objects then they should.
> >> >> > >
> >> >> > > The .rgw.buckets pool has 35.651.590 objects totaling 76880G.
> >> >> > >
> >> >> > > I listed all objects in the .rgw.buckets pool and summed them per
> >> >> > > bucket, the top 5:
> >> >> > >
> >> >> > >  783844 default.25918901.102486
> >> >> > >  876013 default.25918901.3
> >> >> > > 3325825 default.24201682.7
> >> >> > > 6324217 default.84795862.29891
> >> >> > > 7805208 default.25933378.233873
> >> >> > >
> >> >> > > So I started to rados_stat() (using Python) all the objects in
> the
> >> >> > > last three pools. While these stat() calls are still running. I
> >> >> > > statted
> >> >> > > about 30% of the objects and their total size is already
> >> >> > > 17511GB/17TB.
> >> >> > >
> >> >> > > size_kb_actual summed up for bucket default.24201682.7,
> >> >> > > default.84795862.29891 and default.25933378.233873 sums up to
> 12TB.
> >> >> > >
> >> >> > > So I'm currently at 30% of statting the objects and I'm already
> 5TB
> >> >> > > over the total size of these buckets.
> >> >> > >
> >> >> >
> >> >> > The stat calls have finished. The grant total is 65TB.
> >> >> >
> >> >> > So while the buckets should consume only 12TB they seems to occupy
> >> >> > 65TB
> >> >> > of storage.
> >> >> >
> >> >> > > What I noticed is that it's mainly *shadow* objects which are all
> >> >> > > 4MB
> >> >> > > in size.
> >> >> > >
> >> >> > > I know that 'radosgw-admin orphans find --pool=.rgw.buckets
> >> >> > > --job-id=xyz' should also do this for me, but as mentioned, this
> >> >> > > keeps
> >> >> > > looping and hangs.
> >> >> > >
> >> >> >
> >> >> > I started this tool about 20 hours ago:
> >> >> >
> >> >> > # radosgw-admin orphans find --pool=.rgw.buckets --job-id=wido1
> >> >> > --debug-rados=10 2>&1|gzip > orphans.find.wido1.log.gz
> >> >> >
> >> >> > It now shows me this in the logs while it is still running:
> >> >> >
> >> >> > 2016-12-24 13:41:00.989876 7ff6844d29c0 10 librados: omap-set-vals
> >> >> > oid=orphan.scan.wido1.linked.27 nspace=
> >> >> > 2016-12-24 13:41:00.993271 7ff6844d29c0 10 librados: Objecter
> >> >> > returned
> >> >> > from omap-set-vals r=0
> >> >> > storing 2 entries at orphan.scan.wido1.linked.28
> >> >> > 2016-12-24 13:41:00.993311 7ff6844d29c0 10 librados: omap-set-vals
> >> >> > oid=orphan.scan.wido1.linked.28 nspace=
> >> >> > storing 1 entries at orphan.scan.wido1.linked.31
> >> >> > 2016-12-24 13:41:00.995698 7ff6844d29c0 10 librados: Objecter
> >> >> > returned
> >> >> > from omap-set-vals r=0
> >> >> > 2016-12-24 13:41:00.995787 7ff6844d29c0 10 librados: omap-set-vals
> >> >> > oid=orphan.scan.wido1.linked.31 nspace=
> >> >> > storing 1 entries at orphan.scan.wido1.linked.33
> >> >> > 2016-12-24 13:41:00.997730 7ff6844d29c0 10 librados: Objecter
> >> >> > returned
> >> >> > from omap-set-vals r=0
> >> >> > 2016-12-24 13:41:00.997776 7ff6844d29c0 10 librados: omap-set-vals
> >> >> > oid=orphan.scan.wido1.linked.33 nspace=
> >> >> > 2016-12-24 13:41:01.000161 7ff6844d29c0 10 librados: Objecter
> >> >> > returned
> >> >> > from omap-set-vals r=0
> >> >> > storing 1 entries at orphan.scan.wido1.linked.35
> >> >> > 2016-12-24 13:41:01.000225 7ff6844d29c0 10 librados: omap-set-vals
> >> >> > oid=orphan.scan.wido1.linked.35 nspace=
> >> >> > 2016-12-24 13:41:01.002102 7ff6844d29c0 10 librados: Objecter
> >> >> > returned
> >> >> > from omap-set-vals r=0
> >> >> > storing 1 entries at orphan.scan.wido1.linked.36
> >> >> > 2016-12-24 13:41:01.002167 7ff6844d29c0 10 librados: omap-set-vals
> >> >> > oid=orphan.scan.wido1.linked.36 nspace=
> >> >> > storing 1 entries at orphan.scan.wido1.linked.39
> >> >> > 2016-12-24 13:41:01.004397 7ff6844d29c0 10 librados: Objecter
> >> >> > returned
> >> >> > from omap-set-vals r=0
> >> >> >
> >> >> > It seems to still be doing something, is that correct?
> >> >> >
> >> >>
> >> >> Giving this thread a gentle bump.
> >> >>
> >> >> There is a issue in the tracker for this:
> >> >> http://tracker.ceph.com/issues/18331
> >> >>
> >> >> In addition there is the issue that the orphan search stays in a
> >> >> endless
> >> >> loop: http://tracker.ceph.com/issues/18258
> >> >>
> >> >> This has been discussed multiple times on the ML but I never saw it
> >> >> getting resolved.
> >> >>
> >> >> Any ideas?
> >> >>
> >> >> Wido
> >> >>
> >> >> > Wido
> >> >> >
> >> >> > > So for now I'll probably resort to figuring out which RADOS
> objects
> >> >> > > are obsolete by matching against the bucket's index, but that's a
> >> >> > > lot of
> >> >> > > manual work.
> >> >> > >
> >> >> > > I'd rather fix the orphans find, so I will probably run that with
> >> >> > > high
> >> >> > > logging enabled so we can have some interesting information.
> >> >> > >
> >> >> > > In the meantime, any hints or suggestions?
> >> >> > >
> >> >> > > The cluster is running v10.2.5 btw.
> >> >> > >
> >> >> > > >
> >> >> > > > Orit
> >> >> > > >
> >> >> > > > >
> >> >> > > > > Sorry. I forgot to mention, that we've registered two issues
> on
> >> >> > > > > tracker:
> >> >> > > > > http://tracker.ceph.com/issues/18331
> >> >> > > > > http://tracker.ceph.com/issues/18258
> >> >> > > > >
> >> >> > > > > --
> >> >> > > > > Marius Vaitiekūnas
> >> >> > > > >
> >> >> > > > > _______________________________________________
> >> >> > > > > ceph-users mailing list
> >> >> > > > > ceph-users@lists.ceph.com
> >> >> > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >> > > > >
> >> >> > > > _______________________________________________
> >> >> > > > ceph-users mailing list
> >> >> > > > ceph-users@lists.ceph.com
> >> >> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >> > _______________________________________________
> >> >> > ceph-users mailing list
> >> >> > ceph-users@lists.ceph.com
> >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >> _______________________________________________
> >> >> ceph-users mailing list
> >> >> ceph-users@lists.ceph.com
> >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >
> >
> >
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rgw leaking data, orphan search loop

Reply via email to