Re: [ceph-users] Speeding up garbage collection in RGW

David Turner Fri, 27 Oct 2017 08:39:22 -0700

I had the exact same error when using --bypass-gc.  We too decided to
destroy this realm and start it fresh.  For us, 95% of the data in this
realm is backups for other systems and they're find rebuilding it.  So our
plan is to migrate the 5% of the data to a temporary s3 location and then
rebuild this realm with brand-new pools, a fresh GC, and new settings. I
can add this realm to the offerings of tests to figure out options.  It's
running Jewel 10.2.7.


On Fri, Oct 27, 2017 at 11:26 AM Bryan Stillwell <bstillw...@godaddy.com>
wrote:

> On Wed, Oct 25, 2017 at 4:02 PM, Yehuda Sadeh-Weinraub <yeh...@redhat.com>
> wrote:
> >
> > On Wed, Oct 25, 2017 at 2:32 PM, Bryan Stillwell <bstillw...@godaddy.com>
> wrote:
> > > That helps a little bit, but overall the process would take years at
> this
> > > rate:
> > >
> > > # for i in {1..3600}; do ceph df -f json-pretty |grep -A7
> '".rgw.buckets"' |grep objects; sleep 60; done
> > >                  "objects": 1660775838
> > >                  "objects": 1660775733
> > >                  "objects": 1660775548
> > >                  "objects": 1660774825
> > >                  "objects": 1660774790
> > >                  "objects": 1660774735
> > >
> > > This is on a hammer cluster.  Would upgrading to Jewel or Luminous
> speed up
> > > this process at all?
> >
> > I'm not sure it's going to help much, although the omap performance
> > might improve there. The big problem is that the omaps are just too
> > big, so that every operation on them take considerable time. I think
> > the best way forward there is to take a list of all the rados objects
> > that need to be removed from the gc omaps, and then get rid of the gc
> > objects themselves (newer ones will be created, this time using the
> > new configurable). Then remove the objects manually (and concurrently)
> > using the rados command line tool.
> > The one problem I see here is that even just removal of objects with
> > large omaps can affect the availability of the osds that hold these
> > objects. I discussed that now with Josh, and we think the best way to
> > deal with that is not to remove the gc objects immediatly, but to
> > rename the gc pool, and create a new one (with appropriate number of
> > pgs). This way new gc entries will now go into the new gc pool (with
> > higher number of gc shards), and you don't need to remove the old gc
> > objects (thus no osd availability problem). Then you can start
> > trimming the old gc objects (on the old renamed pool) by using the
> > rados command. It'll take a very very long time, but the process
> > should pick up speed slowly, as the objects shrink.
>
> That's fine for us.  We'll be tearing down this cluster in a few weeks
> and adding the nodes to the new cluster we created.  I just wanted to
> explore other options now that we can use it as a test cluster.
>
> The solution you described with renaming the .rgw.gc pool and creating a
> new one is pretty interesting.  I'll have to give that a try, but until
> then I've been trying to remove some of the other buckets with the
> --bypass-gc option and it keeps dying with output like this:
>
> # radosgw-admin bucket rm --bucket=sg2pl5000 --purge-objects --bypass-gc
> 2017-10-27 08:00:00.865993 7f2b387228c0  0 RGWObjManifest::operator++():
> result: ofs=1488744 stripe_ofs=1488744 part_ofs=0 rule->part_size=0
> 2017-10-27 08:00:04.385875 7f2b387228c0  0 RGWObjManifest::operator++():
> result: ofs=673900 stripe_ofs=673900 part_ofs=0 rule->part_size=0
> 2017-10-27 08:00:04.517241 7f2b387228c0  0 RGWObjManifest::operator++():
> result: ofs=1179224 stripe_ofs=1179224 part_ofs=0 rule->part_size=0
> 2017-10-27 08:00:05.791876 7f2b387228c0  0 RGWObjManifest::operator++():
> result: ofs=566620 stripe_ofs=566620 part_ofs=0 rule->part_size=0
> 2017-10-27 08:00:26.815081 7f2b387228c0  0 RGWObjManifest::operator++():
> result: ofs=1090645 stripe_ofs=1090645 part_ofs=0 rule->part_size=0
> 2017-10-27 08:00:46.757556 7f2b387228c0  0 RGWObjManifest::operator++():
> result: ofs=1488744 stripe_ofs=1488744 part_ofs=0 rule->part_size=0
> 2017-10-27 08:00:47.093813 7f2b387228c0 -1 ERROR: could not drain handles
> as aio completion returned with -2
>
>
> I can typically make further progress by running it again:
>
> # radosgw-admin bucket rm --bucket=sg2pl5000 --purge-objects --bypass-gc
> 2017-10-27 08:20:57.310859 7fae9c3d48c0  0 RGWObjManifest::operator++():
> result: ofs=673900 stripe_ofs=673900 part_ofs=0 rule->part_size=0
> 2017-10-27 08:20:57.406684 7fae9c3d48c0  0 RGWObjManifest::operator++():
> result: ofs=1179224 stripe_ofs=1179224 part_ofs=0 rule->part_size=0
> 2017-10-27 08:20:57.808050 7fae9c3d48c0 -1 ERROR: could not drain handles
> as aio completion returned with -2
>
>
> and again:
>
> # radosgw-admin bucket rm --bucket=sg2pl5000 --purge-objects --bypass-gc
> 2017-10-27 08:22:04.992578 7ff8071038c0  0 RGWObjManifest::operator++():
> result: ofs=566620 stripe_ofs=566620 part_ofs=0 rule->part_size=0
> 2017-10-27 08:22:05.726485 7ff8071038c0 -1 ERROR: could not drain handles
> as aio completion returned with -2
>
>
> What does this error mean, and is there any way to keep it from dying
> like this?  This cluster is running 0.94.10, but I can upgrade it to jewel
> pretty easily if you would like.
>
> Thanks,
> Bryan
>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Speeding up garbage collection in RGW

Reply via email to