Re: [ceph-users] OSD Restart results in "unfound objects"

Diego Castro Wed, 01 Jun 2016 13:49:10 -0700

Hello Samuel, i'm bit afraid of restarting my osd's again, i'll wait until
the weekend to push the config.
BTW, i just unset sortbitwise flag.



---
Diego Castro / The CloudFather
GetupCloud.com - Eliminamos a Gravidade

2016-06-01 13:39 GMT-03:00 Samuel Just <sj...@redhat.com>:

> Can either of you reproduce with logs?  That would make it a lot
> easier to track down if it's a bug.  I'd want
>
> debug osd = 20
> debug ms = 1
> debug filestore = 20
>
> On all of the osds for a particular pg from when it is clean until it
> develops an unfound object.
> -Sam
>
> On Wed, Jun 1, 2016 at 5:36 AM, Diego Castro
> <diego.cas...@getupcloud.com> wrote:
> > Hello Uwe, i also have sortbitwise flag enable and i have the exactly
> > behavior of yours.
> > Perhaps this is also the root of my issues, does anybody knows if is
> safe to
> > disable it?
> >
> >
> > ---
> > Diego Castro / The CloudFather
> > GetupCloud.com - Eliminamos a Gravidade
> >
> > 2016-06-01 7:17 GMT-03:00 Uwe Mesecke <u...@mesecke.net>:
> >>
> >>
> >> > Am 01.06.2016 um 10:25 schrieb Diego Castro
> >> > <diego.cas...@getupcloud.com>:
> >> >
> >> > Hello, i have a cluster running Jewel 10.2.0, 25 OSD's + 4 Mon.
> >> > Today my cluster suddenly went unhealth with lots of stuck pg's  due
> >> > unfound objects, no disks failures nor node crashes, it just went bad.
> >> >
> >> > I managed to put the cluster on health state again by marking lost
> >> > objects to delete "ceph pg <id> mark_unfound_lost delete".
> >> > Regarding the fact that i have no idea why the cluster gone bad, i
> >> > realized restarting the osd' daemons to unlock stuck clients put the
> cluster
> >> > on unhealth and pg gone stuck again due unfound objects.
> >> >
> >> > Does anyone have this issue?
> >>
> >> Hi,
> >>
> >> I also ran into that problem after upgrading to jewel. In my case I was
> >> able to somewhat correlate this behavior with setting the sortbitwise
> flag
> >> after the upgrade. When the flag is set, after some time these unfound
> >> objects are popping up. Restarting osds just makes it worse and/or makes
> >> these problems appear faster. When looking at the missing objects I can
> see
> >> that sometimes even region or zone configuration objects for radosgw are
> >> missing which I know are there because the radosgw was using these just
> >> before.
> >>
> >> After unsetting the sortbitwise flag, the PGs go back to normal, all
> >> previously unfound objects are found and the cluster becomes healthy
> again.
> >>
> >> Of course I’m not sure whether this is the real root of the problem or
> >> just a coincidence but I can reproduce this behavior every time.
> >>
> >> So for now the cluster is running without this flag. :-/
> >>
> >> Regards,
> >> Uwe
> >>
> >> >
> >> > ---
> >> > Diego Castro / The CloudFather
> >> > GetupCloud.com - Eliminamos a Gravidade
> >> > _______________________________________________
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD Restart results in "unfound objects"

Reply via email to