On Tue, Nov 17, 2015 at 7:27 AM, Joao Eduardo Luis <j...@suse.de> wrote:

> On 11/17/2015 03:56 AM, Jose Tavares wrote:
> > The problem is that I think I don't have any good monitor anymore.
> > How do I know if the map I am trying is ok?
> >
> > I also saw in the logs that the primary mon was trying to contact a
> > removed mon at IP .112 .. So, I added .112 again ... and it didn't help.
> >
> > Attached are the logs of what is going on and some monmaps that I
> > capture that were from minutes before the cluster become inaccessible ..
> >
> > Should I try inject this monmaps in my primary mon to see if it can
> > recover the cluster?
> > Is it possible to see if this monmaps match my content?
>
> Without access to the actual store.db there's no way to ascertain if the
> store has any problems, and even then figuring out a potential
> corruption from just one monitor store.db would either be impossible or
> impractical.
>

I posted my store.db in my previous answer ..



>
> That said, from the log you attached it seems you only have issues with
> authentication: you have pgmaps from epoch 91923 through to 92589, you
> have an mds map (epoch 38), osdmaps at least through epoch 307, and 40
> versions for the auth keys.
>
> Somehow, though, your monitors are unable to authenticate each other. No
> way to tell if that was corruption or user error.
>
> You should be able to get your monitors back to speaking terms again
> simply by disabling cephx temporarily. Then you can figure out whatever
> you need to figure out in terms of monitor keys.
>
> Just update your ceph.conf with 'auth supported = none' and restart the
> monitors. See how it goes from there.
>

I tried your suggestion and it didn't make any change to the results .. :(

Thanks a lot.
Jose Tavares



> HTH
>
>   -Joao
>
>
>
> >
> > Thanks a lot.
> > Jose Tavares
> >
> >
> >
> >
> >
> > On Mon, Nov 16, 2015 at 10:48 PM, Nathan Harper
> > <nathan.har...@cfms.org.uk <mailto:nathan.har...@cfms.org.uk>> wrote:
> >
> >     I had to go through a similar process when we had a disaster which
> >     destroyed one of our monitors.   I followed the process here:
> >     REMOVING MONITORS FROM AN UNHEALTHY CLUSTER
> >     <http://docs.ceph.com/docs/hammer/rados/operations/add-or-rm-mons/>
> to
> >     remove all but one monitor, which let me bring the cluster back up.
> >
> >     As you are running an older version of Ceph than hammer, some of the
> >     commands might differ (perhaps this might
> >     help
> http://docs.ceph.com/docs/v0.80/rados/operations/add-or-rm-mons/)
> >
> >
> >     --
> >     *Nathan Harper*// IT Systems Architect
> >
> >     *e: * nathan.har...@cfms.org.uk <mailto:nathan.har...@cfms.org.uk>
> >     // *t: * 0117 906 1104 // *m: * 07875 510891 // *w: *
> >     www.cfms.org.uk <http://www.cfms.org.uk%22> // Linkedin grey icon
> >     scaled <http://uk.linkedin.com/pub/nathan-harper/21/696/b81>
> >     CFMS Services Ltd// Bristol & Bath Science Park // Dirac Crescent //
> >     Emersons Green // Bristol // BS16 7FR
> >
> >     CFMS Services Ltd is registered in England and Wales No 05742022 - a
> >     subsidiary of CFMS Ltd
> >     CFMS Services Ltd registered office // Victoria House // 51 Victoria
> >     Street // Bristol // BS1 6AD
> >
> >     On 16 November 2015 at 16:50, Jose Tavares <j...@terra.com.br
> >     <mailto:j...@terra.com.br>> wrote:
> >
> >         Hi guys ...
> >         I need some help as my cluster seems to be corrupted.
> >
> >         I saw here ..
> >
> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg01919.html
> >         .. a msg from 2013 where Peter had a problem with his monitors.
> >
> >         I had the same problem today when trying to add a new monitor,
> >         and than playing with monmap as the monitors were not entering
> >         the quorum. I'm using version 0.80.8.
> >
> >         Right now my cluster won't start because of a corrupted monitor.
> >         Is it possible to remove all monitors and create just a new one
> >         without losing data? I have ~260GB of data with work from 2
> weeks.
> >
> >         What should I do? Do you recommend any specific procedure?
> >
> >         Thanks a lot.
> >         Jose Tavares
> >
> >         _______________________________________________
> >         ceph-users mailing list
> >         ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> >         http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to