Re: [ceph-users] After reboot nothing worked

Umar Draz Tue, 17 Dec 2013 02:40:03 -0800

Hi Karan,

Thanks for your reply, OK I have spent some time on it and finally found a
problem regarding this issue


1) If I reboot any of the node, and when its back then the OSD service are
not start due to unmount of /var/lib/ceph/osd/ceph-0
    then I manually edit /etc/fstab and add the mount point of ceph osd
storage e.g.

    *UUID=142136cd-8325-44a7-ad67-80fe19ed3873 /var/lib/ceph/osd/ceph-0 xfs
defaults,noatime*

   the above fixed the iss. Now question: is the valid approach? and why on
reboot ceph not activated the osd drive?

2) After fixing the above issue I again reboot my all nodes, now this time
there is another warning

*    health HEALTH_WARN clock skew detected on mon.vms2*

    here is the output

     health HEALTH_WARN clock skew detected on mon.vms2
     monmap e1: 2 mons at {vms1=
192.168.1.128:6789/0,vms2=192.168.1.129:6789/0}, election epoch 14, quorum
0,1 vms1,vms2
     mdsmap e11: 1/1/1 up {0=vms1=up:active}
     osdmap e36: 3 osds: 3 up, 3 in


My current setup is 3 osd, 2 mons and 1 msd

Br.

Umar


On Tue, Dec 17, 2013 at 2:54 PM, Karan Singh <ksi...@csc.fi> wrote:

> Umar
>
> *Ceph is stable for production* , there are a large number of ceph
> clusters deployed and running smoothly in PRODUCTIONS and countless in
> testing / pre-production.
>
> Since you are facing problems with your ceph testing , it does not mean
> CEPH is unstable.
>
> I would suggest put some time troubleshooting your problem.
>
> What i see from your logs  --
>
>  1) you have 2 Mons thats a problem ( either have 1  or have 3 to form
> quorum ) . Add 1 more monitor node
>  2)  out of 2 OSD , only 1 is IN , check where is the other one and try
> bringing both of them UP . Add few more OSD's to remove health warning . 2
> OSD is a very less numbers for OSD
>
> Many Thanks
> Karan Singh
>
>
> ------------------------------
> *From: *"Umar Draz" <unix...@gmail.com>
> *To: *ceph-us...@ceph.com
> *Sent: *Tuesday, 17 December, 2013 8:51:27 AM
> *Subject: *[ceph-users] After reboot nothing worked
>
>
> Hello,
>
> I have 2 node ceph cluster, I just rebooted both of the host just for
> testing that after rebooting the cluster remain work or not, and the result
> was cluster unable to start.
>
> here is ceph -s output
>
>      health HEALTH_WARN 704 pgs stale; 704 pgs stuck stale; mds cluster is
> degraded; 1/1 in osds are down; clock skew detected on mon.kvm2
>      monmap e2: 2 mons at {kvm1=
> 192.168.214.10:6789/0,kvm2=192.168.214.11:6789/0}, election epoch 16,
> quorum 0,1 kvm1,kvm2
>      mdsmap e13: 1/1/1 up {0=kvm1=up:replay}
>      osdmap e29: 2 osds: 0 up, 1 in
>       pgmap v68: 704 pgs, 4 pools, 9603 bytes data, 23 objects
>             1062 MB used, 80816 MB / 81879 MB avail
>                  704 stale+active+clean
>
> according to this useless documentation.
>
> http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/
>
> I tried ceph osd tree
>
> the output was
>
> # id    weight  type name       up/down reweight
> -1      0.16    root default
> -2      0.07999         host kvm1
> 0       0.07999                 osd.0   down    1
> -3      0.07999         host kvm2
> 1       0.07999                 osd.1   down    0
>
> Then i tried
>
> sudo /etc/init.d/ceph -a start osd.0
> sudo /etc/init.d/ceph -a start osd.1
>
> to start the osd on both host the result was
>
> /etc/init.d/ceph: osd.0 not found (/etc/ceph/ceph.conf defines ,
> /var/lib/ceph defines )
>
> /etc/init.d/ceph: osd.1 not found (/etc/ceph/ceph.conf defines ,
> /var/lib/ceph defines )
>
> Now question is what is this? is really ceph is stable? can we use this
> for production environment?
>
> My both host has ntp running the time is upto date.
>
> Br.
>
> Umar
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
Umar Draz
Network Architect

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] After reboot nothing worked

Reply via email to