Thanks for the prompt reply. The OSDs are set up on dedicated devices, and the mappings are in /etc/fstab. mount shows:
/dev/rssda on /var/lib/ceph/osd/ceph-0 type xfs (rw) and similar on all other nodes. Thx, dk On Mon, Mar 31, 2014 at 1:12 PM, Gregory Farnum <g...@inktank.com> wrote: > Well, you killed them as part of the reboot...they should have > restarted automatically when the system turned on, but that will > depend on your configuration and how they were set up. (Eg, if they > are each getting a dedicated hard drive, make sure the system knows > the drive is present.) > What version of the software are you running? > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > On Mon, Mar 31, 2014 at 1:00 PM, Dan Koren <d...@daterainc.com> wrote: > > Hi Greg, > > Thanks for the prompt response. > > Sure enough, I do see all the OSDs are now down. > > However, I do not understand the meaning of the > > sentence about killing the OSDs. This was an OS > > level reboot of the entire cluster, not issuing any > > ceph commands either before or after the restart. > > Doesn't Ceph recover transparently to the same > > state it was in before the cluster rebooted? > > Thx, > > dk > > > > On Mon, Mar 31, 2014 at 12:47 PM, Gregory Farnum <g...@inktank.com> > wrote: > >> > >> If you wait longer, you should see the remaining OSDs get marked down. > >> We detect down OSDs in two ways: > >> 1) OSDs heartbeat each other frequently and issue reports when the > >> heartbeat responses take too long. (This is the main way.) > >> 2) OSDs periodically send statistics to the monitors, and if these > >> statistics do not arrive for a *very* long time (roughly 15 minutes, > >> by default) the monitor will mark the OSD down. > >> > >> It looks like when restarting, you did it so that the first OSD was > >> marked down by the other OSDs in their timeframe (about 30 seconds), > >> but you killed the others quickly enough that they were never marked > >> down by the other. > >> -Greg > >> Software Engineer #42 @ http://inktank.com | http://ceph.com > >> > >> > >> On Mon, Mar 31, 2014 at 12:44 PM, Dan Koren <d...@daterainc.com> wrote: > >> > On a 4 node cluster (admin + 3 mon/osd nodes) I see the following > >> > shortly > >> > after rebooting the cluster and waiting for a couple of minutes: > >> > > >> > root@rts23:~# ps -ef | grep ceph && ceph osd tree > >> > root 4183 1 0 12:09 ? 00:00:00 /usr/bin/ceph-mon > >> > --cluster=ceph -i rts23 -f > >> > root 5771 5640 0 12:30 pts/0 00:00:00 grep --color=auto > ceph > >> > # id weight type name up/down reweight > >> > -1 0.94 root default > >> > -2 0.31 host rts22 > >> > 0 0.31 osd.0 down 0 > >> > -3 0.31 host rts21 > >> > 1 0.31 osd.1 up 1 > >> > -4 0.32 host rts23 > >> > 2 0.32 osd.2 up 1 > >> > > >> > > >> > It seems rather odd that ceph reports 2 OSDs up while ps does not show > >> > any OSD daemons running (ceph osd tree output is the same on all 4 > >> > nodes). > >> > > >> > ceph status shows: > >> > > >> > root@rts23:~# ceph status > >> > cluster 6149cebd-b619-4709-9fec-00fd8bc210a3 > >> > health HEALTH_WARN 192 pgs degraded; 192 pgs stale; 192 pgs stuck > >> > stale; 192 pgs > >> > stuck unclean; recovery 10242/20484 objects degraded (50.000%); 2/2 in > >> > osds > >> > are down; > >> > clock skew detected on mon.rts23 > >> > monmap e1: 3 mons at > >> > {rts21=172.29.0.21:6789/0,rts22=172.29.0.22:6789/0,rts23= > >> > 172.29.0.23:6789/0}, election epoch 12, quorum 0,1,2 > rts21,rts22,rts23 > >> > osdmap e25: 3 osds: 0 up, 2 in > >> > pgmap v445: 192 pgs, 3 pools, 40960 MB data, 10242 objects > >> > 10305 MB used, 641 GB / 651 GB avail > >> > 10242/20484 objects degraded (50.000%) > >> > 192 stale+active+degraded > >> > > >> > > >> > How can OSDs be "up" when no OSD daemons are running in the cluster? > >> > > >> > MTIA, > >> > > >> > dk > >> > > >> > Dan Koren > >> > Director of Software > >> > DATERA | 650.210.7910 | @dateranews > >> > d...@datera.io > >> > > >> > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com