Hello David, Thanks for the update.
http://tracker.ceph.com/issues/13833#note-7 - As per this tracker they mentioned that the GUID may differ which cause udev were unable to chown ceph. We are following below procedure to create OSD's #sgdisk -Z /dev/sdb #ceph-disk prepare --bluestore --cluster ceph --cluster-uuid <fsid> /dev/vdb #ceph-disk --verbose activate /dev/vdb1 Here you can see all the device haiving same GUID. #for i in b c d ; do /usr/sbin/blkid -o udev -p /dev/vd$i\1 | grep ID_PART_ENTRY_TYPE; done ID_PART_ENTRY_TYPE=4fbd7e29-9d25-41b8-afd0-062c0ceff05d ID_PART_ENTRY_TYPE=4fbd7e29-9d25-41b8-afd0-062c0ceff05d ID_PART_ENTRY_TYPE=4fbd7e29-9d25-41b8-afd0-062c0ceff05d Currently we are facing issue with the OSD activation while boot. which caused the OSD journal device mounted like this.. ~~~ /dev/sdh1 /var/lib/ceph/tmp/mnt.EayTmL ~~~ At the same time on OSD logs, we getting like, osd.2 can't able to find the mounted journal device hence it landed into failure state.. ~~~ May 26 15:40:39 cn1 ceph-osd: 2017-05-26 15:40:39.978072 7f1dc3bc2940 -1 #033[0;31m ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-2: (2) No such file or directory#033[0m May 26 15:40:39 cn1 systemd: ceph-osd@2.service: main process exited, code=exited, status=1/FAILURE May 26 15:40:39 cn1 systemd: Unit ceph-osd@2.service entered failed state. May 26 15:40:39 cn1 systemd: ceph-osd@2.service failed. ~~~ To fix this problem, we are following below workaround... #umount /var/lib/ceph/tmp/mnt.om4Lbq Mount the device with respective osd number. #mount /dev/sdb1 /var/lib/ceph/osd/ceph-2 Then start the osd. #systemctl start ceph-osd@2.service. We notice below services fail at the same time. === systemctl --failed UNIT LOAD ACTIVE SUB DESCRIPTION ● var-lib-ceph-tmp-mnt.UiCYFu.mount not-found failed failed var-lib-ceph-tmp-mnt.UiCYFu.mount ● ceph-disk@dev-sdc1.service loaded failed failed Ceph disk activation: /dev/sdc1 ● ceph-disk@dev-sdd1.service loaded failed failed Ceph disk activation: /dev/sdd1 ● ceph-disk@dev-sdd2.service loaded failed failed Ceph disk activation: /dev/sdd2 === Need your suggestion to proceed further Thanks Jayaram On Tue, Jun 13, 2017 at 7:30 PM, David Turner <drakonst...@gmail.com> wrote: > I came across this a few times. My problem was with journals I set up by > myself. I didn't give them the proper GUID partition type ID so the udev > rules didn't know how to make sure the partition looked correct. What the > udev rules were unable to do was chown the journal block device as > ceph:ceph so that it could be opened by the Ceph user. You can test by > chowning the journal block device and try to start the OSD again. > > Alternatively if you want to see more information, you can start the > daemon manually as opposed to starting it through systemd and see what its > output looks like. > > On Tue, Jun 13, 2017 at 6:32 AM nokia ceph <nokiacephus...@gmail.com> > wrote: > >> Hello, >> >> Some osd's not getting activated after a reboot operation which cause >> that particular osd's landing in failed state. >> >> Here you can see mount points were not getting updated to osd-num and >> mounted as a incorrect mount point, which caused osd.<num> can't able to >> mount/activate the osd's. >> >> Env:- RHEL 7.2 - EC 4+1, v11.2.0 bluestore. >> >> #grep mnt proc/mounts >> /dev/sdh1 /var/lib/ceph/tmp/mnt.om4Lbq xfs >> rw,noatime,attr2,inode64,sunit=512,swidth=512,noquota >> 0 0 >> /dev/sdh1 /var/lib/ceph/tmp/mnt.EayTmL xfs >> rw,noatime,attr2,inode64,sunit=512,swidth=512,noquota >> 0 0 >> >> From /var/log/messages.. >> >> -- >> May 26 15:39:58 cn1 systemd: Starting Ceph disk activation: /dev/sdh2... >> May 26 15:39:58 cn1 systemd: Starting Ceph disk activation: /dev/sdh1... >> >> >> May 26 15:39:58 cn1 systemd: *start request repeated too quickly for* >> ceph-disk@dev-sdh2.service => suspecting this could be root cause. >> May 26 15:39:58 cn1 systemd: Failed to start Ceph disk activation: >> /dev/sdh2. >> May 26 15:39:58 cn1 systemd: Unit ceph-disk@dev-sdh2.service entered >> failed state. >> May 26 15:39:58 cn1 systemd: ceph-disk@dev-sdh2.service failed. >> May 26 15:39:58 cn1 systemd: start request repeated too quickly for >> ceph-disk@dev-sdh1.service >> May 26 15:39:58 cn1 systemd: Failed to start Ceph disk activation: >> /dev/sdh1. >> May 26 15:39:58 cn1 systemd: Unit ceph-disk@dev-sdh1.service entered >> failed state. >> May 26 15:39:58 cn1 systemd: ceph-disk@dev-sdh1.service failed. >> -- >> >> But this issue will occur intermittently after a reboot operation. >> >> Note;- We haven't face this problem in Jewel. >> >> Awaiting for comments. >> >> Thanks >> Jayaram >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com