Hi,
ceph-deploy-1.5.3 can make trouble, if a reboot is done between
preparation and aktivation of an osd:
The osd-disk was /dev/sdb at this time, osd itself should go to sdb1,
formatted to cleared, journal should go to sdb2, formatted to btrfs
I prepared an osd:
root@bd-a:/etc/ceph# ceph-deploy -v --overwrite-conf osd --fs-type btrfs
prepare bd-1:/dev/sdb1:/dev/sdb2
[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.3): /usr/bin/ceph-deploy -v
--overwrite-conf osd --fs-type btrfs prepare bd-1:/dev/sdb1:/dev/sdb2
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks
bd-1:/dev/sdb1:/dev/sdb2
[bd-1][DEBUG ] connected to host: bd-1
[bd-1][DEBUG ] detect platform information from remote host
[bd-1][DEBUG ] detect machine type
[ceph_deploy.osd][INFO ] Distro info: Ubuntu 14.04 trusty
[ceph_deploy.osd][DEBUG ] Deploying osd to bd-1
[bd-1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[bd-1][INFO ] Running command: udevadm trigger --subsystem-match=block
--action=add
[ceph_deploy.osd][DEBUG ] Preparing host bd-1 disk /dev/sdb1 journal
/dev/sdb2 activate False
[bd-1][INFO ] Running command: ceph-disk-prepare --fs-type btrfs
--cluster ceph -- /dev/sdb1 /dev/sdb2
[bd-1][DEBUG ]
[bd-1][DEBUG ] WARNING! - Btrfs v3.12 IS EXPERIMENTAL
[bd-1][DEBUG ] WARNING! - see http://btrfs.wiki.kernel.org before using
[bd-1][DEBUG ]
[bd-1][DEBUG ] fs created label (null) on /dev/sdb1
[bd-1][DEBUG ] nodesize 32768 leafsize 32768 sectorsize 4096 size 19.99TiB
[bd-1][DEBUG ] Btrfs v3.12
[bd-1][WARNIN] WARNING:ceph-disk:OSD will not be hot-swappable if
journal is not the same device as the osd data
[bd-1][WARNIN] Turning ON incompat feature 'extref': increased hardlink
limit per file to 65536
[bd-1][WARNIN] Error: Partition(s) 1 on /dev/sdb1 have been written, but
we have been unable to inform the kernel of the change, probably because
it/they are in use. As a result, the old partition(s) will remain in
use. You should reboot now before making further changes.
[bd-1][INFO ] checking OSD status...
[bd-1][INFO ] Running command: ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host bd-1 is now ready for osd use.
Unhandled exception in thread started by
sys.excepthook is missing
lost sys.stderr
ceph-deploy told me to do a reboot, so i did.
After the reboot the osd-disk changed from sdb to sda. This is a known
problem of linux (ubuntu)
root@bd-a:/etc/ceph# ceph-deploy -v osd activate bd-1:/dev/sda1:/dev/sda2
[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.3): /usr/bin/ceph-deploy -v osd
activate bd-1:/dev/sda1:/dev/sda2
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks
bd-1:/dev/sda1:/dev/sda2
[bd-1][DEBUG ] connected to host: bd-1
[bd-1][DEBUG ] detect platform information from remote host
[bd-1][DEBUG ] detect machine type
[ceph_deploy.osd][INFO ] Distro info: Ubuntu 14.04 trusty
[ceph_deploy.osd][DEBUG ] activating host bd-1 disk /dev/sda1
[ceph_deploy.osd][DEBUG ] will use init type: upstart
[bd-1][INFO ] Running command: ceph-disk-activate --mark-init upstart
--mount /dev/sda1
[bd-1][WARNIN] got monmap epoch 1
[bd-1][WARNIN] HDIO_DRIVE_CMD(identify) failed: Invalid argument
[bd-1][WARNIN] 2014-06-10 11:45:07.222697 7f5c111af800 -1 journal check:
ondisk fsid c8ce6ee2-f21b-4ba3-a20e-649224244b9a doesn't match expected
fcaaf66f-b7b7-4702-83a4-54832b7131fa, invalid (someone else's?) journal
[bd-1][WARNIN] HDIO_DRIVE_CMD(identify) failed: Invalid argument
[bd-1][WARNIN] HDIO_DRIVE_CMD(identify) failed: Invalid argument
[bd-1][WARNIN] HDIO_DRIVE_CMD(identify) failed: Invalid argument
[bd-1][WARNIN] 2014-06-10 11:45:08.125384 7f5c111af800 -1
filestore(/var/lib/ceph/tmp/mnt.LryOxo) could not find
23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
[bd-1][WARNIN] 2014-06-10 11:45:08.320327 7f5c111af800 -1 created object
store /var/lib/ceph/tmp/mnt.LryOxo journal
/var/lib/ceph/tmp/mnt.LryOxo/journal for osd.4 fsid
08066b4a-3f36-4e3f-bd1e-15c006a09057
[bd-1][WARNIN] 2014-06-10 11:45:08.320367 7f5c111af800 -1 auth: error
reading file: /var/lib/ceph/tmp/mnt.LryOxo/keyring: can't open
/var/lib/ceph/tmp/mnt.LryOxo/keyring: (2) No such file or directory
[bd-1][WARNIN] 2014-06-10 11:45:08.320419 7f5c111af800 -1 created new
key in keyring /var/lib/ceph/tmp/mnt.LryOxo/keyring
[bd-1][WARNIN] added key for osd.4
[bd-1][INFO ] checking OSD status...
[bd-1][INFO ] Running command: ceph --cluster=ceph osd stat --format=json
[bd-1][WARNIN] there are 2 OSDs down
[bd-1][WARNIN] there are 2 OSDs out
root@bd-a:/etc/ceph# ceph -s
cluster 08066b4a-3f36-4e3f-bd1e-15c006a09057
health HEALTH_WARN 679 pgs degraded; 992 pgs stuck unclean;
recovery 19/60 objects degraded (31.667%); clock skew detected on mon.bd-1
monmap e1: 3 mons at
{bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0},
election epoch 4034, quorum 0,1,2 bd-0,bd-1,bd-2
mdsmap e2815: 1/1/1 up {0=bd-2=up:active}, 2 up:standby
osdmap e1717: 6 osds: 4 up, 4 in
pgmap v46008: 992 pgs, 11 pools, 544 kB data, 20 objects
10324 MB used, 125 TB / 125 TB avail
19/60 objects degraded (31.667%)
2 active
679 active+degraded
311 active+remapped
root@bd-a:/etc/ceph# ceph osd tree
# id weight type name up/down reweight
-1 189.1 root default
-2 63.63 host bd-0
0 43.64 osd.0 up 1
3 19.99 osd.3 up 1
-3 63.63 host bd-1
1 43.64 osd.1 down 0
4 19.99 osd.4 down 0
-4 61.81 host bd-2
2 43.64 osd.2 up 1
5 18.17 osd.5 up 1
At this time i rebooted bd-1 once more and the osd-disk now was /dev/sdb.
So i tried once more to activate the osd:
root@bd-a:/etc/ceph# ceph-deploy -v osd activate bd-1:/dev/sdb1:/dev/sdb2
[ceph_deploy.conf][DEBUG ] found configuration file at:
/root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.3): /usr/bin/ceph-deploy -v osd
activate bd-1:/dev/sdb1:/dev/sdb2
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks
bd-1:/dev/sdb1:/dev/sdb2
[bd-1][DEBUG ] connected to host: bd-1
[bd-1][DEBUG ] detect platform information from remote host
[bd-1][DEBUG ] detect machine type
[ceph_deploy.osd][INFO ] Distro info: Ubuntu 14.04 trusty
[ceph_deploy.osd][DEBUG ] activating host bd-1 disk /dev/sdb1
[ceph_deploy.osd][DEBUG ] will use init type: upstart
[bd-1][INFO ] Running command: ceph-disk-activate --mark-init upstart
--mount /dev/sdb1
[bd-1][INFO ] checking OSD status...
[bd-1][INFO ] Running command: ceph --cluster=ceph osd stat --format=json
[bd-1][WARNIN] there are 2 OSDs down
[bd-1][WARNIN] there are 2 OSDs out
root@bd-a:/etc/ceph# ceph osd tree
# id weight type name up/down reweight
-1 189.1 root default
-2 63.63 host bd-0
0 43.64 osd.0 up 1
3 19.99 osd.3 up 1
-3 63.63 host bd-1
1 43.64 osd.1 down 0
4 19.99 osd.4 down 0
-4 61.81 host bd-2
2 43.64 osd.2 up 1
5 18.17 osd.5 up 1
root@bd-a:/etc/ceph# ceph -s
cluster 08066b4a-3f36-4e3f-bd1e-15c006a09057
health HEALTH_WARN 679 pgs degraded; 992 pgs stuck unclean;
recovery 10/60 objects degraded (16.667%); clock skew detected on mon.bd-1
monmap e1: 3 mons at
{bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0},
election epoch 4060, quorum 0,1,2 bd-0,bd-1,bd-2
mdsmap e2823: 1/1/1 up {0=bd-2=up:active}, 2 up:standby
osdmap e1759: 6 osds: 4 up, 4 in
pgmap v46110: 992 pgs, 11 pools, 544 kB data, 20 objects
10320 MB used, 125 TB / 125 TB avail
10/60 objects degraded (16.667%)
679 active+degraded
313 active+remapped
root@bd-a:/etc/ceph#
After another reboot all was ok:
ceph -s
cluster 08066b4a-3f36-4e3f-bd1e-15c006a09057
health HEALTH_OK
monmap e1: 3 mons at
{bd-0=xxx.xxx.xxx.20:6789/0,bd-1=xxx.xxx.xxx.21:6789/0,bd-2=xxx.xxx.xxx.22:6789/0},
election epoch 4220, quorum 0,1,2 bd-0,bd-1,bd-2
mdsmap e2895: 1/1/1 up {0=bd-2=up:active}, 2 up:standby
osdmap e1939: 6 osds: 6 up, 6 in
pgmap v47099: 992 pgs, 11 pools, 551 kB data, 20 objects
117 MB used, 189 TB / 189 TB avail
992 active+clean
root@bd-a:~#
Is it possible for the author of ceph-deploy, to make the reboot
needlessly during these 2 steps ?
Then it would also be possible to use create instead of prepare+activate
Thank you,
Markus
--
MfG,
Markus Goldberg
--------------------------------------------------------------------------
Markus Goldberg Universität Hildesheim
Rechenzentrum
Tel +49 5121 88392822 Marienburger Platz 22, D-31141 Hildesheim, Germany
Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
--------------------------------------------------------------------------
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com