The only clue I have run across so far is that the osd daemons ceph-deploy 
attempts to create on the failing OSD server (osd3) are two of the same 
osd-id's just created on the last osd server deployed (osd2). So from the osd 
tree listing - osd1 has osd.0, osd.1, osd.2 and osd.3. The next server, osd2 
has the next 4 in the correct order osd.4, osd.5, osd.6, and osd.7. The failing 
osd server should have started with osd.8 through osd.11, instead it is reusing 
osd.5 and osd.6. These are also the only log files in var/log/ceph on osd3 
server which contain only the following entry repeated over and over again:

2018-02-07 08:09:33.077286 7f264e6a8800  0 set uid:gid to 167:167 (ceph:ceph)
2018-02-07 08:09:33.077321 7f264e6a8800  0 ceph version 10.2.10 
(5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe), process ceph-osd, pid 4923
2018-02-07 08:09:33.077572 7f264e6a8800 -1  ** ERROR: unable to open OSD 
superblock on /var/lib/ceph/osd/ceph-5: (2) No such file or directory


The outputs from list and osd tree follow:

[osd3][DEBUG ] connected to host: osd3
[osd3][DEBUG ] detect platform information from remote host
[osd3][DEBUG ] detect machine type
[osd3][DEBUG ] find the location of an executable
[osd3][INFO  ] Running command: /usr/sbin/ceph-disk list
[osd3][INFO  ] ----------------------------------------
[osd3][INFO  ] ceph-5
[osd3][INFO  ] ----------------------------------------
[osd3][INFO  ] Path           /var/lib/ceph/osd/ceph-5
[osd3][INFO  ] ID             5
[osd3][INFO  ] Name           osd.5
[osd3][INFO  ] Status         up
[osd3][INFO  ] Reweight       1.0
[osd3][INFO  ] ----------------------------------------
[osd3][INFO  ] ----------------------------------------
[osd3][INFO  ] ceph-6
[osd3][INFO  ] ----------------------------------------
[osd3][INFO  ] Path           /var/lib/ceph/osd/ceph-6
[osd3][INFO  ] ID             6
[osd3][INFO  ] Name           osd.6
[osd3][INFO  ] Status         up
[osd3][INFO  ] Reweight       1.0
[osd3][INFO  ] ----------------------------------------


[cephuser@groot cephcluster]$ sudo ceph osd tree
ID WEIGHT  TYPE NAME     UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 1.06311 root default
-2 0.53156     host osd1
 0 0.13289         osd.0      up  1.00000          1.00000
 1 0.13289         osd.1      up  1.00000          1.00000
 2 0.13289         osd.2      up  1.00000          1.00000
 3 0.13289         osd.3      up  1.00000          1.00000
-3 0.53156     host osd2
 4 0.13289         osd.4      up  1.00000          1.00000
 5 0.13289         osd.5      up  1.00000          1.00000
 6 0.13289         osd.6      up  1.00000          1.00000
 7 0.13289         osd.7      up  1.00000          1.00000
[cephuser@groot cephcluster]$

________________________________
From: ceph-users <ceph-users-boun...@lists.ceph.com> on behalf of Андрей 
<andrey_...@mail.ru>
Sent: Thursday, February 8, 2018 6:40:16 AM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Unable to activate OSD's


I have the same problem.
Configuration:
4 HW servers Debian GNU/Linux 9.3 (stretch)
Ceph luminous 12.2.2

Now I installed on these servers ceph version 10.2.10, OSDs activate is fine.



Среда, 7 февраля 2018, 19:54 +03:00 от "Cranage, Steve" 
<scran...@deepspacestorage.com>:


Greetings ceph-users. I have been trying to build a test cluster in a KVM 
environment - something I have done before successfully before but this time 
I'm running into an issue I can't seem to get past. My Internet searches have 
shown instances of this by other users that involved either ownership problems 
with the OSD devices, or partition UID's needing to be set. Neither of these 
problems seem to be in play here.


The cluster is on centos 7, running Ceph 10.2.10. I have configured one mon, 
and 3 OSD servers with 4 disks each, and each is set to journal on a separate 
partition of an SSD, one SSD per VM. I have built this VM environment several 
times now, and recently I always have the same issue on at least one of my VM 
OSD's and I cannot seem to get any hints of where the problem lies from the 
sparse information printed to the console during the failure.


In addition to setting partition ownerships to ceph:ceph and UIDs to one of the 
the values  "set_data_partition" says it expects, I also zeroed out the entire 
contents of both drives and re-partioned, but I still get the same results. The 
problem at present only occurs on one virtual server, the other 8 drives split 
between the other 2 VM OSD's had no issue with prepare or activate. I see no 
difference between this server or drive configuration vs the other two that run 
fine.


Hopefully someone can at least point me to some more fruitful log information, 
"Failed to activate" isn't very helpful by itself. There is nothing in messages 
other than clean mount/unmount messages for the OSD data device being processed 
(in this case /dev/vdb1). BTW, I have also tried to repeat the same process 
without a separate journal device ( just using prepare/activate osd3:/dev/vdb1) 
and I got the same "Failed to activate" result.



[cephuser@groot cephcluster]$ ceph-deploy osd prepare osd3:/dev/vdb1:/dev/vdf1
[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/cephuser/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.39): /bin/ceph-deploy osd prepare 
osd3:/dev/vdb1:/dev/vdf1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  block_db                      : None
[ceph_deploy.cli][INFO  ]  disk                          : [('osd3', 
'/dev/vdb1', '/dev/vdf1')]
[ceph_deploy.cli][INFO  ]  dmcrypt                       : False
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  bluestore                     : None
[ceph_deploy.cli][INFO  ]  block_wal                     : None
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : prepare
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir               : 
/etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
<ceph_deploy.conf.cephdeploy.Conf instance at 0x2a7bdd0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  fs_type                       : xfs
[ceph_deploy.cli][INFO  ]  filestore                     : None
[ceph_deploy.cli][INFO  ]  func                          : <function osd at 
0x2a6f1b8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  zap_disk                      : False
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks osd3:/dev/vdb1:/dev/vdf1
[osd3][DEBUG ] connection detected need for sudo
[osd3][DEBUG ] connected to host: osd3
[osd3][DEBUG ] detect platform information from remote host
[osd3][DEBUG ] detect machine type
[osd3][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to osd3
[osd3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.osd][DEBUG ] Preparing host osd3 disk /dev/vdb1 journal /dev/vdf1 
activate False
[osd3][DEBUG ] find the location of an executable
[osd3][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v prepare --cluster 
ceph --fs-type xfs -- /dev/vdb1 /dev/vdf1
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph 
--show-config-value=fsid
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd 
--check-allows-journal -i 0 --log-file $run_dir/$cluster-osd-check.log 
--cluster ceph --setuser ceph --setgroup ceph
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd 
--check-wants-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster 
ceph --setuser ceph --setgroup ceph
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd 
--check-needs-journal -i 0 --log-file $run_dir/$cluster-osd-check.log --cluster 
ceph --setuser ceph --setgroup ceph
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is 
/sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph 
--show-config-value=osd_journal_size
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is 
/sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is 
/sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph 
--name=osd. --lookup osd_mkfs_options_xfs
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph 
--name=osd. --lookup osd_fs_mkfs_options_xfs
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph 
--name=osd. --lookup osd_mount_options_xfs
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph 
--name=osd. --lookup osd_fs_mount_options_xfs
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdf1 uuid path is 
/sys/dev/block/252:81/dm/uuid
[osd3][WARNIN] prepare_device: Journal /dev/vdf1 is a partition
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdf1 uuid path is 
/sys/dev/block/252:81/dm/uuid
[osd3][WARNIN] prepare_device: OSD will not be hot-swappable if journal is not 
the same device as the osd data
[osd3][WARNIN] command: Running command: /sbin/blkid -o udev -p /dev/vdf1
[osd3][WARNIN] prepare_device: Journal /dev/vdf1 was not prepared with 
ceph-disk. Symlinking directly.
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is 
/sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] set_data_partition: OSD data device /dev/vdb1 is a partition
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is 
/sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] command: Running command: /sbin/blkid -o udev -p /dev/vdb1
[osd3][WARNIN] populate_data_path_device: Creating xfs fs on /dev/vdb1
[osd3][WARNIN] command_check_call: Running command: /sbin/mkfs -t xfs -f -i 
size=2048 -- /dev/vdb1
[osd3][DEBUG ] meta-data=/dev/vdb1              isize=2048   agcount=4, 
agsize=8920960 blks
[osd3][DEBUG ]          =                       sectsz=512   attr=2, 
projid32bit=1
[osd3][DEBUG ]          =                       crc=1        finobt=0, sparse=0
[osd3][DEBUG ] data     =                       bsize=4096   blocks=35683840, 
imaxpct=25
[osd3][DEBUG ]          =                       sunit=0      swidth=0 blks
[osd3][DEBUG ] naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
[osd3][DEBUG ] log      =internal log           bsize=4096   blocks=17423, 
version=2
[osd3][DEBUG ]          =                       sectsz=512   sunit=0 blks, 
lazy-count=1
[osd3][DEBUG ] realtime =none                   extsz=4096   blocks=0, 
rtextents=0
[osd3][WARNIN] mount: Mounting /dev/vdb1 on /var/lib/ceph/tmp/mnt.EWuVuW with 
options noatime,inode64
[osd3][WARNIN] command_check_call: Running command: /usr/bin/mount -t xfs -o 
noatime,inode64 -- /dev/vdb1 /var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] command: Running command: /sbin/restorecon 
/var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] populate_data_path: Preparing osd data dir 
/var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] command: Running command: /sbin/restorecon -R 
/var/lib/ceph/tmp/mnt.EWuVuW/ceph_fsid.7378.tmp
[osd3][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph 
/var/lib/ceph/tmp/mnt.EWuVuW/ceph_fsid.7378.tmp
[osd3][WARNIN] command: Running command: /sbin/restorecon -R 
/var/lib/ceph/tmp/mnt.EWuVuW/fsid.7378.tmp
[osd3][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph 
/var/lib/ceph/tmp/mnt.EWuVuW/fsid.7378.tmp
[osd3][WARNIN] command: Running command: /sbin/restorecon -R 
/var/lib/ceph/tmp/mnt.EWuVuW/magic.7378.tmp
[osd3][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph 
/var/lib/ceph/tmp/mnt.EWuVuW/magic.7378.tmp
[osd3][WARNIN] command: Running command: /sbin/restorecon -R 
/var/lib/ceph/tmp/mnt.EWuVuW/journal_uuid.7378.tmp
[osd3][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph 
/var/lib/ceph/tmp/mnt.EWuVuW/journal_uuid.7378.tmp
[osd3][WARNIN] adjust_symlink: Creating symlink 
/var/lib/ceph/tmp/mnt.EWuVuW/journal -> /dev/vdf1
[osd3][WARNIN] command: Running command: /sbin/restorecon -R 
/var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph 
/var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] unmount: Unmounting /var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] command_check_call: Running command: /bin/umount -- 
/var/lib/ceph/tmp/mnt.EWuVuW
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is 
/sys/dev/block/252:17/dm/uuid
[osd3][INFO  ] checking OSD status...
[osd3][DEBUG ] find the location of an executable
[osd3][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat 
--format=json
[ceph_deploy.osd][DEBUG ] Host osd3 is now ready for osd use.


[cephuser@groot cephcluster]$ ceph-deploy osd activate osd3:/dev/vdb1:/dev/vdf1
[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/cephuser/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.39): /bin/ceph-deploy osd activate 
osd3:/dev/vdb1:/dev/vdf1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : activate
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : 
<ceph_deploy.conf.cephdeploy.Conf instance at 0x20f9dd0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function osd at 
0x20ed1b8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  disk                          : [('osd3', 
'/dev/vdb1', '/dev/vdf1')]
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks osd3:/dev/vdb1:/dev/vdf1
[osd3][DEBUG ] connection detected need for sudo
[osd3][DEBUG ] connected to host: osd3
[osd3][DEBUG ] detect platform information from remote host
[osd3][DEBUG ] detect machine type
[osd3][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.4.1708 Core
[ceph_deploy.osd][DEBUG ] activating host osd3 disk /dev/vdb1
[ceph_deploy.osd][DEBUG ] will use init type: systemd
[osd3][DEBUG ] find the location of an executable
[osd3][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate 
--mark-init systemd --mount /dev/vdb1
[osd3][WARNIN] main_activate: path = /dev/vdb1
[osd3][WARNIN] get_dm_uuid: get_dm_uuid /dev/vdb1 uuid path is 
/sys/dev/block/252:17/dm/uuid
[osd3][WARNIN] command: Running command: /sbin/blkid -o udev -p /dev/vdb1
[osd3][WARNIN] command: Running command: /sbin/blkid -p -s TYPE -o value -- 
/dev/vdb1
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph 
--name=osd. --lookup osd_mount_options_xfs
[osd3][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph 
--name=osd. --lookup osd_fs_mount_options_xfs
[osd3][WARNIN] mount: Mounting /dev/vdb1 on /var/lib/ceph/tmp/mnt.G7uifc with 
options noatime,inode64
[osd3][WARNIN] command_check_call: Running command: /usr/bin/mount -t xfs -o 
noatime,inode64 -- /dev/vdb1 /var/lib/ceph/tmp/mnt.G7uifc
[osd3][WARNIN] command: Running command: /sbin/restorecon 
/var/lib/ceph/tmp/mnt.G7uifc
[osd3][WARNIN] activate: Cluster uuid is 83d61520-5a38-4f50-9b54-bef4f6bef08c
[osd3][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph 
--show-config-value=fsid
[osd3][WARNIN] activate: Cluster name is ceph
[osd3][WARNIN] activate: OSD uuid is 4627c861-71b7-485e-a402-30bff54a963c
[osd3][WARNIN] allocate_osd_id: Allocating OSD id...
[osd3][WARNIN] command: Running command: /usr/bin/ceph --cluster ceph --name 
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd 
create --concise 4627c861-71b7-485e-a402-30bff54a963c
[osd3][WARNIN] mount_activate: Failed to activate
[osd3][WARNIN] unmount: Unmounting /var/lib/ceph/tmp/mnt.G7uifc
[osd3][WARNIN] command_check_call: Running command: /bin/umount -- 
/var/lib/ceph/tmp/mnt.G7uifc
[osd3][WARNIN] Traceback (most recent call last):
[osd3][WARNIN]   File "/usr/sbin/ceph-disk", line 9, in <module>
[osd3][WARNIN]     load_entry_point('ceph-disk==1.0.0', 'console_scripts', 
'ceph-disk')()
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", 
line 5371, in run
[osd3][WARNIN]     main(sys.argv[1:])
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", 
line 5322, in main
[osd3][WARNIN]     args.func(args)
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", 
line 3445, in main_activate
[osd3][WARNIN]     reactivate=args.reactivate,
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", 
line 3202, in mount_activate
[osd3][WARNIN]     (osd_id, cluster) = activate(path, activate_key_template, 
init)
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", 
line 3365, in activate
[osd3][WARNIN]     keyring=keyring,
[osd3][WARNIN]   File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", 
line 1013, in allocate_osd_id
[osd3][WARNIN]     raise Error('ceph osd create failed', e, e.output)
[osd3][WARNIN] ceph_disk.main.Error: Error: ceph osd create failed: Command 
'/usr/bin/ceph' returned non-zero exit status 1: 2018-02-07 09:38:40.104098 
7fa479cf2700  0 librados: client.bootstrap-osd authentication error (1) 
Operation not permitted
[osd3][WARNIN] Error connecting to cluster: PermissionError
[osd3][WARNIN]
[osd3][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: 
/usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/vdb1

[cephuser@groot cephcluster]$





_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to