Trying to add a batch of OSD’s to my cluster, (Jewel 10.2.6, Ubuntu 16.04)

2 new nodes (ceph01,ceph02), 10 OSD’s per node.

I am trying to steer the OSD’s into a different root pool with crush location 
set in ceph.conf with 
> [osd.34]
> crush_location = "host=ceph01 rack=ssd.rack2 root=ssd"
> 
> [osd.35]
> crush_location = "host=ceph01 rack=ssd.rack2 root=ssd"
> 
> [osd.36]
> crush_location = "host=ceph01 rack=ssd.rack2 root=ssd"
> 
> [osd.37]
> crush_location = "host=ceph01 rack=ssd.rack2 root=ssd"
> 
> [osd.38]
> crush_location = "host=ceph01 rack=ssd.rack2 root=ssd"
> 
> [osd.39]
> crush_location = "host=ceph01 rack=ssd.rack2 root=ssd"
> 
> [osd.40]
> crush_location = "host=ceph01 rack=ssd.rack2 root=ssd"
> 
> [osd.41]
> crush_location = "host=ceph01 rack=ssd.rack2 root=ssd"
> 
> [osd.42]
> crush_location = "host=ceph01 rack=ssd.rack2 root=ssd"
> 
> [osd.43]
> crush_location = "host=ceph01 rack=ssd.rack2 root=ssd”
> 
> [osd.44]
> crush_location = "host=ceph02 rack=ssd.rack2 root=ssd"
> 
> [osd.45]
> crush_location = "host=ceph02 rack=ssd.rack2 root=ssd"
> 
> [osd.46]
> crush_location = "host=ceph02 rack=ssd.rack2 root=ssd"
> 
> [osd.47]
> crush_location = "host=ceph02 rack=ssd.rack2 root=ssd"
> 
> [osd.48]
> crush_location = "host=ceph02 rack=ssd.rack2 root=ssd"
> 
> [osd.49]
> crush_location = "host=ceph02 rack=ssd.rack2 root=ssd"
> 
> [osd.50]
> crush_location = "host=ceph02 rack=ssd.rack2 root=ssd"
> 
> [osd.51]
> crush_location = "host=ceph02 rack=ssd.rack2 root=ssd"
> 
> [osd.52]
> crush_location = "host=ceph02 rack=ssd.rack2 root=ssd"
> 
> [osd.53]
> crush_location = "host=ceph02 rack=ssd.rack2 root=ssd”

Adding ceph01 and its OSDs went without a hitch.
However, ceph02 is completely getting lost, and its osd’s are getting zero 
weighted into the bottom of the osd tree at the root level.

> $ ceph osd tree
> ID  WEIGHT    TYPE NAME                     UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -13  34.91394 root ssd
> -11  34.91394     rack ssd.rack2
> -14  17.45697         host ceph00
>  24   1.74570             osd.24                 up  1.00000          1.00000
>  25   1.74570             osd.25                 up  1.00000          1.00000
>  26   1.74570             osd.26                 up  1.00000          1.00000
>  27   1.74570             osd.27                 up  1.00000          1.00000
>  28   1.74570             osd.28                 up  1.00000          1.00000
>  29   1.74570             osd.29                 up  1.00000          1.00000
>  30   1.74570             osd.30                 up  1.00000          1.00000
>  31   1.74570             osd.31                 up  1.00000          1.00000
>  32   1.74570             osd.32                 up  1.00000          1.00000
>  33   1.74570             osd.33                 up  1.00000          1.00000
> -15  17.45697         host ceph01
>  34   1.74570             osd.34                 up  1.00000          1.00000
>  35   1.74570             osd.35                 up  1.00000          1.00000
>  36   1.74570             osd.36                 up  1.00000          1.00000
>  37   1.74570             osd.37                 up  1.00000          1.00000
>  38   1.74570             osd.38                 up  1.00000          1.00000
>  39   1.74570             osd.39                 up  1.00000          1.00000
>  40   1.74570             osd.40                 up  1.00000          1.00000
>  41   1.74570             osd.41                 up  1.00000          1.00000
>  42   1.74570             osd.42                 up  1.00000          1.00000
>  43   1.74570             osd.43                 up  1.00000          1.00000
> -16         0         host ceph02
> -10         0 rack default.rack2
> -12         0     chassis default.rack2.U16
>  -1 174.51584 root default
>  -2  21.81029     host node24
>   0   7.27010         osd.0                      up  1.00000          1.00000
>   8   7.27010         osd.8                      up  1.00000          1.00000
>  16   7.27010         osd.16                     up  1.00000          1.00000
>  -3  21.81029     host node25
>   1   7.27010         osd.1                      up  1.00000          1.00000
>   9   7.27010         osd.9                      up  1.00000          1.00000
>  17   7.27010         osd.17                     up  1.00000          1.00000
>  -4  21.81987     host node26
>  10   7.27010         osd.10                     up  1.00000          1.00000
>  18   7.27489         osd.18                     up  1.00000          1.00000
>   2   7.27489         osd.2                      up  1.00000          1.00000
>  -5  21.81508     host node27
>   3   7.27010         osd.3                      up  1.00000          1.00000
>  11   7.27010         osd.11                     up  1.00000          1.00000
>  19   7.27489         osd.19                     up  1.00000          1.00000
>  -6  21.81508     host node28
>   4   7.27010         osd.4                      up  1.00000          1.00000
>  12   7.27010         osd.12                     up  1.00000          1.00000
>  20   7.27489         osd.20                     up  1.00000          1.00000
>  -7  21.81508     host node29
>   5   7.27010         osd.5                      up  1.00000          1.00000
>  13   7.27010         osd.13                     up  1.00000          1.00000
>  21   7.27489         osd.21                     up  1.00000          1.00000
>  -8  21.81508     host node30
>   6   7.27010         osd.6                      up  1.00000          1.00000
>  14   7.27010         osd.14                     up  1.00000          1.00000
>  22   7.27489         osd.22                     up  1.00000          1.00000
>  -9  21.81508     host node31
>   7   7.27010         osd.7                      up  1.00000          1.00000
>  15   7.27010         osd.15                     up  1.00000          1.00000
>  23   7.27489         osd.23                     up  1.00000          1.00000
>  44         0 osd.44                           down  1.00000          1.00000


I manually added ceph02 to the crush map with cli
> $ ceph osd crush add-bucket ceph02 host
> $ ceph osd crush move ceph02 root=ssd
> $ ceph osd crush move ceph02 rack=ssd.rack2

That still didn’t make a difference (host ceph02 wasn’t even getting added to 
the crush map before).

This is the output of ceph-deploy (1.5.37) when trying to add osd.44 with 
ceph-deploy
> $ ceph-deploy --username root osd prepare ceph02:sda:/dev/nvme0n1p4
> [ceph_deploy.conf][DEBUG ] found configuration file at: 
> /home/maint/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (1.5.37): /usr/bin/ceph-deploy --username 
> root osd prepare ceph02:sda:/dev/nvme0n1p4
> [ceph_deploy.cli][INFO  ] ceph-deploy options:
> [ceph_deploy.cli][INFO  ]  username                      : root
> [ceph_deploy.cli][INFO  ]  disk                          : [('ceph02', 
> '/dev/sda', '/dev/nvme0n1p4')]
> [ceph_deploy.cli][INFO  ]  dmcrypt                       : False
> [ceph_deploy.cli][INFO  ]  verbose                       : False
> [ceph_deploy.cli][INFO  ]  bluestore                     : None
> [ceph_deploy.cli][INFO  ]  overwrite_conf                : False
> [ceph_deploy.cli][INFO  ]  subcommand                    : prepare
> [ceph_deploy.cli][INFO  ]  dmcrypt_key_dir               : 
> /etc/ceph/dmcrypt-keys
> [ceph_deploy.cli][INFO  ]  quiet                         : False
> [ceph_deploy.cli][INFO  ]  cd_conf                       : 
> <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f0834a09248>
> [ceph_deploy.cli][INFO  ]  cluster                       : ceph
> [ceph_deploy.cli][INFO  ]  fs_type                       : xfs
> [ceph_deploy.cli][INFO  ]  func                          : <function osd at 
> 0x7f0834e6d398>
> [ceph_deploy.cli][INFO  ]  ceph_conf                     : None
> [ceph_deploy.cli][INFO  ]  default_release               : False
> [ceph_deploy.cli][INFO  ]  zap_disk                      : False
> [ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks 
> ceph02:/dev/sda:/dev/nvme0n1p4
> [ceph02][DEBUG ] connected to host: root@ceph02
> [ceph02][DEBUG ] detect platform information from remote host
> [ceph02][DEBUG ] detect machine type
> [ceph02][DEBUG ] find the location of an executable
> [ceph_deploy.osd][INFO  ] Distro info: Ubuntu 16.04 xenial
> [ceph_deploy.osd][DEBUG ] Deploying osd to ceph02
> [ceph02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
> [ceph_deploy.osd][DEBUG ] Preparing host ceph02 disk /dev/sda journal 
> /dev/nvme0n1p4 activate False
> [ceph02][DEBUG ] find the location of an executable
> [ceph02][INFO  ] Running command: /usr/sbin/ceph-disk -v prepare --cluster 
> ceph --fs-type xfs -- /dev/sda /dev/nvme0n1p4
> [ceph02][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph 
> --show-config-value=fsid
> [ceph02][WARNIN] command: Running command: /usr/bin/ceph-osd 
> --check-allows-journal -i 0 --cluster ceph --setuser ceph --setgroup ceph
> [ceph02][WARNIN] command: Running command: /usr/bin/ceph-osd 
> --check-wants-journal -i 0 --cluster ceph --setuser ceph --setgroup ceph
> [ceph02][WARNIN] command: Running command: /usr/bin/ceph-osd 
> --check-needs-journal -i 0 --cluster ceph --setuser ceph --setgroup ceph
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sda uuid path is 
> /sys/dev/block/8:0/dm/uuid
> [ceph02][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph 
> --show-config-value=osd_journal_size
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sda uuid path is 
> /sys/dev/block/8:0/dm/uuid
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sda uuid path is 
> /sys/dev/block/8:0/dm/uuid
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sda uuid path is 
> /sys/dev/block/8:0/dm/uuid
> [ceph02][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph 
> --name=osd. --lookup osd_mkfs_options_xfs
> [ceph02][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph 
> --name=osd. --lookup osd_fs_mkfs_options_xfs
> [ceph02][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph 
> --name=osd. --lookup osd_mount_options_xfs
> [ceph02][WARNIN] command: Running command: /usr/bin/ceph-conf --cluster=ceph 
> --name=osd. --lookup osd_fs_mount_options_xfs
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/nvme0n1p4 uuid path is 
> /sys/dev/block/259:13/dm/uuid
> [ceph02][WARNIN] prepare_device: Journal /dev/nvme0n1p4 is a partition
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/nvme0n1p4 uuid path is 
> /sys/dev/block/259:13/dm/uuid
> [ceph02][WARNIN] prepare_device: OSD will not be hot-swappable if journal is 
> not the same device as the osd data
> [ceph02][WARNIN] command: Running command: /sbin/blkid -o udev -p 
> /dev/nvme0n1p4
> [ceph02][WARNIN] prepare_device: Journal /dev/nvme0n1p4 was not prepared with 
> ceph-disk. Symlinking directly.
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sda uuid path is 
> /sys/dev/block/8:0/dm/uuid
> [ceph02][WARNIN] set_data_partition: Creating osd partition on /dev/sda
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sda uuid path is 
> /sys/dev/block/8:0/dm/uuid
> [ceph02][WARNIN] ptype_tobe_for_name: name = data
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sda uuid path is 
> /sys/dev/block/8:0/dm/uuid
> [ceph02][WARNIN] create_partition: Creating data partition num 1 size 0 on 
> /dev/sda
> [ceph02][WARNIN] command_check_call: Running command: /sbin/sgdisk 
> --largest-new=1 --change-name=1:ceph data 
> --partition-guid=1:9e26d63f-cc60-4c41-93ef-c936a657b643 
> --typecode=1:89c57f98-2fe5-4dc0-89c1-f3ad0ceff2be --mbrtogpt -- /dev/sda
> [ceph02][DEBUG ] Setting name!
> [ceph02][DEBUG ] partNum is 0
> [ceph02][WARNIN] update_partition: Calling partprobe on created device 
> /dev/sda
> [ceph02][DEBUG ] REALLY setting name!
> [ceph02][WARNIN] command_check_call: Running command: /sbin/udevadm settle 
> --timeout=600
> [ceph02][DEBUG ] The operation has completed successfully.
> [ceph02][WARNIN] command: Running command: /usr/bin/flock -s /dev/sda 
> /sbin/partprobe /dev/sda
> [ceph02][WARNIN] command_check_call: Running command: /sbin/udevadm settle 
> --timeout=600
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sda uuid path is 
> /sys/dev/block/8:0/dm/uuid
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sda uuid path is 
> /sys/dev/block/8:0/dm/uuid
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sda1 uuid path is 
> /sys/dev/block/8:1/dm/uuid
> [ceph02][WARNIN] populate_data_path_device: Creating xfs fs on /dev/sda1
> [ceph02][WARNIN] command_check_call: Running command: /sbin/mkfs -t xfs -f -i 
> size=2048 -- /dev/sda1
> [ceph02][DEBUG ] meta-data=/dev/sda1              isize=2048   agcount=4, 
> agsize=117210837 blks
> [ceph02][DEBUG ]          =                       sectsz=512   attr=2, 
> projid32bit=1
> [ceph02][DEBUG ]          =                       crc=1        finobt=1, 
> sparse=0
> [ceph02][DEBUG ] data     =                       bsize=4096   
> blocks=468843345, imaxpct=5
> [ceph02][DEBUG ]          =                       sunit=0      swidth=0 blks
> [ceph02][DEBUG ] naming   =version 2              bsize=4096   ascii-ci=0 
> ftype=1
> [ceph02][DEBUG ] log      =internal log           bsize=4096   blocks=228927, 
> version=2
> [ceph02][DEBUG ]          =                       sectsz=512   sunit=0 blks, 
> lazy-count=1
> [ceph02][DEBUG ] realtime =none                   extsz=4096   blocks=0, 
> rtextents=0
> [ceph02][WARNIN] mount: Mounting /dev/sda1 on /var/lib/ceph/tmp/mnt.9Zpt6h 
> with options noatime,inode64
> [ceph02][WARNIN] command_check_call: Running command: /bin/mount -t xfs -o 
> noatime,inode64 -- /dev/sda1 /var/lib/ceph/tmp/mnt.9Zpt6h
> [ceph02][WARNIN] populate_data_path: Preparing osd data dir 
> /var/lib/ceph/tmp/mnt.9Zpt6h
> [ceph02][WARNIN] command: Running command: /bin/chown -R ceph:ceph 
> /var/lib/ceph/tmp/mnt.9Zpt6h/ceph_fsid.28789.tmp
> [ceph02][WARNIN] command: Running command: /bin/chown -R ceph:ceph 
> /var/lib/ceph/tmp/mnt.9Zpt6h/fsid.28789.tmp
> [ceph02][WARNIN] command: Running command: /bin/chown -R ceph:ceph 
> /var/lib/ceph/tmp/mnt.9Zpt6h/magic.28789.tmp
> [ceph02][WARNIN] command: Running command: /bin/chown -R ceph:ceph 
> /var/lib/ceph/tmp/mnt.9Zpt6h/journal_uuid.28789.tmp
> [ceph02][WARNIN] adjust_symlink: Creating symlink 
> /var/lib/ceph/tmp/mnt.9Zpt6h/journal -> /dev/nvme0n1p4
> [ceph02][WARNIN] command: Running command: /bin/chown -R ceph:ceph 
> /var/lib/ceph/tmp/mnt.9Zpt6h
> [ceph02][WARNIN] unmount: Unmounting /var/lib/ceph/tmp/mnt.9Zpt6h
> [ceph02][WARNIN] command_check_call: Running command: /bin/umount -- 
> /var/lib/ceph/tmp/mnt.9Zpt6h
> [ceph02][WARNIN] get_dm_uuid: get_dm_uuid /dev/sda uuid path is 
> /sys/dev/block/8:0/dm/uuid
> [ceph02][WARNIN] command_check_call: Running command: /sbin/sgdisk 
> --typecode=1:4fbd7e29-9d25-41b8-afd0-062c0ceff05d -- /dev/sda
> [ceph02][DEBUG ] Warning: The kernel is still using the old partition table.
> [ceph02][DEBUG ] The new table will be used at the next reboot or after you
> [ceph02][DEBUG ] run partprobe(8) or kpartx(8)
> [ceph02][DEBUG ] The operation has completed successfully.
> [ceph02][WARNIN] update_partition: Calling partprobe on prepared device 
> /dev/sda
> [ceph02][WARNIN] command_check_call: Running command: /sbin/udevadm settle 
> --timeout=600
> [ceph02][WARNIN] command: Running command: /usr/bin/flock -s /dev/sda 
> /sbin/partprobe /dev/sda
> [ceph02][WARNIN] command_check_call: Running command: /sbin/udevadm settle 
> --timeout=600
> [ceph02][WARNIN] command_check_call: Running command: /sbin/udevadm trigger 
> --action=add --sysname-match sda1
> [ceph02][INFO  ] checking OSD status...
> [ceph02][DEBUG ] find the location of an executable
> [ceph02][INFO  ] Running command: /usr/bin/ceph --cluster=ceph osd stat 
> --format=json
> [ceph02][WARNIN] there is 1 OSD down
> [ceph_deploy.osd][DEBUG ] Host ceph02 is now ready for osd use.


Hoping someone might be checking their email over the weekend that can easily 
spot something I have overlooked somehow.
Just very odd to see it work without issue on the first node, and not work on 
the second node, both configured identically, deployed identically, with 
different results.

Appreciate any help.

Reed
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to