Thanks for testing. That should rule out udev as the cause of the race. A couple of observations from the log:
* There is a loop for each osd that calls 'ceph-volume lvm trigger' 30 times until the OSD is activated, for example for 4: [2019-05-31 01:27:29,235][ceph_volume.process][INFO ] Running command: ceph-volume lvm trigger 4-7478edfc-f321-40a2-a105-8e8a2c8ca3f6 [2019-05-31 01:27:35,435][ceph_volume.process][INFO ] stderr --> RuntimeError: could not find osd.4 with fsid 7478edfc-f321-40a2-a105-8e8a2c8ca3f6 [2019-05-31 01:27:35,530][systemd][WARNING] command returned non-zero exit status: 1 [2019-05-31 01:27:35,531][systemd][WARNING] failed activating OSD, retries left: 30 [2019-05-31 01:27:44,122][ceph_volume.process][INFO ] stderr --> RuntimeError: could not find osd.4 with fsid 7478edfc-f321-40a2-a105-8e8a2c8ca3f6 [2019-05-31 01:27:44,174][systemd][WARNING] command returned non-zero exit status: 1 [2019-05-31 01:27:44,175][systemd][WARNING] failed activating OSD, retries left: 29 ... I wonder if we can have similar 'ceph-volume lvm trigger' calls for WAL and DB devices per OSD. Does that even make sense? Or perhaps another call with a similar goal. We should be able to determine if an OSD has a DB or WAL device from the lvm tags. * The first 3 osd's that are activated are 18, 4, and 11 and they are the 3 that are missing block.db/block.wal symlinks. That's just more confirmation this is a race: [2019-05-31 01:28:03,370][systemd][INFO ] successfully trggered activation for: 18-eb5270dc-1110-420f-947e-aab7fae299c9 [2019-05-31 01:28:12,354][systemd][INFO ] successfully trggered activation for: 4-7478edfc-f321-40a2-a105-8e8a2c8ca3f6 [2019-05-31 01:28:12,530][systemd][INFO ] successfully trggered activation for: 11-33de740d-bd8c-4b47-a601-3e6e634e489a -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1828617 Title: Hosts randomly 'losing' disks, breaking ceph-osd service enumeration To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1828617/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs