Thank you,

meanwhile I solved with a bit of tinkering - multipathd was not installed and running, so there were multiple /dev entries for the same disk

--
Francesco Di Nucci
System Administrator
Compute & Networking Service, INFN Naples

Email: [email protected]

On 11/8/25 09:49, Eugen Block wrote:
Hi,

I haven't seen this message yet, so maybe you could add some more details about your setup (storage arrays).

Related to this - When monitoring logs with "ceph -W cephadm", the operations are retried continously, is "ceph mgr module disable cephadm" enough to disable them while debugging? How can I list the operations orch is trying to apply and in case stop/delete them?

I wouldn't disable the entire module, you could first try to set your osd spec to unmanaged:

ceph orch set-unmanaged osd.osd02

Or you pause the orchestrator:

ceph orch pause

And if you want to continue, unpause it:

ceph orch resume

Maybe one of those things can help you with debugging.

Regards,
Eugen

Zitat von Francesco Di Nucci <[email protected]>:

Hi all,

I'm deploying a Ceph cluster with cephadm and I'm having some issues with OSD deploy - they fail with errors like "/bin/podman: stderr  stderr: WARNING: adding device /dev/sde with idname naa.5000c500940453db which is already used for /dev/sds", how should I prevent this? (Might it be related to usage of storage arrays?) A sample log is below

Related to this - When monitoring logs with "ceph -W cephadm", the operations are retried continously, is "ceph mgr module disable cephadm" enough to disable them while debugging? How can I list the operations orch is trying to apply and in case stop/delete them?



SAMPLE LOG

2025-11-04T15:11:13.632450+0000 mgr.my-monmgr01.dospbf [ERR] Failed to apply osd.osd02 spec DriveGroupSpec.from_json(yaml.safe_load('''service_type: osd
service_id: osd02
service_name: osd.osd02
placement:
  hosts:
  - my-osd02.example.com
spec:
  data_devices:
    size: '4TB:'
  filter_logic: AND
  objectstore: bluestore
''')): cephadm exited with an error code: 1, stderr:Non-zero exit code 1 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:7c69e59beaeea61ca714e71cb84ff6d5e533db7f1fd84143dd9ba6649a5fd2ec -e NODE_NAME=my-osd02.example.com -e CEPH_VOLUME_OSDSPEC_AFFINITY=osd02 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142:/var/run/ceph:z -v /var/log/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142:/var/log/ceph:z -v /var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/selinux:/sys/fs/selinux:ro -v /:/rootfs:rslave -v /etc/hosts:/etc/hosts:ro -v /tmp/ceph-tmp5hr75pjk:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpt9txcizn:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:7c69e59beaeea61ca714e71cb84ff6d5e533db7f1fd84143dd9ba6649a5fd2ec lvm batch --no-auto /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdn /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy /dev/sdz --yes --no-systemd
/bin/podman: stderr --> passed data devices: 20 physical, 0 LVM
/bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key /bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key /bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 6065211e-ba14-4c47-b786-0c4ab3dd6f12 /bin/podman: stderr Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/vgcreate --force --yes ceph-e90f3c54-857f-41ef-9a2e-5635c64c4365 /dev/sde /bin/podman: stderr  stderr: WARNING: adding device /dev/sde with idname naa.5000c500866fdcdb which is already used for /dev/sdq. /bin/podman: stderr  stdout: Physical volume "/dev/sde" successfully created. /bin/podman: stderr  stdout: Volume group "ceph-e90f3c54-857f-41ef-9a2e-5635c64c4365" successfully created /bin/podman: stderr --> Was unable to complete a new OSD, will rollback changes /bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.17 --yes-i-really-mean-it
/bin/podman: stderr  stderr: purged osd.17
/bin/podman: stderr --> No OSD identified by "17" was found among LVM-based OSDs.
/bin/podman: stderr --> Proceeding to check RAW-based OSDs.
/bin/podman: stderr No OSD were found.
Traceback (most recent call last):
  File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 5581, in <module>   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 5569, in main   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 409, in _infer_config   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 324, in _infer_fsid   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 437, in _infer_image   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 311, in _validate_fsid   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 3314, in command_ceph_volume   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/cephadmlib/call_wrappers.py", line 310, in call_throws RuntimeError: Failed command: /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:7c69e59beaeea61ca714e71cb84ff6d5e533db7f1fd84143dd9ba6649a5fd2ec -e NODE_NAME=my-osd02.example.com -e CEPH_VOLUME_OSDSPEC_AFFINITY=osd02 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142:/var/run/ceph:z -v /var/log/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142:/var/log/ceph:z -v /var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/selinux:/sys/fs/selinux:ro -v /:/rootfs:rslave -v /etc/hosts:/etc/hosts:ro -v /tmp/ceph-tmp5hr75pjk:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpt9txcizn:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:7c69e59beaeea61ca714e71cb84ff6d5e533db7f1fd84143dd9ba6649a5fd2ec lvm batch --no-auto /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdn /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy /dev/sdz --yes --no-systemd
Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 602, in _apply_all_services
    if self._apply_service(spec):
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 721, in _apply_service
    self.mgr.osd_service.create_from_spec(cast(DriveGroupSpec, spec))
  File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 79, in create_from_spec
    ret = self.mgr.wait_async(all_hosts())
  File "/usr/share/ceph/mgr/cephadm/module.py", line 815, in wait_async
    return self.event_loop.get_result(coro, timeout)
  File "/usr/share/ceph/mgr/cephadm/ssh.py", line 136, in get_result
    return future.result(timeout)
  File "/lib64/python3.9/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/lib64/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 76, in all_hosts
    return await gather(*futures)
  File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 63, in create_from_spec_one
    ret_msg = await self.create_single_host(
  File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 95, in create_single_host
    raise RuntimeError(
RuntimeError: cephadm exited with an error code: 1, stderr:Non-zero exit code 1 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:7c69e59beaeea61ca714e71cb84ff6d5e533db7f1fd84143dd9ba6649a5fd2ec -e NODE_NAME=my-osd02.example.com -e CEPH_VOLUME_OSDSPEC_AFFINITY=osd02 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142:/var/run/ceph:z -v /var/log/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142:/var/log/ceph:z -v /var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/selinux:/sys/fs/selinux:ro -v /:/rootfs:rslave -v /etc/hosts:/etc/hosts:ro -v /tmp/ceph-tmp5hr75pjk:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpt9txcizn:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:7c69e59beaeea61ca714e71cb84ff6d5e533db7f1fd84143dd9ba6649a5fd2ec lvm batch --no-auto /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdn /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy /dev/sdz --yes --no-systemd
/bin/podman: stderr --> passed data devices: 20 physical, 0 LVM
/bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key /bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key /bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 6065211e-ba14-4c47-b786-0c4ab3dd6f12 /bin/podman: stderr Running command: nsenter --mount=/rootfs/proc/1/ns/mnt --ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net --uts=/rootfs/proc/1/ns/uts /sbin/vgcreate --force --yes ceph-e90f3c54-857f-41ef-9a2e-5635c64c4365 /dev/sde /bin/podman: stderr  stderr: WARNING: adding device /dev/sde with idname naa.5000c500866fdcdb which is already used for /dev/sdq. /bin/podman: stderr  stdout: Physical volume "/dev/sde" successfully created. /bin/podman: stderr  stdout: Volume group "ceph-e90f3c54-857f-41ef-9a2e-5635c64c4365" successfully created /bin/podman: stderr --> Was unable to complete a new OSD, will rollback changes /bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.17 --yes-i-really-mean-it
/bin/podman: stderr  stderr: purged osd.17
/bin/podman: stderr --> No OSD identified by "17" was found among LVM-based OSDs.
/bin/podman: stderr --> Proceeding to check RAW-based OSDs.
/bin/podman: stderr No OSD were found.
Traceback (most recent call last):
  File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 5581, in <module>   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 5569, in main   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 409, in _infer_config   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 324, in _infer_fsid   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 437, in _infer_image   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 311, in _validate_fsid   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/__main__.py", line 3314, in command_ceph_volume   File "/var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/cephadm.1a8853661a9c1798390b8e8d13c27688c1b1327a075745af2ee40ac466f0ac36/cephadmlib/call_wrappers.py", line 310, in call_throws RuntimeError: Failed command: /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:7c69e59beaeea61ca714e71cb84ff6d5e533db7f1fd84143dd9ba6649a5fd2ec -e NODE_NAME=my-osd02.example.com -e CEPH_VOLUME_OSDSPEC_AFFINITY=osd02 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142:/var/run/ceph:z -v /var/log/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142:/var/log/ceph:z -v /var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /var/lib/ceph/d3209bec-b987-11f0-aaa9-ac1f6b0de142/selinux:/sys/fs/selinux:ro -v /:/rootfs:rslave -v /etc/hosts:/etc/hosts:ro -v /tmp/ceph-tmp5hr75pjk:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpt9txcizn:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:7c69e59beaeea61ca714e71cb84ff6d5e533db7f1fd84143dd9ba6649a5fd2ec lvm batch --no-auto /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdn /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy /dev/sdz --yes --no-systemd 2025-11-04T15:11:30.613025+0000 mgr.my-monmgr01.dospbf [INF] Detected new or changed devices on my-osd02.example.com


Thanks in advance

--
Francesco Di Nucci
System Administrator
Compute & Networking Service, INFN Naples

Email: [email protected]

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]


_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to