HI,

In case anyone is running into the same issue,
the culprit was running 2 instances of docker ( snap docker and docker )

Masking snapdocker resolved the issue

I am still puzzled that I had this issue since the cluster is in production
for few months now
but
happy to get it fixed

Thanks
Steven

On Wed, 29 Oct 2025 at 13:18, Steven Vacaroaia <[email protected]> wrote:

> Just to add the the mystery
> All of the bellow on "ceph-host-1"
>
>  "docker ps -a"                         shows nothing
>
> "ls /var/lib/docker/containers"  shows lots of entries
>
> and
>
> "systemctl status docker"       shows active (running)
>
> Any help will be greatly appreciated
>
> Steven
>
> On Wed, 29 Oct 2025 at 11:37, Steven Vacaroaia <[email protected]> wrote:
>
>> Hi,
>>
>> I just noticed this error
>>    "failed to probe daemon or devices"
>> which seems to be related to /var/lib/ceph  being read only
>>
>> I checked the host ( ceph-host-1) and it seems fine so it must be a
>> docker instance
>>
>> How do I identify which one and how can I fix it please ?
>>
>> Many thanks
>> Steven
>>
>>
>>
>> root@ceph-host-1:~/utilities# ceph -s
>>   cluster:
>>     id:     0cfa836d-68b5-11f0-90bf-7cc2558e5ce8
>>     health: HEALTH_WARN
>>             failed to probe daemons or devices
>>
>>   services:
>>     mon: 5 daemons, quorum
>> ceph-host-1,ceph-host-2,ceph-host-3,ceph-host-7,ceph-host-6 (age 8h)
>>     mgr: ceph-host-1.lqlece(active, since 5w), standbys:
>> ceph-host-2.suiuxi
>>     mds: 19/19 daemons up, 7 standby
>>     osd: 185 osds: 185 up (since 4w), 185 in (since 4w)
>>          flags noautoscale
>>     rgw: 2 daemons active (2 hosts, 1 zones)
>>
>>   data:
>>     volumes: 4/4 healthy
>>     pools:   16 pools, 7137 pgs
>>     objects: 359.48M objects, 811 TiB
>>     usage:   1.2 PiB used, 1.1 PiB / 2.3 PiB avail
>>     pgs:     7034 active+clean
>>              72   active+clean+scrubbing
>>              31   active+clean+scrubbing+deep
>>
>>   io:
>>     client:   30 MiB/s rd, 34 KiB/s wr, 11 op/s rd, 1 op/s wr
>>
>>
>>
>>
>>
>> cephadm shell -- ceph-volume lvm list --format json
>> Inferring fsid 0cfa836d-68b5-11f0-90bf-7cc2558e5ce8
>> Inferring config
>> /var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/mon.ceph-host-1/config
>> docker: Error response from daemon: error while creating mount source
>> path '/var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/crash': mkdir
>> /var/lib/ceph: read-only file system.
>>
>>
>>
>> root@ceph-host-1:~/utilities# ceph health detail
>> HEALTH_WARN failed to probe daemons or devices
>> [WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
>>     host ceph-host-1 `cephadm ceph-volume` failed: cephadm exited with an
>> error code: 1, stderr: Inferring config
>> /var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/mon.ceph-host-1/config
>> Non-zero exit code 125 from /usr/bin/docker run --rm --ipc=host
>> --stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint
>> /usr/sbin/ceph-volume --privileged --group-add=disk --init -e
>> CONTAINER_IMAGE=
>> quay.io/ceph/ceph@sha256:8214ebff6133ac27d20659038df6962dbf9d77da21c9438a296b2e2059a56af6
>> -e NODE_NAME=ceph-host-1 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e
>> CEPH_VOLUME_DEBUG=1 -v
>> /var/run/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8:/var/run/ceph:z -v
>> /var/log/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8:/var/log/ceph:z -v
>> /var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/crash:/var/lib/ceph/crash:z
>> -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
>> /run/lock/lvm:/run/lock/lvm -v /:/rootfs:rslave -v
>> /tmp/ceph-tmptx7tjpjz:/etc/ceph/ceph.conf:z
>> quay.io/ceph/ceph@sha256:8214ebff6133ac27d20659038df6962dbf9d77da21c9438a296b2e2059a56af6
>> inventory --format=json-pretty --filter-for-batch
>> /usr/bin/docker: stderr docker: Error response from daemon: error while
>> creating mount source path
>> '/var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/crash': mkdir
>> /var/lib/ceph: read-only file system.
>> Traceback (most recent call last):
>>   File "<frozen runpy>", line 198, in _run_module_as_main
>>   File "<frozen runpy>", line 88, in _run_code
>>   File
>> "/var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/cephadm.e2b78d38c8b7ec4ef00612ed046678feceb37baccd3990bc21df1538095d27c9/__main__.py",
>> line 5581, in <module>
>>   File
>> "/var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/cephadm.e2b78d38c8b7ec4ef00612ed046678feceb37baccd3990bc21df1538095d27c9/__main__.py",
>> line 5569, in main
>>   File
>> "/var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/cephadm.e2b78d38c8b7ec4ef00612ed046678feceb37baccd3990bc21df1538095d27c9/__main__.py",
>> line 409, in _infer_config
>>   File
>> "/var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/cephadm.e2b78d38c8b7ec4ef00612ed046678feceb37baccd3990bc21df1538095d27c9/__main__.py",
>> line 324, in _infer_fsid
>>   File
>> "/var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/cephadm.e2b78d38c8b7ec4ef00612ed046678feceb37baccd3990bc21df1538095d27c9/__main__.py",
>> line 437, in _infer_image
>>   File
>> "/var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/cephadm.e2b78d38c8b7ec4ef00612ed046678feceb37baccd3990bc21df1538095d27c9/__main__.py",
>> line 311, in _validate_fsid
>>   File
>> "/var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/cephadm.e2b78d38c8b7ec4ef00612ed046678feceb37baccd3990bc21df1538095d27c9/__main__.py",
>> line 3314, in command_ceph_volume
>>   File
>> "/var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/cephadm.e2b78d38c8b7ec4ef00612ed046678feceb37baccd3990bc21df1538095d27c9/cephadmlib/call_wrappers.py",
>> line 307, in call_throws
>> RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host
>> --stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint
>> /usr/sbin/ceph-volume --privileged --group-add=disk --init -e
>> CONTAINER_IMAGE=
>> quay.io/ceph/ceph@sha256:8214ebff6133ac27d20659038df6962dbf9d77da21c9438a296b2e2059a56af6
>> -e NODE_NAME=ceph-host-1 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e
>> CEPH_VOLUME_DEBUG=1 -v
>> /var/run/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8:/var/run/ceph:z -v
>> /var/log/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8:/var/log/ceph:z -v
>> /var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/crash:/var/lib/ceph/crash:z
>> -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
>> /run/lock/lvm:/run/lock/lvm -v /:/rootfs:rslave -v
>> /tmp/ceph-tmptx7tjpjz:/etc/ceph/ceph.conf:z
>> quay.io/ceph/ceph@sha256:8214ebff6133ac27d20659038df6962dbf9d77da21c9438a296b2e2059a56af6
>> inventory --format=json-pretty --filter-for-batch: docker: Error response
>> from daemon: error while creating mount source path
>> '/var/lib/ceph/0cfa836d-68b5-11f0-90bf-7cc2558e5ce8/crash': mkdir
>> /var/lib/ceph: read-only file system.
>>
>>
>>
>>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to