Hi Dev,
Not sure to understand why there was these service deployment time-out, the
log says that one OSD failed, this may explain that that the upgrade is not
progressing anymore. The Bluestore slow ops (a new warning so not
necessarily something new) on so many OSD seem to suggest that there
something not optimal. As suggested in another thread recently it may be an
indication that you need to compact OSD.
I am not sure what you adjusted but as long as the cluster works, I would
not have changed parameters and try to fix the mentioned problems.
Good luck,
Michel
Sent from my mobile
Le 20 juin 2025 05:13:38 Devender Singh <deven...@netskrt.io> a écrit :
Here is the status
# ceph orch upgrade status
{
"target_image":
"quay.io/ceph/ceph@sha256:8214ebff6133ac27d20659038df6962dbf9d77da21c9438a296b2e2059a56af6",
"in_progress": true,
"which": "Upgrading all daemon types on all hosts",
"services_complete": [
"crash",
"mgr",
"mon"
],
"progress": "74/113 daemons upgraded",
"message": "",
"is_paused": false
}
Regards
Dev
On Jun 19, 2025, at 8:06 PM, Devender Singh <deven...@netskrt.io> wrote:
Hello all
I have a cluster where my cluster is in hung state, Some back fills are
there but I reduced it to 1 but still upgrade not progressing…
Please help…
```# ceph health detail
HEALTH_WARN 8 OSD(s) experiencing slow operations in BlueStore; Failed to
apply 2 service(s): osd.all-available-devices,osd.iops_optimized; 1 failed
cephadm daemon(s); failed to probe daemons or devices; noscrub,nodeep-scrub
flag(s) set; Degraded data redundancy: 1600150/39365198 objects degraded
(4.065%), 93 pgs degraded, 103 pgs undersized; 127 pgs not deep-scrubbed in
time
[WRN] BLUESTORE_SLOW_OP_ALERT: 8 OSD(s) experiencing slow operations in
BlueStore
osd.5 observed slow operation indications in BlueStore
osd.9 observed slow operation indications in BlueStore
osd.18 observed slow operation indications in BlueStore
osd.36 observed slow operation indications in BlueStore
osd.59 observed slow operation indications in BlueStore
osd.66 observed slow operation indications in BlueStore
osd.106 observed slow operation indications in BlueStore
osd.110 observed slow operation indications in BlueStore
[WRN] CEPHADM_APPLY_SPEC_FAIL: Failed to apply 2 service(s):
osd.all-available-devices,osd.iops_optimized
osd.all-available-devices: Command timed out on host cephadm deploy (osd
daemon) (default 900 second timeout)
osd.iops_optimized: Command timed out on host cephadm deploy (osd daemon)
(default 900 second timeout)
[WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
daemon osd.19 on phl-prod-host04n.example.comis in error state
[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
Command "cephadm ceph-volume -- inventory" timed out on host
phl-prod-converged03n.phl.netskrt.org (default 900 second timeout)
[WRN] OSDMAP_FLAGS: noscrub,nodeep-scrub flag(s) set
[WRN] PG_DEGRADED: Degraded data redundancy: 1600150/39365198 objects
degraded (4.065%), 93 pgs degraded, 103 pgs undersized
pg 25.0 is stuck undersized for 21h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[90,77,NONE,38,3]
pg 25.1 is stuck undersized for 23h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[131,1,66,NONE,72]
pg 25.c is stuck undersized for 23h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[1,135,110,28,NONE]
pg 25.e is active+undersized+degraded+remapped+backfill_wait, acting
[18,20,108,101,NONE]
pg 25.14 is stuck undersized for 19h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[32,13,NONE,65,97]
pg 25.17 is stuck undersized for 19h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[32,65,110,23,NONE]
pg 25.1a is stuck undersized for 23h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[97,66,NONE,112,139]
pg 25.1c is stuck undersized for 2h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[102,136,97,NONE,45]
pg 25.1f is stuck undersized for 8h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[NONE,66,7,39,109]
pg 26.45 is stuck undersized for 23h, current state
active+undersized+degraded+remapped+backfill_wait, last acting [32,47]
pg 26.48 is stuck undersized for 23h, current state
active+undersized+degraded+remapped+backfill_wait, last acting [113,55]
pg 26.4b is stuck undersized for 8h, current state
active+undersized+degraded+remapped+backfill_wait, last acting [105,7]
pg 26.59 is stuck undersized for 23h, current state
active+undersized+degraded+remapped+backfill_wait, last acting [35,40]
pg 26.6a is stuck undersized for 7h, current state
active+undersized+degraded+remapped+backfill_wait, last acting [25,45]
pg 26.74 is stuck undersized for 23h, current state
active+undersized+degraded+remapped+backfill_wait, last acting [112,131]
pg 26.77 is stuck undersized for 2h, current state
active+undersized+degraded+remapped+backfill_wait, last acting [17,105]
pg 26.9c is stuck undersized for 19h, current state
active+undersized+degraded+remapped+backfill_wait, last acting [47,76]
pg 26.bf is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting [32,26]
pg 26.c2 is stuck undersized for 2h, current state
active+undersized+degraded+remapped+backfill_wait, last acting [97,135]
pg 26.ec is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting [26,80]
pg 26.f7 is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting [49,105]
pg 31.12 is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[47,110,131,37,NONE]
pg 31.19 is stuck undersized for 2h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[NONE,87,3,137,39]
pg 31.1b is stuck undersized for 2h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[108,NONE,42,102,97]
pg 31.1c is stuck undersized for 19h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[113,66,NONE,45,72]
pg 31.41 is stuck undersized for 2h, current state
active+undersized+remapped+backfill_wait, last acting [14,101,NONE,67,18]
pg 31.42 is stuck undersized for 7h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[37,134,NONE,82,8]
pg 31.44 is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[49,NONE,101,3,62]
pg 31.46 is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfilling, last acting
[95,38,NONE,25,102]
pg 31.47 is stuck undersized for 18h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[60,NONE,130,72,110]
pg 31.4a is stuck undersized for 4h, current state
active+undersized+degraded+remapped+backfilling, last acting
[66,NONE,135,82,14]
pg 31.4c is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[34,NONE,1,82,18]
pg 31.4d is stuck undersized for 2h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[101,13,65,30,NONE]
pg 31.52 is stuck undersized for 4m, current state
active+undersized+remapped+backfill_wait, last acting [26,112,66,NONE,135]
pg 31.53 is stuck undersized for 17h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[NONE,24,11,42,110]
pg 31.55 is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[116,NONE,27,4,117]
pg 31.57 is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[17,15,110,NONE,1]
pg 31.5a is stuck undersized for 19h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[13,24,124,26,NONE]
pg 31.5b is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[NONE,31,8,27,42]
pg 31.5c is stuck undersized for 2h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[NONE,60,28,18,119]
pg 31.5d is stuck undersized for 17h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[124,NONE,85,6,11]
pg 31.5e is stuck undersized for 2h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[NONE,4,1,23,138]
pg 31.60 is stuck undersized for 18h, current state
active+undersized+degraded+remapped+backfilling, last acting
[37,11,102,NONE,133]
pg 31.64 is stuck undersized for 4m, current state
active+undersized+remapped+backfill_wait, last acting [26,106,45,34,NONE]
pg 31.65 is stuck undersized for 4h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[NONE,14,127,62,3]
pg 31.66 is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[34,NONE,35,23,59]
pg 31.67 is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[67,24,127,8,NONE]
pg 31.68 is stuck undersized for 4m, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[32,41,18,17,NONE]
pg 31.70 is stuck undersized for 19h, current state
active+undersized+degraded+remapped+backfilling, last acting [106,38,1,NONE,97]
pg 31.75 is stuck undersized for 2h, current state
active+undersized+degraded+remapped+backfill_wait, last acting
[26,31,34,8,NONE]
pg 31.7f is stuck undersized for 2h, current state
active+undersized+remapped+backfilling, last acting [NONE,101,110,10,45]
[WRN] PG_NOT_DEEP_SCRUBBED: 127 pgs not deep-scrubbed in time
pg 26.e5 not deep-scrubbed since 2025-06-07T05:59:22.612553+0000
pg 26.ca not deep-scrubbed since 2025-06-06T18:21:07.435823+0000
pg 26.bf not deep-scrubbed since 2025-06-03T07:29:02.791095+0000
pg 26.be not deep-scrubbed since 2025-06-04T07:55:54.389714+0000
pg 26.a5 not deep-scrubbed since 2025-06-07T07:28:56.878096+0000
pg 26.85 not deep-scrubbed since 2025-06-03T15:18:59.395929+0000
pg 26.7f not deep-scrubbed since 2025-06-04T21:29:28.412637+0000
pg 26.7e not deep-scrubbed since 2025-06-03T09:14:19.585388+0000
pg 26.7d not deep-scrubbed since 2025-06-03T18:37:30.931020+0000
pg 26.7c not deep-scrubbed since 2025-06-04T13:38:00.061488+0000
pg 26.73 not deep-scrubbed since 2025-06-03T06:20:15.111819+0000
pg 26.6f not deep-scrubbed since 2025-06-03T13:45:24.880397+0000
pg 26.6e not deep-scrubbed since 2025-05-26T23:15:32.099862+0000
pg 26.6d not deep-scrubbed since 2025-06-04T14:04:10.449101+0000
pg 31.62 not deep-scrubbed since 2025-06-03T13:34:49.518456+0000
pg 26.65 not deep-scrubbed since 2025-06-04T07:56:25.353411+0000
pg 31.66 not deep-scrubbed since 2025-06-03T10:32:05.364424+0000
pg 26.62 not deep-scrubbed since 2025-06-04T09:35:58.267976+0000
pg 31.65 not deep-scrubbed since 2025-06-03T16:04:40.003140+0000
pg 31.5b not deep-scrubbed since 2025-06-03T14:18:18.835477+0000
pg 26.5d not deep-scrubbed since 2025-06-04T15:14:30.870252+0000
pg 31.58 not deep-scrubbed since 2025-06-03T03:09:27.568605+0000
pg 26.5c not deep-scrubbed since 2025-06-03T01:57:27.644129+0000
pg 31.5f not deep-scrubbed since 2025-06-03T05:53:20.860393+0000
pg 31.52 not deep-scrubbed since 2025-05-27T00:01:27.040861+0000
pg 31.53 not deep-scrubbed since 2025-05-24T09:37:58.964829+0000
pg 26.55 not deep-scrubbed since 2025-06-04T21:25:34.135356+0000
pg 26.54 not deep-scrubbed since 2025-06-04T06:07:12.978734+0000
pg 31.56 not deep-scrubbed since 2025-06-04T12:58:17.599712+0000
pg 31.57 not deep-scrubbed since 2025-06-03T07:02:16.859990+0000
pg 26.51 not deep-scrubbed since 2025-06-03T05:42:22.435483+0000
pg 26.4f not deep-scrubbed since 2025-06-03T09:10:22.617328+0000
pg 31.4a not deep-scrubbed since 2025-05-28T00:54:55.246532+0000
pg 26.4e not deep-scrubbed since 2025-06-03T11:16:49.278513+0000
pg 31.4b not deep-scrubbed since 2025-06-03T10:24:24.123351+0000
pg 26.4d not deep-scrubbed since 2025-06-04T19:01:44.614410+0000
pg 31.49 not deep-scrubbed since 2025-05-28T04:56:29.368285+0000
pg 31.42 not deep-scrubbed since 2025-05-28T08:38:57.151865+0000
pg 26.41 not deep-scrubbed since 2025-06-03T05:47:35.443867+0000
pg 26.40 not deep-scrubbed since 2025-06-03T05:43:13.283668+0000
pg 25.17 not deep-scrubbed since 2025-06-03T09:26:52.253625+0000
pg 32.2e not deep-scrubbed since 2025-06-05T13:01:06.175389+0000
pg 22.1a not deep-scrubbed since 2025-06-07T01:40:45.063268+0000
pg 31.13 not deep-scrubbed since 2025-06-03T12:21:17.965218+0000
pg 22.1b not deep-scrubbed since 2025-06-04T12:22:44.947751+0000
pg 25.14 not deep-scrubbed since 2025-05-28T06:26:32.552200+0000
pg 26.17 not deep-scrubbed since 2025-06-03T18:37:26.617483+0000
pg 31.12 not deep-scrubbed since 2025-05-28T08:39:23.271194+0000
pg 31.1c not deep-scrubbed since 2025-06-03T08:17:51.230187+0000
pg 25.1f not deep-scrubbed since 2025-05-28T00:19:23.653883+0000
77 more pgs…
Regards
Dev
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io