[ceph-users] Re: CEPH upgrade from 18.2.7 to 19.2.2 -- Hung from last 24h at 66%

Michel Jouvin Sun, 22 Jun 2025 02:01:06 -0700

Hi Dev,

I am not sure why you formatted osd.19, may be I missed something. ClearlyCeph is very sensitive to network issues on any of the network used. Yourpriority should be to ensure that your network config is ok between youCeph servers. This is an OS configuration issue that you need totroubleshoot with the usual OS tools. The problem may be a change in yournetwork infrastructure (switches config for example), a problem with MTUsize if you're using Jumbo frames...

If your cluster was kind of ok before the upgrade, for me there is noreason to reformat OSDs or change anything to the cluster config. You needto spot what the problem cause is, may be something outside Ceph, and fixit before trying to restart the upgrade.


Good luck.

Michel
Sent from my mobile
Le 21 juin 2025 23:50:01 Devender Singh <deven...@netskrt.io> a écrit :

Hello Fred
I formatted the osd.19 but facing similar issue on osd.9. I have pause theupgrade.
Below are the logs, another issue found is my osd’s are not using clusternetwork…. How to deal with it?
root@pl-host04n:/var/lib/ceph/a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac/osd.9#ceph config get mon public_network
10.104.1.0/24
root@pl-host04n:/var/lib/ceph/a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac/osd.9#ceph config get mon cluster_network
10.104.5.0/24
root@pl-host04n:/var/lib/ceph/a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac/osd.9#ceph osd find 9
{
"osd": 9,
"addrs": {
"addrvec": [
{
"type": "v2",
"addr": "10.104.1.124:6802",
"nonce": 2257868117
},
{
"type": "v1",
"addr": "10.104.1.124:6803",
"nonce": 2257868117
}
]
},
"osd_fsid": "4db6e332-9031-4c81-8de0-00fdd6b860f6",
"host": "pl-host04n.phl.example.com",
"crush_location": {
"host": "pl-host04n",
"root": "default"
}
}
root@pl-host04n:/var/lib/ceph/a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac/osd.9#grep -E '\bosd\.9\b' /var/log/syslog-ceph |tail -20Jun 21 20:10:00 pl-host04n bash[3965687]: debug2025-06-21T20:09:59.996+0000 7fd05b6b4640 0 log_channel(cluster) log [WRN]: daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 20:10:00 pl-host04n bash[3965687]: cluster2025-06-21T20:10:00.000344+0000 mon.pl-host04n (mon.0) 208269 : cluster[WRN] daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 20:20:00 pl-host04n bash[3965687]: debug2025-06-21T20:19:59.994+0000 7fd05b6b4640 0 log_channel(cluster) log [WRN]: daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 20:20:00 pl-host04n bash[3965687]: cluster2025-06-21T20:20:00.000282+0000 mon.pl-host04n (mon.0) 208543 : cluster[WRN] daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 20:30:00 pl-host04n bash[3965687]: debug2025-06-21T20:29:59.993+0000 7fd05b6b4640 0 log_channel(cluster) log [WRN]: daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 20:30:00 pl-host04n bash[3965687]: cluster2025-06-21T20:30:00.000292+0000 mon.pl-host04n (mon.0) 208802 : cluster[WRN] daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 20:40:00 pl-host04n bash[3965687]: debug2025-06-21T20:39:59.996+0000 7fd05b6b4640 0 log_channel(cluster) log [WRN]: daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 20:40:00 pl-host04n bash[3965687]: cluster2025-06-21T20:40:00.000322+0000 mon.pl-host04n (mon.0) 209104 : cluster[WRN] daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 20:50:00 pl-host04n bash[3965687]: debug2025-06-21T20:49:59.994+0000 7fd05b6b4640 0 log_channel(cluster) log [WRN]: daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 20:50:00 pl-host04n bash[3965687]: cluster2025-06-21T20:50:00.000342+0000 mon.pl-host04n (mon.0) 209394 : cluster[WRN] daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 21:00:00 pl-host04n bash[3965687]: debug2025-06-21T20:59:59.993+0000 7fd05b6b4640 0 log_channel(cluster) log [WRN]: daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 21:00:00 pl-host04n bash[3965687]: cluster2025-06-21T21:00:00.000331+0000 mon.pl-host04n (mon.0) 209681 : cluster[WRN] daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 21:10:00 pl-host04n bash[3965687]: debug2025-06-21T21:09:59.996+0000 7fd05b6b4640 0 log_channel(cluster) log [WRN]: daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 21:10:00 pl-host04n bash[3965687]: cluster2025-06-21T21:10:00.000308+0000 mon.pl-host04n (mon.0) 209967 : cluster[WRN] daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 21:20:00 pl-host04n bash[3965687]: debug2025-06-21T21:19:59.994+0000 7fd05b6b4640 0 log_channel(cluster) log [WRN]: daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 21:20:00 pl-host04n bash[3965687]: cluster2025-06-21T21:20:00.000268+0000 mon.pl-host04n (mon.0) 210298 : cluster[WRN] daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 21:30:00 pl-host04n bash[3965687]: debug2025-06-21T21:29:59.993+0000 7fd05b6b4640 0 log_channel(cluster) log [WRN]: daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 21:30:00 pl-host04n bash[3965687]: cluster2025-06-21T21:30:00.000316+0000 mon.pl-host04n (mon.0) 210560 : cluster[WRN] daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 21:40:00 pl-host04n bash[3965687]: debug2025-06-21T21:39:59.996+0000 7fd05b6b4640 0 log_channel(cluster) log [WRN]: daemon osd.9 on pl-host04n.phl.example.com is in error stateJun 21 21:40:00 pl-host04n bash[3965687]: cluster2025-06-21T21:40:00.000231+0000 mon.pl-host04n (mon.0) 210817 : cluster[WRN] daemon osd.9 on pl-host04n.phl.example.com is in error state
root@pl-host04n:/var/lib/ceph/a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac/osd.9#systemctl list-units |grep -i osd.9● ceph-a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac@osd.9.serviceloaded failedfailed Ceph osd.9 for a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac● ceph-osd@9.serviceloaded failedfailed Ceph object storage daemon osd.9root@pl-host04n:/var/lib/ceph/a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac/osd.9#systemctl status ceph-a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac@osd.9.service× ceph-a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac@osd.9.service - Ceph osd.9 fora0bd51e8-4dfc-11ee-b5a9-3b06e501a0acLoaded: loaded(/etc/systemd/system/ceph-a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac@.service;enabled; vendor preset: enabled)Active: failed (Result: exit-code) since Sat 2025-06-21 18:46:01 UTC; 3h2min agoProcess: 3156903 ExecStart=/bin/bash/var/lib/ceph/a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac/osd.9/unit.run(code=exited, status=1/FAI>Process: 3158431 ExecStopPost=/bin/bash/var/lib/ceph/a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac/osd.9/unit.poststop(code=exited, stat>
Main PID: 3156903 (code=exited, status=1/FAILURE)
CPU: 571ms
Jun 21 18:46:01 pl-host04n.phl.example.com systemd[1]:ceph-a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac@osd.9.service: Scheduled >Jun 21 18:46:01 pl-host04n.phl.example.com systemd[1]: Stopped Ceph osd.9for a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac.Jun 21 18:46:01 pl-host04n.phl.example.com systemd[1]:ceph-a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac@osd.9.service: Start requ>Jun 21 18:46:01 pl-host04n.phl.example.com systemd[1]:ceph-a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac@osd.9.service: Failed wit>Jun 21 18:46:01 pl-host04n.phl.example.com systemd[1]: Failed to start Cephosd.9 for a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac.root@pl-host04n:/var/lib/ceph/a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac/osd.9#journalctl -u ceph-a0bd51e8-4dfc-11ee-b5a9-3b06e501a0ac@osd.9.serviceJun 17 04:44:20 pl-host04n.phl.example.com bash[2152282]: debug2025-06-17T04:44:20.588+0000 7fded3075640 -1 osd.9 pg_epoc>Jun 17 04:44:24 pl-host04n.phl.example.com bash[2152282]: debug2025-06-17T04:44:24.076+0000 7fdede08b640 4 rocksdb: [db/>Jun 17 04:45:38 pl-host04n.phl.example.com bash[2152282]: debug2025-06-17T04:45:38.812+0000 7fdede08b640 4 rocksdb: [db/>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: debug2025-06-17T04:45:51.776+0000 7fdedd089640 4 rocksdb: [db/>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: debug2025-06-17T04:45:51.776+0000 7fdedd089640 4 rocksdb: [db/>
Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: ** DB Stats **
Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Uptime(secs):88200.5 total, 600.0 intervalJun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Cumulativewrites: 4721K writes, 20M keys, 4721K commit groups, >Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Cumulative WAL:4721K writes, 1745K syncs, 2.71 writes per sync,>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Cumulative stall:00:00:0.000 H:M:S, 0.0 percentJun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Interval writes:21K writes, 229K keys, 21K commit groups, 1.0 w>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Interval WAL: 21Kwrites, 7176 syncs, 2.98 writes per sync, writ>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Interval stall:00:00:0.000 H:M:S, 0.0 percentJun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: ** CompactionStats [O-1] **Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Level FilesSize Score Read(GB) Rn(GB) Rnp1(GB) Write(>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]:---------------------------------------------------------------->Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: L0 0/00.00 KB 0.0 0.0 0.0 0.0 0>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: L1 7/0393.07 MB 0.4 4.3 0.4 3.9 >Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Sum 7/0393.07 MB 0.0 4.3 0.4 3.9 >Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Int 0/00.00 KB 0.0 0.0 0.0 0.0 0>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: ** CompactionStats [O-1] **Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Priority FilesSize Score Read(GB) Rn(GB) Rnp1(GB) Wri>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]:---------------------------------------------------------------->Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Low 0/00.00 KB 0.0 4.3 0.4 3.9 4>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: High 0/00.00 KB 0.0 0.0 0.0 0.0 0>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Blob file count:0, total size: 0.0 GB, garbage size: 0.0 GB, sp>Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Uptime(secs):88200.5 total, 4800.1 intervalJun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Flush(GB):cumulative 0.407, interval 0.000Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: AddFile(GB):cumulative 0.000, interval 0.000Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: AddFile(TotalFiles): cumulative 0, interval 0Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: AddFile(L0Files): cumulative 0, interval 0Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: AddFile(Keys):cumulative 0, interval 0Jun 17 04:45:51 pl-host04n.phl.example.com bash[2152282]: Cumulativecompaction: 4.64 GB write, 0.05 MB/s write, 4.26 GB r>
Regards
Dev
On Jun 20, 2025, at 9:57 PM, Frédéric Nass <frederic.n...@univ-lorraine.fr>wrote:
Hi Dev,
Since MGRs and MONs were already upgraded successfully, you should be safestopping the upgrade, and restart it at a later time.
But before that, you could investigate why osd.19 is not coming up and whyceph-volume inventory times out. Can you ssh from MGR host to osd.19 'hostphl-prod-host04n.example.comis'?
I would look into ceph-osd.19.log and /var/log/messages for any hints onwhy osd.19 didn't start, start it manually and see if the upgrade resumes.
If the upgrade doesn't resume, I would increase cephadm command timeout to1800 (default is 900):
$ ceph config set global mgr/cephadm/default_cephadm_command_timeout 1800
(This might need a ceph mgr fail but ceph mgr fail will also interrupt theupgrade, IIRC.)
then run

$ ceph orch device ls --hostname=phl-prod-host04n.example.comis --refresh

and see if the upgrade resumes. It if doesn't, check

$ ceph log last 1000 debug cephadm


and run

$ ceph orch upgrade pause
$ ceph orch upgrade resume

again, see if the upgrade resumes. If it still doesn't, then

$ ceph orch upgrade stop
$ ceph mgr fail
$ ceph orch upgrade start --image quay.io/ceph/ceph:v19.2.2

All those commands should be safe to run in the current state of your cluster.

Regards,
Frédéric.

De : Devender Singh <deven...@netskrt.io>
Envoyé : vendredi 20 juin 2025 23:35
À : Anthony D'Atri
Cc: Michel Jouvin; ceph-users
Objet : [ceph-users] Re: CEPH upgrade from 18.2.7 to 19.2.2 -- Hung fromlast 24h at 66%
Thanks all, what if I stop upgrade, what worst will happen ?

Regards
Dev
On Jun 20, 2025, at 6:41 AM, Anthony D'Atri <a...@dreamsnake.net> wrote:
Or depending on the release in force when the OSDs were created, perhapsshard RocksDB column families?
https://www.ibm.com/docs/en/storage-ceph/8.0.0?topic=bluestore-resharding-rocksdb-database<https://www.google.com/url?q=https://www.ibm.com/docs/en/storage-ceph/8.0.0?topic%3Dbluestore-resharding-rocksdb-database&source=gmail-imap&ust=1751031728000000&usg=AOvVaw2QwtMhyiksMW179R-6g3Ut>
(Playbook from cephadm-ansible)
On Jun 20, 2025, at 1:57 AM, Michel Jouvin <michel.jou...@ijclab.in2p3.fr>wrote:
Hi Dev,
Not sure to understand why there was these service deployment time-out, thelog says that one OSD failed, this may explain that that the upgrade is notprogressing anymore. The Bluestore slow ops (a new warning so notnecessarily something new) on so many OSD seem to suggest that theresomething not optimal. As suggested in another thread recently it may be anindication that you need to compact OSD.
I am not sure what you adjusted but as long as the cluster works, I wouldnot have changed parameters and try to fix the mentioned problems.
Good luck,

Michel
Sent from my mobile
Le 20 juin 2025 05:13:38 Devender Singh <deven...@netskrt.io> a écrit :
Here is the status

# ceph orch upgrade status
{
"target_image":"quay.io/ceph/ceph@sha256:8214ebff6133ac27d20659038df6962dbf9d77da21c9438a296b2e2059a56af6",
  "in_progress": true,
  "which": "Upgrading all daemon types on all hosts",
  "services_complete": [
      "crash",
      "mgr",
      "mon"
  ],
  "progress": "74/113 daemons upgraded",
  "message": "",
  "is_paused": false
}

Regards
Dev
On Jun 19, 2025, at 8:06 PM, Devender Singh <deven...@netskrt.io> wrote:


Hello all
I have a cluster where my cluster is in hung state, Some back fills arethere but I reduced it to 1 but still upgrade not progressing…
Please help…

```# ceph health detail
HEALTH_WARN 8 OSD(s) experiencing slow operations in BlueStore; Failed toapply 2 service(s): osd.all-available-devices,osd.iops_optimized; 1 failedcephadm daemon(s); failed to probe daemons or devices; noscrub,nodeep-scrubflag(s) set; Degraded data redundancy: 1600150/39365198 objects degraded(4.065%), 93 pgs degraded, 103 pgs undersized; 127 pgs not deep-scrubbed intime[WRN] BLUESTORE_SLOW_OP_ALERT: 8 OSD(s) experiencing slow operations inBlueStore
osd.5 observed slow operation indications in BlueStore
osd.9 observed slow operation indications in BlueStore
osd.18 observed slow operation indications in BlueStore
osd.36 observed slow operation indications in BlueStore
osd.59 observed slow operation indications in BlueStore
osd.66 observed slow operation indications in BlueStore
osd.106 observed slow operation indications in BlueStore
osd.110 observed slow operation indications in BlueStore
[WRN] CEPHADM_APPLY_SPEC_FAIL: Failed to apply 2 service(s):osd.all-available-devices,osd.iops_optimizedosd.all-available-devices: Command timed out on host cephadm deploy (osddaemon) (default 900 second timeout)osd.iops_optimized: Command timed out on host cephadm deploy (osd daemon)(default 900 second timeout)
[WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
daemon osd.19 on phl-prod-host04n.example.comis in error state
[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
Command "cephadm ceph-volume -- inventory" timed out on hostphl-prod-converged03n.phl.netskrt.org (default 900 second timeout)
[WRN] OSDMAP_FLAGS: noscrub,nodeep-scrub flag(s) set
[WRN] PG_DEGRADED: Degraded data redundancy: 1600150/39365198 objectsdegraded (4.065%), 93 pgs degraded, 103 pgs undersizedpg 25.0 is stuck undersized for 21h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[90,77,NONE,38,3]pg 25.1 is stuck undersized for 23h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[131,1,66,NONE,72]pg 25.c is stuck undersized for 23h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[1,135,110,28,NONE]pg 25.e is active+undersized+degraded+remapped+backfill_wait, acting[18,20,108,101,NONE]pg 25.14 is stuck undersized for 19h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[32,13,NONE,65,97]pg 25.17 is stuck undersized for 19h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[32,65,110,23,NONE]pg 25.1a is stuck undersized for 23h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[97,66,NONE,112,139]pg 25.1c is stuck undersized for 2h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[102,136,97,NONE,45]pg 25.1f is stuck undersized for 8h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[NONE,66,7,39,109]pg 26.45 is stuck undersized for 23h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [32,47]pg 26.48 is stuck undersized for 23h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [113,55]pg 26.4b is stuck undersized for 8h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [105,7]pg 26.59 is stuck undersized for 23h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [35,40]pg 26.6a is stuck undersized for 7h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [25,45]pg 26.74 is stuck undersized for 23h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [112,131]pg 26.77 is stuck undersized for 2h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [17,105]pg 26.9c is stuck undersized for 19h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [47,76]pg 26.bf is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [32,26]pg 26.c2 is stuck undersized for 2h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [97,135]pg 26.ec is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [26,80]pg 26.f7 is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting [49,105]pg 31.12 is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[47,110,131,37,NONE]pg 31.19 is stuck undersized for 2h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[NONE,87,3,137,39]pg 31.1b is stuck undersized for 2h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[108,NONE,42,102,97]pg 31.1c is stuck undersized for 19h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[113,66,NONE,45,72]pg 31.41 is stuck undersized for 2h, current stateactive+undersized+remapped+backfill_wait, last acting [14,101,NONE,67,18]pg 31.42 is stuck undersized for 7h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[37,134,NONE,82,8]pg 31.44 is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[49,NONE,101,3,62]pg 31.46 is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfilling, last acting[95,38,NONE,25,102]pg 31.47 is stuck undersized for 18h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[60,NONE,130,72,110]pg 31.4a is stuck undersized for 4h, current stateactive+undersized+degraded+remapped+backfilling, last acting[66,NONE,135,82,14]pg 31.4c is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[34,NONE,1,82,18]pg 31.4d is stuck undersized for 2h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[101,13,65,30,NONE]pg 31.52 is stuck undersized for 4m, current stateactive+undersized+remapped+backfill_wait, last acting [26,112,66,NONE,135]pg 31.53 is stuck undersized for 17h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[NONE,24,11,42,110]pg 31.55 is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[116,NONE,27,4,117]pg 31.57 is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[17,15,110,NONE,1]pg 31.5a is stuck undersized for 19h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[13,24,124,26,NONE]pg 31.5b is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[NONE,31,8,27,42]pg 31.5c is stuck undersized for 2h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[NONE,60,28,18,119]pg 31.5d is stuck undersized for 17h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[124,NONE,85,6,11]pg 31.5e is stuck undersized for 2h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[NONE,4,1,23,138]pg 31.60 is stuck undersized for 18h, current stateactive+undersized+degraded+remapped+backfilling, last acting[37,11,102,NONE,133]pg 31.64 is stuck undersized for 4m, current stateactive+undersized+remapped+backfill_wait, last acting [26,106,45,34,NONE]pg 31.65 is stuck undersized for 4h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[NONE,14,127,62,3]pg 31.66 is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[34,NONE,35,23,59]pg 31.67 is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[67,24,127,8,NONE]pg 31.68 is stuck undersized for 4m, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[32,41,18,17,NONE]pg 31.70 is stuck undersized for 19h, current stateactive+undersized+degraded+remapped+backfilling, last acting [106,38,1,NONE,97]pg 31.75 is stuck undersized for 2h, current stateactive+undersized+degraded+remapped+backfill_wait, last acting[26,31,34,8,NONE]pg 31.7f is stuck undersized for 2h, current stateactive+undersized+remapped+backfilling, last acting [NONE,101,110,10,45]
[WRN] PG_NOT_DEEP_SCRUBBED: 127 pgs not deep-scrubbed in time
pg 26.e5 not deep-scrubbed since 2025-06-07T05:59:22.612553+0000
pg 26.ca not deep-scrubbed since 2025-06-06T18:21:07.435823+0000
pg 26.bf not deep-scrubbed since 2025-06-03T07:29:02.791095+0000
pg 26.be not deep-scrubbed since 2025-06-04T07:55:54.389714+0000
pg 26.a5 not deep-scrubbed since 2025-06-07T07:28:56.878096+0000
pg 26.85 not deep-scrubbed since 2025-06-03T15:18:59.395929+0000
pg 26.7f not deep-scrubbed since 2025-06-04T21:29:28.412637+0000
pg 26.7e not deep-scrubbed since 2025-06-03T09:14:19.585388+0000
pg 26.7d not deep-scrubbed since 2025-06-03T18:37:30.931020+0000
pg 26.7c not deep-scrubbed since 2025-06-04T13:38:00.061488+0000
pg 26.73 not deep-scrubbed since 2025-06-03T06:20:15.111819+0000
pg 26.6f not deep-scrubbed since 2025-06-03T13:45:24.880397+0000
pg 26.6e not deep-scrubbed since 2025-05-26T23:15:32.099862+0000
pg 26.6d not deep-scrubbed since 2025-06-04T14:04:10.449101+0000
pg 31.62 not deep-scrubbed since 2025-06-03T13:34:49.518456+0000
pg 26.65 not deep-scrubbed since 2025-06-04T07:56:25.353411+0000
pg 31.66 not deep-scrubbed since 2025-06-03T10:32:05.364424+0000
pg 26.62 not deep-scrubbed since 2025-06-04T09:35:58.267976+0000
pg 31.65 not deep-scrubbed since 2025-06-03T16:04:40.003140+0000
pg 31.5b not deep-scrubbed since 2025-06-03T14:18:18.835477+0000
pg 26.5d not deep-scrubbed since 2025-06-04T15:14:30.870252+0000
pg 31.58 not deep-scrubbed since 2025-06-03T03:09:27.568605+0000
pg 26.5c not deep-scrubbed since 2025-06-03T01:57:27.644129+0000
pg 31.5f not deep-scrubbed since 2025-06-03T05:53:20.860393+0000
pg 31.52 not deep-scrubbed since 2025-05-27T00:01:27.040861+0000
pg 31.53 not deep-scrubbed since 2025-05-24T09:37:58.964829+0000
pg 26.55 not deep-scrubbed since 2025-06-04T21:25:34.135356+0000
pg 26.54 not deep-scrubbed since 2025-06-04T06:07:12.978734+0000
pg 31.56 not deep-scrubbed since 2025-06-04T12:58:17.599712+0000
pg 31.57 not deep-scrubbed since 2025-06-03T07:02:16.859990+0000
pg 26.51 not deep-scrubbed since 2025-06-03T05:42:22.435483+0000
pg 26.4f not deep-scrubbed since 2025-06-03T09:10:22.617328+0000
pg 31.4a not deep-scrubbed since 2025-05-28T00:54:55.246532+0000
pg 26.4e not deep-scrubbed since 2025-06-03T11:16:49.278513+0000
pg 31.4b not deep-scrubbed since 2025-06-03T10:24:24.123351+0000
pg 26.4d not deep-scrubbed since 2025-06-04T19:01:44.614410+0000
pg 31.49 not deep-scrubbed since 2025-05-28T04:56:29.368285+0000
pg 31.42 not deep-scrubbed since 2025-05-28T08:38:57.151865+0000
pg 26.41 not deep-scrubbed since 2025-06-03T05:47:35.443867+0000
pg 26.40 not deep-scrubbed since 2025-06-03T05:43:13.283668+0000
pg 25.17 not deep-scrubbed since 2025-06-03T09:26:52.253625+0000
pg 32.2e not deep-scrubbed since 2025-06-05T13:01:06.175389+0000
pg 22.1a not deep-scrubbed since 2025-06-07T01:40:45.063268+0000
pg 31.13 not deep-scrubbed since 2025-06-03T12:21:17.965218+0000
pg 22.1b not deep-scrubbed since 2025-06-04T12:22:44.947751+0000
pg 25.14 not deep-scrubbed since 2025-05-28T06:26:32.552200+0000
pg 26.17 not deep-scrubbed since 2025-06-03T18:37:26.617483+0000
pg 31.12 not deep-scrubbed since 2025-05-28T08:39:23.271194+0000
pg 31.1c not deep-scrubbed since 2025-06-03T08:17:51.230187+0000
pg 25.1f not deep-scrubbed since 2025-05-28T00:19:23.653883+0000
77 more pgs…

Regards
Dev
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: CEPH upgrade from 18.2.7 to 19.2.2 -- Hung from last 24h at 66%

Reply via email to