Hi, After disaster and restarting for automatic recovery, I found following ceph status. Some OSDs cannot be restarted due to file system corruption (it seem that xfs is fragile).
[root@management-b ~]# ceph status cluster 3810e9eb-9ece-4804-8c56-b986e7bb5627 health HEALTH_WARN 209 pgs degraded 209 pgs stuck degraded 334 pgs stuck unclean 209 pgs stuck undersized 209 pgs undersized recovery 5354/77810 objects degraded (6.881%) recovery 1105/77810 objects misplaced (1.420%) monmap e1: 3 mons at {management-a= 10.255.102.1:6789/0,management-b=10.255.102.2:6789/0,management-c=10.255.102.3:6789/0 } election epoch 2308, quorum 0,1,2 management-a,management-b,management-c osdmap e25037: 96 osds: 49 up, 49 in; 125 remapped pgs flags sortbitwise pgmap v9024253: 2560 pgs, 5 pools, 291 GB data, 38905 objects 678 GB used, 90444 GB / 91123 GB avail 5354/77810 objects degraded (6.881%) 1105/77810 objects misplaced (1.420%) 2226 active+clean 209 active+undersized+degraded 125 active+remapped client io 0 B/s rd, 282 kB/s wr, 10 op/s Since total active PGs same with total PGs and total degraded PGs same with total undersized PGs, does it mean that all PGs have at least one good replica, so I can just mark lost or remove down OSD, reformat again and then restart them if there is no hardware issue with HDDs? Which one of PGs status should I pay more attention, degraded or undersized due to lost object possibility? Best regards,
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com