Hello Ceph Users, I have an issue with my ceph cluster, after one serious fail in four SSD (electricaly dead) I have lost PGs (and replicats) and I have 14 Pgs stuck.
So for correct it I have try to force create this PGs (with same IDs) but now the Pgs stuck in creating state -_-" : ~# ceph -s health HEALTH_ERR 14 pgs are stuck inactive for more than 300 seconds .... ceph pg dump | grep creating dumped all in format plain 9.3 0 0 0 0 0 0 0 0 creating 2019-02-25 09:32:12.333979 0'0 0:0[20,26] 20 [20,11] 20 0'0 2019-02-25 09:32:12.333979 0'0 2019-02-25 09:32:12.333979 3.9 0 0 0 0 0 0 0 0 creating 2019-02-25 09:32:11.295451 0'0 0:0[16,39] 16 [17,6] 17 0'0 2019-02-25 09:32:11.295451 0'0 2019-02-25 09:32:11.295451 ... I have try to create new PG dosent existe before and it work, but for this PG stuck in creating state. In my monitor logs I have this message: 2019-02-25 11:02:46.904897 7f5a371ed700 0 mon.controller1@1(peon) e7 handle_command mon_command({"prefix": "pg force_create_pg", "pgid": "4.20e"} v 0) v1 2019-02-25 11:02:46.904938 7f5a371ed700 0 log_channel(audit) log [INF] : from='client.? 172.31.101.107:0/3101034432' entity='client.admin' cmd=[{"prefix": "pg force_create_pg", "pgid": "4.20e"}]: dispatch When I check map I have: ~# ceph pg map 4.20e osdmap e428069 pg 4.20e (4.20e) -> up [27,37,36] acting [13,17] I have restart OSD 27,37,36,13 and 17 but no effect. (one by one) I have see this issue http://tracker.ceph.com/issues/18298 but I run on ceph 10.2.11. So could you help me please ? Many thanks by advance, Sfalicon.
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com