Dear all,

I have 4 servers with 4 OSDs / drives each, so total I have 16 OSDs. For
some reason, the last server is over-utilised compared to the first 3
servers, causing all the OSDs on the fourth server: osd.12, osd.13, osd.14
and osd.15 to be near full (above 85%).

/dev/sda1      458140932 393494412  64646520  86% /var/lib/ceph/osd/ceph-12
/dev/sdb1      458140932 391394376  66746556  86% /var/lib/ceph/osd/ceph-13
/dev/sdc1      458140932 408390636  49750296  90% /var/lib/ceph/osd/ceph-14
/dev/sdd1      458140932 405402916  52738016  89% /var/lib/ceph/osd/ceph-15

The cluster's health is now degraded because a lot of PGs are in
active+remapped+backfill_toofull status with 5 near-fulll OSDs (all 4 on
the fourth server and another 1 on the other server).

$ ceph health
HEALTH_WARN 21 pgs backfill_toofull; 21 pgs stuck unclean; recovery
30972/1624868 degraded (1.906%); 5 near full osd(s)

When I tried to adjust the weight, it seems that the PGs are being
reassigned to different OSDs on the same server (e.g. from 15 to 14, 13,
etc) and not to OSDs on different server which still have more space
(utilisation below 80%).

-5      1.72            host ceph-osd-04
12      0.43                    osd.12  up      1
13      0.43                    osd.13  up      0.9
14      0.43                    osd.14  up      0.85
15      0.43                    osd.15  up      0.85


>From below table, we can see that it tries to remap to another OSD on the
same server, which is also near-full.

$ grep backfill pg_dump.8 | awk '{print $1,$9,$14,$15}'
4.1fe active+remapped+backfill_toofull [0,15] [0,15,12]
4.1f1 active+remapped+backfill_toofull [5,14] [5,14,15]
4.1db active+remapped+backfill_toofull [15,8] [8,15,12]
4.1d3 active+remapped+backfill_toofull [14,3] [3,14,15]
4.1d1 active+remapped+backfill_toofull [7,15] [7,15,14]
4.1b1 active+remapped+backfill_toofull [13,8] [8,13,15]
4.1a0 active+remapped+backfill_toofull [15,1] [1,15,14]
4.18a active+remapped+backfill_toofull [13,2] [2,13,15]
4.18d active+remapped+backfill_toofull [10,14] [10,14,15]

This causes never ending degradation since the OSDs will always full unless
if the PGs are re-mapped to other OSDs on the other servers.

Is it possible to "force" certain PGs to be re-assigned to certain OSD on a
different server?

Looking forward to your reply, thank you.

Cheers.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to