Hi Brett,

this issue has been with you long before upgrade to 14.2.1. This upgrade just brought corresponding alert visible.

You can turn the alert off by setting bluestore_warn_on_bluefs_spillover=false.

But generally this warning shows DB data layout inefficiency - some data is kept at slow device - which might has some negative performance impact.

Unfortunately that's a know issue with current RocksDB/BlueStore interaction - spillovers to slow device might take place even when there is plenty of free space at fast one.


Thanks,

Igor



On 6/18/2019 8:46 PM, Brett Chancellor wrote:
Does anybody have a fix for BlueFS spillover detected? This started happening 2 days after an upgrade to 14.2.1 and has increased from 3 OSDs to 118 in the last 4 days. I read you could fix it by rebuilding the OSDs, but rebuilding the 264 OSDs on this cluster will take months of rebalancing.

$ sudo ceph health detail
HEALTH_WARN BlueFS spillover detected on 118 OSD(s)
BLUEFS_SPILLOVER BlueFS spillover detected on 118 OSD(s)
     osd.0 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.1 spilled over 103 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.5 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.6 spilled over 64 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.11 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.13 spilled over 23 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.21 spilled over 102 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.22 spilled over 103 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.23 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.24 spilled over 25 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.25 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.26 spilled over 64 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.27 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.30 spilled over 65 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.32 spilled over 21 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.34 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.42 spilled over 25 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.45 spilled over 103 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.46 spilled over 24 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.47 spilled over 63 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.48 spilled over 63 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.49 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.50 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.52 spilled over 140 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.53 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.54 spilled over 59 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.55 spilled over 134 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.56 spilled over 19 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.57 spilled over 61 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.58 spilled over 66 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.59 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.61 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.62 spilled over 59 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.65 spilled over 19 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.67 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.69 spilled over 20 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.71 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.73 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.74 spilled over 17 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.75 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.76 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.78 spilled over 64 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.80 spilled over 100 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.81 spilled over 63 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.82 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.83 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.84 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.85 spilled over 19 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.87 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.89 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.93 spilled over 102 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.95 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.98 spilled over 19 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.101 spilled over 20 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.103 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.108 spilled over 20 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.110 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.112 spilled over 65 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.113 spilled over 64 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.115 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.117 spilled over 23 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.118 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.119 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.120 spilled over 65 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.121 spilled over 101 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.122 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.126 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.127 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.128 spilled over 23 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.129 spilled over 67 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.132 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.133 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.137 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.138 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.139 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.142 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.143 spilled over 27 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.144 spilled over 25 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.147 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.148 spilled over 96 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.157 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.158 spilled over 64 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.160 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.161 spilled over 61 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.167 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.177 spilled over 23 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.180 spilled over 140 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.185 spilled over 20 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.189 spilled over 62 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.190 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.192 spilled over 19 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.193 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.202 spilled over 21 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.207 spilled over 27 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.216 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.219 spilled over 59 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.220 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.221 spilled over 176 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.223 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.225 spilled over 22 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.226 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.228 spilled over 59 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.236 spilled over 65 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.237 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.238 spilled over 23 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.239 spilled over 103 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.240 spilled over 25 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.241 spilled over 26 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.242 spilled over 60 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.243 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.244 spilled over 103 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.245 spilled over 144 GiB metadata from 'db' device (29 GiB used of 148 GiB) to slow device      osd.246 spilled over 25 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.247 spilled over 24 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.251 spilled over 106 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.252 spilled over 105 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.261 spilled over 20 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device      osd.262 spilled over 22 GiB metadata from 'db' device (28 GiB used of 148 GiB) to slow device

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to