I dont hear a lot of people discuss using xfs_fsr on OSDs and going over the mailing list history it seems to have been brought up very infrequently and never as a suggestion for regular maintenance. Perhaps its not needed.
One thing to consider trying, and to rule out something funky with the XFS filesystem on that particular OSD/drive would be to remove the OSD entirely from the cluster, reformat the disk, and then rebuild the OSD, putting a brand new XFS on the OSD. On Mon, Jan 15, 2018 at 7:36 AM, lists <li...@merit.unu.edu> wrote: > Hi, > > On our three-node 24 OSDs ceph 10.2.10 cluster, we have started seeing > slow requests on a specific OSD, during the the two-hour nightly xfs_fsr > run from 05:00 - 07:00. This started after we applied the meltdown patches. > > The specific osd.10 also has the highest space utilization of all OSDs > cluster-wide, with 45%, while the others are mostly around 40%. All OSDs > are the same 4TB platters with journal on ssd, all with weight 1. > > Smart info for osd.10 shows nothing interesting I think: > > Current Drive Temperature: 27 C >> Drive Trip Temperature: 60 C >> >> Manufactured in week 04 of year 2016 >> Specified cycle count over device lifetime: 10000 >> Accumulated start-stop cycles: 53 >> Specified load-unload count over device lifetime: 300000 >> Accumulated load-unload cycles: 697 >> Elements in grown defect list: 0 >> >> Vendor (Seagate) cache information >> Blocks sent to initiator = 1933129649 >> Blocks received from initiator = 869206640 >> Blocks read from cache and sent to initiator = 2149311508 >> Number of read and write commands whose size <= segment size = 676356809 >> Number of read and write commands whose size > segment size = 12734900 >> >> Vendor (Seagate/Hitachi) factory information >> number of hours powered up = 13625.88 >> number of minutes until next internal SMART test = 8 >> > > Now my question: > Could it be that osd.10 just happens to contain some data chunks that are > heavily needed by the VMs around that time, and that the added load of an > xfs_fsr is simply too much for it to handle? > > In that case, how about reweighting that osd.10 to "0", wait until all > data has moved off osd.10, and then setting it back to "1". Would this > result in *exactly* the same situation as before, or would it at least > cause the data to have spread move better across the other OSDs? > > (with the idea that better data spread across OSDs brings also better > distribution of load between the OSDs) > > Or other ideas to check out? > > MJ > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Respectfully, Wes Dillingham wes_dilling...@harvard.edu Research Computing | Senior CyberInfrastructure Storage Engineer Harvard University | 38 Oxford Street, Cambridge, Ma 02138 | Room 204
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com