You want to look into the settings osd_max_scrubs which indicates how many different scrub operations an OSD can be involved in at once (the well chosen default is 1), as well as osd_scrub_max_interval and osd_deep_scrub_interval. One of the differences in your cluster from before to now is time.
It might be that you're PGs are running into their max interval and forcing to scrub now which is interfering with your script. We also manage our deep scrubs with a cron and have set our osd_deep_scrub_scrub interval to longer than it will take our cron to go through all of the PGs so that they will never automatically deep scrub themselves. For reference please refer to the osd config ceph document for all of the osd scrub settings available to you. http://docs.ceph.com/docs/master/rados/configuration/osd-config-ref/ ________________________________ [cid:image5d3c3a.JPG@9b095740.4da3a690]<https://storagecraft.com> David Turner | Cloud Operations Engineer | StorageCraft Technology Corporation<https://storagecraft.com> 380 Data Drive Suite 300 | Draper | Utah | 84020 Office: 801.871.2760 | Mobile: 385.224.2943 ________________________________ If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this message is prohibited. ________________________________ ________________________________ From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Richard Arends [cephmailingl...@mosibi.nl] Sent: Tuesday, January 17, 2017 8:03 AM To: ceph-users@lists.ceph.com Subject: [ceph-users] Manual deep scrub Hi, When i start a deep scrub on a PG by hand 'ceph pg deep-scrub 1.18d5', sometimes the deep scrub is executed direct after the command is entered, but often it's not there is a lot of time between starting and executing. For example: 2017-01-17 05:25:31.786 session 01162017 :: Starting deep-scrub on pg 1.1a39 for pool openstack_volumes 2017-01-17 06:37:48.135325 osd.1120 <ip>:6814/4237 263 : cluster [INF] 1.1a39 deep-scrub starts 2017-01-17 06:58:07.651926 osd.1120 <ip>:6814/4237 264 : cluster [INF] 1.1a39 deep-scrub ok De first log line is from our 'deep scrub cron script'. My question is, what defines how and when and how long a deep scrub is queued and is there a way to enforce the deep scrub 'now'. I am looking into this, because in the past we could deep scrub a specific set in 24 hours and now that's not possible anymore. -- Regards, Richard.
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com