Based on original concept of *osd_max_backfills* which prevents the following:
"*situationIf all of these backfills happen simultaneously, it would put excessive load on the osd.*" the value of "osd_max_backfills" could be important in some situation. So we might not be able to say how it's important. >From my experience, big cluster easily could become complicted. Because I know some automobile manufacturers which faced performance issues. Actually their ceph cluster are not quite big so -; *"dropping down max backfills is more important than reducing max recovery (gathering recovery metadata happens largely in memory)"* As Jan said, "*increasing the number of PGs helped with this as the “blocks” of work are much smaller than before.*" A number of PGs is also one of factors that improve performance, and needs to be considered. >From messages of Huang and Jan, we might need to think that a total number of PGs are not always equal to the following formular. "*Total PGs = (OSDs * 100) / pool size*" So what I like and would like to try are: " *What I would be happy to see is more of a QOS style tunable along the lines of networking traffic shaping.*" - Milosz Tanski " *Another idea would be to have a better way to prioritize recovery traffic to an* *even lower priority level by setting the ionice value to 'idle' in the CFQ scheduler*" - Bryan Stillwell Shinobu On Fri, Jun 5, 2015 at 8:24 AM, Scottix <scot...@gmail.com> wrote: > From a ease of use standpoint and depending on the situation you are > setting up your environment, the idea is as follow; > > It seems like it would be nice to have some easy on demand control where > you don't have to think a whole lot other than knowing how it is going to > affect your cluster in a general sense. > > The two extremes and a general limitation would be: > 1. Priority data recover > 2. Priority client usability > 3rd might be hardware related like 1Gb connection > > With predefined settings you can setup different levels that have sensible > settings and maybe 1 that is custom for the advanced user. > Example command (Caveat: I don't fully know how your configs work): > ceph osd set priority <low|medium|high|custom> > *With priority set it would lock certain attributes > **With priority unset it would unlock certain attributes > > In our use case basically after 8pm the activity goes way down. Here I can > up the priority to medium or high, then at 6 am I can adjust it back to low. > > With cron I can easily schedule that or depending on the current situation > I can schedule maintenance and change the priority to fit my needs. > > > > On Thu, Jun 4, 2015 at 2:01 PM Mike Dawson <mike.daw...@cloudapt.com> > wrote: > >> With a write-heavy RBD workload, I add the following to ceph.conf: >> >> osd_max_backfills = 2 >> osd_recovery_max_active = 2 >> >> If things are going well during recovery (i.e. guests happy and no slow >> requests), I will often bump both up to three: >> >> # ceph tell osd.* injectargs '--osd-max-backfills 3 >> --osd-recovery-max-active 3' >> >> If I see slow requests, I drop them down. >> >> The biggest downside to setting either to 1 seems to be the long tail >> issue detailed in: >> >> http://tracker.ceph.com/issues/9566 >> >> Thanks, >> Mike Dawson >> >> >> On 6/3/2015 6:44 PM, Sage Weil wrote: >> > On Mon, 1 Jun 2015, Gregory Farnum wrote: >> >> On Mon, Jun 1, 2015 at 6:39 PM, Paul Von-Stamwitz >> >> <pvonstamw...@us.fujitsu.com> wrote: >> >>> On Fri, May 29, 2015 at 4:18 PM, Gregory Farnum <g...@gregs42.com> >> wrote: >> >>>> On Fri, May 29, 2015 at 2:47 PM, Samuel Just <sj...@redhat.com> >> wrote: >> >>>>> Many people have reported that they need to lower the osd recovery >> config options to minimize the impact of recovery on client io. We are >> talking about changing the defaults as follows: >> >>>>> >> >>>>> osd_max_backfills to 1 (from 10) >> >>>>> osd_recovery_max_active to 3 (from 15) >> >>>>> osd_recovery_op_priority to 1 (from 10) >> >>>>> osd_recovery_max_single_start to 1 (from 5) >> >>>> >> >>>> I'm under the (possibly erroneous) impression that reducing the >> number of max backfills doesn't actually reduce recovery speed much (but >> will reduce memory use), but that dropping the op priority can. I'd rather >> we make users manually adjust values which can have a material impact on >> their data safety, even if most of them choose to do so. >> >>>> >> >>>> After all, even under our worst behavior we're still doing a lot >> better than a resilvering RAID array. ;) -Greg >> >>>> -- >> >>> >> >>> >> >>> Greg, >> >>> When we set... >> >>> >> >>> osd recovery max active = 1 >> >>> osd max backfills = 1 >> >>> >> >>> We see rebalance times go down by more than half and client write >> performance increase significantly while rebalancing. We initially played >> with these settings to improve client IO expecting recovery time to get >> worse, but we got a 2-for-1. >> >>> This was with firefly using replication, downing an entire node with >> lots of SAS drives. We left osd_recovery_threads, osd_recovery_op_priority, >> and osd_recovery_max_single_start default. >> >>> >> >>> We dropped osd_recovery_max_active and osd_max_backfills together. If >> you're right, do you think osd_recovery_max_active=1 is primary reason for >> the improvement? (higher osd_max_backfills helps recovery time with erasure >> coding.) >> >> >> >> Well, recovery max active and max backfills are similar in many ways. >> >> Both are about moving data into a new or outdated copy of the PG ? the >> >> difference is that recovery refers to our log-based recovery (where we >> >> compare the PG logs and move over the objects which have changed) >> >> whereas backfill requires us to incrementally move through the entire >> >> PG's hash space and compare. >> >> I suspect dropping down max backfills is more important than reducing >> >> max recovery (gathering recovery metadata happens largely in memory) >> >> but I don't really know either way. >> >> >> >> My comment was meant to convey that I'd prefer we not reduce the >> >> recovery op priority levels. :) >> > >> > We could make a less extreme move than to 1, but IMO we have to reduce >> it >> > one way or another. Every major operator I've talked to does this, our >> PS >> > folks have been recommending it for years, and I've yet to see a single >> > complaint about recovery times... meanwhile we're drowning in a sea of >> > complaints about the impact on clients. >> > >> > How about >> > >> > osd_max_backfills to 1 (from 10) >> > osd_recovery_max_active to 3 (from 15) >> > osd_recovery_op_priority to 3 (from 10) >> > osd_recovery_max_single_start to 1 (from 5) >> > >> > (same as above, but 1/3rd the recovery op prio instead of 1/10th) >> > ? >> > >> > sage >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@lists.ceph.com >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Email: shin...@linux.com ski...@redhat.com Life w/ Linux <http://i-shinobu.hatenablog.com/>
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com