On 06/04/2014 07:22 PM, Sage Weil wrote: > On Wed, 4 Jun 2014, Andrey Korolyov wrote: >> On 06/04/2014 06:06 PM, Sage Weil wrote: >>> On Wed, 4 Jun 2014, Dan Van Der Ster wrote: >>>> Hi Sage, all, >>>> >>>> On 21 May 2014, at 22:02, Sage Weil <s...@inktank.com> wrote: >>>> >>>>> * osd: allow snap trim throttling with simple delay (#6278, Sage Weil) >>>> >>>> Do you have some advice about how to use the snap trim throttle? I saw >>>> osd_snap_trim_sleep, which is still 0 by default. But I didn't manage to >>>> follow the original ticket, since it started out as a question about >>>> deep scrub contending with client IOs, but then at some point you >>>> renamed the ticket to throttling snap trim. What exactly does snap trim >>>> do in the context of RBD client? And can you suggest a good starting >>>> point for osd_snap_trim_sleep = ? ? >>> >>> This is a coarse hack to make the snap trimming slow down and let client >>> IO run by simply sleeping between work. I would start with something >>> smallish (.01 = 10ms) after deleting some snapshots and see what effect it >>> has on request latency. Unfortunately it's not a very intuitive knob to >>> adjust, but it is an interim solution until we figure out how to better >>> prioritize this (and other) background work. >>> >>> In short, if you do see a performance degradation after removing snaps, >>> adjust this up or down and see how it changes that. If you don't see a >>> degradation, then you're lucky and don't need to do anything. :) >>> >>> You can adjust this on running OSDs with something like 'ceph daemon >>> osd.NN config set osd_snap_trim_sleep .01' or with 'ceph tell osd.* >>> injectargs -- --osd-snap-trim-sleep .01'. >>> >>> sage >>> >> >> Hi, >> >> we had the same mechanism for almost a half of year and it working nice >> except cases when multiple background snap deletions are hitting their >> ends - latencies may spike not regarding very large sleep gap for snap >> operations. Do you have any thoughts on reducing this particular impact? > > This isn't ringing any bells. If this is somethign you can reproduce with > osd logging enabled we should be able to tell what is causing the spike, > though... > > sage >
Ok, would 10 be enough there? On 20, all timings most likely to be distorted by logging operations even for tmpfs. _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com