On 06/04/2014 07:22 PM, Sage Weil wrote:
> On Wed, 4 Jun 2014, Andrey Korolyov wrote:
>> On 06/04/2014 06:06 PM, Sage Weil wrote:
>>> On Wed, 4 Jun 2014, Dan Van Der Ster wrote:
>>>> Hi Sage, all,
>>>>
>>>> On 21 May 2014, at 22:02, Sage Weil <s...@inktank.com> wrote:
>>>>
>>>>> * osd: allow snap trim throttling with simple delay (#6278, Sage Weil)
>>>>
>>>> Do you have some advice about how to use the snap trim throttle? I saw 
>>>> osd_snap_trim_sleep, which is still 0 by default. But I didn't manage to 
>>>> follow the original ticket, since it started out as a question about 
>>>> deep scrub contending with client IOs, but then at some point you 
>>>> renamed the ticket to throttling snap trim. What exactly does snap trim 
>>>> do in the context of RBD client? And can you suggest a good starting 
>>>> point for osd_snap_trim_sleep = ? ?
>>>
>>> This is a coarse hack to make the snap trimming slow down and let client 
>>> IO run by simply sleeping between work.  I would start with something 
>>> smallish (.01 = 10ms) after deleting some snapshots and see what effect it 
>>> has on request latency.  Unfortunately it's not a very intuitive knob to 
>>> adjust, but it is an interim solution until we figure out how to better 
>>> prioritize this (and other) background work.
>>>
>>> In short, if you do see a performance degradation after removing snaps, 
>>> adjust this up or down and see how it changes that.  If you don't see a 
>>> degradation, then you're lucky and don't need to do anything.  :)
>>>
>>> You can adjust this on running OSDs with something like 'ceph daemon 
>>> osd.NN config set osd_snap_trim_sleep .01' or with 'ceph tell osd.* 
>>> injectargs -- --osd-snap-trim-sleep .01'.
>>>
>>> sage
>>>
>>
>> Hi,
>>
>> we had the same mechanism for almost a half of year and it working nice
>> except cases when multiple background snap deletions are hitting their
>> ends - latencies may spike not regarding very large sleep gap for snap
>> operations. Do you have any thoughts on reducing this particular impact?
> 
> This isn't ringing any bells.  If this is somethign you can reproduce with 
> osd logging enabled we should be able to tell what is causing the spike, 
> though...
> 
> sage
> 

Ok, would 10 be enough there? On 20, all timings most likely to be
distorted by logging operations even for tmpfs.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to