On Mon, May 16, 2016 at 11:58 AM, Nick Fisk <n...@fisk.me.uk> wrote: > > -----Original Message----- > > From: Peter Kerdisle [mailto:peter.kerdi...@gmail.com] > > Sent: 16 May 2016 10:39 > > To: n...@fisk.me.uk > > Cc: ceph-users@lists.ceph.com > > Subject: Re: [ceph-users] Erasure pool performance expectations > > > > I'm forcing a flush by lower the cache_target_dirty_ratio to a lower > value. > > This forces writes to the EC pool, these are the operations I'm trying to > > throttle a bit. I am understanding you correctly that's throttling only > works for > > the other way around? Promoting cold objects into the hot cache? > > Yes that’s correct. You want to throttle the flushes which is done by > another setting(s) > > Firstly set something like this in your ceph.conf > osd_agent_max_low_ops = 1 > osd_agent_max_ops = 4 > I did not know about this, that's great, I will play around with these.
> > This controls how many parallel threads the tiering agent will use. You > can bump them up later if needed. > > Next set on your cache pools, these two settings. Try and keep them about > .2 apart. So something like .4 and .6 are good to start with. > cache_target_dirty_ratio > cache_target_dirty_high_ratio > Here is actually the heart of the matter. Ideally I would love to run it at 0.0 if that makes sense. I want no dirty objects in my hot cache at all, has anybody ever tried this? Right now I'm just pushing cache_target_dirty_ratio during low activity moments by setting it to 0.2 and then bringing it back up to 0.6 when it's done or activity starts up again. > > And let me know if that helps. > > > > > > > The measurement is a problem for me at the moment. I'm trying to get the > > perf dumps into collectd/graphite but it seems I need to hand roll a > solution > > since the plugins I found are not working anymore. What I'm doing now is > > just summing the bandwidth statistics from my nodes to get an > > approximated number. I hope to make some time this week to write a > > collectd plugin to fetch get the actual stats from perf dumps. > > I've used diamond to collect the stats and it worked really well. I can > share my graphite query to sum the promote/flush rates as well if it helps? > I will check out diamond, are you using this specifically? https://github.com/BrightcoveOS/Diamond/wiki/collectors-CephCollector It would be great if you could share your graphite queries :) > > > > > I confirmed the settings are indeed correctly picked up across the nodes > in > > the cluster. > > Good, glad we got that sorted > > > > > I tried switching my pool to readforward since for my needs the EC pool > is > > fast enough for reads but I got scared when I got the warning about data > > corruption. How safe is readforward really at this point? I noticed the > option > > was removed from the latest docs while still living on the google cached > > version: http://webcache.googleusercontent.com/search?q=cache:http://d > > ocs.ceph.com/docs/master/rados/operations/cache-tiering/ > > Not too sure about the safety, but I'm in the view that those extra modes > probably aren’t needed, I'm pretty sure the same effect can be controlled > via the recency settings (Someone correct me please). The higher the > recency settings, the less likely an object will be chosen to be promoted > into the cache tier. If you set the min_recency for reads to be higher than > the max hit_set count. Then in theory no reads will ever cause an object to > be promoted. > You are right, your earlier help made me do exactly that and things have been working better since. Thanks!
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com