Re: [ceph-users] Ceph Blog Articles

2016-12-06 Thread Sascha Vogt
Hi Nick, m( of course, you're right. Yes, we have rbd_cache enabled for KVM / QEMU. That probably also explains the large diff between avg and stdev. Thanks for the Pointer. Unfortunately I have not yet gotten fio to work with the rbd engine. Always fails with > rbd engine: RBD version: 0.1.9 >

Re: [ceph-users] Ceph Blog Articles

2016-12-06 Thread Sascha Vogt
time_based=1 > runtime=360 > numjobs=1 > > [rbd_iodepth1] > iodepth=1 > >> -Original Message----- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Sascha Vogt >> Sent: 05 December 2016 14:08 >> To: ceph-users@lists.ceph.com >

Re: [ceph-users] Ceph Blog Articles

2016-12-05 Thread Sascha Vogt
Hi Nick, thanks for sharing your results. Would you be able to share the fio args you used for benchmarking (especially the ones for the screenshot you shared in the write latency post)? What I found is that when I do some 4k write benchmarks my lat stdev is much higher then the average (also wid

Re: [ceph-users] Blog post about Ceph cache tiers - feedback welcome

2016-10-04 Thread Sascha Vogt
Hi Lindsay, Am 03.10.2016 um 23:57 schrieb Lindsay Mathieson: > Thanks, that clarified things a lot - much easier to follow than the > offical docs :) Thank you for the kind words, very much appreciated! > Do cache tiers help with writes as well? Basically there are two cache modes (you specify i

Re: [ceph-users] Blog post about Ceph cache tiers - feedback welcome

2016-10-03 Thread Sascha Vogt
Hi Nick, On 02/10/16 22:56, Nick Fisk wrote: [...] osd_agent_max_high_ops osd_agent_max_ops They control how many concurrent flushes happen at the high/low thresholds. Ie you can set the low one to 1 to minimise the impact on client IO. Also the target_max_bytes is calculated on a per PG bas

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup | writeup

2016-10-02 Thread Sascha Vogt
Hi all, just a quick writeup. Over the last two days I was able to evict a lot of those 0-byte files by setting "target_max_objects" to 2 millions. After we hit that limit I set it to 10 millions for now. So target_dirty_ratio of 0.6 would mean evicting should start at around 6 million objec

[ceph-users] Blog post about Ceph cache tiers - feedback welcome

2016-10-02 Thread Sascha Vogt
Hi all, as it took quite a while until we got our Ceph cache working (and we're still hit but some unexpected things, see the thread Ceph with cache pool - disk usage / cleanup), I thought it might be good to write a summary of what I (believe) to know up to this point. Any feedback, especia

Re: [ceph-users] New Cluster OSD Issues

2016-10-02 Thread Sascha Vogt
Hi Pankaj, On 30/09/16 17:31, Garg, Pankaj wrote: I just created a new cluster with 0.94.8 and I’m getting this message: 2016-09-29 21:36:47.065642 mon.0 [INF] disallowing boot of OSD osd.35 10.22.21.49:6844/9544 because the osdmap requires CEPH_FEATURE_SERVER_JEWEL but the osd lacks CEPH_FEATU

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-30 Thread Sascha Vogt
Hi, Am 30.09.2016 um 09:45 schrieb Christian Balzer: > [...] > Gotta love having (only a few years late) a test and staging cluster that > is actually usable and comparable to my real ones. > > So I did create a 500GB image and filled it up. > The cache pool is set to 500GB as well and will flu

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-30 Thread Sascha Vogt
Am 30.09.2016 um 05:18 schrieb Christian Balzer: > On Thu, 29 Sep 2016 20:15:12 +0200 Sascha Vogt wrote: >> On 29/09/16 15:08, Burkhard Linke wrote: >>> AFAIK evicting an object also flushes it to the backing storage, so >>> evicting a live object should be ok. It wi

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-29 Thread Sascha Vogt
Hi Burkhard, On 29/09/16 15:08, Burkhard Linke wrote: AFAIK evicting an object also flushes it to the backing storage, so evicting a live object should be ok. It will be promoted again at the next access (or whatever triggers promotion in the caching mechanism). For the dead 0-byte files: Shou

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-29 Thread Sascha Vogt
Hi, Am 29.09.2016 um 13:45 schrieb Burkhard Linke: > On 09/29/2016 01:34 PM, Sascha Vogt wrote: >> We have a huge amount of short lived VMs which are deleted before they >> are even flushed to the backing pool. Might this be the reason, that >> ceph doesn't handle that

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-29 Thread Sascha Vogt
Hi, Am 29.09.2016 um 14:00 schrieb Burkhard Linke: > On 09/29/2016 01:46 PM, Sascha Vogt wrote: >>>> Can you check/verify that the deleted objects are actually gone on the >>>> backing pool? >> How do I check that? Aka how to find out on which OSD a particular &g

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-29 Thread Sascha Vogt
A quick follow up question: Am 29.09.2016 um 13:34 schrieb Sascha Vogt: >> Can you check/verify that the deleted objects are actually gone on the >> backing pool? How do I check that? Aka how to find out on which OSD a particular object in the cache pool ends up in the backing pool?

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-29 Thread Sascha Vogt
Hi, Am 29.09.2016 um 02:44 schrieb Christian Balzer: > I don't think the LOG is keeping the 0-byte files alive, though. Yeah, don't think so either. The difference did stay at around the same level. > In general these are objects that have been evicted from the cache and if > it's very busy you w

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-28 Thread Sascha Vogt
Hi Christian, Am 28.09.2016 um 16:56 schrieb Christian Balzer: > 0.94.5 has a well known and documented bug, it doesn't rotate the omap log > of the OSDs. > > Look into "/var/lib/ceph/osd/ceph-xx/current/omap/" of the cache tier and > most likely discover a huge "LOG" file. You're right, it was

Re: [ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-28 Thread Sascha Vogt
Hi Burkhard, thanks a lot for the quick response. Am 28.09.2016 um 14:15 schrieb Burkhard Linke: > someone correct me if I'm wrong, but removing objects in a cache tier > setup result in empty objects which acts as markers for deleting the > object on the backing store.. I've seen the same patter

[ceph-users] Ceph with Cache pool - disk usage / cleanup

2016-09-28 Thread Sascha Vogt
Hi all, we currently experience a few "strange" things on our Ceph cluster and I wanted to ask if anyone has recommendations for further tracking them down (or maybe even an explanation already ;) ) Ceph version is 0.94.5 and we have a HDD based pool with a cache pool on NVMe SSDs in front if it.

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-04 Thread Sascha Vogt
Hi, Am 04.02.2016 um 12:59 schrieb Wade Holler: > You referenced parallel writes for journal and data. Which is default > for btrfs but but XFS. Now you are mentioning multiple parallel writes > to the drive , which of course yes will occur. Ah, that is good to know. So if I want to create more "p

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-04 Thread Sascha Vogt
Hi Robert, Am 04.02.2016 um 00:45 schrieb Robert LeBlanc: > Once we put in our cache tier the I/O on the spindles was so low, we > just moved the journals off the SSDs onto the spindles and left the > SSD space for cache. There have been testing showing that better > performance can be achieved by

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-04 Thread Sascha Vogt
Am 03.02.2016 um 17:24 schrieb Wade Holler: > AFAIK when using XFS, parallel write as you described is not enabled. Not sure I'm getting this. If I have multiple OSDs on the same NVMe (separated by different data-partitions) I have multiple parallel writes (one "stream" per OSD), or am I mistaken?

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-03 Thread Sascha Vogt
Hi Wade, Am 03.02.2016 um 13:26 schrieb Wade Holler: > What is your file system type, XFS or Btrfs ? We're using XFS, though for the new cache tier we could also switch to btrfs if that suggest a significant performance improvement... Greetings -Sascha- __

[ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-03 Thread Sascha Vogt
Hi all, we recently tried adding a cache tier to our ceph cluster. We had 5 spinning disks per hosts with a single journal NVMe disk, hosting the 5 journals (1 OSD per spinning disk). We have 4 hosts up to now, so overall 4 NVMes hosting 20 journals for 20 spinning disks. As we had some space lef