Thanks Wouldn't it be amazing to puta 2TB NVMe card in each compute node, make 1 config change and presto! Users see a 10 fold increase in performance :) with 95% reads going to cache and all writes being acknowledged after being written on cache. For writes you might want dual NVMe in a raid 1 so you're fully covered.
-----Original Message----- From: Ric Wheeler [mailto:rwhee...@redhat.com] Sent: 27 March 2016 09:27 To: Daniel Niasoff <dan...@redactus.co.uk>; Van Leeuwen, Robert <rovanleeu...@ebay.com>; Jason Dillaman <dilla...@redhat.com> Cc: ceph-users@lists.ceph.com; Mike Snitzer <snit...@redhat.com>; Joe Thornber <thorn...@redhat.com> Subject: Re: [ceph-users] Local SSD cache for ceph on each compute node. On 03/27/2016 11:13 AM, Daniel Niasoff wrote: > Hi Ric, > > But you would still have to set a dm-cache per rbd volume which makes it > difficult to manage. > > There needs to be a global setting either within kvm or ceph that caches > reads/writes before they hit the rbd the device. > > Thanks > > Daniel Correct, it is per block device - effectively it is a layer on top of the rbd device if you want to set up a caching layer like this. As you mention, you can cache at other layers of the system as well. How difficult that is to manage and assemble depends on tooling. I don't see doing it in kvm as really easier than doing it under kvm, but I am a big believer in the need for much better tools to help manage things like this so that users don't see the complexity. Ric > > -----Original Message----- > From: Ric Wheeler [mailto:rwhee...@redhat.com] > Sent: 27 March 2016 09:00 > To: Van Leeuwen, Robert <rovanleeu...@ebay.com>; Daniel Niasoff > <dan...@redactus.co.uk>; Jason Dillaman <dilla...@redhat.com> > Cc: ceph-users@lists.ceph.com; Mike Snitzer <snit...@redhat.com>; Joe > Thornber <thorn...@redhat.com> > Subject: Re: [ceph-users] Local SSD cache for ceph on each compute node. > > On 03/16/2016 12:15 PM, Van Leeuwen, Robert wrote: >>> My understanding of how a writeback cache should work is that it should >>> only take a few seconds for writes to be streamed onto the network and is >>> focussed on resolving the speed issue of small sync writes. The writes >>> would be bundled into larger writes that are not time sensitive. >>> >>> So there is potential for a few seconds data loss but compared to the >>> current trend of using ephemeral storage to solve this issue, it's a major >>> improvement. >> It think is a bit worse then just a few seconds of data: >> As mentioned in the blueprint for ceph you would need some kind or ordered >> write-back cache that maintains checkpoints internally. >> >> I am not that familiar with the internals of dm-cache but I do not think it >> guarantees any write order. >> E.g. By default it will bypass the cache for sequential IO. >> >> So I think it is very likely the “few seconds of data loss" in this case >> means the filesystem is corrupt and you could lose the whole thing. >> At the very least you will need to run fsck on it and hope it can sort out >> all of the errors with minimal data loss. >> >> >> So, for me, it seems conflicting to me to use persistent storage and then >> hoping your volumes survive a power outage. >> >> If you can survive missing that data you are probably better of running >> fully from ephemeral storage in the first place. >> >> Cheers, >> Robert van Leeuwen > Hi Robert, > > I might be misunderstanding your point above, but dm-cache provides > persistent storage. It will be there when you reboot and look for data on > that same box. > dm-cache is also power failure safe and tested to survive this kind of outage. > > If you try to look at the rbd device under dm-cache from another host, of > course any data that was cached on the dm-cache layer will be missing since > the dm-cache device itself is local to the host you wrote the data from > originally. > > In a similar way, using dm-cache for write caching (or any write cache local > to a client) will also mean that your data has a single point of failure > since that data will not be replicated out to the backing store until it is > destaged from cache. > > I would note that this is exactly the kind of write cache that is popular > these days in front of enterprise storage arrays on clients so this is not > really uncommon. > > Regards, > > Ric > > _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com