On Mon, May 18, 2015 at 9:34 AM, Brian Rak <b...@gameservers.com> wrote: > We just enabled a small cache pool on one of our clusters (v 0.94.1) and > have run into some issues: > > 1) Cache population appears to happen via the public network (not the > cluster network). We're seeing basically no traffic on the cluster network, > and multiple gigabits inbound to our cache OSDs. Normal rebuild/recovery > happens via the cluster network, so I don't believe this is just a > configuration issue.
Yep. > > 2) Similar to #1, I was expecting to see cache traffic show up as repair > traffic in 'ceph status'. Instead, it seems to appear as a client traffic. Correct, it's client traffic. > > 3) We're using a readonly pool (we only really write to our pools once). I > noticed that if all the OSDs hosting the cache pool go down, all reads stop > until they're restored. I would have expected that reads would fall back to > the backing pool if the cache pool is unavailable. Is this how it's > supposed to work? Well, you've got a nice thought, but this is expected behavior. > Any thoughts on these? Are my expectations just wrong here? The > documentation is fairly sparse, so I'm not quite sure what to expect. Yeah... In order to maintain a whole bunch of guarantees and cluster safety stuff, cache tiers are currently served by having the cache tier act as a client to the base tier. Right now that means the traffic needs to flow over the client network; I'm not sure how tough it would be to send it over the cluster network instead — although I can see your interest in that. Likewise, since it's client traffic, that's how it's reported in the ceph central log summaries (ceph -s reports). I don't know that marking it as repair traffic would make much sense either. And finally, caches are definitely expected to be durable, even when readonly. Setting things up to bypass the caches would be...hard. Not unworkable and not an unreasonable thing to have them do, but hard. I think you're the first person I know of who's using the read-only tiering, and for all the use cases we've discussed in the past it would not have been appropriate to skip to the base pool if the cache was inaccessible. All that said, tracker feature requests and especially patches are welcome! :) -Greg _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com