On Mon, May 18, 2015 at 9:34 AM, Brian Rak <b...@gameservers.com> wrote:
> We just enabled a small cache pool on one of our clusters (v 0.94.1) and
> have run into some issues:
>
> 1) Cache population appears to happen via the public network (not the
> cluster network).  We're seeing basically no traffic on the cluster network,
> and multiple gigabits inbound to our cache OSDs. Normal rebuild/recovery
> happens via the cluster network, so I don't believe this is just a
> configuration issue.

Yep.

>
> 2) Similar to #1, I was expecting to see cache traffic show up as repair
> traffic in 'ceph status'.  Instead, it seems to appear as a client traffic.

Correct, it's client traffic.

>
> 3) We're using a readonly pool (we only really write to our pools once).  I
> noticed that if all the OSDs hosting the cache pool go down, all reads stop
> until they're restored.  I would have expected that reads would fall back to
> the backing pool if the cache pool is unavailable.  Is this how it's
> supposed to work?

Well, you've got a nice thought, but this is expected behavior.

> Any thoughts on these?  Are my expectations just wrong here?  The
> documentation is fairly sparse, so I'm not quite sure what to expect.

Yeah...
In order to maintain a whole bunch of guarantees and cluster safety
stuff, cache tiers are currently served by having the cache tier act
as a client to the base tier. Right now that means the traffic needs
to flow over the client network; I'm not sure how tough it would be to
send it over the cluster network instead — although I can see your
interest in that.

Likewise, since it's client traffic, that's how it's reported in the
ceph central log summaries (ceph -s reports). I don't know that
marking it as repair traffic would make much sense either.

And finally, caches are definitely expected to be durable, even when
readonly. Setting things up to bypass the caches would be...hard. Not
unworkable and not an unreasonable thing to have them do, but hard. I
think you're the first person I know of who's using the read-only
tiering, and for all the use cases we've discussed in the past it
would not have been appropriate to skip to the base pool if the cache
was inaccessible.

All that said, tracker feature requests and especially patches are welcome! :)
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to