I've been noticing somethings strange with my RGW federation. I added
some statistics to radosgw-agent to try and get some insight
(https://github.com/ceph/radosgw-agent/pull/7), but that just showed me
that I don't understand how replication works.
When PUT traffic was relatively slow to the master zone, replication had
no issues keeping up. Now I'm trying to cause replication to fall
behind, by deliberately exceeding the amount of bandwidth between the
two zones (they're in different datacenters). Instead of falling
behind, both the radosgw-agent logs and the stats I added say that slave
zone is keeping up.
But some of the numbers don't add up. I'm not using enough bandwidth
between the two facilities, and I'm not using enough disk space in the
slave zone. The disk usage in the slave zone continues to fall further
and further behind the master. Despite this, I'm always able to
download objects from both zones.
How does radosgw-agent actually replicate metadata and data? Does
radosgw-agent actually copy all the bytes, or does it create
placeholders in the slave zone? If radosgw-agent is creating
placeholders in the slave zone, and radosgw populates the placeholder in
the background, then that would explain the behavior I'm seeing. If
this is true, how can I tell if replication is keeping up?
--
*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email cle...@centraldesktop.com <mailto:cle...@centraldesktop.com>
*Central Desktop. Work together in ways you never thought possible.*
Connect with us Website <http://www.centraldesktop.com/> | Twitter
<http://www.twitter.com/centraldesktop> | Facebook
<http://www.facebook.com/CentralDesktop> | LinkedIn
<http://www.linkedin.com/groups?gid=147417> | Blog
<http://cdblog.centraldesktop.com/>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com