Re: [Openstack] Ring rebuild, multiple copies of ringbuilder file, wasRe: swift ringbuilder and disk size/capacity relationship

Mark Kirkwood Tue, 19 Apr 2016 21:57:18 -0700

The proxies and the storage nodes all have a copy of the ringstructure(s): e.g:


$ ls -l /etc/swift/*.ring.gz
-rw-r--r-- 1 root  nagios 1316 Apr 20 00:31 account.ring.gz
-rw-r--r-- 1 root  nagios 1299 Apr 20 00:31 container.ring.gz
-rw-r--r-- 1 root  nagios 1287 Apr 20 00:31 object.ring.gz

but yeah, suppose you make changes to the ring on (say) one of theproxies, got make a coffee, then distribute the new rings to the variousmachines. There is a period of time when the rings are different on somemachines from others.

So it is possible that a request for an object initiated by the proxywhere you modified the rings may look for an object on the newly addeddevice (which does not have anything yet) - it will be served from ahandoff or replica instead (you might get a not found if you have numreplicas = 1...haven't tried that out tho).

I *think* if you modify the ring on a proxy then it won't be able toforce the storage nodes to move an object somewhere where it can't befound (they will look at their own ring version). However, subsequentreplication runs (where storage servers chatter to their next and prevneighbours) once you have all the new rings distributed will reorganiseanything that did get moved incorectly.

John can hopefully give you fuller details (I haven't read up on ortried out all the various scenarios you clearly dream up). However I diddo some pretty horrific things (on purpose):

- changing the number of partitions and installing this everywhere (ahem- do not do this in a cluster you care about)

- checking that it utterly breaks everything :-(
- copying back the old rings (do back these up)!
- checking that the cluster is working again :-)

So in general, seems pretty robust!

Also our friend swift-recon can alert you about any problems with nonmatching rings:


markir@proxy1:~$ swift-recon --md5
===============================================================================
--> Starting reconnaissance on 4 hosts
===============================================================================
[2016-04-20 16:35:10] Checking ring md5sums
4/4 hosts matched, 0 error[s] while checking hosts.
===============================================================================
[2016-04-20 16:35:10] Checking swift.conf md5sum
4/4 hosts matched, 0 error[s] while checking hosts.
===============================================================================

Cheers

Mark



On 20/04/16 02:37, Peter Brouwer wrote:


Hello All

Followup question.
Assume a swift cluster with a number of swift proxy nodes, each node
needs to hold a copy of the ring structure, right?
What happens when a disk is added to the ring. After the change is made
on the first proxy node, the ring config files need to be copied to the
other proxy nodes, right?
Is there a risk during the period that the new ring builder files are
copied a file can be stored using the new structure on one proxy node
and retrieved via an other node that still holds the old structure and
not returning object not found. Or the odd change an object is moved
already by the re-balance process while being access by a proxy that
still has the old ring structure.



_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] Ring rebuild, multiple copies of ringbuilder file, wasRe: swift ringbuilder and disk size/capacity relationship

Reply via email to