Hello All
Followup question.
Assume a swift cluster with a number of swift proxy nodes, each node
needs to hold a copy of the ring structure, right?
What happens when a disk is added to the ring. After the change is made
on the first proxy node, the ring config files need to be copied to the
other proxy nodes, right?
Is there a risk during the period that the new ring builder files are
copied a file can be stored using the new structure on one proxy node
and retrieved via an other node that still holds the old structure and
not returning object not found. Or the odd change an object is moved
already by the re-balance process while being access by a proxy that
still has the old ring structure.
Regards
Peter
On 16/03/2016 00:23, Mark Kirkwood wrote:
On 16/03/16 00:51, Peter Brouwer wrote:
Ah, good info. Followup question, assume worse case ( just to emphasis
the situation) , one copy ( replication = 1 ) , disk approaching its max
capacity.
How can you monitor this situation, i.e. to avoid the disk full scenario
and
if the disk is full, what type of error is returned?
Let's do an example: 4 storage nodes (obj1...obj4) each with 1 disk
(vdb) added to ring. Replication set to 1.
Firstly write a 1G object (to see where it is gonna go)...host obj1,
disk vdb, partition 1003):
obj1 $ ls -l
/srv/node/vdb/objects/1003/d31/fae796287c852f0833316a3dadfb3d31/
total 1048580
-rw------- 1 swift swift 1073741824 Mar 16 10:15 1458079557.01198.data
Then remove it
obj1 $ ls -l
/srv/node/vdb/objects/1003/d31/fae796287c852f0833316a3dadfb3d31/
total 4
-rw------- 1 swift swift 0 Mar 16 10:47 1458078463.80396.ts
...and use up space on obj1/vdb (dd a 29G file into /srv/node/vdb
somewhere)
obj1 $ df -m|grep vdb
/dev/vdb 30705 29729 977 97% /srv/node/vdb
Add object again (ends up on obj4 instead...handoff node)
obj4 $ ls -l
/srv/node/vdb/objects/1003/d31/fae796287c852f0833316a3dadfb3d31/
total 1048580
-rw------- 1 swift swift 1073741824 Mar 16 11:06 1458079557.01198.data
So swift is coping with the obj1/vdb disk being too full. Remove again
and exhaust space on all disks (dd again):
@obj[1-4] $ df -h|grep vdb
/dev/vdb 30G 30G 977M 97% /srv/node/vdb
Now attempt to write 1G object again
swiftclient.exceptions.ClientException:
Object PUT failed:
http://192.168.122.61:8080/v1/AUTH_9a428d5a6f134f829b2a5e4420f512e7/con0/obj0
503 Service Unavailable
So we get an http 503 to show that the put has failed.
Now re monitoring. Out of the box swift-recon cover this:
proxy1 $ swift-recon -dv
===============================================================================
--> Starting reconnaissance on 4 hosts
===============================================================================
[2016-03-16 13:16:54] Checking disk usage now
-> http://192.168.122.63:6000/recon/diskusage: [{u'device': u'vdc',
u'avail': 32162807808, u'mounted': True, u'used': 33718272, u'size':
32196526080}, {u'device': u'vdb', u'avail': 1024225280, u'mounted':
True, u'used': 31172300800, u'size': 32196526080}]
-> http://192.168.122.64:6000/recon/diskusage: [{u'device': u'vdc',
u'avail': 32162807808, u'mounted': True, u'used': 33718272, u'size':
32196526080}, {u'device': u'vdb', u'avail': 1024274432, u'mounted':
True, u'used': 31172251648, u'size': 32196526080}]
-> http://192.168.122.62:6000/recon/diskusage: [{u'device': u'vdc',
u'avail': 32162807808, u'mounted': True, u'used': 33718272, u'size':
32196526080}, {u'device': u'vdb', u'avail': 1024237568, u'mounted':
True, u'used': 31172288512, u'size': 32196526080}]
-> http://192.168.122.65:6000/recon/diskusage: [{u'device': u'vdc',
u'avail': 32162807808, u'mounted': True, u'used': 33718272, u'size':
32196526080}, {u'device': u'vdb', u'avail': 1024221184, u'mounted':
True, u'used': 31172304896, u'size': 32196526080}]
Distribution Graph:
0% 4
*********************************************************************
96% 4
*********************************************************************
Disk usage: space used: 124824018944 of 257572208640
Disk usage: space free: 132748189696 of 257572208640
Disk usage: lowest: 0.1%, highest: 96.82%, avg: 48.4617574245%
===============================================================================
So integrating swift-recon into regular monitoring/alerting
(collectd/nagios or whatever) is one approach (mind you most folk
already monitor disk usage data... and there is nothing overly special
about ensuring you don't run of space)!
BTW, thanks for the patience for sticking with me in this.
No worries - a good question (once I finally understood it).
regards
Mark
--
Regards,
Peter Brouwer, Principal Software Engineer,
Oracle Application Integration Engineering.
Phone: +44 1506 672767, Mobile +44 7720 598 226
E-Mail: peter.brou...@oracle.com
_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack