Hi All,
I have a cluster setup with 16 OSDs on 4 nodes, standard RGW install
with standard rgw pools, replication on those pools is set to 2 (size 2,
min_size 1).
We've had the situation before where one node totally dropped out (so 4
OSDs) and the cluster health was warning and rgw as well as other pools
were working fine.
I now had a problem where we added a test pool with replication 1 (size
1, min_size 1), the node died again and 4 OSDs dropped out resulting in
health_error and RGW not responding at all which I'm not sure why that
would be the case.
I understand that with a pool that uses size 1 and one OSD dropping out
(unrecoverable), you'll loose all that data (pretty much), and it was
only set to do some benchmarking, however, I didn't know that it was
affecting the entire cluster. Restarting the rados-gw service would
work, however, it wouldn't listen to requests as well as showing errors
like this in the logs:
2016-11-18 11:13:47.231827 7f0aaadb2a00 10 cannot find current period
zonegroup using local zonegroup
2016-11-18 11:13:47.231860 7f0aaadb2a00 20 get_system_obj_state:
rctx=0x7fffb14242c0 obj=.rgw.root:default.realm state=0x564c3fa99858
s->prefetch_data=0
2016-11-18 11:13:47.232754 7f0aaadb2a00 10 could not read realm id: (2)
No such file or directory
2016-11-18 11:13:47.232772 7f0aaadb2a00 10 Creating default zonegroup
2016-11-18 11:13:47.233376 7f0aaadb2a00 10 couldn't find old data
placement pools config, setting up new ones for the zone
...
2016-11-18 11:13:47.251629 7f0aaadb2a00 10 ERROR: name default already
in use for obj id 712c74f9-baf4-4d74-956b-022c67e4a5bb
2016-11-18 11:13:47.251631 7f0aaadb2a00 10 create_default() returned
-EEXIST, we raced with another zonegroup creation
Full log here: http://pastebin.com/iYpiF9wP
Once we removed the pool with size = 1 via 'rados rmpool', the cluster
started recovering and RGW served requests!
Any ideas?
Cheers,
Thomas
--
Thomas Gross
TGMEDIA Ltd.
p. +64 211 569080 | i...@tgmedia.co.nz
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com