[ceph-users] OSDs going down when we bring down some OSD nodes Or cut-off the cluster network link between OSD nodes

Venkata Manojawa Paritala Sat, 06 Aug 2016 07:54:43 -0700

Hi,

We have configured single Ceph cluster in a lab with the below
specification.


1. Divided the cluster into 3 logical sites (SiteA, SiteB & SiteC). This is
to simulate that nodes are part of different Data Centers and having
network connectivity between them for DR.
2. Each site operates in a different subnet and each subnet is part of one
VLAN. We have configured routing so that OSD nodes in one site can
communicate to OSD nodes in the other 2 sites.
3. Each site will have one monitor  node, 2  OSD nodes (to which we have
disks attached) and IO generating clients.
4. We have configured 2 networks.
4.1. Public network - To which all the clients, monitors and OSD nodes are
connected
4.2. Cluster network - To which only the OSD nodes are connected for -
Replication/recovery/hearbeat traffic.

5. We have 2 issues here.
5.1. We are unable sustain IO for clients from individual sites when we
isolate the OSD nodes by bringing down ONLY the cluster network between
sites. Logically this will make the individual sites to be in isolation
with respect to the cluster network. Please note that the public network is
still connected between the sites.
5.2. In a fully functional cluster, when we bring down 2 sites (shutdown
the OSD services of 2 sites - say Site A OSDs and Site B OSDs) then, OSDs
in the third site (Site C) are going down (OSD Flapping).

We need workarounds/solutions to  fix the above 2 issues.

Below are some of the parameters we have already mentioned in the Cenf.conf
to sustain the cluster for a longer time, when we cut-off the links between
sites. But, they were not successful.

--------------
[global]
public_network = 10.10.0.0/16
cluster_network = 192.168.100.0/16,192.168.150.0/16,192.168.200.0/16
osd hearbeat address = 172.16.0.0/16

[monitor]
mon osd report timeout = 1800

[OSD}
osd heartbeat interval = 12
osd hearbeat grace = 60
osd mon heartbeat interval = 60
osd mon report interval max = 300
osd mon report interval min = 10
osd mon act timeout = 60
.
.
----------------

We also confiured the parameter "osd_heartbeat_addr" and tried with the
values - 1) Ceph public network (assuming that when we bring down the
cluster network hearbeat should happen via public network). 2) Provided a
different network range altogether and had physical connections. But both
the options did not work.

We have a total of 49 OSDs (14 in Site A, 14 in SiteB, 21 in SiteC) in the
cluster. One Monitor in each Site.

We need to try the below two options.

A) Increase the "mon osd min down reporters" value. Question is how much.
Say, if I give this value to 49, then will the client IO sustain when we
cut-off the cluster network links between sites. In this case one issue
would be that if the OSD is really down we wouldn't know.

B) Add 2 monitors to each site. This would make each site with 3 monitors
and the overall cluster will have 9 monitors. The reason we wanted to try
this is, we think that the OSDs are going down as the the quorum is unable
to find the minimum number nodes (may be monitors) to sustain.

Thanks & Regards,
Manoj

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] OSDs going down when we bring down some OSD nodes Or cut-off the cluster network link between OSD nodes

Reply via email to