> Hi Janne, > Thanks for your advice. > > So, you mean with with K=4 M =2 EC, we need 8 OSD nodes to have better > protection
As always, it is a tradeoff in cost, speed, availability, storage size and so on. What you need if the data is important is for the cluster to be able to heal itself. If you run X hosts in a Replication=X setup, OR, N+M hosts in an EC N+M setup is something that works fine when everything is fine. Unfortunately, drives die, OSes crash, hosts get failed PSUs and sometimes, just a simple maintenance thing goes slightly wrong and downtime becomes far longer than expected. In any such cases where one host is out for a long time, a cluster that has exactly minimum number of hosts will not be able to repair itself when a host is missing, because it is already at the minimum and you are now one step under the minimum. This means that from being fully functional you are one crash away from being degraded and at risk of data loss or at least a cluster that goes readonly to protect the data in case of any new unexpected surprises. If you have N+M+1 hosts or more, the cluster can recover into one of the "excess" hosts drives and at some point after, become fully functional again without your intervention. Also one thing to consider is that you should never fill a cluster to 85% or more and this will need to take crashes into account aswell so if you have EC 4+2 and 7 hosts, if they are 83% full and one OSD host goes down, the cluster will still not be able to make extra copies on the remaining 6 OSD hosts so that it goes over 85% full, so not only should you have more hosts than the EC N+M says, you should also have spare drive capacity and expand early to avoid getting into a situation where you can't repair due to almost-full drives everywhere. This is easy to see if you compare the impact of losing one OSD host when you have 6, in this case you have 16.6% of the total data needing to spread out over the remaining 5 hosts, which will be a noticeable amount. If you had a 100 OSD hosts and one crashes, you have 1% of the total data needing to be spread out over the remaining 99, and even if that is the same amount of space to be rewritten/recreated, the extra data for each hosts becomes very small. You can skip going to the datacenter, let it recover by itself, and if another host dies the week after, its still ~1.1% to be spread out over 98 hosts, again something that is very much manageable without panicking. If you have EC4+2 on 6 hosts and one dies in the middle of the night, it's time to get in the car as soon as possible. >> Still, if you have EC 4+2 and only 6 OSD hosts, this means if a host >> dies, the cluster can not recreate data anywhere without violating >> "one copy per host" default placement, so the cluster will be degraded >> until this host comes back or another one replaces it. For a N+M EC >> cluster, I would suggest having N+M+1 or even +2 number of hosts, so >> that you can do maintenance on a host or lose a host and still be able >> to recover without visiting the server room. -- May the most significant bit of your life be positive. _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io