Hi Janne / Anthony, Thanks for the great explanation. I am going ahead with 4X2 EC with 8 OSD nodes with the hope that it will provide max protection and performance.
Will have 3 more servers to run Monitor / Manager and MDS daemon on each of them. As advised by Anthony will run 2 more monitor daemons on OSD nodes to have 5 monitor daemons running. Since I am newbie to Ceph, can you guys please share doc / commands on how to set up 4X2 EC. Thanks, Gagan On Wed, Apr 23, 2025 at 12:09 PM Janne Johansson <icepic...@gmail.com> wrote: > > Hi Janne, > > Thanks for your advice. > > > > So, you mean with with K=4 M =2 EC, we need 8 OSD nodes to have better > protection > > As always, it is a tradeoff in cost, speed, availability, storage size > and so on. > What you need if the data is important is for the cluster to be able > to heal itself. > > If you run X hosts in a Replication=X setup, OR, N+M hosts in an EC > N+M setup is something that works fine when everything is fine. > Unfortunately, drives die, OSes crash, hosts get failed PSUs and > sometimes, just a simple maintenance thing goes slightly wrong and > downtime becomes far longer than expected. In any such cases where one > host is out for a long time, a cluster that has exactly minimum number > of hosts will not be able to repair itself when a host is missing, > because it is already at the minimum and you are now one step under > the minimum. > > This means that from being fully functional you are one crash away > from being degraded and at risk of data loss or at least a cluster > that goes readonly to protect the data in case of any new unexpected > surprises. > > If you have N+M+1 hosts or more, the cluster can recover into one of > the "excess" hosts drives and at some point after, become fully > functional again without your intervention. Also one thing to consider > is that you should never fill a cluster to 85% or more and this will > need to take crashes into account aswell so if you have EC 4+2 and 7 > hosts, if they are 83% full and one OSD host goes down, the cluster > will still not be able to make extra copies on the remaining 6 OSD > hosts so that it goes over 85% full, so not only should you have more > hosts than the EC N+M says, you should also have spare drive capacity > and expand early to avoid getting into a situation where you can't > repair due to almost-full drives everywhere. This is easy to see if > you compare the impact of losing one OSD host when you have 6, in this > case you have 16.6% of the total data needing to spread out over the > remaining 5 hosts, which will be a noticeable amount. If you had a 100 > OSD hosts and one crashes, you have 1% of the total data needing to be > spread out over the remaining 99, and even if that is the same amount > of space to be rewritten/recreated, the extra data for each hosts > becomes very small. You can skip going to the datacenter, let it > recover by itself, and if another host dies the week after, its still > ~1.1% to be spread out over 98 hosts, again something that is very > much manageable without panicking. If you have EC4+2 on 6 hosts and > one dies in the middle of the night, it's time to get in the car as > soon as possible. > > >> Still, if you have EC 4+2 and only 6 OSD hosts, this means if a host > >> dies, the cluster can not recreate data anywhere without violating > >> "one copy per host" default placement, so the cluster will be degraded > >> until this host comes back or another one replaces it. For a N+M EC > >> cluster, I would suggest having N+M+1 or even +2 number of hosts, so > >> that you can do maintenance on a host or lose a host and still be able > >> to recover without visiting the server room. > > -- > May the most significant bit of your life be positive. > _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io