Hi Janne / Anthony,
                         Thanks for the great explanation. I am going ahead
with 4X2 EC  with 8 OSD nodes with the hope that it will provide max
protection and performance.

Will have 3 more servers to run Monitor / Manager and MDS daemon on each of
them.  As advised by Anthony will run 2 more monitor daemons on OSD nodes
to have 5 monitor daemons running.

Since I am newbie to Ceph, can you guys please share doc / commands on how
to set up 4X2 EC.

Thanks,
Gagan

On Wed, Apr 23, 2025 at 12:09 PM Janne Johansson <icepic...@gmail.com>
wrote:

> > Hi Janne,
> >                      Thanks for your advice.
> >
> > So, you mean with with K=4 M =2 EC, we need 8 OSD nodes to have better
> protection
>
> As always, it is a tradeoff in cost, speed, availability, storage size
> and so on.
> What you need if the data is important is for the cluster to be able
> to heal itself.
>
> If you run X hosts in a Replication=X setup, OR, N+M hosts in an EC
> N+M setup is something that works fine when everything is fine.
> Unfortunately, drives die, OSes crash, hosts get failed PSUs and
> sometimes, just a simple maintenance thing goes slightly wrong and
> downtime becomes far longer than expected. In any such cases where one
> host is out for a long time, a cluster that has exactly minimum number
> of hosts will not be able to repair itself when a host is missing,
> because it is already at the minimum and you are now one step under
> the minimum.
>
> This means that from being fully functional you are one crash away
> from being degraded and at risk of data loss or at least a cluster
> that goes readonly to protect the data in case of any new unexpected
> surprises.
>
> If you have N+M+1 hosts or more, the cluster can recover into one of
> the "excess" hosts drives and at some point after, become fully
> functional again without your intervention. Also one thing to consider
> is that you should never fill a cluster to 85% or more and this will
> need to take crashes into account aswell so if you have EC 4+2 and 7
> hosts, if they are 83% full and one OSD host goes down, the cluster
> will still not be able to make extra copies on the remaining 6 OSD
> hosts so that it goes over 85% full, so not only should you have more
> hosts than the EC N+M says, you should also have spare drive capacity
> and expand early to avoid getting into a situation where you can't
> repair due to almost-full drives everywhere. This is easy to see if
> you compare the impact of losing one OSD host when you have 6, in this
> case you have 16.6% of the total data needing to spread out over the
> remaining 5 hosts, which will be a noticeable amount. If you had a 100
> OSD hosts and one crashes, you have 1% of the total data needing to be
> spread out over the remaining 99, and even if that is the same amount
> of space to be rewritten/recreated, the extra data for each hosts
> becomes very small. You can skip going to the datacenter, let it
> recover by itself, and if another host dies the week after, its still
> ~1.1% to be spread out over 98 hosts, again something that is very
> much manageable without panicking. If you have EC4+2 on 6 hosts and
> one dies in the middle of the night, it's time to get in the car as
> soon as possible.
>
> >> Still, if you have EC 4+2 and only 6 OSD hosts, this means if a host
> >> dies, the cluster can not recreate data anywhere without violating
> >> "one copy per host" default placement, so the cluster will be degraded
> >> until this host comes back or another one replaces it. For a N+M EC
> >> cluster, I would suggest having N+M+1 or even +2 number of hosts, so
> >> that you can do maintenance on a host or lose a host and still be able
> >> to recover without visiting the server room.
>
> --
> May the most significant bit of your life be positive.
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to