Paul Emmerich wrote:

> +1 on adding them all at the same time.
> 
> All these methods that gradually increase the weight aren't really
> necessary in newer releases of Ceph.

Because the default backfill/recovery values are lower than they were in, say, 
Dumpling?

Doubling (or more) the size of a cluster in one swoop still means a lot of 
peering and a lot of recovery I/O, I’ve seen a cluster’s data rate go to or 
near 0 for a brief but nonzero length of time.  If something goes wrong with 
the network (cough cough subtle jumbo frame lossage cough) , if one has 
fat-fingered something along the way, etc. going in increments means that a ^C 
lets the cluster stablize before very long.  Then you get to troubleshoot with 
HEALTH_OK instead of HEALTH_WARN or HEALTH_ERR.

Having experienced a cluster be DoS’d for hours when its size was tripled in 
one go, I’m once bitten twice shy.  Yes, that was Dumpling, but even with SSDs 
on Jewel and Luminous I’ve seen sigificant client performance impact from 
en-masse topology changes.

— aad

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to