Re: [ceph-users] Switching failure domains

David Turner Wed, 31 Jan 2018 17:47:28 -0800

I don't know if a non-impactful way to change this. If any host, rack, etc
IDs change it will cause movement. If any crush rule changes where it
chooses from our the failure domain, it will cause movement.


I once ran a test cluster where I changed every host to be in its own
"rack" just to change the rule to choose from racks instead of hosts and it
moved all of the data even though the actual size of the failed domains
didn't change.

In Luminous if you only have HDDs and you change the crush rule to choose
from default class HDD, even though the osds all stayed where they were and
nothing else changed in the map, the majority of your data will move.

Afaik, all changes to the crush map like this will move all affected data
around.

On Wed, Jan 31, 2018, 12:57 PM Bryan Stillwell <bstillw...@godaddy.com>
wrote:

> We're looking into switching the failure domains on several of our
> clusters from host-level to rack-level and I'm trying to figure out the
> least impactful way to accomplish this.
>
> First off, I've made this change before on a couple large (500+ OSDs)
> OpenStack clusters where the volumes, images, and vms pools were all
> about 33% of the cluster.  The way I did it then was to create a new
> rule which had a switch-based failure domain and then did one pool at a
> time.
>
> That worked pretty well, but now I've inherited several large RGW
> clusters (500-1000+ OSDs) where 99% of the data is in the .rgw.buckets
> pool with slower and bigger disks (7200 RPM 4TB SATA HDDs vs. the 10k
> RPM 1.2TB SAS HDDs I was using previously).  This makes the change take
> longer and early testing has shown it being fairly impactful.
>
> I'm wondering if there is a way to more gradually switch to a rack-based
> failure domain?
>
> One of the ideas we had was to create new hosts that are actually the
> racks and gradually move all the OSDs to those hosts.  Once that is
> complete we should be able to turn those hosts into racks and switch the
> failure domain at the same time.
>
> Does anyone see a problem with that approach?
>
> I was also wondering if we could take advantage of RGW in any way to
> gradually move the data to a new pool with the proper failure domain set
> on it?
>
> BTW, these clusters will all be running jewel (10.2.10).  The time I
> made the switch previously was done on hammer.
>
> Thanks,
> Bryan
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Switching failure domains

Reply via email to