Hi Robert, Relocating the older hardware to the new racks is also an interesting option. Thanks for the suggestion!
Rogier Dikkes Systeem Programmeur Hadoop & HPC Cloud SURFsara | Science Park 140 | 1098 XG Amsterdam > On Apr 23, 2015, at 5:50 PM, Robert LeBlanc <rob...@leblancnet.us> wrote: > > If you force CRUSH to put copies in each rack, then you will be limited by > the smallest rack. You can have some sever limitations if you try to keep > your copies to two racks (see the thread titles "CRUSH rule for 3 replicas > across 2 hosts") for some of my explanation about this. > > If I were you, I would install almost all the new hardware and hold out a few > pieces. Get the new hardware up and running, then take down some of the > original hardware and relocate it in the other cabinets so that you even out > the older lower capacity nodes and new higher capacity nodes in each cabinet. > That would give you the best of redundancy and performance (not all PGs would > have to have a replica on the potentially slower hardware). This would allow > you to have replication level three and able to lose a rack. > > Another options if you have the racks is to spread the new hardware over 3 > racks instead of 2 so that your cluster is over 4 racks. CRUSH will give a > preference to the newer hardware (assuming the CRUSH weights reflect the size > of the disk) and you would no longer be limited by the older smaller rack. > > On Thu, Apr 23, 2015 at 3:20 AM, Rogier Dikkes <rogier.dik...@surfsara.nl > <mailto:rogier.dik...@surfsara.nl>> wrote: > Hello all, > > At this moment we have a scenario where i would like your opinion on. > > Scenario: > Currently we have a ceph environment with 1 rack of hardware, this rack > contains a couple of OSD nodes with 4T disks. In a few months time we will > deploy 2 more racks with OSD nodes, these nodes have 6T disks and 1 node more > per rack. > > Short overview: > rack1: 4T OSD > rack2: 6T OSD > rack3: 6T OSD > > At this moment we are playing around with the idea to use the CRUSH map to > make ceph 'rack aware' and ensure to have data replicated between racks. > However from documentation i gathered i found that when you enforce data > replication between buckets then your max storage size will be the lowest > bucket value. My understanding: enforce the objects (size=3) to be replicated > to 3 racks, the moment the rack with 4T OSD's is full we cannot store data > anymore. > > Is this assumption correct? > > The current idea we play with: > > - Create 2 rack buckets > - Create a ruleset to create 2 object replica’s for the 2x 6T buckets > - Create a ruleset to create 1 object replica over all the hosts. > > This would result in 3 replicas of the object. Where we are sure that 2 > objects at least are in different racks. In the unlikely event of a rack > failure we would have at least 1 or 2 replica’s left. > > Our idea is to have a crush rule with config that looks like: > device 0 osd.0 > device 1 osd.1 > device 2 osd.2 > device 3 osd.3 > device 4 osd.4 > device 5 osd.5 > device 6 osd.6 > device 7 osd.7 > device 8 osd.8 > device 9 osd.9 > > > host r01-cn01 { > id -1 > alg straw > hash 0 > item osd.0 weight 4.00 > } > > host r01-cn02 { > id -2 > alg straw > hash 0 > item osd.1 weight 4.00 > } > > host r01-cn03 { > id -3 > alg straw > hash 0 > item osd.3 weight 4.00 > } > > host r02-cn04 { > id -4 > alg straw > hash 0 > item osd.4 weight 6.00 > } > > host r02-cn05 { > id -5 > alg straw > hash 0 > item osd.5 weight 6.00 > } > > host r02-cn06 { > id -6 > alg straw > hash 0 > item osd.6 weight 6.00 > } > > host r03-cn07 { > id -7 > alg straw > hash 0 > item osd.7 weight 6.00 > } > > host r03-cn08 { > id -8 > alg straw > hash 0 > item osd.8 weight 6.00 > } > > host r03-cn09 { > id -9 > alg straw > hash 0 > item osd.9 weight 6.00 > } > > rack r02 { > id -10 > alg straw > hash 0 > item r02-cn04 weight 6.00 > item r02-cn05 weight 6.00 > item r02-cn06 weight 6.00 > } > > rack r03 { > id -11 > alg straw > hash 0 > item r03-cn07 weight 6.00 > item r03-cn08 weight 6.00 > item r03-cn09 weight 6.00 > } > > root 6t { > id -12 > alg straw > hash 0 > item r02 weight 18.00 > item r03 weight 18.00 > } > > rule one { > ruleset 1 > type replicated > min_size 1 > max_size 10 > step take 6t > step chooseleaf firstn 2 type rack > step chooseleaf firstn 1 type host > step emit > } > Is this the right approach and would this cause limitations in regards of > performance or usability? Do you have suggestions? > > Another interesting situation we have now is: We are going to move the > hardware to new locations next year, the rack layout will change and thus the > crush map will be altered. When changing a CRUSH map that theoretically would > change the 2x 6T racks into 4 racks, would we need to take any special > actions into consideration? > > Thank you for your answers, they are much appreciated! > > Rogier Dikkes > System Programmer Hadoop & HPC Cloud > SURFsara | Science Park 140 | 1098 XG Amsterdam > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com