On Thu, Jul 6, 2017 at 9:18 AM, Gregory Farnum <gfar...@redhat.com> wrote:
> On Thu, Jul 6, 2017 at 7:04 AM <bruno.cann...@stfc.ac.uk> wrote: > >> Hi Ceph Users, >> >> >> >> We plan to add 20 storage nodes to our existing cluster of 40 nodes, each >> node has 36 x 5.458 TiB drives. We plan to add the storage such that all >> new OSDs are prepared, activated and ready to take data but not until we >> start slowly increasing their weightings. We also expect this not to cause >> any backfilling before we adjust the weightings. >> >> >> >> When testing the deployment on our development cluster, adding a new OSD >> to the host bucket with a crush weight of 5.458 and an OSD reweight of 0 >> (we have set “noin”) causes the acting sets of a few pools to change, thus >> triggering backfilling. Interestingly, none of the pool backfilling have >> the new OSD in their acting set. >> >> >> >> This was not what we expected, so I have to ask, is what we are trying to >> achieve possible and how best we should go about doing it. >> > > Yeah, there's an understandable but unfortunate bit where when you add a > new CRUSH device/bucket to a CRUSH bucket (so, a new disk to a host, or a > new host to a rack) you change the overall weight of that bucket (the host > or rack). So even though the new OSD might be added with a *reweight* of > zero, it has a "real" weight of 5.458 and so a little bit more data is > mapped into the host/rack, even though none gets directed to the new disk > until you set its reweight value up. > > As you note below, if you add the disks with a weight of zero that doesn't > happen, so you can try doing that and weighting them up gradually. > -Greg > This works well for us - Adding in OSDs with crush weight of 0 (osd crush initial weight = 0) and slowly crush weighting them in while the reweight remains at 1. This should also result in less overall data movement if that is a concern. > >> >> Commands run: >> >> ceph osd crush add osd.43 0 host=ceph-sn833 - causes no backfilling >> >> ceph osd crush add osd.44 5.458 host=ceph-sn833 - does cause backfilling >> >> >> >> For multiple hosts and OSDs, we plan to prepare a new crushmap and inject >> that into the cluster. >> >> >> >> Best wishes, >> >> Bruno >> >> >> >> >> >> Bruno Canning >> >> LHC Data Store System Administrator >> >> Scientific Computing Department >> >> STFC Rutherford Appleton Laboratory >> >> Harwell Oxford >> >> Didcot >> >> OX11 0QX >> >> Tel. +44 ((0)1235) 446621 >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Brian Andrus | Cloud Systems Engineer | DreamHost brian.and...@dreamhost.com | www.dreamhost.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com