Thank you Rober - I'm wondering when I do remove total of 7 OSDs from crush map - weather that will cause more than 37% of data moved (80% or whatever)
I'm also wondering if the thortling that I applied is fine or not - I will introduce the osd_recovery_delay_start 10sec as Irek said. I'm just wondering hom much will be the performance impact, because: - when stoping OSD, the impact while backfilling was fine more or a less - I can leave with this - when I removed OSD from cursh map - first 1h or so, impact was tremendous, and later on during recovery process impact was much less but still noticable... Thanks for the tip of course ! Andrija On 3 March 2015 at 18:34, Robert LeBlanc <rob...@leblancnet.us> wrote: > I would be inclined to shut down both OSDs in a node, let the cluster > recover. Once it is recovered, shut down the next two, let it recover. > Repeat until all the OSDs are taken out of the cluster. Then I would > set nobackfill and norecover. Then remove the hosts/disks from the > CRUSH then unset nobackfill and norecover. > > That should give you a few small changes (when you shut down OSDs) and > then one big one to get everything in the final place. If you are > still adding new nodes, when nobackfill and norecover is set, you can > add them in so that the one big relocate fills the new drives too. > > On Tue, Mar 3, 2015 at 5:58 AM, Andrija Panic <andrija.pa...@gmail.com> > wrote: > > Thx Irek. Number of replicas is 3. > > > > I have 3 servers with 2 OSDs on them on 1g switch (1 OSD already > > decommissioned), which is further connected to a new 10G switch/network > with > > 3 servers on it with 12 OSDs each. > > I'm decommissioning old 3 nodes on 1G network... > > > > So you suggest removing whole node with 2 OSDs manually from crush map? > > Per my knowledge, ceph never places 2 replicas on 1 node, all 3 replicas > > were originally been distributed over all 3 nodes. So anyway It could be > > safe to remove 2 OSDs at once together with the node itself...since > replica > > count is 3... > > ? > > > > Thx again for your time > > > > On Mar 3, 2015 1:35 PM, "Irek Fasikhov" <malm...@gmail.com> wrote: > >> > >> Once you have only three nodes in the cluster. > >> I recommend you add new nodes to the cluster, and then delete the old. > >> > >> 2015-03-03 15:28 GMT+03:00 Irek Fasikhov <malm...@gmail.com>: > >>> > >>> You have a number of replication? > >>> > >>> 2015-03-03 15:14 GMT+03:00 Andrija Panic <andrija.pa...@gmail.com>: > >>>> > >>>> Hi Irek, > >>>> > >>>> yes, stoping OSD (or seting it to OUT) resulted in only 3% of data > >>>> degraded and moved/recovered. > >>>> When I after that removed it from Crush map "ceph osd crush rm id", > >>>> that's when the stuff with 37% happened. > >>>> > >>>> And thanks Irek for help - could you kindly just let me know of the > >>>> prefered steps when removing whole node? > >>>> Do you mean I first stop all OSDs again, or just remove each OSD from > >>>> crush map, or perhaps, just decompile cursh map, delete the node > completely, > >>>> compile back in, and let it heal/recover ? > >>>> > >>>> Do you think this would result in less data missplaces and moved > arround > >>>> ? > >>>> > >>>> Sorry for bugging you, I really appreaciate your help. > >>>> > >>>> Thanks > >>>> > >>>> On 3 March 2015 at 12:58, Irek Fasikhov <malm...@gmail.com> wrote: > >>>>> > >>>>> A large percentage of the rebuild of the cluster map (But low > >>>>> percentage degradation). If you had not made "ceph osd crush rm id", > the > >>>>> percentage would be low. > >>>>> In your case, the correct option is to remove the entire node, rather > >>>>> than each disk individually > >>>>> > >>>>> 2015-03-03 14:27 GMT+03:00 Andrija Panic <andrija.pa...@gmail.com>: > >>>>>> > >>>>>> Another question - I mentioned here 37% of objects being moved > arround > >>>>>> - this is MISPLACED object (degraded objects were 0.001%, after I > removed 1 > >>>>>> OSD from cursh map (out of 44 OSD or so). > >>>>>> > >>>>>> Can anybody confirm this is normal behaviour - and are there any > >>>>>> workarrounds ? > >>>>>> > >>>>>> I understand this is because of the object placement algorithm of > >>>>>> CEPH, but still 37% of object missplaces just by removing 1 OSD > from crush > >>>>>> maps out of 44 make me wonder why this large percentage ? > >>>>>> > >>>>>> Seems not good to me, and I have to remove another 7 OSDs (we are > >>>>>> demoting some old hardware nodes). This means I can potentialy go > with 7 x > >>>>>> the same number of missplaced objects...? > >>>>>> > >>>>>> Any thoughts ? > >>>>>> > >>>>>> Thanks > >>>>>> > >>>>>> On 3 March 2015 at 12:14, Andrija Panic <andrija.pa...@gmail.com> > >>>>>> wrote: > >>>>>>> > >>>>>>> Thanks Irek. > >>>>>>> > >>>>>>> Does this mean, that after peering for each PG, there will be delay > >>>>>>> of 10sec, meaning that every once in a while, I will have 10sec od > the > >>>>>>> cluster NOT being stressed/overloaded, and then the recovery takes > place for > >>>>>>> that PG, and then another 10sec cluster is fine, and then stressed > again ? > >>>>>>> > >>>>>>> I'm trying to understand process before actually doing stuff > (config > >>>>>>> reference is there on ceph.com but I don't fully understand the > process) > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Andrija > >>>>>>> > >>>>>>> On 3 March 2015 at 11:32, Irek Fasikhov <malm...@gmail.com> wrote: > >>>>>>>> > >>>>>>>> Hi. > >>>>>>>> > >>>>>>>> Use value "osd_recovery_delay_start" > >>>>>>>> example: > >>>>>>>> [root@ceph08 ceph]# ceph --admin-daemon > >>>>>>>> /var/run/ceph/ceph-osd.94.asok config show | grep > osd_recovery_delay_start > >>>>>>>> "osd_recovery_delay_start": "10" > >>>>>>>> > >>>>>>>> 2015-03-03 13:13 GMT+03:00 Andrija Panic <andrija.pa...@gmail.com > >: > >>>>>>>>> > >>>>>>>>> HI Guys, > >>>>>>>>> > >>>>>>>>> I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it > >>>>>>>>> caused over 37% od the data to rebalance - let's say this is > fine (this is > >>>>>>>>> when I removed it frm Crush Map). > >>>>>>>>> > >>>>>>>>> I'm wondering - I have previously set some throtling mechanism, > but > >>>>>>>>> during first 1h of rebalancing, my rate of recovery was going up > to 1500 > >>>>>>>>> MB/s - and VMs were unusable completely, and then last 4h of the > duration of > >>>>>>>>> recover this recovery rate went down to, say, 100-200 MB.s and > during this > >>>>>>>>> VM performance was still pretty impacted, but at least I could > work more or > >>>>>>>>> a less > >>>>>>>>> > >>>>>>>>> So my question, is this behaviour expected, is throtling here > >>>>>>>>> working as expected, since first 1h was almoust no throtling > applied if I > >>>>>>>>> check the recovery rate 1500MB/s and the impact on Vms. > >>>>>>>>> And last 4h seemed pretty fine (although still lot of impact in > >>>>>>>>> general) > >>>>>>>>> > >>>>>>>>> I changed these throtling on the fly with: > >>>>>>>>> > >>>>>>>>> ceph tell osd.* injectargs '--osd_recovery_max_active 1' > >>>>>>>>> ceph tell osd.* injectargs '--osd_recovery_op_priority 1' > >>>>>>>>> ceph tell osd.* injectargs '--osd_max_backfills 1' > >>>>>>>>> > >>>>>>>>> My Jorunals are on SSDs (12 OSD per server, of which 6 journals > on > >>>>>>>>> one SSD, 6 journals on another SSD) - I have 3 of these hosts. > >>>>>>>>> > >>>>>>>>> Any thought are welcome. > >>>>>>>>> -- > >>>>>>>>> > >>>>>>>>> Andrija Panić > >>>>>>>>> > >>>>>>>>> _______________________________________________ > >>>>>>>>> ceph-users mailing list > >>>>>>>>> ceph-users@lists.ceph.com > >>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> С уважением, Фасихов Ирек Нургаязович > >>>>>>>> Моб.: +79229045757 > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> > >>>>>>> Andrija Panić > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> > >>>>>> Andrija Panić > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> С уважением, Фасихов Ирек Нургаязович > >>>>> Моб.: +79229045757 > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> > >>>> Andrija Panić > >>> > >>> > >>> > >>> > >>> -- > >>> С уважением, Фасихов Ирек Нургаязович > >>> Моб.: +79229045757 > >> > >> > >> > >> > >> -- > >> С уважением, Фасихов Ирек Нургаязович > >> Моб.: +79229045757 > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- Andrija Panić
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com