Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

Andrija Panic Wed, 04 Mar 2015 00:36:08 -0800

Thank you Rober - I'm wondering when I do remove total of 7 OSDs from crush
map - weather that will cause more than 37% of data moved (80% or whatever)


I'm also wondering if the thortling that I applied is fine or not - I will
introduce the osd_recovery_delay_start 10sec as Irek said.

I'm just wondering hom much will be the performance impact, because:
- when stoping OSD, the impact while backfilling was fine more or a less -
I can leave with this
- when I removed OSD from cursh map - first 1h or so, impact was
tremendous, and later on during recovery process impact was much less but
still noticable...

Thanks for the tip of course !
Andrija

On 3 March 2015 at 18:34, Robert LeBlanc <rob...@leblancnet.us> wrote:

> I would be inclined to shut down both OSDs in a node, let the cluster
> recover. Once it is recovered, shut down the next two, let it recover.
> Repeat until all the OSDs are taken out of the cluster. Then I would
> set nobackfill and norecover. Then remove the hosts/disks from the
> CRUSH then unset nobackfill and norecover.
>
> That should give you a few small changes (when you shut down OSDs) and
> then one big one to get everything in the final place. If you are
> still adding new nodes, when nobackfill and norecover is set, you can
> add them in so that the one big relocate fills the new drives too.
>
> On Tue, Mar 3, 2015 at 5:58 AM, Andrija Panic <andrija.pa...@gmail.com>
> wrote:
> > Thx Irek. Number of replicas is 3.
> >
> > I have 3 servers with 2 OSDs on them on 1g switch (1 OSD already
> > decommissioned), which is further connected to a new 10G switch/network
> with
> > 3 servers on it with 12 OSDs each.
> > I'm decommissioning old 3 nodes on 1G network...
> >
> > So you suggest removing whole node with 2 OSDs manually from crush map?
> > Per my knowledge, ceph never places 2 replicas on 1 node, all 3 replicas
> > were originally been distributed over all 3 nodes. So anyway It could be
> > safe to remove 2 OSDs at once together with the node itself...since
> replica
> > count is 3...
> > ?
> >
> > Thx again for your time
> >
> > On Mar 3, 2015 1:35 PM, "Irek Fasikhov" <malm...@gmail.com> wrote:
> >>
> >> Once you have only three nodes in the cluster.
> >> I recommend you add new nodes to the cluster, and then delete the old.
> >>
> >> 2015-03-03 15:28 GMT+03:00 Irek Fasikhov <malm...@gmail.com>:
> >>>
> >>> You have a number of replication?
> >>>
> >>> 2015-03-03 15:14 GMT+03:00 Andrija Panic <andrija.pa...@gmail.com>:
> >>>>
> >>>> Hi Irek,
> >>>>
> >>>> yes, stoping OSD (or seting it to OUT) resulted in only 3% of data
> >>>> degraded and moved/recovered.
> >>>> When I after that removed it from Crush map "ceph osd crush rm id",
> >>>> that's when the stuff with 37% happened.
> >>>>
> >>>> And thanks Irek for help - could you kindly just let me know of the
> >>>> prefered steps when removing whole node?
> >>>> Do you mean I first stop all OSDs again, or just remove each OSD from
> >>>> crush map, or perhaps, just decompile cursh map, delete the node
> completely,
> >>>> compile back in, and let it heal/recover ?
> >>>>
> >>>> Do you think this would result in less data missplaces and moved
> arround
> >>>> ?
> >>>>
> >>>> Sorry for bugging you, I really appreaciate your help.
> >>>>
> >>>> Thanks
> >>>>
> >>>> On 3 March 2015 at 12:58, Irek Fasikhov <malm...@gmail.com> wrote:
> >>>>>
> >>>>> A large percentage of the rebuild of the cluster map (But low
> >>>>> percentage degradation). If you had not made "ceph osd crush rm id",
> the
> >>>>> percentage would be low.
> >>>>> In your case, the correct option is to remove the entire node, rather
> >>>>> than each disk individually
> >>>>>
> >>>>> 2015-03-03 14:27 GMT+03:00 Andrija Panic <andrija.pa...@gmail.com>:
> >>>>>>
> >>>>>> Another question - I mentioned here 37% of objects being moved
> arround
> >>>>>> - this is MISPLACED object (degraded objects were 0.001%, after I
> removed 1
> >>>>>> OSD from cursh map (out of 44 OSD or so).
> >>>>>>
> >>>>>> Can anybody confirm this is normal behaviour - and are there any
> >>>>>> workarrounds ?
> >>>>>>
> >>>>>> I understand this is because of the object placement algorithm of
> >>>>>> CEPH, but still 37% of object missplaces just by removing 1 OSD
> from crush
> >>>>>> maps out of 44 make me wonder why this large percentage ?
> >>>>>>
> >>>>>> Seems not good to me, and I have to remove another 7 OSDs (we are
> >>>>>> demoting some old hardware nodes). This means I can potentialy go
> with 7 x
> >>>>>> the same number of missplaced objects...?
> >>>>>>
> >>>>>> Any thoughts ?
> >>>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>> On 3 March 2015 at 12:14, Andrija Panic <andrija.pa...@gmail.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Thanks Irek.
> >>>>>>>
> >>>>>>> Does this mean, that after peering for each PG, there will be delay
> >>>>>>> of 10sec, meaning that every once in a while, I will have 10sec od
> the
> >>>>>>> cluster NOT being stressed/overloaded, and then the recovery takes
> place for
> >>>>>>> that PG, and then another 10sec cluster is fine, and then stressed
> again ?
> >>>>>>>
> >>>>>>> I'm trying to understand process before actually doing stuff
> (config
> >>>>>>> reference is there on ceph.com but I don't fully understand the
> process)
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Andrija
> >>>>>>>
> >>>>>>> On 3 March 2015 at 11:32, Irek Fasikhov <malm...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> Hi.
> >>>>>>>>
> >>>>>>>> Use value "osd_recovery_delay_start"
> >>>>>>>> example:
> >>>>>>>> [root@ceph08 ceph]# ceph --admin-daemon
> >>>>>>>> /var/run/ceph/ceph-osd.94.asok config show  | grep
> osd_recovery_delay_start
> >>>>>>>>   "osd_recovery_delay_start": "10"
> >>>>>>>>
> >>>>>>>> 2015-03-03 13:13 GMT+03:00 Andrija Panic <andrija.pa...@gmail.com
> >:
> >>>>>>>>>
> >>>>>>>>> HI Guys,
> >>>>>>>>>
> >>>>>>>>> I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it
> >>>>>>>>> caused over 37% od the data to rebalance - let's say this is
> fine (this is
> >>>>>>>>> when I removed it frm Crush Map).
> >>>>>>>>>
> >>>>>>>>> I'm wondering - I have previously set some throtling mechanism,
> but
> >>>>>>>>> during first 1h of rebalancing, my rate of recovery was going up
> to 1500
> >>>>>>>>> MB/s - and VMs were unusable completely, and then last 4h of the
> duration of
> >>>>>>>>> recover this recovery rate went down to, say, 100-200 MB.s and
> during this
> >>>>>>>>> VM performance was still pretty impacted, but at least I could
> work more or
> >>>>>>>>> a less
> >>>>>>>>>
> >>>>>>>>> So my question, is this behaviour expected, is throtling here
> >>>>>>>>> working as expected, since first 1h was almoust no throtling
> applied if I
> >>>>>>>>> check the recovery rate 1500MB/s and the impact on Vms.
> >>>>>>>>> And last 4h seemed pretty fine (although still lot of impact in
> >>>>>>>>> general)
> >>>>>>>>>
> >>>>>>>>> I changed these throtling on the fly with:
> >>>>>>>>>
> >>>>>>>>> ceph tell osd.* injectargs '--osd_recovery_max_active 1'
> >>>>>>>>> ceph tell osd.* injectargs '--osd_recovery_op_priority 1'
> >>>>>>>>> ceph tell osd.* injectargs '--osd_max_backfills 1'
> >>>>>>>>>
> >>>>>>>>> My Jorunals are on SSDs (12 OSD per server, of which 6 journals
> on
> >>>>>>>>> one SSD, 6 journals on another SSD)  - I have 3 of these hosts.
> >>>>>>>>>
> >>>>>>>>> Any thought are welcome.
> >>>>>>>>> --
> >>>>>>>>>
> >>>>>>>>> Andrija Panić
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> ceph-users mailing list
> >>>>>>>>> ceph-users@lists.ceph.com
> >>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> С уважением, Фасихов Ирек Нургаязович
> >>>>>>>> Моб.: +79229045757
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>>
> >>>>>>> Andrija Panić
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> Andrija Panić
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> С уважением, Фасихов Ирек Нургаязович
> >>>>> Моб.: +79229045757
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> Andrija Panić
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> С уважением, Фасихов Ирек Нургаязович
> >>> Моб.: +79229045757
> >>
> >>
> >>
> >>
> >> --
> >> С уважением, Фасихов Ирек Нургаязович
> >> Моб.: +79229045757
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>



-- 

Andrija Panić

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

Reply via email to