On Sat, Apr 27, 2019, 3:49 PM Nikhil R <nikh.ravin...@gmail.com> wrote:

> We have baremetal nodes 256GB RAM, 36core CPU
> We are on ceph jewel 10.2.9 with leveldb
> The osd’s and journals are on the same hdd.
> We have 1 backfill_max_active, 1 recovery_max_active and 1
> recovery_op_priority
> The osd crashes and starts once a pg is backfilled and the next pg tried
> to backfill. This is when we see iostat and the disk is utilised upto 100%.
>

I would set noout to prevent excess movement in the event of OSD flapping,
and disable scrubbing and deep scrubbing until your backfilling has
completed. I would also bring the new OSDs online a few at a time rather
than all 25 at once if you add more servers.


> Appreciate your help David
>
> On Sun, 28 Apr 2019 at 00:46, David C <dcsysengin...@gmail.com> wrote:
>
>>
>>
>> On Sat, 27 Apr 2019, 18:50 Nikhil R, <nikh.ravin...@gmail.com> wrote:
>>
>>> Guys,
>>> We now have a total of 105 osd’s on 5 baremetal nodes each hosting 21
>>> osd’s on HDD which are 7Tb with journals on HDD too. Each journal is about
>>> 5GB
>>>
>>
>> This would imply you've got a separate hdd partition for journals, I
>> don't think there's any value in that and would probabaly be detrimental to
>> performance.
>>
>>>
>>> We expanded our cluster last week and added 1 more node with 21 HDD and
>>> journals on same disk.
>>> Our client i/o is too heavy and we are not able to backfill even 1
>>> thread during peak hours - incase we backfill during peak hours osd's are
>>> crashing causing undersized pg's and if we have another osd crash we wont
>>> be able to use our cluster due to undersized and recovery pg's. During
>>> non-peak we can just backfill 8-10 pgs.
>>> Due to this our MAX AVAIL is draining out very fast.
>>>
>>
>> How much ram have you got in your nodes? In my experience that's a common
>> reason for crashing OSDs during recovery ops
>>
>> What does your recovery and backfill tuning look like?
>>
>>
>>
>>> We are thinking of adding 2 more baremetal nodes with 21 *7tb  osd’s on
>>>  HDD and add 50GB SSD Journals for these.
>>> We aim to backfill from the 105 osd’s a bit faster and expect writes of
>>> backfillis coming to these osd’s faster.
>>>
>>
>> Ssd journals would certainly help, just be sure it's a model that
>> performs well with Ceph
>>
>>>
>>> Is this a good viable idea?
>>> Thoughts please?
>>>
>>
>> I'd recommend sharing more detail e.g full spec of the nodes, Ceph
>> version etc.
>>
>>>
>>> -Nikhil
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> --
> Sent from my iPhone
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to