We have baremetal nodes 256GB RAM, 36core CPU
We are on ceph jewel 10.2.9 with leveldb
The osd’s and journals are on the same hdd.
We have 1 backfill_max_active, 1 recovery_max_active and 1
recovery_op_priority
The osd crashes and starts once a pg is backfilled and the next pg tried to
backfill. This is when we see iostat and the disk is utilised upto 100%.

Appreciate your help David

On Sun, 28 Apr 2019 at 00:46, David C <dcsysengin...@gmail.com> wrote:

>
>
> On Sat, 27 Apr 2019, 18:50 Nikhil R, <nikh.ravin...@gmail.com> wrote:
>
>> Guys,
>> We now have a total of 105 osd’s on 5 baremetal nodes each hosting 21
>> osd’s on HDD which are 7Tb with journals on HDD too. Each journal is about
>> 5GB
>>
>
> This would imply you've got a separate hdd partition for journals, I don't
> think there's any value in that and would probabaly be detrimental to
> performance.
>
>>
>> We expanded our cluster last week and added 1 more node with 21 HDD and
>> journals on same disk.
>> Our client i/o is too heavy and we are not able to backfill even 1 thread
>> during peak hours - incase we backfill during peak hours osd's are crashing
>> causing undersized pg's and if we have another osd crash we wont be able to
>> use our cluster due to undersized and recovery pg's. During non-peak we can
>> just backfill 8-10 pgs.
>> Due to this our MAX AVAIL is draining out very fast.
>>
>
> How much ram have you got in your nodes? In my experience that's a common
> reason for crashing OSDs during recovery ops
>
> What does your recovery and backfill tuning look like?
>
>
>
>> We are thinking of adding 2 more baremetal nodes with 21 *7tb  osd’s on
>>  HDD and add 50GB SSD Journals for these.
>> We aim to backfill from the 105 osd’s a bit faster and expect writes of
>> backfillis coming to these osd’s faster.
>>
>
> Ssd journals would certainly help, just be sure it's a model that performs
> well with Ceph
>
>>
>> Is this a good viable idea?
>> Thoughts please?
>>
>
> I'd recommend sharing more detail e.g full spec of the nodes, Ceph version
> etc.
>
>>
>> -Nikhil
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> --
Sent from my iPhone
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to