We have baremetal nodes 256GB RAM, 36core CPU We are on ceph jewel 10.2.9 with leveldb The osd’s and journals are on the same hdd. We have 1 backfill_max_active, 1 recovery_max_active and 1 recovery_op_priority The osd crashes and starts once a pg is backfilled and the next pg tried to backfill. This is when we see iostat and the disk is utilised upto 100%.
Appreciate your help David On Sun, 28 Apr 2019 at 00:46, David C <dcsysengin...@gmail.com> wrote: > > > On Sat, 27 Apr 2019, 18:50 Nikhil R, <nikh.ravin...@gmail.com> wrote: > >> Guys, >> We now have a total of 105 osd’s on 5 baremetal nodes each hosting 21 >> osd’s on HDD which are 7Tb with journals on HDD too. Each journal is about >> 5GB >> > > This would imply you've got a separate hdd partition for journals, I don't > think there's any value in that and would probabaly be detrimental to > performance. > >> >> We expanded our cluster last week and added 1 more node with 21 HDD and >> journals on same disk. >> Our client i/o is too heavy and we are not able to backfill even 1 thread >> during peak hours - incase we backfill during peak hours osd's are crashing >> causing undersized pg's and if we have another osd crash we wont be able to >> use our cluster due to undersized and recovery pg's. During non-peak we can >> just backfill 8-10 pgs. >> Due to this our MAX AVAIL is draining out very fast. >> > > How much ram have you got in your nodes? In my experience that's a common > reason for crashing OSDs during recovery ops > > What does your recovery and backfill tuning look like? > > > >> We are thinking of adding 2 more baremetal nodes with 21 *7tb osd’s on >> HDD and add 50GB SSD Journals for these. >> We aim to backfill from the 105 osd’s a bit faster and expect writes of >> backfillis coming to these osd’s faster. >> > > Ssd journals would certainly help, just be sure it's a model that performs > well with Ceph > >> >> Is this a good viable idea? >> Thoughts please? >> > > I'd recommend sharing more detail e.g full spec of the nodes, Ceph version > etc. > >> >> -Nikhil >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > -- Sent from my iPhone
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com