On Sat, Apr 27, 2019, 3:49 PM Nikhil R <nikh.ravin...@gmail.com> wrote:
> We have baremetal nodes 256GB RAM, 36core CPU > We are on ceph jewel 10.2.9 with leveldb > The osd’s and journals are on the same hdd. > We have 1 backfill_max_active, 1 recovery_max_active and 1 > recovery_op_priority > The osd crashes and starts once a pg is backfilled and the next pg tried > to backfill. This is when we see iostat and the disk is utilised upto 100%. > I would set noout to prevent excess movement in the event of OSD flapping, and disable scrubbing and deep scrubbing until your backfilling has completed. I would also bring the new OSDs online a few at a time rather than all 25 at once if you add more servers. > Appreciate your help David > > On Sun, 28 Apr 2019 at 00:46, David C <dcsysengin...@gmail.com> wrote: > >> >> >> On Sat, 27 Apr 2019, 18:50 Nikhil R, <nikh.ravin...@gmail.com> wrote: >> >>> Guys, >>> We now have a total of 105 osd’s on 5 baremetal nodes each hosting 21 >>> osd’s on HDD which are 7Tb with journals on HDD too. Each journal is about >>> 5GB >>> >> >> This would imply you've got a separate hdd partition for journals, I >> don't think there's any value in that and would probabaly be detrimental to >> performance. >> >>> >>> We expanded our cluster last week and added 1 more node with 21 HDD and >>> journals on same disk. >>> Our client i/o is too heavy and we are not able to backfill even 1 >>> thread during peak hours - incase we backfill during peak hours osd's are >>> crashing causing undersized pg's and if we have another osd crash we wont >>> be able to use our cluster due to undersized and recovery pg's. During >>> non-peak we can just backfill 8-10 pgs. >>> Due to this our MAX AVAIL is draining out very fast. >>> >> >> How much ram have you got in your nodes? In my experience that's a common >> reason for crashing OSDs during recovery ops >> >> What does your recovery and backfill tuning look like? >> >> >> >>> We are thinking of adding 2 more baremetal nodes with 21 *7tb osd’s on >>> HDD and add 50GB SSD Journals for these. >>> We aim to backfill from the 105 osd’s a bit faster and expect writes of >>> backfillis coming to these osd’s faster. >>> >> >> Ssd journals would certainly help, just be sure it's a model that >> performs well with Ceph >> >>> >>> Is this a good viable idea? >>> Thoughts please? >>> >> >> I'd recommend sharing more detail e.g full spec of the nodes, Ceph >> version etc. >> >>> >>> -Nikhil >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> -- > Sent from my iPhone > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com