Re: [ceph-users] PG stuck peering - OSD cephx: verify_authorizer key problem

2019-04-27 Thread Jan Pekař - Imatic
On 26/04/2019 21.50, Gregory Farnum wrote: On Fri, Apr 26, 2019 at 10:55 AM Jan Pekař - Imatic wrote: Hi, yesterday my cluster reported slow request for minutes and after restarting OSDs (reporting slow requests) it stuck with peering PGs. Whole cluster was not responding and IO stopped. I

[ceph-users] IMPORTANT : NEED HELP : Low IOPS on hdd : MAX AVAIL Draining fast

2019-04-27 Thread Nikhil R
Guys, We now have a total of 105 osd’s on 5 baremetal nodes each hosting 21 osd’s on HDD which are 7Tb with journals on HDD too. Each journal is about 5GB. We expanded our cluster last week and added 1 more node with 21 HDD and journals on same disk. Our client i/o is too heavy and we are not able

Re: [ceph-users] IMPORTANT : NEED HELP : Low IOPS on hdd : MAX AVAIL Draining fast

2019-04-27 Thread David C
On Sat, 27 Apr 2019, 18:50 Nikhil R, wrote: > Guys, > We now have a total of 105 osd’s on 5 baremetal nodes each hosting 21 > osd’s on HDD which are 7Tb with journals on HDD too. Each journal is about > 5GB > This would imply you've got a separate hdd partition for journals, I don't think there'

Re: [ceph-users] IMPORTANT : NEED HELP : Low IOPS on hdd : MAX AVAIL Draining fast

2019-04-27 Thread Nikhil R
We have baremetal nodes 256GB RAM, 36core CPU We are on ceph jewel 10.2.9 with leveldb The osd’s and journals are on the same hdd. We have 1 backfill_max_active, 1 recovery_max_active and 1 recovery_op_priority The osd crashes and starts once a pg is backfilled and the next pg tried to backfill. Th

Re: [ceph-users] IMPORTANT : NEED HELP : Low IOPS on hdd : MAX AVAIL Draining fast

2019-04-27 Thread Erik McCormick
On Sat, Apr 27, 2019, 3:49 PM Nikhil R wrote: > We have baremetal nodes 256GB RAM, 36core CPU > We are on ceph jewel 10.2.9 with leveldb > The osd’s and journals are on the same hdd. > We have 1 backfill_max_active, 1 recovery_max_active and 1 > recovery_op_priority > The osd crashes and starts o

Re: [ceph-users] IMPORTANT : NEED HELP : Low IOPS on hdd : MAX AVAIL Draining fast

2019-04-27 Thread Nikhil R
Hi, I have set noout, noscrub and nodeep-scrub and the last time we added osd's we adding few at a time. The main issue here is with IOPS where the existing osd's are not able to backfill at a higher rate - not even 1 thread during peak hours and a max of 2 threads during off-peak. We are getting m

Re: [ceph-users] clock skew

2019-04-27 Thread mj
An update. We noticed contradicting output from chrony. "chronyc sources" showed that chrony was synced. However, we also noted this output: root@ceph2:/etc/chrony# chronyc activity 200 OK 0 sources online 4 sources offline 0 sources doing burst (return to online) 0 sources doing burst (retur