On 26/04/2019 21.50, Gregory Farnum wrote:
On Fri, Apr 26, 2019 at 10:55 AM Jan Pekař - Imatic wrote:
Hi,
yesterday my cluster reported slow request for minutes and after restarting
OSDs (reporting slow requests) it stuck with peering PGs. Whole
cluster was not responding and IO stopped.
I
Guys,
We now have a total of 105 osd’s on 5 baremetal nodes each hosting 21 osd’s
on HDD which are 7Tb with journals on HDD too. Each journal is about 5GB.
We expanded our cluster last week and added 1 more node with 21 HDD and
journals on same disk.
Our client i/o is too heavy and we are not able
On Sat, 27 Apr 2019, 18:50 Nikhil R, wrote:
> Guys,
> We now have a total of 105 osd’s on 5 baremetal nodes each hosting 21
> osd’s on HDD which are 7Tb with journals on HDD too. Each journal is about
> 5GB
>
This would imply you've got a separate hdd partition for journals, I don't
think there'
We have baremetal nodes 256GB RAM, 36core CPU
We are on ceph jewel 10.2.9 with leveldb
The osd’s and journals are on the same hdd.
We have 1 backfill_max_active, 1 recovery_max_active and 1
recovery_op_priority
The osd crashes and starts once a pg is backfilled and the next pg tried to
backfill. Th
On Sat, Apr 27, 2019, 3:49 PM Nikhil R wrote:
> We have baremetal nodes 256GB RAM, 36core CPU
> We are on ceph jewel 10.2.9 with leveldb
> The osd’s and journals are on the same hdd.
> We have 1 backfill_max_active, 1 recovery_max_active and 1
> recovery_op_priority
> The osd crashes and starts o
Hi,
I have set noout, noscrub and nodeep-scrub and the last time we added osd's
we adding few at a time.
The main issue here is with IOPS where the existing osd's are not able to
backfill at a higher rate - not even 1 thread during peak hours and a max
of 2 threads during off-peak. We are getting m
An update.
We noticed contradicting output from chrony. "chronyc sources" showed
that chrony was synced. However, we also noted this output:
root@ceph2:/etc/chrony# chronyc activity
200 OK
0 sources online
4 sources offline
0 sources doing burst (return to online)
0 sources doing burst (retur