Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-02-05 Thread Jake Grimmett
Dear Nick & Wido, Many thanks for your helpful advice; our cluster has returned to HEALTH_OK One caveat is that a small number of pgs remained at "activating". By increasing mon_max_pg_per_osd from 500 to 1000 these few osds activated, allowing the cluster to rebalance fully. i.e. this was n

Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-01-29 Thread Wido den Hollander
On 01/29/2018 04:21 PM, Jake Grimmett wrote: Hi Nick, many thanks for the tip, I've set "osd_max_pg_per_osd_hard_ratio = 3" and restarted the OSD's. So far it's looking promising, I now have 56% objects misplaced rather than 3021 pgs inactive. cluster now working hard to rebalance. PGs s

Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-01-29 Thread Jake Grimmett
Hi Nick, many thanks for the tip, I've set "osd_max_pg_per_osd_hard_ratio = 3" and restarted the OSD's. So far it's looking promising, I now have 56% objects misplaced rather than 3021 pgs inactive. cluster now working hard to rebalance. I will report back after things stabilise... many, man

Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-01-29 Thread Wido den Hollander
On 01/29/2018 02:07 PM, Nick Fisk wrote: Hi Jake, I suspect you have hit an issue that me and a few others have hit in Luminous. By increasing the number of PG's before all the data has re-balanced, you have probably exceeded hard PG per OSD limit. See this thread https://www.spinics.net/list

Re: [ceph-users] pgs down after adding 260 OSDs & increasing PGs

2018-01-29 Thread Nick Fisk
Hi Jake, I suspect you have hit an issue that me and a few others have hit in Luminous. By increasing the number of PG's before all the data has re-balanced, you have probably exceeded hard PG per OSD limit. See this thread https://www.spinics.net/lists/ceph-users/msg41231.html Nick > -Orig