Re: [ceph-users] New OSD Nodes, pgs haven't changed state

Chris Taylor Tue, 11 Oct 2016 14:44:02 -0700

 

I see on this list often that peering issues are related to networking
and MTU sizes. Perhaps the HP 5400's or the managed switches did not
have jumbo frames enabled?


Hope that helps you determine the issue in case you want to move the
nodes back to the other location. 

Chris 

On 2016-10-11 2:30 pm, Mike Jacobacci wrote: 

> Hi Goncalo, 
> 
> Thanks for your reply! I finally figured out that our issue was with the 
> physical setup of the nodes. Se had one OSD and MON node in our office and 
> the others are co-located at our ISP. We have an almost dark fiber going 
> between our two buildings connected via HP 5400's, but it really isn't since 
> there are some switches in between doing VLAN rewriting (ISP managed). 
> 
> Even though all the interfaces were communicating without issue, no data 
> would move across the nodes. I ended up moving all nodes into the same rack 
> and data immediately started moving and the cluster is now working! So it 
> seems the storage traffic was being dropped/blocked by something on our ISP 
> side. 
> 
> Cheers, 
> Mike 
> 
> On Mon, Oct 10, 2016 at 5:22 PM, Goncalo Borges 
> <goncalo.bor...@sydney.edu.au> wrote:
> 
>> Hi Mike...
>> 
>> I was hoping that someone with a bit more experience would answer you since 
>> I never had similar situation. So, I'll try to step in and help.
>> 
>> The peering process means that the OSDs are agreeing on the state of objects 
>> in the PGs they share. The peering process can take some time and is a hard 
>> operation to execute from a ceph point of view, specially if a lot of 
>> peering happens at the same time. This is one of the reasons why also the pg 
>> increase should be done in very small steps (normally increases of 256 pgs).
>> 
>> Is your cluster slowly decreasing the number of pgs in peering? and the 
>> number of active pgs increasing? If you see no evolution at all after this 
>> time, you can have a problem.
>> 
>> pgs which do not leave the peering state may be because:
>> - incorrect crush map
>> - issues in osds
>> - issues with the network
>> 
>> Check that your network is working as expected and that you do not have 
>> firewalls blocking traffic and so on.
>> 
>> A pg query for one of those peering pgs may provide some further information 
>> about what could be wrong.
>> 
>> Looking to osd logs may also show a bit of light.
>> 
>> Cheers
>> Goncalo
>> 
>> ________________________________________
>> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Mike 
>> Jacobacci [mi...@flowjo.com]
>> Sent: 10 October 2016 01:55
>> To: ceph-us...@ceph.com
>> Subject: [ceph-users] New OSD Nodes, pgs haven't changed state
>> 
>> Hi,
>> 
>> Yesterday morning I added two more OSD nodes and changed the crushmap from 
>> disk to node. It looked to me like everything went ok besides some disks 
>> missing that I can re-add later, but the cluster status hasn't changed since 
>> then. Here is the output of ceph -w:
>> 
>> cluster 395fb046-0062-4252-914c-013258c5575c
>> health HEALTH_ERR
>> 1761 pgs are stuck inactive for more than 300 seconds
>> 1761 pgs peering
>> 1761 pgs stuck inactive
>> 8 requests are blocked > 32 sec
>> crush map has legacy tunables (require bobtail, min is firefly)
>> monmap e2: 3 mons at {birkeland=192.168.10.190:6789/0,immanuel=192.168.10.1 
>> [1]<http://192.168.10.190:6789/0,immanuel=192.168.10.1 [1]> 
>> 25:6789/0,peratt=192.168.10.187:6789/0 [2]<http://192.168.10.187:6789/0 [2]>}
>> 
>> election epoch 14, quorum 0,1,2 immanuel,peratt,birkeland
>> osdmap e186: 26 osds: 26 up, 26 in; 1796 remapped pgs
>> flags sortbitwise
>> pgmap v6599413: 1796 pgs, 4 pools, 1343 GB data, 336 kobjects
>> 4049 GB used, 92779 GB / 96829 GB avail
>> 1761 remapped+peering
>> 35 active+clean
>> 2016-10-09 07:00:00.000776 mon.0 [INF] HEALTH_ERR; 1761 pgs are stuck 
>> inactive f or more than 300 seconds; 1761 pgs peering; 1761 pgs stuck 
>> inactive; 8 requests are blocked > 32 sec; crush map has legacy tunables 
>> (require bobtail, min is fir efly)
>> 
>> I have legacy tunables on since Ceph is only backing our Xenserver 
>> infrastructure. The number of pgs remapping and clean haven't changed and 
>> there isn't seem to be that much data... Is this normal behavior?
>> 
>> Here is my crushmap:
>> 
>> # begin crush map
>> tunable choose_local_tries 0
>> tunable choose_local_fallback_tries 0
>> tunable choose_total_tries 50
>> tunable chooseleaf_descend_once 1
>> tunable straw_calc_version 1
>> # devices
>> device 0 osd.0
>> device 1 osd.1
>> device 2 osd.2
>> device 3 osd.3
>> device 4 osd.4
>> device 5 osd.5
>> device 6 osd.6
>> device 7 osd.7
>> device 8 osd.8
>> device 9 osd.9
>> device 10 osd.10
>> device 11 osd.11
>> device 12 osd.12
>> device 13 osd.13
>> device 14 osd.14
>> device 15 osd.15
>> device 16 osd.16
>> device 17 osd.17
>> device 18 osd.18
>> device 19 osd.19
>> device 20 osd.20
>> device 21 osd.21
>> device 22 osd.22
>> device 23 osd.23
>> device 24 osd.24
>> device 25 osd.25
>> # types
>> type 0 osd
>> type 1 host
>> type 2 chassis
>> type 3 rack
>> type 4 row
>> type 5 pdu
>> type 6 pod
>> type 7 room
>> type 8 datacenter
>> type 9 region
>> type 10 root
>> # buckets
>> host tesla {
>> id -2 # do not change unnecessarily
>> # weight 36.369
>> alg straw
>> hash 0 # rjenkins1
>> item osd.5 weight 3.637
>> item osd.0 weight 3.637
>> item osd.2 weight 3.637
>> item osd.4 weight 3.637
>> item osd.8 weight 3.637
>> item osd.3 weight 3.637
>> item osd.6 weight 3.637
>> item osd.1 weight 3.637
>> item osd.9 weight 3.637
>> item osd.7 weight 3.637
>> }
>> host faraday {
>> id -3 # do not change unnecessarily
>> # weight 32.732
>> alg straw
>> hash 0 # rjenkins1
>> item osd.23 weight 3.637
>> item osd.18 weight 3.637
>> item osd.17 weight 3.637
>> item osd.25 weight 3.637
>> item osd.20 weight 3.637
>> item osd.22 weight 3.637
>> item osd.21 weight 3.637
>> item osd.19 weight 3.637
>> item osd.24 weight 3.637
>> }
>> host hertz {
>> id -4 # do not change unnecessarily
>> # weight 25.458
>> alg straw
>> hash 0 # rjenkins1
>> item osd.15 weight 3.637
>> item osd.12 weight 3.637
>> item osd.13 weight 3.637
>> item osd.14 weight 3.637
>> item osd.16 weight 3.637
>> item osd.10 weight 3.637
>> item osd.11 weight 3.637
>> }
>> root default {
>> id -1 # do not change unnecessarily
>> # weight 94.559
>> alg straw
>> hash 0 # rjenkins1
>> item tesla weight 36.369
>> item faraday weight 32.732
>> item hertz weight 25.458
>> }
>> # rules
>> rule replicated_ruleset {
>> ruleset 0
>> type replicated
>> min_size 1
>> max_size 10
>> step take default
>> step chooseleaf firstn 0 type host
>> step emit
>> }
>> # end crush map
>> 
>> Cheers,
>> Mike
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [3]
 

Links:
------
[1] http://192.168.10.190:6789/0,immanuel=192.168.10.1
[2] http://192.168.10.187:6789/0
[3] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] New OSD Nodes, pgs haven't changed state

Reply via email to