Re: [ceph-users] New OSD Nodes, pgs haven't changed state

Mike Jacobacci Tue, 11 Oct 2016 14:46:24 -0700

HI Chris,

That's an interesting point, I bet the managed switches don't have jumbo
frames enabled.


I think I am going to leave everything at our colo for now.

Cheers,
Mike

On Tue, Oct 11, 2016 at 2:42 PM, Chris Taylor <ctay...@eyonic.com> wrote:

>
>
> I see on this list often that peering issues are related to networking and
> MTU sizes. Perhaps the HP 5400's or the managed switches did not have jumbo
> frames enabled?
>
> Hope that helps you determine the issue in case you want to move the nodes
> back to the other location.
>
>
>
> Chris
>
>
>
> On 2016-10-11 2:30 pm, Mike Jacobacci wrote:
>
> Hi Goncalo,
>
> Thanks for your reply!  I finally figured out that our issue was with the
> physical setup of the nodes.  Se had one OSD and MON node in our office and
> the others are co-located at our ISP.  We have an almost dark fiber going
> between our two buildings connected via HP 5400's, but it really isn't
> since there are some switches in between doing VLAN rewriting (ISP managed).
>
> Even though all the interfaces were communicating without issue, no data
> would move across the nodes.  I ended up moving all nodes into the same
> rack and data immediately started moving and the cluster is now working!
> So it seems the storage traffic was being dropped/blocked by something on
> our ISP side.
>
> Cheers,
> Mike
>
> On Mon, Oct 10, 2016 at 5:22 PM, Goncalo Borges <
> goncalo.bor...@sydney.edu.au> wrote:
>
>> Hi Mike...
>>
>> I was hoping that someone with a bit more experience would answer you
>> since I never had similar situation. So, I'll try to step in and help.
>>
>> The peering process means that the OSDs are agreeing on the state of
>> objects in the PGs they share. The peering process can take some time and
>> is a hard operation to execute from a ceph point of view, specially if a
>> lot of peering happens at the same time. This is one of the reasons why
>> also the pg increase should be done in very small steps (normally increases
>> of 256 pgs).
>>
>> Is your cluster slowly decreasing the number of pgs in peering? and the
>> number of  active pgs increasing? If you see no evolution at all after this
>> time, you can have a problem.
>>
>> pgs which do not leave the peering state may be because:
>> - incorrect crush map
>> - issues in osds
>> - issues with the network
>>
>> Check that your network is working as expected and that you do not have
>> firewalls blocking traffic and so on.
>>
>> A pg query for one of those peering pgs may provide some further
>> information about what could be wrong.
>>
>> Looking to osd logs may also show a bit of light.
>>
>> Cheers
>> Goncalo
>>
>>
>>
>> ________________________________________
>> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of Mike
>> Jacobacci [mi...@flowjo.com]
>> Sent: 10 October 2016 01:55
>> To: ceph-us...@ceph.com
>> Subject: [ceph-users] New OSD Nodes, pgs haven't changed state
>>
>> Hi,
>>
>> Yesterday morning I added two more OSD nodes and changed the crushmap
>> from disk to node. It looked to me like everything went ok besides some
>> disks missing that I can re-add later, but the cluster status hasn't
>> changed since then.  Here is the output of ceph -w:
>>
>>     cluster 395fb046-0062-4252-914c-013258c5575c
>>      health HEALTH_ERR
>>             1761 pgs are stuck inactive for more than 300 seconds
>>             1761 pgs peering
>>             1761 pgs stuck inactive
>>             8 requests are blocked > 32 sec
>>             crush map has legacy tunables (require bobtail, min is
>> firefly)
>>      monmap e2: 3 mons at {birkeland=192.168.10.190:6789
>> /0,immanuel=192.168.10.1<http://192.168.10.190:6789/0,immanu
>> el=192.168.10.1>                                     25:6789/0,peratt=
>> 192.168.10.187:6789/0<http://192.168.10.187:6789/0>}
>>             election epoch 14, quorum 0,1,2 immanuel,peratt,birkeland
>>      osdmap e186: 26 osds: 26 up, 26 in; 1796 remapped pgs
>>             flags sortbitwise
>>       pgmap v6599413: 1796 pgs, 4 pools, 1343 GB data, 336 kobjects
>>             4049 GB used, 92779 GB / 96829 GB avail
>>                 1761 remapped+peering
>>                   35 active+clean
>> 2016-10-09 07:00:00.000776 mon.0 [INF] HEALTH_ERR; 1761 pgs are stuck
>> inactive f                                     or more than 300 seconds;
>> 1761 pgs peering; 1761 pgs stuck inactive; 8 requests
>>                 are blocked > 32 sec; crush map has legacy tunables
>> (require bobtail, min is fir                                     efly)
>>
>>
>> I have legacy tunables on since Ceph is only backing our Xenserver
>> infrastructure.  The number of pgs remapping and clean haven't changed and
>> there isn't seem to be that much data... Is this normal behavior?
>>
>> Here is my crushmap:
>>
>> # begin crush map
>> tunable choose_local_tries 0
>> tunable choose_local_fallback_tries 0
>> tunable choose_total_tries 50
>> tunable chooseleaf_descend_once 1
>> tunable straw_calc_version 1
>> # devices
>> device 0 osd.0
>> device 1 osd.1
>> device 2 osd.2
>> device 3 osd.3
>> device 4 osd.4
>> device 5 osd.5
>> device 6 osd.6
>> device 7 osd.7
>> device 8 osd.8
>> device 9 osd.9
>> device 10 osd.10
>> device 11 osd.11
>> device 12 osd.12
>> device 13 osd.13
>> device 14 osd.14
>> device 15 osd.15
>> device 16 osd.16
>> device 17 osd.17
>> device 18 osd.18
>> device 19 osd.19
>> device 20 osd.20
>> device 21 osd.21
>> device 22 osd.22
>> device 23 osd.23
>> device 24 osd.24
>> device 25 osd.25
>> # types
>> type 0 osd
>> type 1 host
>> type 2 chassis
>> type 3 rack
>> type 4 row
>> type 5 pdu
>> type 6 pod
>> type 7 room
>> type 8 datacenter
>> type 9 region
>> type 10 root
>> # buckets
>> host tesla {
>>         id -2           # do not change unnecessarily
>>         # weight 36.369
>>         alg straw
>>         hash 0  # rjenkins1
>>         item osd.5 weight 3.637
>>         item osd.0 weight 3.637
>>         item osd.2 weight 3.637
>>         item osd.4 weight 3.637
>>         item osd.8 weight 3.637
>>         item osd.3 weight 3.637
>>         item osd.6 weight 3.637
>>         item osd.1 weight 3.637
>>         item osd.9 weight 3.637
>>         item osd.7 weight 3.637
>> }
>> host faraday {
>>         id -3           # do not change unnecessarily
>>         # weight 32.732
>>         alg straw
>>         hash 0  # rjenkins1
>>         item osd.23 weight 3.637
>>         item osd.18 weight 3.637
>>         item osd.17 weight 3.637
>>         item osd.25 weight 3.637
>>         item osd.20 weight 3.637
>>         item osd.22 weight 3.637
>>         item osd.21 weight 3.637
>>         item osd.19 weight 3.637
>>         item osd.24 weight 3.637
>> }
>> host hertz {
>>         id -4           # do not change unnecessarily
>>         # weight 25.458
>>         alg straw
>>         hash 0  # rjenkins1
>>         item osd.15 weight 3.637
>>         item osd.12 weight 3.637
>>         item osd.13 weight 3.637
>>         item osd.14 weight 3.637
>>         item osd.16 weight 3.637
>>         item osd.10 weight 3.637
>>         item osd.11 weight 3.637
>> }
>> root default {
>>         id -1           # do not change unnecessarily
>>         # weight 94.559
>>         alg straw
>>         hash 0  # rjenkins1
>>         item tesla weight 36.369
>>         item faraday weight 32.732
>>         item hertz weight 25.458
>> }
>> # rules
>> rule replicated_ruleset {
>>         ruleset 0
>>         type replicated
>>         min_size 1
>>         max_size 10
>>         step take default
>>         step chooseleaf firstn 0 type host
>>         step emit
>> }
>> # end crush map
>>
>>
>> Cheers,
>> Mike
>>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] New OSD Nodes, pgs haven't changed state

Reply via email to