[c-nsp] N3K: "VPC peer keep-alive receive has failed"

Manuel Guesdon Thu, 27 Dec 2018 02:59:16 -0800

Hi,

I have a strange problem with Nexus N3K and QinQ tunnel.



I've configured 2 Nexus 3064 with VPC. It works well for monthes.

Recently I've added a port-channel in dot1q-tunnel mode (the 1st one in this
mode).
Since that I have this message:
"%$ VDC-1 %$ %VPC-2-PEER_KEEP_ALIVE_RECV_FAIL: In domain 1, VPC peer
keep-alive receive has failed" multiple times a day on the 2 switches.

Details:
  BIOS: version 4.1.0
  NXOS: version 7.0(3)I6(1)

  new interface & port-channel:

        interface Ethernet1/35
          switchport mode dot1q-tunnel
          switchport access vlan 72
          spanning-tree port type edge
          speed 10000
          channel-group 1035

        interface port-channel1035
          switchport mode dot1q-tunnel
          switchport access vlan 72
          speed 10000
          vpc 1035

   A "sh vlan id 72" only report peer-link ports/portchannels and 
   eth1/35 / po1035.

   There's no other end for the moment for this tunnel.

   Message appear on various time on each switch (i.e. not at the same time
   on both switches) and not the same number of time per day. For exemple
   today: 3 on a switch, 6 on the other one.

   Switches load seems the same than before this new port channel and there's
   no load pic around the message date/time (cacti 5mn measures)

   When I shut the port, messages no more appear. When I re-enable it they
   come back.

   I've tried changing keep alive parameters:
        --Keepalive interval            : 500 msec
        --Keepalive timeout             : 10 seconds
        --Keepalive hold timeout        : 6 seconds
   but same thing.

   Keepalive link is on a dedicated 2 ports port-channel, IPs are set
   directly on the portchannel, in a VRF.

   1st switch:
        vpc domain 1
          role priority 1
          peer-keepalive destination 10.0.6.3 source 10.0.6.2 vrf pkal \ 
             interval 500 time out 10 hold-timeout 6
          peer-gateway
          auto-recovery
          ipv6 nd synchronize
          ip arp synchronize

   2nd switch:
        vpc domain 1
          role priority 2
          peer-keepalive destination 10.0.6.2 source 10.0.6.3 vrf pkal \
             interval 500 time out 10 hold-timeout 6
          peer-gateway
          auto-recovery
          ipv6 nd synchronize
          ip arp synchronize


   There's nothing in logs accept the "receive has failed" message.

   There's no error on keep-alive interfaces.

   On cacti, I just notice a little drop of outgoing traffic for keep-alive
   ports around message apparition so it seems it's not a receive problem but
   a transmit problem.

   If a configure 2 others N3K with same configuration (Back-to-Back
   configuration) for the other end of the tunnel and propagate vlan 72 toward
   them, I start having the same message on the other switches, even if the
   QinQ port on them is down. If I stop propagating vlan toward them,
   message stop on these 2 switches (but continue on the first 2 switches).
   
   Any idea ???



Manuel 

--
______________________________________________________________________
Manuel Guesdon - OXYMIUM
_______________________________________________
cisco-nsp mailing list  [email protected]
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/

[c-nsp] N3K: "VPC peer keep-alive receive has failed"

Reply via email to