Re: [OSL | CCIE_SP] Carrier Supporting Carrier (CSC) - LDP peering between SPs drops every 3 minutes

Jo Knight Fri, 04 Sep 2009 10:49:39 -0700

I am seeing the same thing on this IOS:

Cisco IOS Software, 7200 Software (C7200-K91P-M), Version 12.2(25)S9,
RELEASE SOFTWARE (fc1)


00:27:36: %LDP-5-NBRCHG: LDP Neighbor 150.1.1.1:0 is DOWN
00:27:43: %LDP-5-NBRCHG: LDP Neighbor 150.1.1.1:0 is UP
00:30:43: %LDP-5-NBRCHG: LDP Neighbor 150.1.1.1:0 is DOWN
00:30:50: %LDP-5-NBRCHG: LDP Neighbor 150.1.1.1:0 is UP
00:33:50: %LDP-5-NBRCHG: LDP Neighbor 150.1.1.1:0 is DOWN
00:34:00: %LDP-5-NBRCHG: LDP Neighbor 150.1.1.1:0 is UP
00:37:00: %LDP-5-NBRCHG: LDP Neighbor 150.1.1.1:0 is DOWN
00:37:11: %LDP-5-NBRCHG: LDP Neighbor 150.1.1.1:0 is UP
00:40:11: %LDP-5-NBRCHG: LDP Neighbor 150.1.1.1:0 is DOWN


I implemented the workaround and it seems stable now.

Is it happening due to ldp being configured on an interface within a VRF in
this case?

Good spot though - thanks.

Jo

2009/9/4 Bryan Bartik <[email protected]>

> Well I think Antoine was on to something...and I found another workaround
> :-)
>
> I did a packet capture in Dynamips and just before the session drops, the
> TCP/LDP keepalive between R1 and R9 loop between each other. I get about 254
> packets in under a sec and I can see the TTL decrement on each one! This
> happens every minute and eventually the timeout expires because these
> keepalive are not received and processed locally by R9 (thus every 3 minutes
> it drops). R1 is sending the keepalive as a labeled packet even though R9 is
> directly connected, I think it should be an imp-null label. According to the
> capture R1 is putting label 20 on the keepalive. Here I verified it:
>
> R1#sho mpls ldp bindings
>   lib entry: 123.1.9.0/24, rev 2
>         local binding:  label: imp-null
>         remote binding: lsr: 123.1.9.9:0, label: 20
>
> R9#sho mpls forwarding-table labels 20
> Local  Outgoing      Prefix            Bytes Label   Outgoing   Next Hop
>
> Label  Label or VC   or Tunnel Id      Switched      interface
>
> 20     Pop Label     123.1.9.0/24[V] <http://123.1.9.0/24%5BV%5D>
> 3358266       Se1/2      point2point
> R9#
>
> R1 is learning the label 20 and using that to send packets onto the
> directly connected network. R9 receives it, pops it and sends it back out of
> the interface instead of receiving it locally. (I am using the interface as
> ldp transport address for this example). It's hard to tell who is at fault
> here!
>
> In order for the session to actually come up again, all the learned labels
> have to get flushed so the TCP exchange can happen. Once it does, TCP
> handshake and label exchange occurs without any label encapsulation. R9 then
> advertises the vrf subnet (connected to R1) in a label mapping message with
> a label of 20 (verified with capture). This label is now used by R1 and any
> further TCP exchange uses that label and packets loop. It's funny to see it
> happen, because as soon as R9 sends that label mapping, R1 uses the label in
> the ACK! However this does not cause any problems at this point, because R1
> also sends an unlabeled ACK as well, completeing the exchange (weird).
>
> To fix this, I have configured the following on R1:
>
> access-list 1 deny   123.1.9.0 0.0.0.255
> access-list 1 permit any
> mpls ldp neighbor 123.1.9.9 labels accept 1
>
> On R3 I did similar:
>
> access-list 1 deny   123.3.8.0 0.0.0.255
> access-list 1 permit any
> mpls ldp neighbor 123.3.8.8 labels accept 1
>
> It's been up for more than 3 minutes:
>
> R1#sho mpls ldp neighbor
>     Peer LDP Ident: 123.1.9.9:0; Local LDP Ident 123.123.123.1:0
>         TCP connection: 123.1.9.9.11032 - 123.1.9.1.646
>         State: Oper; Msgs sent/rcvd: 13/13; Downstream
>         Up time: 00:04:55
>         LDP discovery sources:
>           Serial1/0, Src IP addr: 123.1.9.9
>         Addresses bound to peer LDP Ident:
>           123.1.9.9
>
> R3#sho mpls ldp neighbor
>     Peer LDP Ident: 123.3.8.8:0; Local LDP Ident 123.123.123.3:0
>         TCP connection: 123.3.8.8.11036 - 123.3.8.3.646
>         State: Oper; Msgs sent/rcvd: 15/15; Downstream
>         Up time: 00:06:13
>         LDP discovery sources:
>           Serial1/0, Src IP addr: 123.3.8.8
>         Addresses bound to peer LDP Ident:
>           123.3.8.8
>
> Can't believe that is they way it should be designed (I still think bug),
> but for now it is stable :-)
>
> thanks all,
>
>
> On Fri, Sep 4, 2009 at 8:47 AM, Rick Mur <[email protected]> wrote:
>
>> True, the CSC code can be buggy sometimes. As this 3 minute marker is
>> definitely a bug. I recall I also had some issues where it worked only some
>> period of time, so maybe I ran into that same issue. Still it's sometimes
>> buggy and I don't think that will happen on your lab.
>>
>>     --
>> Regards,
>>
>> Rick Mur
>> CCIE2 #21946 (R&S / Service Provider)
>> Sr. Support Engineer – IPexpert, Inc.
>> URL: http://www.IPexpert.com
>>
>> On 4 sep 2009, at 14:11, Bryan Bartik wrote:
>>
>> Thanks Guys, I am using 7200s in dynamips with 12.2S code. In fact, this
>> is a mock set up of vol 2 lab 4, I just took everything else out of the
>> equation. I started from scratch just doing the MPLS VPN scenarios. Even R2
>> is not doing anything (ospf or ldp) right now. I am also just using physical
>> interfaces on my frame relay cloud :) I don't see the client tagging the
>> packets but I could look deeper into this. It's just amazing that is every 3
>> minutes pretty much on the dot! I wonder if it's some type of CSC bug having
>> to do with MPLS on the VRF interface...I will do some more testing and let
>> you know.
>>
>> On Fri, Sep 4, 2009 at 3:22 AM, Antonie Henning - MWEB <[email protected]
>> > wrote:
>>
>>> Had a similar issue. I narrowed it down to point to point frame-relay
>>> subinterface with csc igp (ospf) and ldp enabled.
>>>
>>> What I saw was a loop. The client would tag packets to the carrier for
>>> the directly connected subnet. I was expecting to see it pop the label:
>>>
>>> R2(config-subif)#do sh ip cef 123.2.6.6
>>> 123.2.6.0/24
>>>  attached to Serial4/0.206 label 614
>>>
>>> The 614 label then sends the packet back to the client and a loop forms:
>>>
>>> R2(config)#do trace 123.2.6.6
>>>
>>> Type escape sequence to abort.
>>> Tracing the route to 123.2.6.6
>>>
>>>  1 123.2.6.6 [MPLS: Label 614 Exp 0] 8 msec 32 msec 12 msec
>>>  2 123.2.6.2 20 msec 40 msec 40 msec
>>>  3 123.2.6.6 [MPLS: Label 614 Exp 0] 44 msec 24 msec 44 msec
>>>  4 123.2.6.2 20 msec 48 msec 64 msec
>>>  5 123.2.6.6 [MPLS: Label 614 Exp 0] 40 msec 24 msec 40 msec
>>>  6 123.2.6.2 48 msec 72 msec 60 msec
>>>  7 123.2.6.6 [MPLS: Label 614 Exp 0] 60 msec 36 msec 64 msec
>>>  8 123.2.6.2 56 msec 60 msec 88 msec
>>>  9 123.2.6.6 [MPLS: Label 614 Exp 0] 60 msec 108 msec 60 msec
>>>
>>> Changing the frame-relay config to use the main interface solved the
>>> problem.
>>>
>>> Hth
>>> 21500.net
>>>
>>> -----Original Message-----
>>> From: [email protected] [mailto:[email protected]] On Behalf Of
>>> Francisco Baena
>>> Sent: 04 September 2009 08:11 AM
>>> To: 'Bryan Bartik'; [email protected]; [email protected]
>>> Subject: RE: Carrier Supporting Carrier (CSC) - LDP peering between SPs
>>> drops every 3 minutes
>>>
>>> Make it two. I had the same problem with vol II - Lab 4 (I think), but
>>> from
>>> R2 to R6.
>>>
>>> At that point I blamed on dynamips being buggy as all the routing/MPLS
>>> tables seem fine.
>>>
>>> I look forward to a resolution on this. The interesting thing is that
>>> when I
>>> shut down the connection from R1 to R2, the problem went away, so it
>>> sounds
>>> like an IGP issue. However even making a Sham link between r6 and r9 and
>>> increasing the OSPF cost from R1 to R2 (to ensure AS200 was the exit
>>> point),
>>> made no difference.
>>>
>>> I would say that a possible workaround could be to make the R1-R9 LDP
>>> sessions targeted, but obviously what we all want to know is what the
>>> heck
>>> happened there in the first place.
>>>
>>> During my testing I disabled MPLS TE too in AS200, just in case, but no
>>> cigar....
>>>
>>> Cheers,
>>> Francisco
>>> http://www.linkedin.com/in/fbaena
>>>
>>>
>>> -----Original Message-----
>>> From: [email protected] [mailto:[email protected]] On Behalf Of
>>> Bryan Bartik
>>> Sent: 04 September 2009 05:14
>>> To: [email protected]; [email protected]
>>> Subject: Carrier Supporting Carrier (CSC) - LDP peering between SPs drops
>>> every 3 minutes
>>>
>>> Hello,
>>>
>>> I am running an inter-as + CSC scenario with OSPF and MPLS between the
>>> SPs.
>>> The LDP peering is dropping exactly every 3 minutes and coming back up.
>>> Connectivity is fine throughout the VPN when the session is up. I don't
>>> see
>>> any route flapping in the debugs.
>>>
>>> AS65123[R3]---------[R8]AS100---------AS200[R9]---------[R1]AS65123
>>>
>>> AS100 and AS200 have an inter-as VPN supporting the 2nd level carrier
>>> AS65123.
>>> R9 and R8 have vrf interfaces connected to R1 and R3 respectively.
>>> R1 has an LDP/OSPF peering with R9 (in the vrf) in AS100
>>> R3 has an LDP/OSPF peering with R8 (in the vrf) in AS200
>>>
>>> Both LDP sessions are bouncing on queue every 3 minutes! Here is R8 for
>>> example:
>>>
>>> R8#
>>> *Sep  3 21:50:49.575: %LDP-5-NBRCHG: LDP Neighbor 123.123.123.3:0 is
>>> DOWN
>>> *Sep  3 21:50:59.475: %LDP-5-NBRCHG: LDP Neighbor 123.123.123.3:0 is UP
>>> *Sep  3 21:53:59.491: %LDP-5-NBRCHG: LDP Neighbor 123.123.123.3:0 is
>>> DOWN
>>> *Sep  3 21:54:09.291: %LDP-5-NBRCHG: LDP Neighbor 123.123.123.3:0 is UP
>>> *Sep  3 21:57:09.311: %LDP-5-NBRCHG: LDP Neighbor 123.123.123.3:0 is
>>> DOWN
>>> *Sep  3 21:57:19.331: %LDP-5-NBRCHG: LDP Neighbor 123.123.123.3:0 is UP
>>> *Sep  3 22:00:19.351: %LDP-5-NBRCHG: LDP Neighbor 123.123.123.3:0 is
>>> DOWN
>>> *Sep  3 22:00:29.239: %LDP-5-NBRCHG: LDP Neighbor 123.123.123.3:0 is UP
>>>
>>> Now just before the session dies I get the following from "debug mpls ldp
>>> transport events interface x/x". This is from R9:
>>>
>>> R9#
>>> *Sep  3 22:09:27.295: ldp: Send ldp hello; Serial1/2, src/dst
>>> 123.1.9.9/224.0.0.2, inst_id 0
>>> *Sep  3 22:09:28.787: ldp: Rcvd ldp hello; Serial1/2, from 123.1.9.1 (
>>> 123.123.123.1:0), intf_id 0, opt 0xC
>>> *Sep  3 22:09:32.127: ldp: Send ldp hello; Serial1/2, src/dst
>>> 123.1.9.9/224.0.0.2, inst_id 0
>>> *Sep  3 22:09:33.239: ldp: Rcvd ldp hello; Serial1/2, from 123.1.9.1 (
>>> 123.123.123.1:0), intf_id 0, opt 0xC
>>> *Sep  3 22:09:36.111: ldp: Send ldp hello; Serial1/2, src/dst
>>> 123.1.9.9/224.0.0.2, inst_id 0
>>> *Sep  3 22:09:36.907: tagcon: Session KeepAlive timer expired, peer
>>> 123.123.123.1:0 (pp 0x64027BD8)
>>> *Sep  3 22:09:36.911: ldp: Close LDP transport conn for adj 0x63486648
>>> *Sep  3 22:09:36.915: ldp: Closing ldp conn 123.1.9.9:646 <->
>>> 123.123.123.1:11015, adj 0x63486648
>>> *Sep  3 22:09:36.919: ldp: Adj 0x63486648; state set to closed
>>> *Sep  3 22:09:36.923: %LDP-5-NBRCHG: LDP Neighbor 123.123.123.1:0 is
>>> DOWN
>>>
>>> Even though I am sending and receiving hellos, I am getting the session
>>> keepalive timer expired. Any ideas?
>>>
>>> Thanks!
>>> --
>>> Bryan Bartik
>>> CCIE #23707 (R&S), CCNP
>>> Sr. Support Engineer - IPexpert, Inc.
>>> URL: http://www.IPexpert.com
>>>
>>> _____________________________________________________________________
>>> Subscription information: http://www.groupstudy.com/list/comserv.html
>>>
>>> No virus found in this incoming message.
>>> Checked by AVG - www.avg.com
>>> Version: 8.5.409 / Virus Database: 270.13.76/2343 - Release Date:
>>> 09/03/09
>>> 05:50:00
>>>
>>> _____________________________________________________________________
>>> Subscription information: http://www.groupstudy.com/list/comserv.html
>>> Connect with South Africa’s leading Internet Service Provider and
>>> discover the magic of the Internet and all its possibilities.
>>> Call 08600 32000 or click here(http://www.mweb.co.za/productsservices/)
>>> for more.
>>>
>>> MWEB :-)  CONNECT AND YOU CAN.
>>>
>>> This electronic communication and the attached file(s) are subject to a
>>> disclaimer which can be accessed on the following link:
>>> Disclaimer - or copy the following URL into your browser -
>>> http://www.mweb.co.za/disclaimer.
>>> If you are unable to view the disclaimer, please contact [email protected] 
>>> a copy.
>>>
>>
>>
>>
>> --
>> Bryan Bartik
>> CCIE #23707 (R&S), CCNP
>> Sr. Support Engineer - IPexpert, Inc.
>> URL: http://www.IPexpert.com
>>  _______________________________________________
>> For more information regarding industry leading CCIE Lab training, please
>> visit www.ipexpert.com
>>
>>
>>
>
>
> --
> Bryan Bartik
> CCIE #23707 (R&S), CCNP
> Sr. Support Engineer - IPexpert, Inc.
> URL: http://www.IPexpert.com
>
> _______________________________________________
> For more information regarding industry leading CCIE Lab training, please
> visit www.ipexpert.com
>
>

_______________________________________________
For more information regarding industry leading CCIE Lab training, please visit 
www.ipexpert.com

Re: [OSL | CCIE_SP] Carrier Supporting Carrier (CSC) - LDP peering between SPs drops every 3 minutes

Reply via email to