On 11/12/07, Claudio Jeker <[EMAIL PROTECTED]> wrote: > > On Tue, Nov 06, 2007 at 06:26:47PM +0100, Tony Sarendal wrote: > > New version. Less duplication and a nice feature as bonus. > > With softreconfig in enabled the looped prefixes are accepted > > into the Adj-RIB-In. > > > > This means that I can tell if my neighbor AS is using > > a path via myself. Either I'm tired or that is cool. > > > > router-02# bgpctl show rib 192.168.0.0 > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced > > origin: i = IGP, e = EGP, ? = Incomplete > > > > flags destination gateway lpref med aspath origin > > *> 192.168.0.0/16 192.168.100.5 100 0 65100 i > > * 192.168.0.0/16 172.17.1.1 100 0 65200 65100 i > > * 192.168.0.0/16 172.17.1.5 100 0 65200 65200 65200 > 65200 65100 i > > router-02# > > > > I now kill the peering that 65200 has to 65100, removing their > > direct path to 192.168.0.0/16. > > > > router-02# bgpctl show rib 192.168.0.0 > > flags: * = Valid, > = Selected, I = via IBGP, A = Announced > > origin: i = IGP, e = EGP, ? = Incomplete > > > > flags destination gateway lpref med aspath origin > > *> 192.168.0.0/16 192.168.100.5 100 0 65100 i > > router-02# > > > > Sweet, the looping issue is gone. > > Here is the bonus: > > > > router-02# bgpctl show rib neigh 172.17.1.5 in | grep 65300 > > * 172.17.0.2/32 172.17.1.5 100 0 65200 65300 i > > * 192.168.0.0/16 172.17.1.5 100 0 65200 65300 65100 > i > > * 192.168.100.4/30 172.17.1.5 100 0 65200 65300 i > > router-02# > > > > I now see the paths that the peer uses my network to access. > > Note that this depends a bit on remote implementation. > > I think this works agains a cisco router. > > > > /Tony > > > > > > Index: rde.c > > =================================================================== > > RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v > > retrieving revision 1.228 > > diff -u -r1.228 rde.c > > --- rde.c 16 Sep 2007 15:20:50 -0000 1.228 > > +++ rde.c 6 Nov 2007 17:08:50 -0000 > > @@ -919,12 +919,6 @@ > > /* shift to NLRI information */ > > p += 2 + attrpath_len; > > > > - /* aspath needs to be loop free nota bene this is not a hard error > */ > > - if (peer->conf.ebgp && !aspath_loopfree(asp->aspath, conf->as)) { > > - error = 0; > > - goto done; > > - } > > - > > /* parse nlri prefix */ > > while (nlri_len > 0) { > > if ((pos = rde_update_get_prefix(p, nlri_len, &prefix, > > @@ -977,10 +971,18 @@ > > if (fasp == NULL) > > fasp = asp; > > > > - rde_update_log("update", peer, > &fasp->nexthop->exit_nexthop, > > - &prefix, prefixlen); > > - path_update(peer, fasp, &prefix, prefixlen, F_LOCAL); > > - > > + rde_update_log("update", peer, > > + &fasp->nexthop->exit_nexthop,&prefix, > > + prefixlen); > > + /* handle an update with loop as a withdraw */ > > + if (peer->conf.ebgp && !aspath_loopfree(asp->aspath, > > + conf->as)) > > + prefix_remove(peer, &prefix, prefixlen, > > + F_LOCAL); > > + else > > + path_update(peer, fasp, &prefix, prefixlen, > > + F_LOCAL); > > + > > /* free modified aspath */ > > if (fasp != asp) > > path_put(fasp); > > @@ -1075,9 +1077,15 @@ > > > > rde_update_log("update", peer, > > &asp->nexthop->exit_nexthop, > > - &prefix, prefixlen); > > - path_update(peer, fasp, &prefix, > prefixlen, > > - F_LOCAL); > > + &prefix, prefixlen); > > + /* handle an update with loop as a > withdraw */ > > + if (peer->conf.ebgp && > > + > !aspath_loopfree(asp->aspath,conf->as)) > > + prefix_remove(peer, &prefix, > > + prefixlen,F_LOCAL); > > + else > > + path_update(peer, fasp, &prefix, > > + prefixlen,F_LOCAL); > > > > /* free modified aspath */ > > if (fasp != asp) > > I looked a bit closer at this problem and the RFC mentions that pathes > with loops need to be inserted into the RIB and will be ignored in phase 2 > of the decision process. > > So this diff does just about that. It does not remove any prefix if there > is a loop but instead is ignoring them during the route decision process. > This seems to work for me but I'm currently unable to do larger tests. > > -- > :wq Claudio > > Index: rde.c > =================================================================== > RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v > retrieving revision 1.228 > diff -u -p -r1.228 rde.c > --- rde.c 16 Sep 2007 15:20:50 -0000 1.228 > +++ rde.c 6 Nov 2007 18:27:42 -0000 > @@ -920,10 +920,8 @@ rde_update_dispatch(struct imsg *imsg) > p += 2 + attrpath_len; > > /* aspath needs to be loop free nota bene this is not a hard error > */ > - if (peer->conf.ebgp && !aspath_loopfree(asp->aspath, conf->as)) { > - error = 0; > - goto done; > - } > + if (peer->conf.ebgp && !aspath_loopfree(asp->aspath, conf->as)) > + asp->flags |= F_ATTR_ASLOOP; > > /* parse nlri prefix */ > while (nlri_len > 0) { > Index: rde.h > =================================================================== > RCS file: /cvs/src/usr.sbin/bgpd/rde.h,v > retrieving revision 1.100 > diff -u -p -r1.100 rde.h > --- rde.h 1 Jun 2007 04:17:30 -0000 1.100 > +++ rde.h 6 Nov 2007 19:17:56 -0000 > @@ -154,6 +154,7 @@ LIST_HEAD(prefix_head, prefix); > #define F_ATTR_MP_REACH 0x00040 > #define F_ATTR_MP_UNREACH 0x00080 > #define F_ATTR_AS4BYTE_NEW 0x00100 /* NEW_ASPATH or > NEW_AGGREGATOR */ > +#define F_ATTR_ASLOOP 0x00200 > #define F_PREFIX_ANNOUNCED 0x01000 > #define F_NEXTHOP_REJECT 0x02000 > #define F_NEXTHOP_BLACKHOLE 0x04000 > Index: rde_decide.c > =================================================================== > RCS file: /cvs/src/usr.sbin/bgpd/rde_decide.c,v > retrieving revision 1.48 > diff -u -p -r1.48 rde_decide.c > --- rde_decide.c 11 May 2007 11:27:59 -0000 1.48 > +++ rde_decide.c 12 Nov 2007 05:43:20 -0000 > @@ -120,6 +120,12 @@ prefix_cmp(struct prefix *p1, struct pre > return (-1); > if (!(p2->flags & F_LOCAL)) > return (1); > + > + /* only loop free pathes are eligible */ > + if (p1->flags & F_ATTR_ASLOOP) > + return (-1); > + if (p2->flags & F_ATTR_ASLOOP) > + return (1); > > asp1 = p1->aspath; > asp2 = p2->aspath; > @@ -239,8 +245,8 @@ prefix_evaluate(struct prefix *p, struct > > xp = LIST_FIRST(&pte->prefix_h); > if (xp == NULL || !(xp->flags & F_LOCAL) || > - (xp->aspath->nexthop != NULL && xp->aspath->nexthop->state != > - NEXTHOP_REACH)) > + (xp->flags & F_ATTR_ASLOOP) || (xp->aspath->nexthop != NULL && > + xp->aspath->nexthop->state != NEXTHOP_REACH)) > /* xp is ineligible */ > xp = NULL; > > as4 advertises 172.19.0.0/16 to as2. as1, as2 and as3 configured in a triangle, with a primary/standby peering between as2 &as3.
See below how router as2 is out of sync with as3. as2# date ; bgpctl show rib 172.19.0.0/16 Mon Nov 12 08:51:59 GMT 2007 flags: * = Valid, > = Selected, I = via IBGP, A = Announced origin: i = IGP, e = EGP, ? = Incomplete flags destination gateway lpref med aspath origin *> 172.19.0.0/16 172.17.1.10 100 0 4 i * 172.19.0.0/16 172.17.1.6 100 0 3 3 3 2 4 i as2# as3# date ; bgpctl show rib 172.19.0.0/16 Mon Nov 12 08:52:13 GMT 2007 flags: * = Valid, > = Selected, I = via IBGP, A = Announced origin: i = IGP, e = EGP, ? = Incomplete flags destination gateway lpref med aspath origin *> 172.19.0.0/16 172.17.1.1 100 0 2 4 i * 172.19.0.0/16 192.168.1.5 100 0 1 2 4 i * 172.19.0.0/16 172.17.1.5 100 0 2 2 2 4 i as3# I shutdown peering as2-as4: as2# date ; bgpctl show rib 172.19.0.0/16 Mon Nov 12 08:53:07 GMT 2007 flags: * = Valid, > = Selected, I = via IBGP, A = Announced origin: i = IGP, e = EGP, ? = Incomplete flags destination gateway lpref med aspath origin *> 172.19.0.0/16 172.17.1.6 100 0 3 3 3 2 1 3 2 2 2 1 3 2 1 3 2 2 2 3 1 2 3 2 2 2 3 1 2 3 2 2 2 3 1 2 3 1 2 3 1 2 3 3 3 1 2 3 1 2 3 3 3 1 2 3 2 2 2 3 1 2 3 3 3 2 3 3 3 1 2 3 1 2 3 3 3 1 2 3 2 2 2 1 3 2 1 3 2 1 3 2 1 3 2 2 2 1 3 2 3 3 3 2 1 3 2 1 3 2 3 3 3 2 3 3 3 2 1 3 2 2 2 3 2 2 2 3 2 2 2 3 2 2 2 1 3 2 1 3 2 2 2 1 3 2 1 3 2 1 3 2 1 3 2 3 3 3 1 2 3 1 2 4 i as2# date ; bgpctl show rib 172.19.0.0/16 Mon Nov 12 08:53:09 GMT 2007 flags: * = Valid, > = Selected, I = via IBGP, A = Announced origin: i = IGP, e = EGP, ? = Incomplete flags destination gateway lpref med aspath origin *> 172.19.0.0/16 192.168.1.1 100 0 1 3 2 2 2 1 3 2 2 2 1 3 2 3 3 3 2 3 3 3 1 2 3 1 2 3 1 2 3 2 2 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 2 2 2 1 3 2 1 3 2 3 3 3 2 3 3 3 2 1 3 2 1 3 2 3 3 3 2 3 3 3 2 3 3 3 2 1 3 2 2 2 1 3 2 1 3 2 2 2 3 1 2 3 2 2 2 3 1 2 3 2 2 2 3 1 2 3 1 2 3 1 2 3 3 3 1 2 3 1 2 3 3 3 1 2 3 2 2 2 3 1 2 3 3 3 2 3 3 3 1 2 3 1 2 3 3 3 1 2 3 2 2 2 1 3 2 1 3 2 1 3 2 1 3 2 2 2 1 3 2 3 3 3 2 1 3 2 1 3 2 3 3 3 2 3 3 3 2 1 3 2 2 2 3 2 2 2 3 2 2 2 3 2 2 2 1 3 2 1 3 2 2 2 1 3 2 1 3 2 1 3 2 1 3 2 3 3 3 1 2 3 1 2 4 i as2# date ; bgpctl show rib 172.19.0.0/16 Mon Nov 12 08:53:09 GMT 2007 flags: * = Valid, > = Selected, I = via IBGP, A = Announced origin: i = IGP, e = EGP, ? = Incomplete flags destination gateway lpref med aspath origin as2# bgpd now crashes: Nov 12 08:53:13 as3 bgpd[24367]: fatal in RDE: aspath_count: would overflow Nov 12 08:53:13 as3 bgpd[27761]: Lost child: route decision engine exited Nov 12 08:53:13 as3 bgpd[14219]: fatal in SE: session_dispatch_imsg: pipe closed: Connection refused Nov 12 08:53:13 as3 bgpd[27761]: can't remove connected route from interface with index 0: not found The crash does not happen every time, some times the network handles this ok, after the initial bursts of updates and loops. The last test I did with flapping as2-a4 crashed all bgpd's in the network, except for the stranded as4 of course. I will look closer at this later, hopefully later today. I do have a more detailed report with timestamps and matching tcpdumps if you want it, otherwise I'll dig more on my side and get back to you. /Tony