To expand on what others have said here, I find it helpful to think of BGP as a 
policy enforcement protocol, rather than as a distance vector routing protocol. 
 

To that end, there’s a generally expected hierarchy of routes, and then a lot 
of individuality between networks.  Having done traffic engineering for some 
global CDNs, there’s a bunch of inbound traffic control that you can do by 
letting an understanding of how most other providers think about this guide 
your transit and peering policies, and a remaining portion that generally needs 
to be solved through either discussions, negotiations, or commercial 
arrangements with the sending party or their upstreams.

For the general rules, local-preference trumps everything else.  The number of 
AS path hops comes after local-preference.  Other things being equal networks 
usually like to hand off traffic to a short AS path, and at the closest point 
to its origination (there are valid performance reasons for this) but 
local-preference policies will override both of those.

Local-preferences usually have three default tiers — customer, peering, and 
transit.  In other words, get paid, hand off for free, and pay.  There are 
often some additional peers that can be selected for traffic engineering 
reasons, either internally or by customers using BGP communities.  BUT, those 
BGP communities don’t transit to other ASes, so even if you manage to signal 
one hop up stream, you may still find your upstream provider announcing your 
routes to those who have different ideas.

One example of this from the early days of anycasted DNS root servers involved 
k.root-servers.net <http://k.root-servers.net/> installing a node in Delhi, 
which pulled 60% of its traffic from North America.  This was clearly 
non-optimal.  They had attempted to get routing diversity by getting transit 
from different providers in different parts of the world, but their Delhi node 
was, if I recall correctly, a customer of a customer of a customer of Level3.  
Oops.

So, what do you do about this?

If you’re a global network operator, you probably attempt to maintain 
consistent peering/transit relationships across sites.  That way, AS paths and 
local-preferences should be fairly even, and you can let nearest exit routing 
do its thing.

If you have a smaller network, but have multiple interconnection locations that 
are far enough apart to make a performance difference, make the same transit 
and peering relationships at each one.  Make exceptions only for peers (not 
transit providers) whose customers or services only exist in one of the areas, 
and make sure they don’t announce your routes to their upstreams.  That way you 
won’t trombone traffic.

If you’ve done all that, and traffic is still coming in the wrong place, then 
you start talking to people.  “Hey, I’m buying transit from you in both Asia 
and the Western US, and all my traffic from asian-country-x is coming into San 
Jose.  Why?”  “Well, they only have a 100 Mb/s interconnection to us in Asia.  
We have to traffic engineer around it.”  And then you have to figure out how to 
convince some national telco to want to talk to you more than they want to talk 
to your transit provider.

I think in your case, I would be asking why you have a 5,000 mile, five-prepend 
loop to get to a provide ten miles away.  It suggests that your network is 
doing things 5,000 miles away that are inconsistent with what you're doing 
locally, or that you have upstreams who aren’t interconnecting locally or 
aren’t maintaining sufficient capacity or sufficient political relationships on 
those paths.  All of those would predictably have this result.  The solution is 
likely to take a look at your transit relationships, ask your transit providers 
about their transit relationships, and either supplement or switch to a set of 
transit providers who can provide the routing you want.

-Steve



> On Jan 22, 2024, at 4:49 AM, William Herrin <b...@herrin.us> wrote:
> 
> Howdy,
> 
> Does anyone have suggestions for dealing with networks who ignore my
> BGP route prepends?
> 
> I have a primary ingress with no prepends and then several distant
> backups with multiple prepends of my own AS number. My intention, of
> course, is that folks take the short path to me whenever it's
> reachable.
> 
> A few years ago, Comcast decided it would prefer the 5000 mile,
> five-prepend loop to the short 10 mile path. I was able to cure that
> with a community telling my ISP along that path to not advertise my
> route to Comcast. Today it's Centurylink. Same story; they'd rather
> send the packets 5000 miles to the other coast and back than 10 miles
> across town. I know they have the correct route because when I
> withdraw the distant ones entirely, they see and use it. But this time
> it's not just one path; they prefer any other path except the one I
> want them to use. And Centurylink is not a peer of those ISPs, so
> there doesn't appear to be any community I can use to tell them not to
> use the route.
> 
> I hate to litter the table with a batch of more-specifics that only
> originate from the short, preferred link but I'm at a loss as to what
> else to do.
> 
> Advice would be most welcome.
> 
> Regards,
> Bill Herrin
> 
> -- 
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/



Reply via email to