RE: Increase bandwidth usage in partial-mesh network?

2021-10-14 Thread Brian Turnbow via NANOG

Has anyone come across any product or technology that can handle the 
multi-path-ness and the private-network-ness like a regular router, but also 
provides the intelligent per-flow path steering based on e.g. latency, like an 
SD-WAN device (and/or some firewalls)?

Maybe add a little bit of linear optimization on top of 
faucet/openvswitch/openflow to calculate best paths based upon bandwidth, 
paths, and fill-factors.  There is a presentation where Google uses that 
technique to obtain high utilization on their links (not necessarily those 
tools though).

Raymond Burkholder

This is what a large Italian wisp has done, here are a couple of presentations 
made at our ITNOG sessions.
I’m not sure if they have open sourced anything though.
https://www.itnog.it/itnog4/files/14-Traffic%20Engineering%20-%20the%20EOLO%20way%20of%20life.pdf
https://www.itnog.it/itnog3/files/ITNOG3-EOLO.pdf
Brian


Re: Increase bandwidth usage in partial-mesh network?

2021-10-14 Thread Arie Vayner
Maybe something like this (if you can break it into different bgp ASNs by
network area):

"draft-mohanty-bess-ebgp-dmz-03"
https://datatracker.ietf.org/doc/html/draft-mohanty-bess-ebgp-dmz-03

On Wed, Oct 13, 2021, 10:30 Adam Thompson  wrote:

> Looking for recommendtions or suggestions...
>
> I've got a downstream customer asking for help;  they have a private
> internal network that I've taken to calling the "partial-mesh network from
> hell": it's got two partially-overlapping radio networks, mixed with
> islands of isolated fiber connectivity.
> Dynamic routing protocols (IS-IS, OSPF, EIGRP, etc.) generally will only
> select the _best_ path, they won't spread the load unless all paths are
> equal - and they are very unequal in this network, ECMP would likely fail
> horribly.
> The network is becoming bandwidth-limited, so they're wanting to make use
> of all available paths, not just the single "best" path.  It's also remote
> and spread out, so adding new links or upgrading existing links is
> difficult and expensive.
> Oh, and their routers are overdue for a refresh, so acquiring replacement
> h/w is now possible.
>
> Has anyone come across any product or technology that can handle the
> multi-path-ness and the private-network-ness like a regular router, but
> also provides the intelligent per-flow path steering based on e.g. latency,
> like an SD-WAN device (and/or some firewalls)?
>
> Here's hoping,
> -Adam
>
> *Adam Thompson*
> Consultant, Infrastructure Services
> [image: 1593169877849]
> 100 - 135 Innovation Drive
> Winnipeg, MB, R3T 6A8
> (204) 977-6824 or 1-800-430-6404 (MB only)
> athomp...@merlin.mb.ca
> www.merlin.mb.ca
>


Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

2021-10-14 Thread Brie

Hi all,

So, having a...  frustrating issue going on.  Long wall of text ahead as 
I explain.


1 x CenturyLink/Lumen fiber in Boise
1 x CenturyLink/Lumen fiber in Cheyenne
1 x Comcast biz fiber in Denver

IPsec VPN tunnels between all three sites, w/ OSPF for routing failover 
(which unfortunately doesn't help in this situation).


Two days ago, Cheyenne to Denver (.196) traffic (both tcp and udp) were 
an issue initially.  Failed over to routing Cheyenne VPN through Boise 
while we opened ticket with CL.


Yesterday, Boise to Denver (.196) traffic started having exact same issue.

Tests from another CL fiber in Boise (my own circuit, with legacy IP 
space and BGP) to Denver (.196) did not show same issues.  Path appeared 
clean.


Traceroutes from Office Boise to Denver (.196) had a noticeable 
difference from Personal Boise to Denver (.196):


Office Boise -> Denver (.196)
--
3: sea-edge-15.inet.qwest.net
4: lag-4.ear3.Seattle1.Level3.net
5: ae-2-52.ear2.seattle1.level3.net   <--  This hop
6: be-203-pe01.seattle.wa.ibone.comcast.net


Personal Boise -> Denver (.196)
--
4: sea-edge-15.inet.qwest.net
5: lag-25.ear2.Seattle1.Level3.net
6: be-203-pe01.seattle.wa.ibone.comcast.net

On a whim, tracerouted to another Denver router IP address (.199, IP 
alias on same interface) from Boise Office, and traceroute matched the 
traceroute from Personal Boise to Denver (.196) traceroute.


No packet loss.


Swapped VPN tunnels over to using another ip on same router (.199), same 
interface physical and logical, in Denver, and VPN was working again 
normally.


This morning though, Cheyenne to Denver (.199) is having problems, while 
Boise to Denver (.199) isn't (for now).


Already spent most of last night working with my partner in Denver 
replacing nearly everything on the Denver side with no change.


Tests from the router above the main Denver VPN endpoint (.196) do not 
show any kind of issues or packet loss to it, so it doesn't appear the 
problem is inside of our network.


I'm not inclined to think this is a Comcast issue, but I can't be sure.

This sounds almost like a load balancing hashing issue, with one link in 
the bond group having issues, somewhere in one of our upstream's networks.


We'll be opening a ticket in a bit through normal channels with 
CenturyLink/Lumen, but we're worried they're just going to close the 
ticket as not being their issue.


Anyone know of an engineer at CenturyLink/Lumen/Level3 or even Comcast 
that might want to take a stab at this?  I can provide a lot more detail.


--
Brielle Bruns
The Summit Open Source Development Group
http://www.sosdg.org/ http://www.ahbl.org


RE: Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

2021-10-14 Thread Mike Lewinski via NANOG
I can confirm this issue exists at several sites in the Denver area with this 
same IPSEC issue, all routing between Level3/Lumen and Comcast.

I was told by one customer that it resolved late yesterday afternoon but I 
haven't been able to confirm that.


Mike

-Original Message-
From: NANOG  On Behalf 
Of Brie
Sent: Thursday, October 14, 2021 10:43 AM
To: nanog@nanog.org
Subject: Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

Hi all,

So, having a...  frustrating issue going on.  Long wall of text ahead as I 
explain.

1 x CenturyLink/Lumen fiber in Boise
1 x CenturyLink/Lumen fiber in Cheyenne
1 x Comcast biz fiber in Denver

IPsec VPN tunnels between all three sites, w/ OSPF for routing failover (which 
unfortunately doesn't help in this situation).

Two days ago, Cheyenne to Denver (.196) traffic (both tcp and udp) were an 
issue initially.  Failed over to routing Cheyenne VPN through Boise while we 
opened ticket with CL.

Yesterday, Boise to Denver (.196) traffic started having exact same issue.

Tests from another CL fiber in Boise (my own circuit, with legacy IP space and 
BGP) to Denver (.196) did not show same issues.  Path appeared clean.

Traceroutes from Office Boise to Denver (.196) had a noticeable difference from 
Personal Boise to Denver (.196):

Office Boise -> Denver (.196)
--
3: sea-edge-15.inet.qwest.net
4: lag-4.ear3.Seattle1.Level3.net
5: ae-2-52.ear2.seattle1.level3.net   <--  This hop
6: be-203-pe01.seattle.wa.ibone.comcast.net


Personal Boise -> Denver (.196)
--
4: sea-edge-15.inet.qwest.net
5: lag-25.ear2.Seattle1.Level3.net
6: be-203-pe01.seattle.wa.ibone.comcast.net

On a whim, tracerouted to another Denver router IP address (.199, IP alias on 
same interface) from Boise Office, and traceroute matched the traceroute from 
Personal Boise to Denver (.196) traceroute.

No packet loss.


Swapped VPN tunnels over to using another ip on same router (.199), same 
interface physical and logical, in Denver, and VPN was working again normally.

This morning though, Cheyenne to Denver (.199) is having problems, while Boise 
to Denver (.199) isn't (for now).

Already spent most of last night working with my partner in Denver replacing 
nearly everything on the Denver side with no change.

Tests from the router above the main Denver VPN endpoint (.196) do not show any 
kind of issues or packet loss to it, so it doesn't appear the problem is inside 
of our network.

I'm not inclined to think this is a Comcast issue, but I can't be sure.

This sounds almost like a load balancing hashing issue, with one link in the 
bond group having issues, somewhere in one of our upstream's networks.

We'll be opening a ticket in a bit through normal channels with 
CenturyLink/Lumen, but we're worried they're just going to close the ticket as 
not being their issue.

Anyone know of an engineer at CenturyLink/Lumen/Level3 or even Comcast that 
might want to take a stab at this?  I can provide a lot more detail.

--
Brielle Bruns
The Summit Open Source Development Group
http://www.sosdg.org/ http://www.ahbl.org