Hi Jon, Are you dual stack? v6 would solve some of these issues?
On Tue, Oct 8, 2024 at 12:20 PM Jon Lewis <jle...@lewis.org> wrote: > We started rolling out CGNAT about 6 months ago. It was smooth sailing > for the first few months, but we eventually did run into a number of > issues. > > Our customer base is primarily FTTH with "dynamic" IP assignment via DHCP. > Since connections are always-on, customer ONTs/routers get an IP assigned, > and then when the lease is renewed, they request a new lease for the > existing IP, and, in general, that request is granted. This gives > customers the mistaken impression they have a static IP. So, my > impression, from working with some customers who've needed to be moved > from CGNAT back to public IP is that customers who are doing > port-forwarding don't even bother with dynamic DNS. They just know they > can connect to their IP as they've never seen it change. We do offer/sell > static IP, but pre-CGNAT, it was strictly for business customers. i.e. > A residential customer could only get static IP service by converting > their account to a business account. That may change in the near future. > > One issue we didn't foresee has been IP Geo issues. i.e. We all knew > that streaming services like Netflix use IP Geo to determine what content > should be made available, but that's, AFAIK, limited by country or region. > What we didn't anticipate is services like Hulu Live TV doing IP Geo down > to the city level to determine which local channels are a subscriber's > local channels. We're using Juniper MX gear and SPC3 cards for our CGNAT > routers, each one having a single large external pool. Since we serve > most of FL, one external pool can't IP Geo correctly for customers as far > apart as Miami and Jacksonville hitting the same CGNAT router. We don't > currently have an acceptable solution to this other than moving impacted > customers off CGNAT. > > One of the great unknowns (at least for us) with CGNAT was what our PBA > settings should be. i.e. How large each port-block should be, and how > many port-blocks to allow per customer. We started with 256x4. It seemed > to work. We eventually noticed that we were logging port-block exceeded > errors. This is one aspect where Juniper's CGNAT support is lacking. > There's a counter for these errors, and it's available via SNMP, but > there's no way to attribute the errors to subscriber IPs. We're polling > the mib and graphing it, so we know it's a continuing issue and can see > when it's incrementing faster/slower, but Junos provides no means for > determining if "PBEs" are all being caused by a single customer, a handful > of customers, etc. We have a JTAC case open on this. As a quick & > hopeful fix, we both increased the port-block size and block limit. That > helped, but didn't stop the errors. It also cut our CGNAT ratio by more > than half (64:1 -> 28:1), if we stay at this ratio, we'll need much larger > external pools than originally anticipated. Tuning these settings is kind > of painful as JTAC strongly recommends bouncing the CGNAT service anytime > CGNAT related config changes are made. This means briefly breaking > Internet access for all CGNAT'd customers. For the PBEs, JTAC's > suggestions so far have been to shorten some of the timeouts in the config > and to keep doing what we're doing, which is a cron job that essentially > does a "show services nat source port-block", parses the output looking > for subscriber IPs that have used up the ports in several of their > port-blocks, then does a "show services sessions source-prefix ..." and > logs all of this. This at least gives us snapshots of "who's a heavy user > right now" and lets us look at how they were using all their ports. i.e. > was it bittorent, are they compromised and scanning the internet for more > systems to compromise, is it legit looking traffic - just lots of it, > etc.? > > The latest CGNAT issue is a customer with a Palo Alto Networks firewall > connected to our network and several of their employees are our FTTH > customers. On their PANW firewall, they're doing IP Geo based filtering, > limiting access to internal servers to "US IPs". Since we only CGNAT > traffic to the external Internet, their on-net employees hit the firewall > from their 100.64/10 IPs and get blocked. I suggested they whitelist > 100.64/10, saying we block traffic from 100.64/10 from entering our > network via peering and transit, so they can be assured anything from > 100.64/10 came from inside our network / our customers. They say the > firewall won't let them whitelist 100.64.0.0/10, giving an error that > it's > invalid IP space. > > I know we're not the first to implement CGNAT, so I'm curious if others > have run into these sorts of issues, or others we haven't run into yet, > and if so, how you solved them. > > > ---------------------------------------------------------------------- > Jon Lewis, MCP :) | I route > Blue Stream Fiber, Sr. Neteng | therefore you are > _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________ >