Hi Jon,

Are you dual stack?  v6 would solve some of these issues?



On Tue, Oct 8, 2024 at 12:20 PM Jon Lewis <jle...@lewis.org> wrote:

> We started rolling out CGNAT about 6 months ago.  It was smooth sailing
> for the first few months, but we eventually did run into a number of
> issues.
>
> Our customer base is primarily FTTH with "dynamic" IP assignment via DHCP.
> Since connections are always-on, customer ONTs/routers get an IP assigned,
> and then when the lease is renewed, they request a new lease for the
> existing IP, and, in general, that request is granted.  This gives
> customers the mistaken impression they have a static IP.  So, my
> impression, from working with some customers who've needed to be moved
> from CGNAT back to public IP is that customers who are doing
> port-forwarding don't even bother with dynamic DNS.  They just know they
> can connect to their IP as they've never seen it change.  We do offer/sell
> static IP, but pre-CGNAT, it was strictly for business customers.  i.e.
> A residential customer could only get static IP service by converting
> their account to a business account. That may change in the near future.
>
> One issue we didn't foresee has been IP Geo issues.  i.e.  We all knew
> that streaming services like Netflix use IP Geo to determine what content
> should be made available, but that's, AFAIK, limited by country or region.
> What we didn't anticipate is services like Hulu Live TV doing IP Geo down
> to the city level to determine which local channels are a subscriber's
> local channels.  We're using Juniper MX gear and SPC3 cards for our CGNAT
> routers, each one having a single large external pool.  Since we serve
> most of FL, one external pool can't IP Geo correctly for customers as far
> apart as Miami and Jacksonville hitting the same CGNAT router.  We don't
> currently have an acceptable solution to this other than moving impacted
> customers off CGNAT.
>
> One of the great unknowns (at least for us) with CGNAT was what our PBA
> settings should be.  i.e.  How large each port-block should be, and how
> many port-blocks to allow per customer.  We started with 256x4.  It seemed
> to work.  We eventually noticed that we were logging port-block exceeded
> errors.  This is one aspect where Juniper's CGNAT support is lacking.
> There's a counter for these errors, and it's available via SNMP, but
> there's no way to attribute the errors to subscriber IPs.  We're polling
> the mib and graphing it, so we know it's a continuing issue and can see
> when it's incrementing faster/slower, but Junos provides no means for
> determining if "PBEs" are all being caused by a single customer, a handful
> of customers, etc.  We have a JTAC case open on this.  As a quick &
> hopeful fix, we both increased the port-block size and block limit.  That
> helped, but didn't stop the errors.  It also cut our CGNAT ratio by more
> than half (64:1 -> 28:1), if we stay at this ratio, we'll need much larger
> external pools than originally anticipated.  Tuning these settings is kind
> of painful as JTAC strongly recommends bouncing the CGNAT service anytime
> CGNAT related config changes are made.  This means briefly breaking
> Internet access for all CGNAT'd customers.  For the PBEs, JTAC's
> suggestions so far have been to shorten some of the timeouts in the config
> and to keep doing what we're doing, which is a cron job that essentially
> does a "show services nat source port-block", parses the output looking
> for subscriber IPs that have used up the ports in several of their
> port-blocks, then does a "show services sessions source-prefix ..." and
> logs all of this.  This at least gives us snapshots of "who's a heavy user
> right now" and lets us look at how they were using all their ports.  i.e.
> was it bittorent, are they compromised and scanning the internet for more
> systems to compromise, is it legit looking traffic - just lots of it,
> etc.?
>
> The latest CGNAT issue is a customer with a Palo Alto Networks firewall
> connected to our network and several of their employees are our FTTH
> customers.  On their PANW firewall, they're doing IP Geo based filtering,
> limiting access to internal servers to "US IPs".  Since we only CGNAT
> traffic to the external Internet, their on-net employees hit the firewall
> from their 100.64/10 IPs and get blocked.  I suggested they whitelist
> 100.64/10, saying we block traffic from 100.64/10 from entering our
> network via peering and transit, so they can be assured anything from
> 100.64/10 came from inside our network / our customers.  They say the
> firewall won't let them whitelist 100.64.0.0/10, giving an error that
> it's
> invalid IP space.
>
> I know we're not the first to implement CGNAT, so I'm curious if others
> have run into these sorts of issues, or others we haven't run into yet,
> and if so, how you solved them.
>
>
> ----------------------------------------------------------------------
>   Jon Lewis, MCP :)              |  I route
>   Blue Stream Fiber, Sr. Neteng  |  therefore you are
> _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
>

Reply via email to