On Jul 10, 2020, at 8:08 AM, Massameno, Dan <[email protected]> wrote:
> Thank you for your thorough review.  I am completely new to the IETF process 
> (I don't even know what a "nit" is) and I'm looking forward to the process.

  "nitpick".  i.e. minor point which should be fix, but isn't very important.

  The IETF process is long, but generally worth it.

> I did want to dive into one of the last comments you made.  
> 
>        Even if the various nits and and issues above were fixed, this 
> proposal would have serious security issues.  Even 
>        if those security issues were addressed, I believe that load balancing 
> is simply not appropriate for 
>        the NAS.  Even if it was appropriate for the NAS, the vendors have 
> spoken: NAS implementations 
>        are simple, and server implementations are complex.
>        As such, load balancing more properly belongs in the server.
> 
> If I have N number of RADIUS servers, how are the servers supposed to load 
> balance if the load balancing function "belongs in the server"?

  You run a load-balancer.  This is either a hardware device which does UDP 
load balancing, or a dedicated RADIUS server which does nothing more than load 
balancing.

>  Won't the NAS need to know about all the RADIUS servers?  If yes, how does 
> it decided on which one to user for any one particular session?

  The NAS knows about one server: the load balancing one.  Typically the load 
balancer can be set up in an HA pair, using VRRP to share an IP address.

  My experience is that this scenario is robust, stable, and gives the admin a 
great amount of control over the system.  My experience also has been that you 
just can't rely on the NAS to do anything sane with RADIUS.  The 
implementations are terrible, naive, simplistic, etc.  Just give up on fixing 
them, and patch over the problem with a RADIUS server that's under your control.

  It's easier for me to say, of course, having written a RADIUS server.  That 
does bias me a bit, I think.  But practical experience validates this approach.

  A different explanation for the NAS behaviour is that the NAS vendors are 
incentivized to make the core functionality work well.  e.g. switching, WiFi, 
etc.  Features such as RADIUS are an after-thought.  Issues with RADIUS are 
only fixed if a sufficient number of customers demand changes.

  In contrast, a RADIUS server vendor is strongly incentivized to implement 
RADIUS correctly.  And, to do load balancing correctly.  With many parameters 
that can be tweaked for your exact situation.

  So independent of anything else, NAS vendors simply aren't motivated to 
implement complex RADIUS load balancing.  Whereas RADIUS server vendors have 
been shipping it for decades.

> My experience with Cisco devices indicates that if multiple RADIUS servers 
> are listed it simply uses the first one exclusively until it fails.  So, 
> there is failover, but no load balancing.  The list of RADIUS servers are 
> also statically configured via CLI, which is cumbersome when there is a large 
> fleet of devices to configure.  These two overly sophomoric features were the 
> ones I was endeavoring to fix.

  Most RADIUS clients behave the same way.  My recommendation for a long time 
has been to just punt on the problem of fixing the NAS.  Instead, run a RADIUS 
server that you control.

  One benefit is that you can upgrade the NAS, or change vendors without 
changing the way that load balancing works.

  And to re-iterate... RFC 5080 Section 2.1 defines retransmission behaviour 
for RADIUS clients.  Implementing this would make customers lives easier.  It 
would make RADIUS systems better and more stable.  But the incremental benefit 
is simply not high enough for NAS vendors to implement it.  So instead, they 
use fixed-count and fixed-time retransmission behaviours which were first 
implemented in 1993.

  In order for your proposal to gain traction, the NAS vendors MUST have strong 
incentive to implement it.  And I'm not sure what that incentive is.  Right 
now, they can just say "buy a load balancer", or "download FreeRADIUS and run 
it in a VM".  Problem solved.

  Alan DeKok.

_______________________________________________
OPSAWG mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/opsawg

Reply via email to