The approach we took in Apache Kudu is that, if Kerberos hasn't been
enabled, we default to a whitelist of subnets. The default whitelist is
127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
matches the IANA "non-routeable IP" subnet list.

In other words, out-of-the-box, you get a deployment that works fine within
a typical LAN environment, but won't allow some remote hacker to locate
your cluster and access your data. We thought this was a nice balance
between "works out of the box without lots of configuration" and "decent
security". In my opinion a "localhost-only by default" would be be overly
restrictive since I'd usually be deploying on some datacenter or EC2
machine and then trying to access it from a client on my laptop.

We released this first a bit over a year ago if my memory serves me, and
we've had relatively few complaints or questions about it. We also made
sure that the error message that comes back to clients is pretty
reasonable, indicating the specific configuration that is disallowing
access, so if people hit the issue on upgrade they had a clear idea what is
going on.

Of course it's not foolproof, since as Eric says, you're still likely open
to the entirety of your corporation, and you may not want that, but as he
also pointed out, that might be true even if you enable Kerberos
authentication.

-Todd

On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang <ey...@hortonworks.com> wrote:

> Hadoop default configuration aimed for user friendliness to increase
> adoption, and security can be enabled one by one.  This approach is most
> problematic to security because system can be compromised before all
> security features are turned on.
> Larry's proposal will add some safety to remind system admin if security
> is disabled.  However, reducing the number of knobs on security configs are
> likely required to make the system secure for the banner idea to work
> without writing too much guessing logic to determine if UI is secured.
> Penetration test can provide better insights of what hasn't been secured to
> improve the next release.  Thankfully most Hadoop vendors have done this
> work periodically to help the community secure Hadoop.
>
> There are plenty of company advertised if you want security, use
> Kerberos.  This statement is not entirely true.  Kerberos makes security
> more difficult to crack for external parties, but it shouldn't be the only
> method to secure Hadoop.  When the Kerberos environment is larger than
> Hadoop cluster, anyone within Kerberos environment can access Hadoop
> cluster freely without restriction.  In large scale enterprises or some
> cloud vendors that sublet their resources, this might not be acceptable.
>
> From my point of view, a secure Hadoop release must default all settings
> to localhost only and allow users to add more hosts through authorized
> white list of servers.  This will keep security perimeter in check.  All
> wild card ACLs will need to be removed or default to current user/current
> host only.  Proxy user/host ACL list must be enforced on http channels.
> This is basically realigning the default configuration to single node
> cluster or firewalled configuration.
>
> Regards,
> Eric
>
> On 7/5/18, 8:24 AM, "larry mccay" <larry.mc...@gmail.com> wrote:
>
>     Hi Steve -
>
>     This is a long overdue DISCUSS thread!
>
>     Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED UI
>     ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
> warning
>     to get to the page like SSL exceptions in the browser do?
>     Similar tactic for UI access without SSL?
>     A new AuthenticationFilter can be added to the filter chains that
> blocks
>     API calls unless explicitly configured to be open and obvious log a
> similar
>     message?
>
>     thanks,
>
>     --larry
>
>
>
>
>     On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
> ste...@hortonworks.com>
>     wrote:
>
>     > Bitcoins are profitable enough to justify writing malware to run on
> Hadoop
>     > clusters & schedule mining jobs: there have been a couple of
> incidents of
>     > this in the wild, generally going in through no security, well known
>     > passwords, open ports.
>     >
>     > Vendors of Hadoop-related products get to deal with their lockdown
>     > themselves, which they often do by installing kerberos from the
> outset,
>     > making users make up their own password for admin accounts, etc.
>     >
>     > The ASF releases though: we just provide something insecure out the
> box
>     > and some docs saying "use kerberos if you want security"
>     >
>     > What we can do here?
>     >
>     > Some things to think about
>     >
>     > * docs explaining IN CAPITAL LETTERS why you need to lock down your
>     > cluster to a private subnet or use Kerberos
>     > * Anything which can be done to make Kerberos easier (?). I see
> there are
>     > some oustanding patches for HADOOP-12649 which need review, but what
> else?
>     >
>     > Could we have Hadoop determine when it's coming up on an open
> network and
>     > start warning? And how?
>     >
>     > At the very least, single node hadoop should be locked down. You
> shouldn't
>     > have to bring up kerberos to run it like that. And for more
> sophisticated
>     > multinode deployments, should the scripts refuse to work without
> kerberos
>     > unless you pass in some argument like "--Dinsecure-clusters-
> permitted"
>     >
>     > Any other ideas?
>     >
>     >
>     > ------------------------------------------------------------
> ---------
>     > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>     > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>     >
>     >
>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to