On Mon, Sep 13, 2010 at 9:31 AM, Owen O'Malley <omal...@apache.org> wrote:

> Moving the discussion over to the more appropriate mapreduce-dev.
>

This is not MR-specific, since the strangely named hadoop.job.ugi determines
HDFS permissions as well. +CC hdfs-dev... though I actually think this is an
issue that users will have interest in, which is why I posted to general
initially rather than a dev list.


> On Mon, Sep 13, 2010 at 9:08 AM, Todd Lipcon <t...@cloudera.com> wrote:
>
> > 1) Groups resolution happens on the server side, where it used to happen
> on
> > the client. Thus, all Hadoop users must exist on the NN/JT machines in
> order
> > for group mapping to succeed (or the user must write a custom group
> mapper).
>
> There is a plugin that performs the group lookup. See HADOOP-4656.
> There is no requirement for having the user accounts on the NN/JT
> although that is the easiest approach. It is not recommended that the
> users be allowed to login.
>

"or the user must write a custom group mapper" above refers to this plugin
capability. But I think most users do not want to spend the time to write
(or even setup) such a plugin beyond the default shell-based mapping
service.


> I think it is important that turning security on and off doesn't
> drastically change the semantics or protocols. That will become much
> much harder to support downstream.
>
>
As someone who spends an awful lot of time doing downstream support of lots
of different clusters, I actually disagree. I believe the majority of users
do *not* plan on turning on security, so keeping things simpler for them is
worth a lot. In many of these clusters the users and the ops team and the
developers are all one and the same - it's not the multitenant "internal
service" model that we see at the larger installations like Yahoo or
Facebook.


> > 2) The hadoop.job.ugi parameter is ignored - instead the user has to use
> the
> > new UGI.createRemoteUser("foo").doAs() API, even in simple security.
>
> User code that counts on hadoop.job.ugi working will be horribly
> broken once you turn on security. Turning on and off security should
> not involve testing all of your applications. It is unfortunate that
> we ever used the configuration value as the user, but continuing to
> support it will make our user's code much much more brittle.
>

The assumption above is "once you turn on security" - but many users will
not and probably never will turn on security. Providing a transition plan
for one version is our usual policy here - I agree that long term we would
like to do away with this hack of a configuration parameter. Since it's not
hard to provide a backwards compatibility path with a deprecation warning
for one version, are you against it? Or just saying that on your particular
clusters you will choose not to take advantage of it?

-Todd

-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to