[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923642#action_12923642
 ] 

He Yongqiang commented on HIVE-78:
----------------------------------

@dhruba
HDFS has its own authorization. So if we allow an access in Hive layer and pass 
this access to HDFS (by setting the correct hdfs username and groups), the job 
can fail with HDFS permission problem. 
So need to solve the problem from 2 layer independent authorization.
One way to allow all accesses to HDFS, and let hive do the authorization. So 
hive runs as root in terms of HDFS.
The other way is to plug in HDFS authorization to Hive layer, and only accept 
one access if both of Hive and HDFS say YES.  A user belongs to different unix 
groups, and set hdfs permission based on the unix group. [ I am not sure about 
how many groups a user can have in terms of HDFS. I mean how many group 
settings you can put to a hdfs file. Let's simply say i want these 2 groups to 
be able to read the file.]  The another problem is the column level privileges.
This is very open for discussion, please comment on it.


About the proposal, there is one authorization rule that we are not sure about. 
It's the simple rule: one deny then deny.

Let's say this example:
5.3.1 I want to grant everyone (new people may join at anytime) to db_name.*, 
and then later i want to protect one table db_name.T from ALL users but a few
1) Add all users to a group 'users'. (assumption: new users will automatically 
join this group). And grant 'users' ALL privileges to db_name.*
2) Add those few users to a new group 'users2'. AND REMOVE them from 'users'
3) DENY 'users' to db_name.T
4) Grant ALL on db_name.T to users2

The main problem in this approach is that "REMOVE them from 'users'" is not 
practicable. 


The other options that we have thought about is another rule.

First try user name:

first try to deny this access by look up the deny tables by user name:

1. If there is an entry in 'user' that deny this access, return DENY
2. If there is an entry in 'db'  that deny this access, return DENY
3. If there is an entry in 'table'  that deny this access, return DENY
4. If there is an entry in 'column'  that deny this access, return DENY

If we got one deny, will return deny for this attempt.

if deny failed, go through all privilege levels with the user name:

5. If there is an entry in 'user' that accept this access, return ACCEPT
6. If there is an entry in 'db'  that accept this access, return ACCEPT
7. If there is an entry in 'table'  that accept this access, return ACCEPT
8. If there is an entry in 'column'  that accept this access, return ACCEPT


Second try the user's group/role names one by one until we get an ACCEPT. If we 
get an ACCEPT from one group/role, will ACCEPT this access. Else deny.

For each role/group, we do the same routine as we did for user name.
The problem with this approach is it's a little bit complex and we did not find 
any system that use this. For mysql, there is no deny. For sql server, it's one 
deny then deny.


> Authorization infrastructure for Hive
> -------------------------------------
>
>                 Key: HIVE-78
>                 URL: https://issues.apache.org/jira/browse/HIVE-78
>             Project: Hive
>          Issue Type: New Feature
>          Components: Server Infrastructure
>            Reporter: Ashish Thusoo
>            Assignee: He Yongqiang
>         Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, 
> hive-78-syntax-v1.patch, hive-78.diff
>
>
> Allow hive to integrate with existing user repositories for authentication 
> and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to