[ https://issues.apache.org/jira/browse/HIVE-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joe Rao updated HIVE-6489: -------------------------- Component/s: (was: Authorization) (was: Clients) > Data loaded with LOAD DATA LOCAL INPATH has incorrect group ownership > --------------------------------------------------------------------- > > Key: HIVE-6489 > URL: https://issues.apache.org/jira/browse/HIVE-6489 > Project: Hive > Issue Type: Bug > Components: Import/Export > Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0 > Environment: OS and hardware are irrelevant. Tested and reproduced > on multiple configurations, including SLES, RHEL, VM, Teradata Hadoop > Appliance, HDP 1.1, HDP 1.3.2, HDP 2.0. > Reporter: Joe Rao > Priority: Minor > Original Estimate: 24h > Remaining Estimate: 24h > > Data uploaded by <user> via the Hive client with the "LOAD DATA LOCAL INPATH" > method will have group ownership of the hdfs://tmp/hive-<user> instead of the > primary group that <user> belongs to. The group ownership of the > hdfs://tmp/hive-<user> is, by default, the group that the user running the > hadoop daemons run under. This means that, on a Hadoop system with default > file permissions of 770, any data loaded to hive via the LOAD DATA LOCAL > INPATH method by one user cannot be seen by another user in the same group > until the group ownership is manually changed in Hive's internal directory, > or the group ownership is manually changed on hdfs://tmp/hive-<user>. This > problem is not present with the LOAD DATA INPATH method, or by using regular > HDFS loads. > Steps to reproduce the problem on a pseudodistributed Hadoop cluster: > - In hdfs-site.xml, modify the umask to 007 (meaning that default permissions > on files are 770). The property changes names in Hadoop 2.0 but used to be > called "dfs.umaskmode". > - Restart hdfs > - Create a group called "testgroup". > - Create two users that have testgroup as their primary group. Call them > "testuser1" and "testuser2" > - Create a test file containing "Hello World" and call it "test.txt". It > should be stored on the local filesystem. > - Create a table called "testtable" in Hive using testuser1. Give it a > single string column, textfile format, comma delimited fields. > - Have testuser1 use the LOAD DATA LOCAL INPATH command to load "test.txt" > into testtable. > - Attempt to read testtable using testuser2. The read will fail on a > permissions error, when it should not. > - Examine the contents of the hdfs://apps/hive/warehouse/testtable directory. > The file will belong to the "hadoop" or "users" or analogous group, instead > of the correct group "testgroup". It will have correct permissions of 770. > - Change the group ownership of the folder "hdfs://tmp/hive-testuser1" to > "testgroup". > - Repeat the data load. testuser2 will now be able to correctly read the > data, and the file will have the correct group ownership. -- This message was sent by Atlassian JIRA (v6.1.5#6160)