[ https://issues.apache.org/jira/browse/HIVE-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569875#comment-16569875 ]
Puneet Jain commented on HIVE-18858: ------------------------------------ Hi, This seems to have broken working scenarios with Hive MR. We now see hadoop.tmp.dir is always set to /tmp/hadoop-hive (in job.xml). This creates problems on a multi-tenant hadoop cluster since ownership of tmp folder is set to the user who executes the jobs first and other users fails to write to tmp folder. E.g. User1 run job and /tmp/hadoop-hive is created on worker node with ownership to user1 and sibsequently user2 tries to run a job and job fails due to no write permission on /tmp/hadoop-hive/ Old behavior allowed multiple tenants to write to their respective tmp folders which was secure and contention free. User1 - /tmp/hadoop-user1, User2 - /tmp/hadoop-user2. Thanks Puneet > System properties in job configuration not resolved when submitting MR job > -------------------------------------------------------------------------- > > Key: HIVE-18858 > URL: https://issues.apache.org/jira/browse/HIVE-18858 > Project: Hive > Issue Type: Bug > Affects Versions: 3.0.0 > Environment: Hadoop 3.0.0 > Reporter: Daniel Voros > Assignee: Daniel Voros > Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-18858.1.patch, HIVE-18858.2.patch, > HIVE-18858.3.patch > > > Since [this hadoop > commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1] > that was first released in 3.0.0, Configuration has a restricted mode, that > disables the resolution of system properties (that happens when retrieving a > configuration option). > This leads to test failures when switching to Hadoop 3.0.0 (instead of > 3.0.0-beta1), since we're relying on the [substitution of > test.tmp.dir|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/data/conf/hive-site.xml#L37] > during the [maven > build|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/pom.xml#L83]. > See test results on HIVE-18327. > When we're passing job configurations to Hadoop, I believe there's no way to > disable the restricted mode, since we go through some Hadoop MR calls first, > see here: > {code} > "HiveServer2-Background-Pool: Thread-105@9500" prio=5 tid=0x69 nid=NA runnable > java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:970) > - locked <0x2fe6> (a org.apache.hadoop.mapred.JobConf) > at > org.apache.hadoop.conf.Configuration.addResource(Configuration.java:895) > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:476) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:162) > at > org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) > at > java.security.AccessController.doPrivileged(AccessController.java:-1) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571) > at > java.security.AccessController.doPrivileged(AccessController.java:-1) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562) > at > org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:415) > at > org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2314) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1985) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1687) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1438) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1432) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248) > at > org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:90) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340) > at > java.security.AccessController.doPrivileged(AccessController.java:-1) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:353) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > I suggest to resolve all variables before passing the configuration to Hadoop > in ExecDriver. -- This message was sent by Atlassian JIRA (v7.6.3#76005)