[jira] [Commented] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

Puneet Jain (JIRA) Mon, 06 Aug 2018 01:35:21 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569875#comment-16569875
 ]


Puneet Jain commented on HIVE-18858:
------------------------------------

Hi,

This seems to have broken working scenarios with Hive MR.  We now see 
hadoop.tmp.dir is always set to /tmp/hadoop-hive (in job.xml). This creates 
problems on a multi-tenant hadoop cluster since ownership of tmp folder is set 
to the user who executes the jobs first and other users fails to write to tmp 
folder.

E.g. User1 run job and /tmp/hadoop-hive is created on worker node with 
ownership to user1 and sibsequently user2 tries to run a job and job fails due 
to no write permission on /tmp/hadoop-hive/

Old behavior allowed multiple tenants to write to their respective tmp folders 
which was secure and contention free. User1 - /tmp/hadoop-user1, User2 - 
/tmp/hadoop-user2.

 

Thanks

Puneet

> System properties in job configuration not resolved when submitting MR job
> --------------------------------------------------------------------------
>
>                 Key: HIVE-18858
>                 URL: https://issues.apache.org/jira/browse/HIVE-18858
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>         Environment: Hadoop 3.0.0
>            Reporter: Daniel Voros
>            Assignee: Daniel Voros
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: HIVE-18858.1.patch, HIVE-18858.2.patch, 
> HIVE-18858.3.patch
>
>
> Since [this hadoop 
> commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1]
>  that was first released in 3.0.0, Configuration has a restricted mode, that 
> disables the resolution of system properties (that happens when retrieving a 
> configuration option).
> This leads to test failures when switching to Hadoop 3.0.0 (instead of 
> 3.0.0-beta1), since we're relying on the [substitution of 
> test.tmp.dir|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/data/conf/hive-site.xml#L37]
>  during the [maven 
> build|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/pom.xml#L83].
>  See test results on HIVE-18327.
> When we're passing job configurations to Hadoop, I believe there's no way to 
> disable the restricted mode, since we go through some Hadoop MR calls first, 
> see here:
> {code}
> "HiveServer2-Background-Pool: Thread-105@9500" prio=5 tid=0x69 nid=NA runnable
>   java.lang.Thread.State: RUNNABLE
>         at 
> org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:970)
>         - locked <0x2fe6> (a org.apache.hadoop.mapred.JobConf)
>         at 
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:895)
>         at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:476)
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:162)
>         at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
>         at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
>         at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
>         at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
>         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
>         at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
>         at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
>         at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:415)
>         at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
>         at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
>         at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
>         at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2314)
>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1985)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1687)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1438)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1432)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:90)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
>         at 
> java.security.AccessController.doPrivileged(AccessController.java:-1)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
>         at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:353)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> I suggest to resolve all variables before passing the configuration to Hadoop 
> in ExecDriver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

Reply via email to