Hello,

I would like to force all mapreduce jobs run from the Hive shell to run as the 
hdfs user who ran them instead of as the "hive" user. For instance, I have HDFS 
testuser1 logged into the edge node under their unix user with the same name 
testuser1. This user begins a hive shell and kicks off a hive query job. This 
job is always run as the "hive" user in HDFS and therefore all temporary files 
reside in the hive users' HDFS folder. This is a problem.

I'm using the following for temp file location: hive.exec.scratchdir = 
/user/${user.name}/.hive-temp/ . The intent is to ensure that every user puts 
their own temp files and intermediate job files in their own user directory so 
that we can track user disk usage correctly. Since Hive jobs are run as the 
"hive" user, temp files always end up in /user/hive/.hive-temp/.

Is it possible to locate hive temp files in the user who runs the hive job?

Thanks,
-Shawn

Shawn Higgins
Systems Engineer
Thomson Reuters

shawn.higg...@thomsonreuters.com
thomsonreuters.com

Reply via email to