I solved the problem by using a fully qualified path for
hive.exec.scratchdir and then the umask trick worked. It turns out
that hive was creating a different directory (on hdfs) than the one
mapreduce was trying to write into, and that's why the umask didn't
work. This remains a nasty workaround, and I wish someone would say
how to do this right!
Quoting yabou...@uwaterloo.ca:
Thanks for the reply Tim. It is writable to all (permission 777). As
a side note, I have discovered now that the mapreduce task spawned
by the RCFileOutputDriver is setting mapred.output.dir to a folder
under file:// regardrless of the fs.default.name. This might be
expected beahviour, but I just wanted to note it.
Quoting Tim Havens <timhav...@gmail.com>:
make sure :/home/yaboulnaga/tmp/**hive-scratch/ is writeable by your
processes.
On Mon, Nov 26, 2012 at 10:07 AM, <yabou...@uwaterloo.ca> wrote:
Hello,
I'm using Cloudera's CDH4 with Hive 0.9 and Hive Server 2. I am trying to
load data into hive using the JDBC driver (the one distributed with
Cloudera CDH4 "org.apache.hive.jdbc.**HiveDriver". I can create the
staging table and LOAD LOCAL into it. However when I try to insert data
into a table with Columnar SerDe Stored As RCFILE I get an error caused by
file permissions. I don't think that the SerDE or the Stored as parameters
have anything to do with the problem but I mentioned them for completeness.
The problem is that hive creates a temporary file in its scratch folder
(local) owned by hive:hive with permissions 755, then pass it as an input
to a mapper running as the user mapred:mapred. Now the mapper tries to
create something inside the input folder (probably can do this elsewhere),
and the following exception is thrown:
org.apache.hadoop.hive.ql.**metadata.HiveException: java.io.IOException:
Mkdirs failed to create file:/home/yaboulnaga/tmp/**
hive-scratch/hive_2012-11-26_**10-46-44_887_**
2004468370569495405/_task_tmp.**-ext-10002
at org.apache.hadoop.hive.ql.io.**HiveFileFormatUtils.**
getHiveRecordWriter(**HiveFileFormatUtils.java:237)
at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**
createBucketFiles(**FileSinkOperator.java:477)
at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.closeOp(**
FileSinkOperator.java:709)
at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
java:557)
at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
java:566)
at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
java:566)
at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
java:566)
at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
java:566)
at org.apache.hadoop.hive.ql.**exec.ExecMapper.close(**
ExecMapper.java:193)
at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**57)
at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
java:393)
at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:327)
at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
at java.security.**AccessController.doPrivileged(**Native Method)
at javax.security.auth.Subject.**doAs(Subject.java:396)
at org.apache.hadoop.security.**UserGroupInformation.doAs(**
UserGroupInformation.java:**1332)
at org.apache.hadoop.mapred.**Child.main(Child.java:262)
As you might have noticed, I moved the scrach folder to a directory under
my home dir so that I can give this directory 777 permissions. The idea was
to use hive.files.umask.value of 0000 to cause subdirectories to inherit
the same open permission (not the best workaround, but wouldn't hurt on my
local machine). Unfortunately this didn't work even when I added a umask
0000 to /etc/init.d/hiveserver2. Can someone please tell me what's the
right way to do this? I mean create a table and then insert values into it!
The Hive QL statements I use are very similar to the ones in the tutorials
about loading data.
Cheers!
-- Younos
--
"The whole world is you. Yet you keep thinking there is something else." -
Xuefeng Yicun 822-902 A.D.
Tim R. Havens
Google Phone: 573.454.1232
ICQ: 495992798
ICBM: 37°51'34.79"N 90°35'24.35"W
ham radio callsign: NW0W
Best regards,
Younos Aboulnaga
Masters candidate
David Cheriton school of computer science
University of Waterloo
http://cs.uwaterloo.ca
E-Mail: younos.abouln...@uwaterloo.ca
Mobile: +1 (519) 497-5669
Best regards,
Younos Aboulnaga
Masters candidate
David Cheriton school of computer science
University of Waterloo
http://cs.uwaterloo.ca
E-Mail: younos.abouln...@uwaterloo.ca
Mobile: +1 (519) 497-5669