Re: Error using ORC Format with Hive

Bryan Jeffrey Fri, 04 Apr 2014 20:22:25 -0700

Amit,

Are you executing your select for conversion to orc via beeline, or hive
cli? From looking at your logs, it appears that you do not have permissions
in hdfs to write the resultant orc data. Check permissions in hdfs to
ensure that your user has write permissions to write to hive warehouse.


I forwarded you a previous thread regarding hive 12 protobuf issues.

Regards,

Bryan Jeffrey
On Apr 4, 2014 8:14 PM, "Amit Tewari" <amittew...@gmail.com> wrote:

 I checked out and build hive 0.13. Tried with same results. i.e.
eRpcServer.addBlock(NameNodeRpcServer.java:555)
    at File
/tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.000000_3
could only be replicated to 0 nodes instead of minReplication (=1).  There
are 1 datanode(s) running and no node(s) are excluded in this operation.



I also tried it with the release version of hive 0.12 and that gave me a
different error. Related to protobuffer incompatibility (pasted below)

So at this point I can't run even the basic use case with ORC storage..

Any pointers would be very helpful.

Amit

Error: java.lang.RuntimeException: Hive Runtime Error while closing
operators
    at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)

    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.UnsupportedOperationException: This is supposed to be
overridden by subclasses.
    at
com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
    at
org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
    at
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
    at
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
    at
org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
    at
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
    at
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
    at
org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
    at
com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
    at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
    at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
    at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
    at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
    at
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
    at
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
    at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
    at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
    at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
    at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
    at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
    at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)

Amit


On 4/4/14 2:28 PM, Amit Tewari wrote:

Hi All,

I am just trying to do some simple tests to see speedup in hive query with
Hive 0.14 (trunk version this morning). Just tried to use sample test case
to start with. First wanted to see how much I can speed up using ORC
format.

However for some reason I can't insert data into the table with ORC format.
It fails with Exception "File <filename> could only be replicated to 0
nodes instead of minReplication (=1).  There are 1 datanode(s) running and
no node(s) are excluded in this operation"

I can however run inserting data into text table without any issue.

I have included the step below.

Any pointers would be appreciated.

Amit



I have a single node setup with minimal settings. JPS output is as follows
$ jps
9823 NameNode
12172 JobHistoryServer
9903 DataNode
14895 Jps
11796 ResourceManager
12034 NodeManager
*Running Hadoop 0.2.2 with Yarn.*



Step1

CREATE TABLE pokes (foo INT, bar STRING);

Step 2

LOAD DATA LOCAL INPATH './examples/files/kv1.txt' OVERWRITE INTO TABLE
pokes;

Step 3
CREATE TABLE pokes_1 (foo INT, bar STRING)

Step 4

Insert into table pokes_1 select * from pokes;

Step 5.

CREATE TABLE pokes_orc (foo INT, bar STRING) stored as orc;

Step 6.

insert into pokes_orc select * from pokes; <__FAILED__ with Exception below
>

eRpcServer.addBlock(NameNodeRpcServer.java:555)
    at File
/tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.000000_3
could only be replicated to 0 nodes instead of minReplication (=1).  There
are 1 datanode(s) running and no node(s) are excluded in this operation.
    at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
    at
org.apache.hadoop.hdfs.server.namenode.NameNodorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
    at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
    at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)

    at
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:168)
    at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:843)
    at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:577)
    at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
    at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
    at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
    at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227)
    ... 8 more


Step 7

Insert overwrite table pokes_1 select * from pokes; <Success>

Re: Error using ORC Format with Hive

Reply via email to