[jira] [Commented] (HIVE-7186) Unable to perform join on table

Alex Nastetsky (JIRA) Fri, 20 Jun 2014 07:18:11 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038830#comment-14038830
 ]


Alex Nastetsky commented on HIVE-7186:
--------------------------------------

I just saw a similar problem with with a different stacktrace. This time, the 
join got to the very end of the job and failed as it finished:
{code}
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.EOFException: 
Premature EOF: no length prefix available
        at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:514)
        at 
org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.serviceStop(JobHistoryEventHandler.java:332)
        at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
        at 
org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
        at 
org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
        at 
org.apache.hadoop.service.CompositeService.stop(CompositeService.java:159)
        at 
org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:132)
        at 
org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:548)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:599)
Caused by: java.io.EOFException: Premature EOF: no length prefix available
        at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1492)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:962)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:930)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1031)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
        at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
{code}

> Unable to perform join on table
> -------------------------------
>
>                 Key: HIVE-7186
>                 URL: https://issues.apache.org/jira/browse/HIVE-7186
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.12.0
>         Environment: Hortonworks Data Platform 2.0.6.0
>            Reporter: Alex Nastetsky
>
> Occasionally, a table will start exhibiting behavior that will prevent it 
> from being used in a JOIN. 
> When doing a map join, it will just stall at "Starting to launch local task 
> to process map join; ".
> When doing a regular join, it will make progress but then error out with a 
> IndexOutOfBoundsException:
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException
>         at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:365)
>         at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
>         at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
>         at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:842)
>         at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
>         ... 9 more
> Caused by: java.lang.IndexOutOfBoundsException
>         at java.nio.Buffer.checkIndex(Buffer.java:532)
>         at 
> java.nio.ByteBufferAsIntBufferL.put(ByteBufferAsIntBufferL.java:131)
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1153)
>         at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:586)
>         at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:372)
>         at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:334)
>         ... 15 more
>         
> Doing simple selects against this table work fine and do not show any 
> apparent problems with the data.
> Assume that the table in question is called tableA and was created by queryA.
> Doing either of the following has helped resolve the issue in the past.
> 1) create table tableB as select * from tableA;
>   Then just use tableB instead in the JOIN.
> 2) regenerate tableA using queryA
>   Then use tableA in the JOIN again. It usually works the second time.
>   
> When doing a "describe formatted" on the tables, the totalSize will be 
> different between the original tableA and tableB, and sometimes (but not 
> always) between the original tableA and the regenerated tableA. The numRows 
> will be the same across all versions of the tables.
> This problem can not be reproduced consistently, but the issue always happens 
> when we try to use an affected table in a JOIN.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7186) Unable to perform join on table

Reply via email to