[ 
https://issues.apache.org/jira/browse/HIVE-10538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523591#comment-14523591
 ] 

Peter Slawski commented on HIVE-10538:
--------------------------------------

The Null Pointer Exception occurs due to a mismatch on the hashcode computed 
from a row in ReduceSinkOperator and FileSinkOperator. ReduceSinkOperator's 
hashcode is used by the partitioner to distribute rows to reducers. The 
FileSinkOperator's hashcode is used to compute the row's bucket number at the 
reducer. If the hashcodes mismatch, the bucket number computed is not one of 
the expected bucket numbers for that reducer. This causes a Null Pointer 
Exception.

ReduceSinkOperator was computing a different hashcode because its bucketNumber 
field was initialized to valid bucket number, 0. The attached patch initializes 
this bucketNumber field to an invalid number, -1. This fixes ReduceSinkOperator 
to compute the same hashcode as FileSinkOperator.

The NPE can be reproduced in a simpler query which is included as a qtest in 
the patch.
{code}
set hive.enforce.bucketing = true;
set mapred.reduce.tasks = 16;

create table bucket_many(key int, value string) clustered by (key) into 256 
buckets;

insert overwrite table bucket_many
select * from src;
{code}

> Fix NPE in FileSinkOperator from hashcode mismatch
> --------------------------------------------------
>
>                 Key: HIVE-10538
>                 URL: https://issues.apache.org/jira/browse/HIVE-10538
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 1.0.0, 1.2.0
>            Reporter: Peter Slawski
>             Fix For: 1.3.0
>
>         Attachments: HIVE-10538.1.patch
>
>
> A Null Pointer Exception occurs when in FileSinkOperator when using bucketed 
> tables and distribute by with multiFileSpray enabled. The following snippet 
> query reproduces this issue:
> {code}
> set hive.enforce.bucketing = true;
> set hive.exec.reducers.max = 20;
> create table bucket_a(key int, value_a string) clustered by (key) into 256 
> buckets;
> create table bucket_b(key int, value_b string) clustered by (key) into 256 
> buckets;
> create table bucket_ab(key int, value_a string, value_b string) clustered by 
> (key) into 256 buckets;
> -- Insert data into bucket_a and bucket_b
> insert overwrite table bucket_ab
> select a.key, a.value_a, b.value_b from bucket_a a join bucket_b b on (a.key 
> = b.key) distribute by key;
> {code}
> The following stack trace is logged.
> {code}
> 2015-04-29 12:54:12,841 FATAL [pool-110-thread-1]: ExecReducer 
> (ExecReducer.java:reduce(255)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row (tag=0) {"key":{},"value":{"_col0":"113","_col1":"val_113"}}
>       at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
>       at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>       at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.findWriterOffset(FileSinkOperator.java:819)
>       at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:747)
>       at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>       at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>       at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
>       ... 8 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to