[ 
https://issues.apache.org/jira/browse/HIVE-12202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967376#comment-14967376
 ] 

Elliot West commented on HIVE-12202:
------------------------------------

I've checked to see that {{AcidUtils.serializeDeltas}} is being used correctly 
in conjunction with {{AcidUtils.deserializeDeltas}}. It appears that 
{{serializeDeltas}} does indeed create {{DeltaMetaData}} instances with an 
empty list for the statement IDs for delta paths containing only 
{{$startTxnId}} and {{$endTxnId}}. However, the deserialization process in  
{{AcidInputFormat.DeltaMetaData.readFields(DataInput)}} incorrectly sets 
{{stmtIds}} to {{null}} at line 152 if no statement count was serialized. Hence 
{{AcidUtils.deserializeDeltas}} then gets tripped up by an NPE at line 371.

> NPE thrown when reading legacy ACID delta files
> -----------------------------------------------
>
>                 Key: HIVE-12202
>                 URL: https://issues.apache.org/jira/browse/HIVE-12202
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 1.3.0
>            Reporter: Elliot West
>            Assignee: Elliot West
>              Labels: transactions
>
> When reading legacy ACID deltas of the form {{delta_$startTxnId_$endTxnId}} a 
> {{NullPointerException}} is thrown on:
> {code:title=org.apache.hadoop.hive.ql.io.AcidUtils.deserializeDeltas#371}
> if(dmd.getStmtIds().isEmpty()) {
> {code}
> The older ACID data format (pre-Hive 1.3.0) which does not include the 
> statement ID, and code written for that format should still be supported. 
> Therefore the above condition should also include a null check or 
> alternatively {{AcidInputFormat.DeltaMetaData}} should never return null, and 
> return an empty list in this specific scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to