[ https://issues.apache.org/jira/browse/HIVE-12202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967376#comment-14967376 ]
Elliot West commented on HIVE-12202: ------------------------------------ I've checked to see that {{AcidUtils.serializeDeltas}} is being used correctly in conjunction with {{AcidUtils.deserializeDeltas}}. It appears that {{serializeDeltas}} does indeed create {{DeltaMetaData}} instances with an empty list for the statement IDs for delta paths containing only {{$startTxnId}} and {{$endTxnId}}. However, the deserialization process in {{AcidInputFormat.DeltaMetaData.readFields(DataInput)}} incorrectly sets {{stmtIds}} to {{null}} at line 152 if no statement count was serialized. Hence {{AcidUtils.deserializeDeltas}} then gets tripped up by an NPE at line 371. > NPE thrown when reading legacy ACID delta files > ----------------------------------------------- > > Key: HIVE-12202 > URL: https://issues.apache.org/jira/browse/HIVE-12202 > Project: Hive > Issue Type: Bug > Components: Transactions > Affects Versions: 1.3.0 > Reporter: Elliot West > Assignee: Elliot West > Labels: transactions > > When reading legacy ACID deltas of the form {{delta_$startTxnId_$endTxnId}} a > {{NullPointerException}} is thrown on: > {code:title=org.apache.hadoop.hive.ql.io.AcidUtils.deserializeDeltas#371} > if(dmd.getStmtIds().isEmpty()) { > {code} > The older ACID data format (pre-Hive 1.3.0) which does not include the > statement ID, and code written for that format should still be supported. > Therefore the above condition should also include a null check or > alternatively {{AcidInputFormat.DeltaMetaData}} should never return null, and > return an empty list in this specific scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)