[ https://issues.apache.org/jira/browse/HIVE-22551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
László Bodor updated HIVE-22551: -------------------------------- Attachment: HIVE-22551.01.patch > BytesColumnVector initBuffer should clean vector and length consistently > ------------------------------------------------------------------------- > > Key: HIVE-22551 > URL: https://issues.apache.org/jira/browse/HIVE-22551 > Project: Hive > Issue Type: Bug > Reporter: László Bodor > Assignee: László Bodor > Priority: Major > Attachments: HIVE-22551.01.patch, HIVE-22551.01.patch, > HIVE-22551.01.patch, HIVE-22551.01.patch, HIVE-22551.01.patch, > HIVE-22551.01.patch > > > VectorExtractRow relies on the fact that vector[i] and length[i] are > consistent within the BytesColumnVector, otherwise it throws exception: > https://github.com/apache/hive/blob/edc53cc0d95e983c371a224943dd866210f0c65c/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExtractRow.java#L275 > There is a scenario when only vector[i] has been cleaned while reusing the > column vector, and then this kind of exception can be thrown: > the reproduction was made with > [LlapDump|https://github.com/apache/hive/blob/master/llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapDump.java] > with String columns (longer than 16 chars) > {code} > 19/10/17 15:55:49 ERROR llap.LlapArrowRowRecordReader: Failed to fetch Arrow > batch > java.lang.RuntimeException: STRING entry: batchIndex 45 > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.BytesReadError(VectorExtractRow.java:488) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:294) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:193) > at > org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:483) > at > org.apache.hadoop.hive.ql.io.arrow.Deserializer.deserialize(Deserializer.java:125) > at > org.apache.hadoop.hive.ql.io.arrow.ArrowColumnarBatchSerDe.deserialize(ArrowColumnarBatchSerDe.java:284) > at > org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:75) > at > org.apache.hadoop.hive.llap.LlapArrowRowRecordReader.next(LlapArrowRowRecordReader.java:41) > at datareader.LlapDump.main(LlapDump.java:124) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)