[ https://issues.apache.org/jira/browse/HIVE-25893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Soumyakanti Das reassigned HIVE-25893: -------------------------------------- > NPE when reading Parquet data because ColumnVector isNull[] is not updated > -------------------------------------------------------------------------- > > Key: HIVE-25893 > URL: https://issues.apache.org/jira/browse/HIVE-25893 > Project: Hive > Issue Type: Bug > Reporter: Soumyakanti Das > Assignee: Soumyakanti Das > Priority: Major > > In > [VectorizedListColumnReader.java|https://github.com/apache/hive/blob/595f3bc9d612f02581bd3377ee0107efd6553ae6/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedListColumnReader.java] > {{isNull[]}} is used in the comparison methods ( eg. > [columnVectorsDifferNullForSameIndex > |https://github.com/apache/hive/blob/595f3bc9d612f02581bd3377ee0107efd6553ae6/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedListColumnReader.java#L524] > ), however, {{isNull}} is always {{false}} as it is never updated in > [getChildData|https://github.com/apache/hive/blob/595f3bc9d612f02581bd3377ee0107efd6553ae6/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedListColumnReader.java#L401]. > This could result in NullPointerException like, > {code} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.compareBytesColumnVector(VectorizedListColumnReader.java:506) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.compareColumnVector(VectorizedListColumnReader.java:432) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.setIsRepeating(VectorizedListColumnReader.java:367) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.convertValueListToListColumnVector(VectorizedListColumnReader.java:360) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.readBatch(VectorizedListColumnReader.java:83) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedMapColumnReader.readBatch(VectorizedMapColumnReader.java:57) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:438) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:377) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:100) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:375) > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)