[ https://issues.apache.org/jira/browse/HIVE-25443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Syed Shameerur Rahman reassigned HIVE-25443: -------------------------------------------- > Arrow SerDe Cannot serialize/deserialize complex data types When there are > more than 1024 values > ------------------------------------------------------------------------------------------------ > > Key: HIVE-25443 > URL: https://issues.apache.org/jira/browse/HIVE-25443 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers > Affects Versions: 3.1.2, 3.1.1, 3.0.0, 3.1.0 > Reporter: Syed Shameerur Rahman > Assignee: Syed Shameerur Rahman > Priority: Major > Fix For: 4.0.0 > > > Complex data types like MAP, STRUCT cannot be serialized/deserialzed using > Arrow SerDe when there are more than 1024 values. This happens due to > ColumnVector always being initialized with a size of 1024. > Issue #1 : > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/ArrowColumnarBatchSerDe.java#L213 > Issue #2 : > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/ArrowColumnarBatchSerDe.java#L215 > Sample unit test to reproduce the case in TestArrowColumnarBatchSerDe : > {code:java} > @Test > public void testListBooleanWithMoreThan1024Values() throws SerDeException { > String[][] schema = { > {"boolean_list", "array<boolean>"}, > }; > > Object[][] rows = new Object[1025][1]; > for (int i = 0; i < 1025; i++) { > rows[i][0] = new BooleanWritable(true); > } > > initAndSerializeAndDeserialize(schema, toList(rows)); > } > > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)