[ https://issues.apache.org/jira/browse/HIVE-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14218286#comment-14218286 ]
Ryan Blue commented on HIVE-8909: --------------------------------- Yes. It implements the rules for reading lists in existing data: 1. If the repeated field is not a group, then its type is the element type and elements are required. 2. If the repeated field is a group with multiple fields, then its type is the element type and elements are required. 3. If the repeated field is a group with one field and is named either "array" or uses the LIST-annotated group's name with "_tuple" appended then the repeated type is the element type and elements are required. 4. Otherwise, the repeated field's type is the element type with the repeated field's repetition. It also structures the converters to match the other projects. LIST and MAP will use ElementConverter and KeyValueConverter and the list version supports these rules while matching the ArrayWritable structure expected by the SerDe (confirmed by tests that pass in both trunk and this patch). Repeated groups that aren't annotated are deserialized into lists as before, but I changed this to put less work on the DataWritableGroupConverter that is now called StructConverter. Struct needs to support repeated inner groups, but rather than keeping a second array of objects, it passes its start() and end() calls to the repeated children converters, which use them to add the correct object to the struct. It's an easier-to-follow method that produces the same result. (By all means, please verify this!) > Hive doesn't correctly read Parquet nested types > ------------------------------------------------ > > Key: HIVE-8909 > URL: https://issues.apache.org/jira/browse/HIVE-8909 > Project: Hive > Issue Type: Bug > Reporter: Ryan Blue > Assignee: Ryan Blue > Attachments: HIVE-8909-1.patch > > > Parquet's Avro and Thrift object models don't produce the same parquet type > representation for lists and maps that Hive does. In the Parquet community, > we've defined what should be written and backward-compatibility rules for > existing data written by parquet-avro and parquet-thrift in PARQUET-113. We > need to implement those rules in the Hive Converter classes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)