[ https://issues.apache.org/jira/browse/HIVE-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navis updated HIVE-5298: ------------------------ Fix Version/s: (was: 0.13.0) 0.14.0 > AvroSerde performance problem caused by HIVE-3833 > ------------------------------------------------- > > Key: HIVE-5298 > URL: https://issues.apache.org/jira/browse/HIVE-5298 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Affects Versions: 0.11.0 > Reporter: Xuefu Zhang > Assignee: Xuefu Zhang > Fix For: 0.14.0 > > Attachments: HIVE-5298.1.patch, HIVE-5298.patch > > > HIVE-3833 fixed the targeted problem and made Hive to use partition-level > metadata to initialize object inspector. In doing that, however, it goes thru > every file under the table to access the partition metadata, which is very > inefficient, especially in case of multiple files per partition. This causes > more problem for AvroSerde because AvroSerde initialization accesses schema, > which is located on file system. As a result, before hive can process any > data, it needs to access every file for a table, which can take long enough > to cause job failure because of lack of job progress. > The improvement can be made so that partition metadata is only access once > per partition. -- This message was sent by Atlassian JIRA (v6.2#6252)