[ https://issues.apache.org/jira/browse/HIVE-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on HIVE-9692 started by Sergio Peña. ----------------------------------------- > Allocate only parquet selected columns in HiveStructConverter class > ------------------------------------------------------------------- > > Key: HIVE-9692 > URL: https://issues.apache.org/jira/browse/HIVE-9692 > Project: Hive > Issue Type: Sub-task > Reporter: Sergio Peña > Assignee: Sergio Peña > > HiveStructConverter class is where Hive converts parquet objects to hive > writable objects that will be later parsed by object inspectors. This class > is allocating enough writable objects as number of columns of the file schema. > {noformat} > ublic HiveStructConverter(final GroupType requestedSchema, final GroupType > tableSchema, Map<String, String> metadata) { > ... > this.writables = new Writable[fileSchema.getFieldCount()]; > ... > } > {noformat} > This is always allocated even if we only select a specific number of columns. > Let's say 2 columns from a table of 50 columns. 50 objects are allocated. > Only 2 are used, and 48 are unused. > We should be able to allocate only the requested number of columns in order > to save memory usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)