[ 
https://issues.apache.org/jira/browse/HIVE-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-9692 started by Sergio Peña.
-----------------------------------------
> Allocate only parquet selected columns in HiveStructConverter class
> -------------------------------------------------------------------
>
>                 Key: HIVE-9692
>                 URL: https://issues.apache.org/jira/browse/HIVE-9692
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergio Peña
>            Assignee: Sergio Peña
>
> HiveStructConverter class is where Hive converts parquet objects to hive 
> writable objects that will be later parsed by object inspectors. This class 
> is allocating enough writable objects as number of columns of the file schema.
> {noformat}
> ublic HiveStructConverter(final GroupType requestedSchema, final GroupType 
> tableSchema, Map<String, String> metadata) {
> ...
> this.writables = new Writable[fileSchema.getFieldCount()];
> ...
> }
> {noformat}
> This is always allocated even if we only select a specific number of columns. 
> Let's say 2 columns from a table of 50 columns. 50 objects are allocated. 
> Only 2 are used, and 48 are unused.
> We should be able to allocate only the requested number of columns in order 
> to save memory usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to