[jira] [Work started] (HIVE-9692) Allocate only parquet selected columns in HiveStructConverter class

JIRA Fri, 13 Feb 2015 15:28:29 -0800

     [ 
https://issues.apache.org/jira/browse/HIVE-9692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Work on HIVE-9692 started by Sergio Peña.
-----------------------------------------
> Allocate only parquet selected columns in HiveStructConverter class
> -------------------------------------------------------------------
>
>                 Key: HIVE-9692
>                 URL: https://issues.apache.org/jira/browse/HIVE-9692
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergio Peña
>            Assignee: Sergio Peña
>
> HiveStructConverter class is where Hive converts parquet objects to hive 
> writable objects that will be later parsed by object inspectors. This class 
> is allocating enough writable objects as number of columns of the file schema.
> {noformat}
> ublic HiveStructConverter(final GroupType requestedSchema, final GroupType 
> tableSchema, Map<String, String> metadata) {
> ...
> this.writables = new Writable[fileSchema.getFieldCount()];
> ...
> }
> {noformat}
> This is always allocated even if we only select a specific number of columns. 
> Let's say 2 columns from a table of 50 columns. 50 objects are allocated. 
> Only 2 are used, and 48 are unused.
> We should be able to allocate only the requested number of columns in order 
> to save memory usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HIVE-9692) Allocate only parquet selected columns in HiveStructConverter class

Reply via email to