[ 
https://issues.apache.org/jira/browse/HIVE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-4421:
--------------------------------

    Fix Version/s: 0.11.0
           Status: Patch Available  (was: Open)

This patch does three things:
* Improves the memory usage while writing ORC dictionaries by removing the 
counts and just storing offsets instead of offsets and lengths.
* Improves the tracking of how much memory is used by the dictionaries by 
tracking the allocation rather than the usage.
* Reduces the size of some of the allocation sizes of the integer arrays.
                
> Improve memory usage by ORC dictionaries
> ----------------------------------------
>
>                 Key: HIVE-4421
>                 URL: https://issues.apache.org/jira/browse/HIVE-4421
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.11.0
>
>         Attachments: HIVE-4421.D10545.1.patch
>
>
> Currently, for tables with many string columns, it is possible to 
> significantly underestimate the memory used by the ORC dictionaries and cause 
> the query to run out of memory in the task. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to