[ https://issues.apache.org/jira/browse/HIVE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Owen O'Malley updated HIVE-4421: -------------------------------- Fix Version/s: 0.11.0 Status: Patch Available (was: Open) This patch does three things: * Improves the memory usage while writing ORC dictionaries by removing the counts and just storing offsets instead of offsets and lengths. * Improves the tracking of how much memory is used by the dictionaries by tracking the allocation rather than the usage. * Reduces the size of some of the allocation sizes of the integer arrays. > Improve memory usage by ORC dictionaries > ---------------------------------------- > > Key: HIVE-4421 > URL: https://issues.apache.org/jira/browse/HIVE-4421 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers > Reporter: Owen O'Malley > Assignee: Owen O'Malley > Fix For: 0.11.0 > > Attachments: HIVE-4421.D10545.1.patch > > > Currently, for tables with many string columns, it is possible to > significantly underestimate the memory used by the ORC dictionaries and cause > the query to run out of memory in the task. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira