[ https://issues.apache.org/jira/browse/HIVE-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-1758: ----------------------------- Resolution: Fixed Fix Version/s: 0.7.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Siying > optimize group by hash map memory > --------------------------------- > > Key: HIVE-1758 > URL: https://issues.apache.org/jira/browse/HIVE-1758 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Namit Jain > Assignee: Siying Dong > Fix For: 0.7.0 > > Attachments: HIVE-1758.1.patch > > > Group By map side's hash map consumes a lot of memory, thereby decreasing its > effectiveness. > We can use some of the optimizations from map-join to reduce the memory > footprint: > class KeyWrapper { > int hashcode; > ArrayList<Object> keys; > // decide whether this is already in hashmap (keys in hashmap are > deepcopied > // version, and we need to use 'currentKeyObjectInspector'). > boolean copy = false; > 1. Changes keys to Array > 2. Optimize the scenario when keys is of a small size (1,2) etc > Let us start profiling it and take it from there -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.