[ https://issues.apache.org/jira/browse/HIVE-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Siying Dong updated HIVE-1758: ------------------------------ Status: Patch Available (was: Open) > optimize group by hash map memory > --------------------------------- > > Key: HIVE-1758 > URL: https://issues.apache.org/jira/browse/HIVE-1758 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Namit Jain > Assignee: Siying Dong > Attachments: HIVE-1758.1.patch > > > Group By map side's hash map consumes a lot of memory, thereby decreasing its > effectiveness. > We can use some of the optimizations from map-join to reduce the memory > footprint: > class KeyWrapper { > int hashcode; > ArrayList<Object> keys; > // decide whether this is already in hashmap (keys in hashmap are > deepcopied > // version, and we need to use 'currentKeyObjectInspector'). > boolean copy = false; > 1. Changes keys to Array > 2. Optimize the scenario when keys is of a small size (1,2) etc > Let us start profiling it and take it from there -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.