[ https://issues.apache.org/jira/browse/HIVE-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Liyin Tang updated HIVE-1754: ----------------------------- Status: Patch Available (was: Open) This patch modifies the following things 1) Remove the JDBM from Hive 2) All the data in the small table will be stored in in-memory hashtable. 3) Create a light-weight RowContainer: MapJoinRowContainer. 4) Optimize MapJoinObjectKey. If there are only one join key or two join keys, it will use MapJoinSingleKey or MapJoinDoulbeKeys instead of MapJoinObjectKey. > Remove JDBM component from Map Join > ----------------------------------- > > Key: HIVE-1754 > URL: https://issues.apache.org/jira/browse/HIVE-1754 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Affects Versions: 0.6.0, 0.7.0 > Reporter: Liyin Tang > Assignee: Liyin Tang > Fix For: 0.7.0 > > Attachments: Hive-1754.patch > > > Right now, JDBM is the major performance bottleneck of performance. > With the growth of the small table, the PUT and GET operation will take most > of execution time. > Map Join is designed to load the data of small table into memory. > If the data is too large to hold in memory, then there is no need to use the > map join strategy. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.