[ https://issues.apache.org/jira/browse/HIVE-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Szehon Ho updated HIVE-8943: ---------------------------- Attachment: HIVE-8943.patch Attaching a patch to add size of connected mapjoin operator in same work (spark-stage) to the calculation of whether to convert current join to mapjoin. I added two unit tests to stress this case, but would rather wait for HIVE-8946 as the tests wont be using mapjoin until then. > Fix memory limit check for combine nested mapjoins [Spark Branch] > ----------------------------------------------------------------- > > Key: HIVE-8943 > URL: https://issues.apache.org/jira/browse/HIVE-8943 > Project: Hive > Issue Type: Sub-task > Components: Spark > Affects Versions: spark-branch > Reporter: Szehon Ho > Assignee: Szehon Ho > Attachments: HIVE-8943.patch > > > Its the opposite problem of what we thought in HIVE-8701. > SparkMapJoinOptimizer actually does combine nested mapjoins into one work due > to removal of RS for big-table. So we actually need to enhance the check to > calculate if all the MapJoins in that work (spark-stage) will fit into the > memory, otherwise it might overwhelm memory for that particular spark > executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)