[ https://issues.apache.org/jira/browse/HIVE-27375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HIVE-27375: ---------------------------------- Labels: pull-request-available (was: ) > SharedWorkOptimizer assigns a common cache key to MapJoin operators that > should not share MapJoin tables > -------------------------------------------------------------------------------------------------------- > > Key: HIVE-27375 > URL: https://issues.apache.org/jira/browse/HIVE-27375 > Project: Hive > Issue Type: Bug > Reporter: Sungwoo Park > Priority: Major > Labels: pull-request-available > > When hive.optimize.shared.work.mapjoin.cache.reuse is set to true, > SharedWorkOptimizer sometimes assigns a common cache key to MapJoin operators > that should not share MapJoin tables. This bug occurs only for MapJoin > operators with 3 or more parent operators. > Example: > MAPJOIN[575] (RS_83, GBY_66, RS_85) > MAPJOIN[585] (RS_212, RS_213, GBY_210) > In this example, both MAPJOIN[575] and MAPJOIN[585] have three parent > operators. The current implementation assigns a common cache key to > MAPJOIN[575] and MAPJOIN[585] because RS_83 are RS_212 are equivalent. > However, MAPJOIN[575] uses GBY_66 for its big table whereas MAPJOIN[585] uses > GBY_210 for its big table. As a result, the MapJoin table loaded by one > operator cannot be used by the other. -- This message was sent by Atlassian Jira (v8.20.10#820010)