[ https://issues.apache.org/jira/browse/HIVE-16022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15879999#comment-15879999 ]
Jason Dere commented on HIVE-16022: ----------------------------------- Noticed a couple of problems when I run the semijoin optimization on a MERGE statement: - DynamicPartitionPruningOptimization.generateSemiJoinOperator(): parentOfRS does not necessarily have to be a SelectOperator - in this case it is a TS. As a result we are missing some important checking on whether this table is appropriate for semijoin opt. - grandParent.getChildren().add(bloomFilterNode) - This wrongly assumes grandParent is AND: In this case, there was no previous filterExpr so grandParent is BETWEEN. Adding the child here incorrectly adds a new parameter to BETWEEN , which is probably getting ignored. This is why in_bloom_filter() is not in the EXPLAIN. > BloomFilter check not showing up in MERGE statement queries > ----------------------------------------------------------- > > Key: HIVE-16022 > URL: https://issues.apache.org/jira/browse/HIVE-16022 > Project: Hive > Issue Type: Bug > Components: Query Planning > Reporter: Jason Dere > Assignee: Jason Dere > Attachments: HIVE-16022.1.patch > > > Running explain on a MERGE statement with runtime filtering enabled, I see > the min/max being applied on the large table, but not the bloom filter check: > {noformat} > explain merge into acidTbl as t using nonAcidOrcTbl s ON t.a = s.a > WHEN MATCHED AND s.a > 8 THEN DELETE > WHEN MATCHED THEN UPDATE SET b = 7 > WHEN NOT MATCHED THEN INSERT VALUES(s.a, s.b) > ... > Map 1 > Map Operator Tree: > TableScan > alias: t > Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE > Filter Operator > predicate: a BETWEEN DynamicValue(RS_3_s_a_min) AND > DynamicValue(RS_3_s_a_max) (type: boolean) > Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL > Column stats: NONE > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)