[ https://issues.apache.org/jira/browse/HIVE-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161477#comment-14161477 ]
Mostafa Mokhtar commented on HIVE-8280: --------------------------------------- The issue is not yet fixed as PK side always returns selectivity of 1 due to the following code : {code} if (pkSide == 1) { FKSideInfo fkInfo = new FKSideInfo(leftRowCount, leftNDV); PKSideInfo pkInfo = new PKSideInfo(rightRowCount, rightNDV, joinRel.getJoinType().generatesNullsOnLeft() ? 1.0 : isPKSideSimpleTree ? RelMetadataQuery.getSelectivity(right, rightPred) : 1.0); {code} When a filter is applied on the PK side isPKSideSimpleTree is false. Logs below show that the PK side has a row count of 19.27 but selectivity is 1. {code} HiveProjectRel(ss_customer_sk=[$1], ss_item_sk=[$0], ss_ticket_number=[$3]): rowcount = 5.50076554E8, cumulative cost = {5.500765732727273E8 rows, 0.0 cpu, 0.0 io}, id = 170 HiveProjectRel(ss_item_sk=[$0], ss_customer_sk=[$1], ss_store_sk=[$2], ss_ticket_number=[$3], s_store_sk=[$4], s_market_id=[$5]): rowcount = 5.50076554E8, cumulative cost = {5.500765732727273E8 rows, 0.0 cpu, 0.0 io}, id = 212 HiveJoinRel(condition=[=($4, $2)], joinType=[inner]): rowcount = 5.50076554E8, cumulative cost = {5.500765732727273E8 rows, 0.0 cpu, 0.0 io}, id = 207 HiveProjectRel(ss_item_sk=[$1], ss_customer_sk=[$2], ss_store_sk=[$6], ss_ticket_number=[$8]): rowcount = 5.50076554E8, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 162 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200_orig.store_sales]]): rowcount = 5.50076554E8, cumulative cost = {0}, id = 6 HiveProjectRel(s_store_sk=[$0], s_market_id=[$10]): rowcount = 19.272727272727273, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 166 HiveFilterRel(condition=[=($10, 4)]): rowcount = 19.272727272727273, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 164 HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200_orig.store]]): rowcount = 212.0, cumulative cost = {0}, id = 5 {code} > CBO : When filter is applied on dimension table PK/FK code path is not in > effect. > --------------------------------------------------------------------------------- > > Key: HIVE-8280 > URL: https://issues.apache.org/jira/browse/HIVE-8280 > Project: Hive > Issue Type: Bug > Components: CBO > Affects Versions: 0.14.0 > Reporter: Mostafa Mokhtar > Assignee: Harish Butani > Fix For: 0.14.0 > > Attachments: HIVE-8280.1.patch, HIVE-8280.2.patch > > > When a filter is applied on PK side joins don't qualify as PK/FK join. > In getUniqueKeys when a filter is applied on the table the child is no > longer a table scan. > {code} > public Set<BitSet> getUniqueKeys(ProjectRelBase rel, boolean ignoreNulls) { > RelNode child = rel.getChild(); > if (!(child instanceof HiveTableScanRel)) { > Function<RelNode, Metadata> fn = RelMdUniqueKeys.SOURCE.apply( > rel.getClass(), BuiltInMetadata.UniqueKeys.class); > return ((BuiltInMetadata.UniqueKeys) fn.apply(rel)) > .getUniqueKeys(ignoreNulls); > } > {code} > Repro > {code} > with ss as > (select > ss_customer_sk, ss_item_sk, ss_ticket_number > from > store_sales, > store > where > s_store_sk = ss_store_sk > and s_market_id = 4), > sr as > (select sr_customer_sk,sr_item_sk ,sr_ticket_number from store_returns, store > where s_store_sk = sr_store_sk and s_market_id=4) > select > count(*) > from > ss, > sr > where > ss_customer_sk = sr_customer_sk > and ss_item_sk = sr_item_sk > and ss_ticket_number = sr_ticket_number; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)