[ https://issues.apache.org/jira/browse/HIVE-7905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Harish Butani updated HIVE-7905: -------------------------------- Attachment: HIVE-7905.2.patch > CBO: more cost model changes > ---------------------------- > > Key: HIVE-7905 > URL: https://issues.apache.org/jira/browse/HIVE-7905 > Project: Hive > Issue Type: Sub-task > Components: CBO > Reporter: Harish Butani > Assignee: Harish Butani > Attachments: HIVE-7905.2.patch, exp-backoff-vs-log-smoothing > > > 1. For composite predicates smoothen the Selectivity calculation using > +exponential backoff+. Thanks to [~ mmokhtar] for this formula. > {quote} > Can you change the algorithm to use exponential back-off : > ndv(pe0) * ndv(pe1) ^(1/2) * ndv(pe2) ^(1/4) * ndv(pe3) ^(1/8) > Opposed to : > ndv(pex)*log(ndv(pe1))*log(ndv(pe2)) > If we assume selectivity of 0.7 for each store_sales join then join > selectivity can end up being 6.24285E-05 which is too low and eventually > results in an un-optimal plan. > {quote} > See attached picture. > 2. In case of Fact - Dim joins on the Dim primary key we infer the Join > cardinality as a filter on the Fact table: > {code} > join card = rowCount(Fact table) * selectivity(dim table) > {code} > Whether a Column is a Key is inferred based on either: > * table rowCount = column ndv > * (tbd shortly) table rowCount = (maxVal - minVal) -- This message was sent by Atlassian JIRA (v6.3.4#6332)