[ https://issues.apache.org/jira/browse/HIVE-12017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14997797#comment-14997797 ]
Ashutosh Chauhan edited comment on HIVE-12017 at 11/10/15 1:34 AM: ------------------------------------------------------------------- I went through golden file plan changes and found following categories of plan diffs: * 1) extra select operator : Many plans now have extra select operator in plans. e.g., auto_sortmerge_join_*.q * 2) agg expr lost : In some tests, it seems like we dropped the aggregation altogether count (*) e.g, auto_smb_mapjoin_14.q,auto_sortmerge_join_10.q * 3) Shuffle join warning : Some tests now are generating shuffle join warning, e.g, multiMapJoin2.q,orc_llap.q,parquet_join.q,pcr.q,pointlookup2.q * 4) extra columns : seems like column pruning issue: auto_join1.q,auto_join10.q,auto_join11.q * 5) PTF op missing : This one seems like ptf operator got dropped altogether ptfgroupbyjoin.q. * 6) Non-skew-join plan : Seems like skew join optimization is broken and we drop that optimization. e.g., skewjoin_mapjoin*.q Among these 1) & 4) are not a big concern. However, 2) & 5) could be correctness issue and 3) & 6) could be substantial perf losses. was (Author: ashutoshc): I went through golden file plan changes and found following categories of plan diffs: * 1) extra select operator : Many plans now have extra select operator in plans. e.g., auto_sortmerge_join_*.q * 2) agg expr lost : In some tests, it seems like we dropped the aggregation altogether count (*) e.g, auto_smb_mapjoin_14.q,auto_sortmerge_join_10.q * 3) Shuffle join warning : Some tests now are generating shuffle join warning, e.g, multiMapJoin2.q,orc_llap.q,parquet_join.q,pcr.q,pointlookup2.q * 4) extra columns : seems like column pruning issue: auto_join1.q,auto_join10.q,auto_join11.q * 5) PTF op missing : This one seems like ptf operator got dropped altogether ptfgroupbyjoin.q. * 6) Non-skew-join plan : Seems like skew join optimization is broken and we drop that optimization. e.g., skewjoin_mapjoin*.q Among these 1) & 4) are not a big concern. However, 2) & 5) could be correctness issue and 3) & 7) could be substantial perf losses. > Do not disable CBO by default when number of joins in a query is equal or > less than 1 > ------------------------------------------------------------------------------------- > > Key: HIVE-12017 > URL: https://issues.apache.org/jira/browse/HIVE-12017 > Project: Hive > Issue Type: Improvement > Components: CBO > Affects Versions: 2.0.0 > Reporter: Jesus Camacho Rodriguez > Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12017.01.patch, HIVE-12017.02.patch, > HIVE-12017.03.patch, HIVE-12017.04.patch, HIVE-12017.05.patch, > HIVE-12017.06.patch, HIVE-12017.07.patch, HIVE-12017.08.patch > > > Instead, we could disable some parts of CBO that are not relevant if the > query contains 1 or 0 joins. Implementation should be able to define easily > other query patterns for which we might disable some parts of CBO (in case we > want to do it in the future). -- This message was sent by Atlassian JIRA (v6.3.4#6332)