[ https://issues.apache.org/jira/browse/HIVE-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Harish Butani updated HIVE-6955: -------------------------------- Status: Patch Available (was: Open) > ExprNodeColDesc isSame doesn't account for tabAlias: this affects trait > Propagation in Joins > -------------------------------------------------------------------------------------------- > > Key: HIVE-6955 > URL: https://issues.apache.org/jira/browse/HIVE-6955 > Project: Hive > Issue Type: Bug > Reporter: Harish Butani > Assignee: Harish Butani > Attachments: HIVE-6955.1.patch > > > For tpcds Q15: > {code} > explain > select ca_zip, sum(cs_sales_price) > from catalog_sales, customer, customer_address, date_dim > where catalog_sales.cs_bill_customer_sk = customer.c_customer_sk > and customer.c_current_addr_sk = customer_address.ca_address_sk > and (substr(ca_zip,1,5) in ('85669', '86197','88274','83405','86475', > '85392', '85460', '80348', '81792') > or ca_state in ('CA','WA','GA') > or cs_sales_price > 500) > and catalog_sales.cs_sold_date_sk = date_dim.d_date_sk > and d_qoy = 2 and d_year = 2001 > group by ca_zip > order by ca_zip > limit 100; > {code} > The Traits setup for the Operators are: > {code} > FIL[23]: bucketCols=[[]],numBuckets=-1 > RS[11]: bucketCols=[[VALUE._col0]],numBuckets=-1 > JOIN[12]: bucketCols=[[_col71], [_col71]],numBuckets=-1 > FIL[13]: bucketCols=[[_col71], [_col71]],numBuckets=-1 > SEL[14]: bucketCols=[[_col71], [_col71]],numBuckets=-1 > GBY[15]: bucketCols=[[_col0]],numBuckets=-1 > RS[16]: bucketCols=[[KEY._col0]],numBuckets=-1 > GBY[17]: bucketCols=[[_col0]],numBuckets=-1 > SEL[18]: bucketCols=[[_col0]],numBuckets=-1 > LIM[21]: bucketCols=[[_col0]],numBuckets=-1 > FS[22]: bucketCols=[[_col0]],numBuckets=-1 > TS[3]: bucketCols=[[]],numBuckets=-1 > RS[5]: bucketCols=[[VALUE._col0]],numBuckets=-1 > JOIN[6]: bucketCols=[[_col3], [_col36]],numBuckets=-1 > RS[7]: bucketCols=[[VALUE._col40]],numBuckets=-1 > JOIN[9]: bucketCols=[[_col40], [_col0]],numBuckets=-1 > RS[10]: bucketCols=[[VALUE._col0]],numBuckets=-1 > TS[1]: bucketCols=[[]],numBuckets=-1 > RS[8]: bucketCols=[[VALUE._col0]],numBuckets=-1 > TS[0]: bucketCols=[[]],numBuckets=-1 > RS[4]: bucketCols=[[VALUE._col3]],numBuckets=-1 > {code} > This is incorrect: > Join[9] joins ca join (cs join cust). In this case both sides of join have a > '_col0' column. The reverse mapping of trait propagation relies on > ExprNodeColumnDesc.isSame; since this doesn't account for the tabAlias we end > up with Join[9] being bucketed on cs_sold_date_sk; Join[12] has the same > issue, only compounds the error. -- This message was sent by Atlassian JIRA (v6.2#6252)