[ https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15680437#comment-15680437 ]
Vineet Garg commented on HIVE-15160: ------------------------------------ TPC-DS query 35 has the same issue {noformat} select ca_state, cd_gender, cd_marital_status, count(*) cnt1, avg(cd_dep_count), max(cd_dep_count), sum(cd_dep_count), cd_dep_employed_count, count(*) cnt2, avg(cd_dep_employed_count), max(cd_dep_employed_count), sum(cd_dep_employed_count), cd_dep_college_count, count(*) cnt3, avg(cd_dep_college_count), max(cd_dep_college_count), sum(cd_dep_college_count) from customer c,customer_address ca,customer_demographics where c.c_current_addr_sk = ca.ca_address_sk and cd_demo_sk = c.c_current_cdemo_sk and exists (select * from store_sales,date_dim where c.c_customer_sk = ss_customer_sk and ss_sold_date_sk = d_date_sk and d_year = 1999 and d_qoy < 4) and (exists (select * from web_sales,date_dim where c.c_customer_sk = ws_bill_customer_sk and ws_sold_date_sk = d_date_sk and d_year = 1999 and d_qoy < 4) or exists (select * from catalog_sales,date_dim where c.c_customer_sk = cs_ship_customer_sk and cs_sold_date_sk = d_date_sk and d_year = 1999 and d_qoy < 4)) group by ca_state, cd_gender, cd_marital_status, cd_dep_count, cd_dep_employed_count, cd_dep_college_count order by ca_state, cd_gender, cd_marital_status, cd_dep_count, cd_dep_employed_count, cd_dep_college_count limit 100; {noformat} cd_dep_count, cd_dep_employed_count are not selected but are part of order by. Not sure if this would a simple fix. [~pxiong] Have you looked into this ? Any idea what would it take to fix this? Quick look at error stack suggests that it is giving up during operator tree generation while trying to resolve column names (probably for oder by ) > Can't order by an unselected column > ----------------------------------- > > Key: HIVE-15160 > URL: https://issues.apache.org/jira/browse/HIVE-15160 > Project: Hive > Issue Type: Bug > Reporter: Pengcheng Xiong > Assignee: Pengcheng Xiong > > If a grouping key hasn't been selected, Hive complains. For comparison, > Postgres does not. > Example. Notice i_item_id is not selected: > {code} > select i_item_desc > ,i_category > ,i_class > ,i_current_price > ,sum(cs_ext_sales_price) as itemrevenue > ,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over > (partition by i_class) as revenueratio > from catalog_sales > ,item > ,date_dim > where cs_item_sk = i_item_sk > and i_category in ('Jewelry', 'Sports', 'Books') > and cs_sold_date_sk = d_date_sk > and d_date between cast('2001-01-12' as date) > and (cast('2001-01-12' as date) + 30 days) > group by i_item_id > ,i_item_desc > ,i_category > ,i_class > ,i_current_price > order by i_category > ,i_class > ,i_item_id > ,i_item_desc > ,revenueratio > limit 100; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)