[ https://issues.apache.org/jira/browse/HIVE-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977003#comment-14977003 ]
Bolke de Bruin commented on HIVE-12216: --------------------------------------- Update: we have noticed this mainly happens with large datasets and smaller datasets do not seem to affected. Also we see a similar issue that we will report separately. > WHERE on the FROM table not (always) working when JOIN are present > ------------------------------------------------------------------ > > Key: HIVE-12216 > URL: https://issues.apache.org/jira/browse/HIVE-12216 > Project: Hive > Issue Type: Bug > Components: hpl/sql, Parser, Query Processor, SQL > Affects Versions: 1.1.0, 1.2.1 > Environment: CDH 5.4.7 HDP2.3.2 MR TEZ > Reporter: Bolke de Bruin > Priority: Blocker > > In case we use a where clause in a state where also joins are present, the > clauses are not (always) respected. We have been able to reproduce this issue > consistently with Hive 1.1.0 on MR, Hive 1.2.1 on Tez (MR Fails here). > So fo the below query we *do* get results back like: > 'gs.i_s_c = 23' (and this goes for all clauses!) > CREATE TABLE tmp_hub_and_sats AS > SELECT > f.dt, > f.t_c, > sum(f.transaction_amount) as amount, > sum(f.amount_euro) amount_euro, > IF(f.org_grid is null, f.org_cust, f.org_grid) as org, > IF(f.org_grid is null, 0, 1) as is_org_grid, > IF(f.org_up is null, if(f.org_grid is null, f.org_cust, f.org_grid), > f.org_up) as org_up, > IF(f.to_grid is null, f.to_cust, f.to_grid) to, > IF(f.to_grid is null, 0, 1) as is_to_grid, > IF(f.to_up is null, if(f.to_grid is null, f.to_cust, f.to_grid), f.to_up) > as to_up, > gh.i_g_c as customer_code_hub, > gs.i_g_c as customer_code_satellite > from x_grid_orders f > LEFT OUTER JOIN > grid.grid gh > ON f.org_grid = gh.hashed_gridid > LEFT OUTER JOIN > grid.grid gs > ON f.to_grid = gs.hashed_gridid > where > IF(f.org_up is null, f.org_cust, f.org_up) <> IF(f.to_up is null, f.to_cust, > f.to_up) > AND > (substring(gh.i_g_c, 1, 2) <> "06" or gh.i_g_c is null) > AND > (substring(gs.i_g_c, 1, 2) <> "06" or gs.i_g_c is null) > AND > (gh.i_s_c <> "23" or gh.i_s_c is null) > AND > (gs.i_s_c <> "23" or gs.i_s_c is null) > group by > f.dt, > f.t_c, > IF(f.org_grid is null, f.org_cust, f.org_grid), > IF(f.org_grid is null, 0, 1), > IF(f.org_up is null, if(f.org_grid is null, f.org_cust, f.org_grid), > f.org_up), > IF(f.to_grid is null, f.to_cust, f.to_grid), > IF(f.to_grid is null, 0, 1), > IF(f.to_up is null, if(f.to_grid is null, f.to_cust, f.to_grid), f.to_up), > gh.i_g_c, > gs.i_g_c -- This message was sent by Atlassian JIRA (v6.3.4#6332)