[ https://issues.apache.org/jira/browse/HIVE-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13840799#comment-13840799 ]
Navis commented on HIVE-5964: ----------------------------- I didn't checked exactly, but HIVE-4293 is seemingly fixing this issue. trunk {noformat} STAGE PLANS: Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: t2:t1:test_tbl TableScan alias: test_tbl Select Operator expressions: expr: id type: string expr: name type: string outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string {noformat} HIVE-4293 {noformat} STAGE PLANS: Stage: Stage-2 Map Reduce Alias -> Map Operator Tree: t2:t1:test_tbl TableScan alias: test_tbl Filter Operator predicate: expr: ((name = 'c') and (id = 'a')) type: boolean Select Operator expressions: expr: id type: string expr: name type: string outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + tag: -1 value expressions: expr: _col0 type: string expr: _col1 type: string {noformat} > Hive missing a filter predicate causing wrong results joining tables after > sort by > ---------------------------------------------------------------------------------- > > Key: HIVE-5964 > URL: https://issues.apache.org/jira/browse/HIVE-5964 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 0.10.0, 0.11.0 > Reporter: dima machlin > Priority: Blocker > > It seems like the optimization of predicate pushdown is failing under certain > conditions causing wrong results as a filter predicate appears to be > completely disregarded by the query processor for some reason. > Here is the scenario (assuming "dual" table exists) : > set hive.optimize.ppd=true; > drop table if exists test_tbl ; > create table test_tbl (id string,name string); > insert into table test_tbl > select 'a','b' from dual; > test_tbl now contains : > a b > the following query : > select t2.* > from > (select id,name from (select id,name from test_tbl) t1 sort by id) t2 > join test_tbl t3 on (t2.id=t3.id ) > where t2.name='c' and t3.id='a'; > returns : > a b > The filter :" t2.name='c' " is missing from the execution plan and obviously > doesn't apply. > The filter "t3.id='a' " does appear in the plan and is being applied before > the join. > If the query changes a little bit like removing the sort by, removing the t1 > sub-query or disabling hive.optimize.ppd then the predicate appears. > I'm able to reproduce the problem both in Hive 0.10 and Hive 0.11 although It > seems to work fine in Hive 0.7 -- This message was sent by Atlassian JIRA (v6.1#6144)