[ 
https://issues.apache.org/jira/browse/HIVE-14652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14652:
------------------------------------
    Attachment: HIVE-14652.patch

The fix (and also a refactor of the class to not have a million-line method).
I have a vague feeling that most of the logic in this method is  bogus, but it 
may be just because I am missing something, because it apparently works. The 
main question is, why do we evaluate UDFs on partition values from the pruned 
set for the filters that we purport to remove, if we have just used the same 
filters to prune the partitions, so one of the two should be true - either we 
cannot eliminate the filter, or the final result of all the expressions is 
known to be true (or not matter). So we'd insta-bail as soon as we'd see any 
disagreement after evaluation; or have a walk state that indicates the value 
doesn't matter.
I don't really know if that's the case or if I'm missing something here. 

So for now the fix is to change the new IN logic introduced by HIVE-11424 to 
follow the same twisted logic. 
Let's see what that breaks.

The problem is that HIVE-11424 changes IN to true if there's a column on the 
left side, but, as described above, this IN was used to filter the partitions, 
so in the NOT IN case, IN is guaranteed to be false. So, while the "regular" 
logic would have confirmed that and then applied NOT to the false constant, the 
current code  results in NOT being applied to the true constant.

cc [~jcamachorodriguez] [~ashutoshc]

> incorrect results for not in on partition columns
> -------------------------------------------------
>
>                 Key: HIVE-14652
>                 URL: https://issues.apache.org/jira/browse/HIVE-14652
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.1.0, 2.2.0
>            Reporter: stephen sprague
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-14652.patch
>
>
> {noformat}
> create table foo (i int) partitioned by (s string);
> insert overwrite table foo partition(s='foo') select cint from alltypesorc 
> limit 10;
> insert overwrite table foo partition(s='bar') select cint from alltypesorc 
> limit 10;
> select * from foo where s not in ('bar');
> {noformat}
> No results. IN ... works correctly



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to