Jesus Camacho Rodriguez created HIVE-10770:
----------------------------------------------

             Summary: Recognize additional common factors in Filter predicates
                 Key: HIVE-10770
                 URL: https://issues.apache.org/jira/browse/HIVE-10770
             Project: Hive
          Issue Type: Bug
            Reporter: Jesus Camacho Rodriguez
            Assignee: Jesus Camacho Rodriguez


Currently, we canonize predicates at the term level (i.e. "a or b or a" becomes 
"a or b" but we do not attempt to recognize terms that are equivalent). 
Further, we do not exploit e.g. the symmetry of '=' (i.e. a = b iff b = a).

- A first extension would be to normalize comparisons between field references 
and literals so that the lower field reference is always on the left. So, "$6 = 
$3" becomes "$3 = $6"; "$6 > $3" becomes "$3< $6". And "literal <= $5" becomes 
"$5 >= literal". This would not damage performance, and would improve a few 
plans.

- Another possible extension. Given the predicate "(a or b) and ((x and a) or 
(y and b))", the first factor can be removed so the expression consists only of 
"(x and a) or (y and b)".
One possible way to recognize such cases is to transform the second factor to 
CNF i.e. "(x or y) and (x or b) and (a or y) and (a or b)", and as it contains 
"(a or b)", we would know that we can discard it. Then we could just use the 
original expression i.e. "(x and a) or (y and b)" in the predicate, once we 
have done the check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to