[ 
https://issues.apache.org/jira/browse/HIVE-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16598030#comment-16598030
 ] 

Vineet Garg commented on HIVE-18873:
------------------------------------

I looked into the test failure (TestAccumuloCliDriver). Query {{SELECT u.key, 
u.country, c.name, c.key FROM users u JOIN countries c ON (u.country = c.key)}} 
throws following exception:
{noformat}
java.lang.RuntimeException: Unexpected residual predicate: key is not null
        at 
org.apache.hadoop.hive.accumulo.predicate.AccumuloPredicateHandler.getSearchConditions(AccumuloPredicateHandler.java:396)
        at 
org.apache.hadoop.hive.accumulo.predicate.AccumuloPredicateHandler.getIterators(AccumuloPredicateHandler.java:315)
        at 
org.apache.hadoop.hive.accumulo.mr.HiveAccumuloTableInputFormat.getSplits(HiveAccumuloTableInputFormat.java:146)
        at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:506)
        at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:752)
        at 
org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:341)
        at 
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:332)
        at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:203)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
        at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
        at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
        at 
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:423)
        at 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:210)
        at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2727)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2379)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2049)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1749)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1743)
        at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
        at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335)
        at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1347)
        at 
org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1321)
        at 
org.apache.hadoop.hive.cli.control.CoreAccumuloCliDriver.runTest(CoreAccumuloCliDriver.java:113)
        at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
        at 
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver(TestAccumuloCliDriver.java:59)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at 
org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
        at org.junit.runners.Suite.runChild(Suite.java:127)
        at org.junit.runners.Suite.runChild(Suite.java:26)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
        at 
org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:73)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
        at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
        at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
        at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
        at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
        at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
{noformat}
This is due to Hive pushing {{is not null}} predicate down to storage handler 
and then {{AccumuloPredicateHandler}} throwing up because it can only handle 
binary comparison predicate (>=, <=).

This is reproducible without the patch with following simple query:
{code:sql}
SELECT u.key, u.country FROM users u where u.key is not null;
{code}

> Skipping predicate pushdown for MR silently at HiveInputFormat can cause 
> storage handlers to produce erroneous result
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-18873
>                 URL: https://issues.apache.org/jira/browse/HIVE-18873
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: Ankit Singhal
>            Assignee: Ankit Singhal
>            Priority: Major
>         Attachments: HIVE-18873.2.patch, HIVE-18873.2_reattach.patch, 
> HIVE-18873.patch
>
>
> {code:java}
> // disable filter pushdown for mapreduce when there are more than one table 
> aliases,
>     // since we don't clone jobConf per alias
>     if (mrwork != null && mrwork.getAliases() != null && 
> mrwork.getAliases().size() > 1 &&
>       jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname).equals("mr")) {
>       return;
>     }
> {code}
> I believe this needs to be handled at OpProcFactory so that hive doesn't 
> believe that predicate is handled by storage handler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to