Syed Atif Akhtar created HIVE-16470: ---------------------------------------
Summary: Union all on a table where one of the partitions is empty gives a NullPointerException when using Hive on Spark Key: HIVE-16470 URL: https://issues.apache.org/jira/browse/HIVE-16470 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 1.1.0 Environment: Cloudera Hive version 1.1.0-cdh5.8.2 Cloudera Spark 1.6.0 Reporter: Syed Atif Akhtar With hive.execution.engine=spark; when you do a Union All on a table where at least one of the partitions does not exist or is empty,the query will fail with a NullPointerException. {quote} set hive.execution.engine=spark; CREATE TABLE inputtbl(a string) PARTITIONED BY (somepartition STRING); INSERT OVERWRITE TABLE inputtbl PARTITION (somepartition='somevalue') SELECT '1' FROM DUAL; SELECT * FROM inputtbl WHERE somepartition='somevalue' UNION ALL SELECT * FROM inputtbl WHERE somepartition='someothervalue'; {quote} The last statement fails with the below exception {quote} 2017-04-18 17:21:18,126 WARN org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: Thread-27259]: Error executing statement: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: NullPointerException null at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:374) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:136) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:206) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:316) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:424) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:401) at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) at com.sun.proxy.$Proxy26.executeStatementAsync(Unknown Source) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:258) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:503) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.setInputFormat(SparkCompiler.java:275) at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.setInputFormat(SparkCompiler.java:254) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:222) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10091) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9884) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:223) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:446) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:312) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1201) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1188) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:134) ... 26 more {quote} It works however when you do the following set hive.execution.engine=mr; -- This message was sent by Atlassian JIRA (v6.3.15#6346)