[ https://issues.apache.org/jira/browse/HIVE-7953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14164597#comment-14164597 ]
Thomas Friedrich commented on HIVE-7953: ---------------------------------------- The 4 tests bucketsortoptimize_insert_2 bucketsortoptimize_insert_4 bucketsortoptimize_insert_7 bucketsortoptimize_insert_8 all fail with the same NPE related to SMB joins: order object is null in SMBMapJoinOperator: // fetch the first group for all small table aliases for (byte pos = 0; pos < order.length; pos++) { if (pos != posBigTable) { fetchNextGroup(pos); } Daemon Thread [Executor task launch worker-3] (Suspended (exception NullPointerException)) SMBMapJoinOperator.processOp(Object, int) line: 258 FilterOperator(Operator<T>).forward(Object, ObjectInspector) line: 799 FilterOperator.processOp(Object, int) line: 137 TableScanOperator(Operator<T>).forward(Object, ObjectInspector) line: 799 TableScanOperator.processOp(Object, int) line: 95 MapOperator(Operator<T>).forward(Object, ObjectInspector) line: 799 MapOperator.process(Writable) line: 536 SparkMapRecordHandler.processRow(Object, Object) line: 139 HiveMapFunctionResultList.processNextRecord(Tuple2<BytesWritable,BytesWritable>) line: 47 HiveMapFunctionResultList.processNextRecord(Object) line: 28 HiveBaseFunctionResultList$ResultIterator.hasNext() line: 108 Wrappers$JIteratorWrapper<A>.hasNext() line: 41 Iterator$class.foreach(Iterator, Function1) line: 727 Wrappers$JIteratorWrapper<A>(AbstractIterator<A>).foreach(Function1<A,U>) line: 1157 RDD$$anonfun$foreach$1.apply(Iterator<T>) line: 760 RDD$$anonfun$foreach$1.apply(Object) line: 760 SparkContext$$anonfun$runJob$3.apply(TaskContext, Iterator<T>) line: 1118 SparkContext$$anonfun$runJob$3.apply(Object, Object) line: 1118 ResultTask<T,U>.runTask(TaskContext) line: 61 ResultTask<T,U>(Task<T>).run(long) line: 56 Executor$TaskRunner.run() line: 182 ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) line: 1145 ThreadPoolExecutor$Worker.run() line: 615 Thread.run() line: 745 There is also a NPE in the FileSinkOperator: the FileSystem object fs is null: // in recent hadoop versions, use deleteOnExit to clean tmp files. if (isNativeTable) { autoDelete = fs.deleteOnExit(fsp.outPaths[0]); Daemon Thread [Executor task launch worker-1] (Suspended (exception NullPointerException)) FileSinkOperator.createBucketFiles(FileSinkOperator$FSPaths) line: 495 FileSinkOperator.closeOp(boolean) line: 925 FileSinkOperator(Operator<T>).close(boolean) line: 582 SelectOperator(Operator<T>).close(boolean) line: 594 SMBMapJoinOperator(Operator<T>).close(boolean) line: 594 DummyStoreOperator(Operator<T>).close(boolean) line: 594 FilterOperator(Operator<T>).close(boolean) line: 594 TableScanOperator(Operator<T>).close(boolean) line: 594 MapOperator(Operator<T>).close(boolean) line: 594 SparkMapRecordHandler.close() line: 175 HiveMapFunctionResultList.closeRecordProcessor() line: 57 HiveBaseFunctionResultList$ResultIterator.hasNext() line: 122 Wrappers$JIteratorWrapper<A>.hasNext() line: 41 Iterator$class.foreach(Iterator, Function1) line: 727 Wrappers$JIteratorWrapper<A>(AbstractIterator<A>).foreach(Function1<A,U>) line: 1157 RDD$$anonfun$foreach$1.apply(Iterator<T>) line: 760 RDD$$anonfun$foreach$1.apply(Object) line: 760 SparkContext$$anonfun$runJob$3.apply(TaskContext, Iterator<T>) line: 1118 SparkContext$$anonfun$runJob$3.apply(Object, Object) line: 1118 ResultTask<T,U>.runTask(TaskContext) line: 61 ResultTask<T,U>(Task<T>).run(long) line: 56 Executor$TaskRunner.run() line: 182 ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) line: 1145 ThreadPoolExecutor$Worker.run() line: 615 Thread.run() line: 745 > Investigate query failures (2) > ------------------------------ > > Key: HIVE-7953 > URL: https://issues.apache.org/jira/browse/HIVE-7953 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Brock Noland > Assignee: Thomas Friedrich > > I ran all q-file tests and the following failed with an exception: > http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-SPARK-ALL-TESTS-Build/lastCompletedBuild/testReport/ > we don't necessary want to run all these tests as part of the spark tests, > but we should understand why they failed with an exception. This JIRA is to > look into these failures and document them with one of: > * New JIRA > * Covered under existing JIRA > * More investigation required > Tests: > {noformat} > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_temp_table_external > 0.33 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_num_reducers > 4.3 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_2 > 11 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_hdfs_file_with_space_in_the_name > 0.65 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_4 > 4.7 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_7 > 2.8 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_2 > 5.5 sec 2 > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_position > 1.5 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_exim_18_part_external > 2.4 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_6 > 11 sec 2 > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_11 > 5.1 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_8 > 10 sec 2 > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parquet_join > 5.4 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats_empty_dyn_part > 0.81 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_dbtxnmgr_compact1 > 0.31 sec 2 > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_dbtxnmgr_ddl1 > 0.26 sec 2 > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_dbtxnmgr_query2 > 0.73 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_3 > 8.5 sec 2 > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_dbtxnmgr_query5 > 0.34 sec 2 > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_rcfile_bigdata > 0.93 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_single_reducer > 6.3 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_dbtxnmgr_compact3 > 2.4 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_dbtxnmgr_compact2 > 0.56 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats_partscan_1_23 > 3.1 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_list_bucket_dml_2 > 4.3 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_exim_15_external_part > 3.2 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_exim_16_part_external > 2.8 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_exim_17_part_managed > 3.4 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_exim_20_part_managed_location > 3.3 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_exim_19_00_part_external_location > 6.9 sec 2 > > org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_external_table_with_space_in_location_path > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)