[ https://issues.apache.org/jira/browse/HIVE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346247#comment-16346247 ]
Hive QA commented on HIVE-17396: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12908438/HIVE-17396.8.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 12862 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_recursive_mapjoin] (batchId=180) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.hcatalog.common.TestHiveClientCache.testCloseAllClients (batchId=200) org.apache.hive.hcatalog.listener.TestDbNotificationListener.dropDatabase (batchId=242) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8938/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8938/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8938/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 22 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12908438 - PreCommit-HIVE-Build > Support DPP with map joins where the source and target belong in the same > stage > ------------------------------------------------------------------------------- > > Key: HIVE-17396 > URL: https://issues.apache.org/jira/browse/HIVE-17396 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Janaki Lahorani > Assignee: Janaki Lahorani > Priority: Major > Attachments: HIVE-17396.1.patch, HIVE-17396.2.patch, > HIVE-17396.3.patch, HIVE-17396.4.patch, HIVE-17396.5.patch, > HIVE-17396.6.patch, HIVE-17396.7.patch, HIVE-17396.8.patch > > > When the target of a partition pruning sink operator is in not the same as > the target of hash table sink operator, both source and target gets scheduled > within the same spark job, and that can result in File Not Found Exception. > HIVE-17225 has a fix to disable DPP in that scenario. This JIRA is to > support DPP for such cases. > Test Case: > SET hive.spark.dynamic.partition.pruning=true; > SET hive.auto.convert.join=true; > SET hive.strict.checks.cartesian.product=false; > CREATE TABLE part_table1 (col int) PARTITIONED BY (part1_col int); > CREATE TABLE part_table2 (col int) PARTITIONED BY (part2_col int); > CREATE TABLE reg_table (col int); > ALTER TABLE part_table1 ADD PARTITION (part1_col = 1); > ALTER TABLE part_table2 ADD PARTITION (part2_col = 1); > ALTER TABLE part_table2 ADD PARTITION (part2_col = 2); > INSERT INTO TABLE part_table1 PARTITION (part1_col = 1) VALUES (1); > INSERT INTO TABLE part_table2 PARTITION (part2_col = 1) VALUES (1); > INSERT INTO TABLE part_table2 PARTITION (part2_col = 2) VALUES (2); > INSERT INTO table reg_table VALUES (1), (2), (3), (4), (5), (6); > EXPLAIN SELECT * > FROM part_table1 pt1, > part_table2 pt2, > reg_table rt > WHERE rt.col = pt1.part1_col > AND pt2.part2_col = pt1.part1_col; > Plan: > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-1 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-2 > Spark > #### A masked pattern was here #### > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: pt1 > Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: col (type: int), part1_col (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 1 Data size: 1 Basic stats: > COMPLETE Column stats: NONE > Spark HashTable Sink Operator > keys: > 0 _col1 (type: int) > 1 _col1 (type: int) > 2 _col0 (type: int) > Select Operator > expressions: _col1 (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 1 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 1 Basic stats: > COMPLETE Column stats: NONE > Spark Partition Pruning Sink Operator > Target column: part2_col (int) > partition key expr: part2_col > Statistics: Num rows: 1 Data size: 1 Basic stats: > COMPLETE Column stats: NONE > target work: Map 2 > Local Work: > Map Reduce Local Work > Map 2 > Map Operator Tree: > TableScan > alias: pt2 > Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: col (type: int), part2_col (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 2 Data size: 2 Basic stats: > COMPLETE Column stats: NONE > Spark HashTable Sink Operator > keys: > 0 _col1 (type: int) > 1 _col1 (type: int) > 2 _col0 (type: int) > Local Work: > Map Reduce Local Work > Stage: Stage-1 > Spark > #### A masked pattern was here #### > Vertices: > Map 3 > Map Operator Tree: > TableScan > alias: rt > Statistics: Num rows: 6 Data size: 6 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: col is not null (type: boolean) > Statistics: Num rows: 6 Data size: 6 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: col (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 6 Data size: 6 Basic stats: > COMPLETE Column stats: NONE > Map Join Operator > condition map: > Inner Join 0 to 1 > Inner Join 0 to 2 > keys: > 0 _col1 (type: int) > 1 _col1 (type: int) > 2 _col0 (type: int) > outputColumnNames: _col0, _col1, _col2, _col3, _col4 > input vertices: > 0 Map 1 > 1 Map 2 > Statistics: Num rows: 13 Data size: 13 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 13 Data size: 13 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Local Work: > Map Reduce Local Work > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink -- This message was sent by Atlassian JIRA (v7.6.3#76005)