[ https://issues.apache.org/jira/browse/HIVE-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285468#comment-16285468 ]
liyunzhang commented on HIVE-18148: ----------------------------------- [~lirui]: I can not reproduce because you did not provide all the script to reproduce it. But I guess following script matches you mentioned [spark_dynamic_partition_pruning.q#https://github.com/kellyzly/hive/blob/master/ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning.q#L29] {code} EXPLAIN select count(*) from srcpart join srcpart_date on (srcpart.ds = srcpart_date.ds) join srcpart_hour on (srcpart.hr = srcpart_hour.hr) where srcpart_date.`date` = '2008-04-08' and srcpart_hour.hour = 11; {code} srcpart is similar as src , srcpart_data is similar as part1 and srcpart_hour is similiar as part2 in your example. But the operator tree is like {code} TS[0]-SEL[2]-MAPJOIN[36]-MAPJOIN[35]-GBY[16]-RS[17]-GBY[18]-FS[20] TS[3]-FIL[27]-SEL[5]-RS[10]-MAPJOIN[36] -SEL[29]-GBY[30]-SPARKPRUNINGSINK[31] TS[6]-FIL[28]-SEL[8]-RS[13]-MAPJOIN[35] -SEL[32]-GBY[33]-SPARKPRUNINGSINK[34] {code} Here TS\[3\] is srcpart_date ,TS\[6\] is srcpart_hour, TS\[0\] is src. But there is no nested DPP problem here. So where is wrong? > NPE in SparkDynamicPartitionPruningResolver > ------------------------------------------- > > Key: HIVE-18148 > URL: https://issues.apache.org/jira/browse/HIVE-18148 > Project: Hive > Issue Type: Bug > Components: Spark > Reporter: Rui Li > Assignee: Rui Li > > The stack trace is: > {noformat} > 2017-11-27T10:32:38,752 ERROR [e6c8aab5-ddd2-461d-b185-a7597c3e7519 main] > ql.Driver: FAILED: NullPointerException null > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.optimizer.physical.SparkDynamicPartitionPruningResolver$SparkDynamicPartitionPruningDispatcher.dispatch(SparkDynamicPartitionPruningResolver.java:100) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.SparkDynamicPartitionPruningResolver.resolve(SparkDynamicPartitionPruningResolver.java:74) > at > org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeTaskPlan(SparkCompiler.java:568) > {noformat} > At this stage, there shouldn't be a DPP sink whose target map work is null. > The root cause seems to be a malformed operator tree generated by > SplitOpTreeForDPP. -- This message was sent by Atlassian JIRA (v6.4.14#64029)