Hi Stan, Looks like it is the same issue we are working to solve. Related PRs are:
https://github.com/apache/spark/pull/16998 https://github.com/apache/spark/pull/16785 You can take a look of those PRs and help review too. Thanks. StanZhai wrote > Thanks for Cheng's help. > > > It must be something wrong with InferFiltersFromConstraints, I just > removed InferFiltersFromConstraints from > org/apache/spark/sql/catalyst/optimizer/Optimizer.scala to avoid this > issue. I will analysis this issue with the method you provided. > > > > > ------------------ Original ------------------ > From: "Cheng Lian [via Apache Spark Developers > List]";<ml-node+s1001551n21069...@n3.nabble.com>; > Send time: Friday, Feb 24, 2017 2:28 AM > To: "Stan Zhai"<m...@zhaishidan.cn>; > > Subject: Re: The driver hangs at DataFrame.rdd in Spark 2.1.0 > > > > > This one seems to be relevant, but it's already fixed in 2.1.0. > > One way to debug is to turn on trace log and check how the > analyzer/optimizer behaves. > > > On 2/22/17 11:11 PM, StanZhai wrote: > > Could this be related to > https://issues.apache.org/jira/browse/SPARK-17733 ? > > > > > ------------------ Original ------------------ > From: "Cheng Lian-3 [via Apache Spark Developers > > List]";<[hidden email]>; > Send time: Thursday, Feb 23, 2017 9:43 AM > To: "Stan Zhai"<[hidden email]>; > Subject: Re: The driver hangs at DataFrame.rdd in > Spark 2.1.0 > > > > > Just from the thread dump you provided, it seems that this > particular query plan jams our optimizer. However, it's also > possible that the driver just happened to be running optimizer > rules at that particular time point. > > > Since query planning doesn't touch any actual data, could you > please try to minimize this query by replacing the actual > relations with temporary views derived from Scala local > collections? In this way, it would be much easier for others to > reproduce issue. > > Cheng > > > On 2/22/17 5:16 PM, Stan Zhai wrote: > > Thanks for lian's reply. > > > Here is the QueryPlan generated by Spark 1.6.2(I can't > get it in Spark 2.1.0): > ... > > > > ------------------ Original ------------------ > Subject: Re: The driver hangs at > DataFrame.rdd in Spark 2.1.0 > > > > > What is the query plan? We had once observed query plans > that grow exponentially in iterative ML workloads and the > query planner hangs forever. For example, each iteration > combines 4 plan trees of the last iteration and forms a > larger plan tree. The size of the plan tree can easily reach > billions of nodes after 15 iterations. > > > On 2/22/17 9:29 AM, Stan Zhai wrote: > > Hi all, > > > The driver hangs at DataFrame.rdd in Spark 2.1.0 when > > the DataFrame(SQL) is complex, Following thread dump of my > driver: > ... > > > > > > > If you reply to this email, your message > will be added to the discussion below: > > http://apache-spark-developers-list.1001551.n3.nabble.com/Re-The-driver-hangs-at-DataFrame-rdd-in-Spark-2-1-0-tp21052p21053.html > > To start a new topic under Apache Spark Developers > List, email [hidden email] > To unsubscribe from Apache Spark Developers List, click here. > NAML > > > > View this message in context: Re: The driver hangs at > DataFrame.rdd in Spark 2.1.0 > Sent from the Apache Spark Developers List mailing list > archive at Nabble.com. > > > > > If you reply to this email, your message will be added > to the > discussion below: > > http://apache-spark-developers-list.1001551.n3.nabble.com/Re-The-driver-hangs-at-DataFrame-rdd-in-Spark-2-1-0-tp21052p21069.html > > To start a new topic under Apache Spark Developers > List, email > ml-node+s1001551n1...@n3.nabble.com > To unsubscribe from Apache Spark Developers List, click here. > NAML ----- Liang-Chi Hsieh | @viirya Spark Technology Center http://www.spark.tc/ -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Re-The-driver-hangs-at-DataFrame-rdd-in-Spark-2-1-0-tp21052p21084.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org