Hi,

We had a Spark-a-thon in Warsaw, Poland [1] where we set on learning
QueryPlan API. My initial idea was to start with Analyzer and register
a custom Rule[LogicalPlan] using extendedResolutionRules [2].

We were glad to have seen the scaladoc:

"Override to provide additional rules for the "Resolution" batch."

after we had got stuck how to register the custom rule. It turned out
to be very different from what you could do using ExperimentalMethods
[3] that offers you two extension points for the query planner (i.e.
SparkPlanner and SparkOptimizer).

We ended up overriding SparkSession, SessionState, and Analyzer which
was an excellent coding exercise for the sparkathon, but could be too
much given the simplicity of ExperimentalMethods for the query
planner.

See a solution: https://gist.github.com/loostro/5a14dbca9b52b841cb97d17ba952943f

Can we do better? Could we have done it using other extension points
to Analyzer? Are there any plans on opening Analyzer (akin to
SparkPlanner)? Why? When? That could be another excellent coding
exercise for sparkathon, couldn't it?

Please help.

[1] https://www.meetup.com/WarsawScala/events/238558617/
[2] 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L107
[3] 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/ExperimentalMethods.scala

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to