Why does spark REPL not embed scala REPL?

Aniket Fri, 30 May 2014 01:13:57 -0700

My apologies in advance if this is a dev mailing list topic. I am working on
a small project to provide web interface to spark REPL. The interface will
allow people to use spark REPL and perform exploratory analysis on the data.
I already have a play application running that provides web interface to
standard scala REPL and I am just looking to extend it to optionally include
support for spark REPL. My initial idea was to include spark dependencies in
the project, create a new instance of SparkContext and bind it to a variable
(lets say 'sc') using imain.bind("sc", sparkContext). While theoretically
this may work, I am trying to understand why spark REPL takes a different
path by creating it's own SparkILoop, SparkIMain, etc. Can anyone help me
understand why there was a need to provide custom versions of IMain, ILoop,
etc instead of embedding the standard scala REPL and binding SparkContext
instance?


Here is my analysis so far:
1. ExecutorClassLoader - I understand this is need to load classes from
HDFS. Perhaps this could have been plugged into the standard scala REPL
using settings.embeddedDefaults(classLoaderInstance). Also, it's not clear
what ConstructorCleaner does.

2. SparkCommandLine & SparkRunnerSettings - Allow for providing an extra -i
file argument to the REPL. The standard sourcepath wouldn't have sufficed?

3. SparkExprTyper - The only difference between standard ExprTyper and
SparkExprTyper is that repldbg is replaced with logDebug. Not sure if this
was intentional/needed.

4. SparkILoop - Has a few deviations from standard ILoop class but this
could have been managed by extending or wrapping ILoop class or using
settings. Not sure what triggered the need to copy the source code and make
edits.

5. SparkILoopInit - Changes the welcome message and binds spark context in
the interpreter. Welcome message could have been changed by extending
ILoopInit.

6. SparkIMain - Contains quiet a few changes around class
loading/logging/etc but I found it very hard to figure out if extension of
IMain was an option and what exactly didn't work/will not work with IMain.

Rest of the classes seem very similar to their standard counterparts. I have
a feeling the spark REPL can be refactored to embed standard scala REPL. I
know refactoring would not help Spark project as such but would help people
embed the spark REPL much in the same way it's done with standard scala
REPL. Thoughts? 



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Why-does-spark-REPL-not-embed-scala-REPL-tp6871.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Why does spark REPL not embed scala REPL?

Reply via email to