[DISCUSS] PR #208 - R Interpreter for Zeppelin

Eric Charles Wed, 30 Dec 2015 06:05:13 -0800

Hi,

I had a look at https://github.com/apache/incubator-zeppelin/pull/208(and related Github repo https://github.com/elbamos/Zeppelin-With-R [1])

Here are a few topics for discussion based on my experience developinghttps://github.com/datalayer/zeppelin-R [2].


1. rscala jar not in Maven Repository

[1] copies the source (scala and R) code from rscala repo andchanges/extends/repackages it a bit. [2] declares the jar as systemscoped library. I recently had incompatibly issues between the 1.0.8(the one you get since 2015-12-10 when you install rscala on your Renvironment) and the 1.0.6 jar I am using part of the zeppelin-R build.To avoid such issues, why not the user choosing the version via aproperty at build time to fit the version he runs on its host? This willalso allow to benefit from the next rscala releases which fix bugs,bring not features... This also means we don't have to copy the rscalacode in Zeppelin tree.


2. Interpreters

[1] proposes 2 interpreters %sparkr.r and %sparkr.knitr which areimplemented in their own module apart from the Spark one. To be alignedthe existing pyspark implementation, why not integrating the R code intothe Spark one? Any reason to keep 2 versions which does basically thesame? The unique magic keyword would then be %spark.r


3. Rendering TABLE plot when interpreter result is a dataframe

This may be confusing. What if I display a plot and simply want to printthe first 10 rows at the end of my code? To keep the same behavior asthe other interpreters, we could make this feature optional (disabled bydefault, enabled via property).



Thx, Eric

[DISCUSS] PR #208 - R Interpreter for Zeppelin

Reply via email to