Vipul Modi created ZEPPELIN-2199:
------------------------------------

             Summary: spark.lapply not working but SparkR:::lapply works.
                 Key: ZEPPELIN-2199
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2199
             Project: Zeppelin
          Issue Type: Bug
    Affects Versions: 0.7.0
            Reporter: Vipul Modi


To run lapply in distributed mode via Zeppelin one needs to run SparkR:::lapply 
instead of spark.lapply.
As per spark documentation spark.lapply should work.

Steps to reproduce:

Build zeppelin using with r profiles enabled:
mvn clean install -DskipTests -Drat.skip=true -Pspark-2.0 -Phadoop-2.4 -Pyarn 
-Ppyspark -Psparkr -Pr -Pscala-2.11

Failed Paragraph
%spark.r
families <- c("gaussian", "poisson")
df <- createDataFrame(iris)
train <- function(family){ 
   model <- model <- glm(Sepal.Length ~ Sepal.Width + Species, iris, family = 
family) 
   summary(model)
}
model.summaries <- spark.lapply(families, train)

Passin paragraph:
%spark.r
families <- c("gaussian", "poisson")
df <- createDataFrame(iris)
train <- function(family){ 
   model <- model <- glm(Sepal.Length ~ Sepal.Width + Species, iris, family = 
family) 
   summary(model)
}
model.summaries <- spark.lapply(families, train)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to