Vipul Modi created ZEPPELIN-2199: ------------------------------------ Summary: spark.lapply not working but SparkR:::lapply works. Key: ZEPPELIN-2199 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2199 Project: Zeppelin Issue Type: Bug Affects Versions: 0.7.0 Reporter: Vipul Modi
To run lapply in distributed mode via Zeppelin one needs to run SparkR:::lapply instead of spark.lapply. As per spark documentation spark.lapply should work. Steps to reproduce: Build zeppelin using with r profiles enabled: mvn clean install -DskipTests -Drat.skip=true -Pspark-2.0 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr -Pr -Pscala-2.11 Failed Paragraph %spark.r families <- c("gaussian", "poisson") df <- createDataFrame(iris) train <- function(family){ model <- model <- glm(Sepal.Length ~ Sepal.Width + Species, iris, family = family) summary(model) } model.summaries <- spark.lapply(families, train) Passin paragraph: %spark.r families <- c("gaussian", "poisson") df <- createDataFrame(iris) train <- function(family){ model <- model <- glm(Sepal.Length ~ Sepal.Width + Species, iris, family = family) summary(model) } model.summaries <- spark.lapply(families, train) -- This message was sent by Atlassian JIRA (v6.3.15#6346)