Thanks Jeff.
Problem solved by installing the R packages into /usr/lib64/R/library (the
default lib path) on each datanode. Your clue help!
On Wed, Aug 29, 2018 at 7:40 PM Jeff Zhang wrote:
>
> I am not sure what's wrong. maybe you can ssh to that machine and run this
> r script manually first
I am not sure what's wrong. maybe you can ssh to that machine and run this
r script manually first to verify what's wrong.
Lian Jiang 于2018年8月30日周四 上午10:34写道:
> Jeff,
>
> R is installed on namenode and all data nodes. The R packages have been
> copied to them all too. I am not sure if an R scri
Jeff,
R is installed on namenode and all data nodes. The R packages have been
copied to them all too. I am not sure if an R script launched by
pyspark's subprocess
can access spark context or not. If not, using addFiles to add R packages
into spark context will not help test.r install the packages
You need to make sure the spark driver machine have this package installed.
And since you are using yarn-cluster mode via livy, you have to install
this packages on all nodes because the spark driver could be launched in
any node of this cluster.
Lian Jiang 于2018年8月30日周四 上午1:46写道:
> After calli
After calling a sample R script, we found another issue when running a real
R script. This R script failed to load changepoint library.
I tried:
%livy2.sparkr
install.packages("changepoint", repos="file:///mnt/data/tmp/r")
library(changepoint) // I see "Successfully loaded changepoint package
ver
Thanks Jeff.
This worked:
%livy2.pyspark
from pyspark import SparkFiles
import subprocess
sc.addFile("hdfs:///user/zeppelin/ocic/test.r")
testpath = SparkFiles.get('test.r')
stdoutdata = subprocess.getoutput("Rscript " + testpath)
print(stdoutdata)
Cheers!
On Tue, Aug 28, 2018 at 6:09 PM Jeff
Do you run it under yarn-cluster mode ? Then you must ensure your rscript
shipped to that driver (via sc.addFile or setting livy.spark.files)
And also you need to make sure you have R installed in all hosts of yarn
cluster because the driver may run any node of this cluster.
Lian Jiang 于2018年8月
Thanks Lucas. We tried and got the same error. Below is the code:
%livy2.pyspark
import subprocess
sc.addFile("hdfs:///user/zeppelin/test.r")
stdoutdata = subprocess.getoutput("Rscript test.r")
print(stdoutdata)
Fatal error: cannot open file 'test.r': No such file or directory
sc.addFile adds t
Have you tried SparkContext.addFile() (not addPyFile()) to add your R script?
https://spark.apache.org/docs/2.2.0/api/python/pyspark.html#pyspark.SparkContext.addFile
From: Lian Jiang
Sent: 27 August 2018 22:42
To: users@zeppelin.apache.org
Subject: EXT: Python script calls R script in Zeppelin o