Re: Python script calls R script in Zeppelin on Hadoop

2018-08-28 Thread Lian Jiang
Thanks Jeff. This worked: %livy2.pyspark from pyspark import SparkFiles import subprocess sc.addFile("hdfs:///user/zeppelin/ocic/test.r") testpath = SparkFiles.get('test.r') stdoutdata = subprocess.getoutput("Rscript " + testpath) print(stdoutdata) Cheers! On Tue, Aug 28, 2018 at 6:09 PM Jeff

Re: Python script calls R script in Zeppelin on Hadoop

2018-08-28 Thread Jeff Zhang
Do you run it under yarn-cluster mode ? Then you must ensure your rscript shipped to that driver (via sc.addFile or setting livy.spark.files) And also you need to make sure you have R installed in all hosts of yarn cluster because the driver may run any node of this cluster. Lian Jiang 于2018年8月

Re: Python script calls R script in Zeppelin on Hadoop

2018-08-28 Thread Lian Jiang
Thanks Lucas. We tried and got the same error. Below is the code: %livy2.pyspark import subprocess sc.addFile("hdfs:///user/zeppelin/test.r") stdoutdata = subprocess.getoutput("Rscript test.r") print(stdoutdata) Fatal error: cannot open file 'test.r': No such file or directory sc.addFile adds t

RE: Python script calls R script in Zeppelin on Hadoop

2018-08-28 Thread Partridge, Lucas (GE Aviation)
Have you tried SparkContext.addFile() (not addPyFile()) to add your R script? https://spark.apache.org/docs/2.2.0/api/python/pyspark.html#pyspark.SparkContext.addFile From: Lian Jiang Sent: 27 August 2018 22:42 To: users@zeppelin.apache.org Subject: EXT: Python script calls R script in Zeppelin o