Thanks Lucas. We tried and got the same error. Below is the code: %livy2.pyspark import subprocess sc.addFile("hdfs:///user/zeppelin/test.r") stdoutdata = subprocess.getoutput("Rscript test.r") print(stdoutdata)
Fatal error: cannot open file 'test.r': No such file or directory sc.addFile adds test.r to spark context. However, subprocess does not use spark context. Hdfs path does not work either: subprocess.getoutput("Rscript hdfs:///user/zeppelin/test.r") Any idea how to make python call R script? Appreciate! On Tue, Aug 28, 2018 at 1:13 AM Partridge, Lucas (GE Aviation) < lucas.partri...@ge.com> wrote: > Have you tried SparkContext.addFile() (not addPyFile()) to add your R > script? > > > https://spark.apache.org/docs/2.2.0/api/python/pyspark.html#pyspark.SparkContext.addFile > > > > *From:* Lian Jiang <jiangok2...@gmail.com> > *Sent:* 27 August 2018 22:42 > *To:* users@zeppelin.apache.org > *Subject:* EXT: Python script calls R script in Zeppelin on Hadoop > > > > Hi, > > > > We are using HDP3.0 (using zeppelin 0.8.0) and are migrating Jupyter > notebooks to Zeppelin. One issue we came across is that a python script > calling R script does not work in Zeppelin. > > > > %livy2.pyspark > > import os > > sc.addPyFile("hdfs:///user/zeppelin/my.py") > > import my > > my.test() > > > > my.test() calls R script like: ['Rscript', 'myR.r'] > > > > Fatal error: cannot open file 'myR.r': No such file or directory > > > > When running this notebook in jupyter, both my.py and myR.r exist in the > same folder. I understand the story changes on hadoop because the scripts > run in containers. > > > > My question: > > Is this scenario supported in zeppelin? How to add a R script into a > python spark context so that the Python script can find the R script? > Appreciate! > > > > >