Re: Python script calls R script in Zeppelin on Hadoop

Lian Jiang Tue, 28 Aug 2018 10:36:02 -0700

Thanks Lucas. We tried and got the same error. Below is the code:

%livy2.pyspark
import subprocess
sc.addFile("hdfs:///user/zeppelin/test.r")
stdoutdata = subprocess.getoutput("Rscript test.r")
print(stdoutdata)


Fatal error: cannot open file 'test.r': No such file or directory


sc.addFile adds test.r to spark context. However, subprocess does not use
spark context.

Hdfs path does not work either: subprocess.getoutput("Rscript
hdfs:///user/zeppelin/test.r")

Any idea how to make python call R script? Appreciate!




On Tue, Aug 28, 2018 at 1:13 AM Partridge, Lucas (GE Aviation) <
lucas.partri...@ge.com> wrote:

> Have you tried SparkContext.addFile() (not addPyFile()) to add your R
> script?
>
>
> https://spark.apache.org/docs/2.2.0/api/python/pyspark.html#pyspark.SparkContext.addFile
>
>
>
> *From:* Lian Jiang <jiangok2...@gmail.com>
> *Sent:* 27 August 2018 22:42
> *To:* users@zeppelin.apache.org
> *Subject:* EXT: Python script calls R script in Zeppelin on Hadoop
>
>
>
> Hi,
>
>
>
> We are using HDP3.0 (using zeppelin 0.8.0) and are migrating Jupyter
> notebooks to Zeppelin. One issue we came across is that a python script
> calling R script does not work in Zeppelin.
>
>
>
> %livy2.pyspark
>
> import os
>
> sc.addPyFile("hdfs:///user/zeppelin/my.py")
>
> import my
>
> my.test()
>
>
>
> my.test() calls R script like: ['Rscript', 'myR.r']
>
>
>
> Fatal error: cannot open file 'myR.r': No such file or directory
>
>
>
> When running this notebook in jupyter, both my.py and myR.r exist in the
> same folder. I understand the story changes on hadoop because the scripts
> run in containers.
>
>
>
> My question:
>
> Is this scenario supported in zeppelin? How to add a R script into a
> python spark context so that the Python script can find the R script?
> Appreciate!
>
>
>
>
>

Re: Python script calls R script in Zeppelin on Hadoop

Reply via email to