Thanks Shivaram! Will give it a try and let you know. Regards, Pawan Venugopal
On Mon, Apr 7, 2014 at 3:38 PM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > You can create standalone jobs in SparkR as just R files that are run > using the sparkR script. These commands will be sent to a Spark cluster and > the examples on the SparkR repository ( > https://github.com/amplab-extras/SparkR-pkg#examples-unit-tests) are in > fact standalone jobs. > > However I don't think that will completely solve your use case of using > Streaming + R. We don't yet have a way to call R functions from Spark's > Java or Scala API. So right now one thing you can try is to save data from > SparkStreaming to HDFS and then run a SparkR job which reads in the file. > > Regarding the other idea of calling R from Scala -- it might be possible > to do that in your code if the classpath etc. is setup correctly. I haven't > tried it out though, but do let us know if you get it to work. > > Thanks > Shivaram > > > On Mon, Apr 7, 2014 at 2:21 PM, pawan kumar <pkv...@gmail.com> wrote: > >> Hi, >> >> Is it possible to create a standalone job in scala using sparkR? If >> possible can you provide me with the information of the setup process. >> (Like the dependencies in SBT and where to include the JAR files) >> >> This is my use-case: >> >> 1. I have a Spark Streaming standalone Job running in local machine which >> streams twitter data. >> 2. I have an R script which performs Sentiment Analysis. >> >> I am looking for an optimal way where I could combine these two >> operations into a single job and run using "SBT Run" command. >> >> I came across this document which talks about embedding R into scala ( >> http://dahl.byu.edu/software/jvmr/dahl-payne-uppalapati-2013.pdf) but >> was not sure if that would work well within the spark context. >> >> Thanks, >> Pawan Venugopal >> >> >