Re: Creating new Spark context when running in Secure YARN fails

2015-11-20 Thread Hari Shreedharan
Can you try this: https://github.com/apache/spark/pull/9875 <https://github.com/apache/spark/pull/9875>. I believe this patch should fix the issue here. Thanks, Hari Shreedharan > On Nov 11, 2015, at 1:59 PM, Ted Yu wrote: > > Please take a look at > yarn/src/main/scal

Re: Monitoring tools for spark streaming

2015-09-28 Thread Hari Shreedharan
+1. The Streaming UI should give you more than enough information. Thanks, Hari On Mon, Sep 28, 2015 at 9:55 PM, Shixiong Zhu wrote: > Which version are you using? Could you take a look at the new Streaming UI > in 1.4.0? > Best Regards, > Shixiong Zhu > 2015-09-29 7:52 GMT+08:00 Siva : >> H

Re: Spark Streaming failing on YARN Cluster

2015-08-19 Thread Hari Shreedharan
It looks like you are having issues with the files getting distributed to the cluster. What is the exception you are getting now? On Wednesday, August 19, 2015, Ramkumar V wrote: > Thanks a lot for your suggestion. I had modified HADOOP_CONF_DIR in > spark-env.sh so that core-site.xml is under H

Re: Spark Streaming with Flume or Kafka?

2014-11-19 Thread Hari Shreedharan
Btw, if you want to write to Spark Streaming from Flume -- there is a sink (it is a part of Spark, not Flume). See Approach 2 here: http://spark.apache.org/docs/latest/streaming-flume-integration.html On Wed, Nov 19, 2014 at 12:41 PM, Hari Shreedharan < hshreedha...@cloudera.com> wrote:

Re: Spark Streaming with Flume or Kafka?

2014-11-19 Thread Hari Shreedharan
As of now, you can feed Spark Streaming from both kafka and flume. Currently though there is no API to write data back to either of the two directly. I sent a PR which should eventually add something like this: https://github.com/harishreedharan/spark/blob/Kafka-output/external/kafka/src/main/scal

Re: Spark and Scala

2014-09-12 Thread Hari Shreedharan
No, Scala primitives remain primitives. Unless you create an RDD using one of the many methods - you would not be able to access any of the RDD methods. There is no automatic porting. Spark is an application as far as scala is concerned - there is no compilation (except of course, the scala, JIT co

Re: Spark Streaming on Yarn Input from Flume

2014-08-07 Thread Hari Shreedharan
Do you see anything suspicious in the logs? How did you run the application? On Thu, Aug 7, 2014 at 10:02 PM, XiaoQinyu wrote: > Hi~ > > I run a spark streaming app to receive data from flume event.When I run on > standalone,Spark Streaming can receive the Flume event normally .But if I > run t

Re: store spark streaming dstream in hdfs or cassandra

2014-07-31 Thread Hari Shreedharan
Off the top of my head, you can use the ForEachDStream to which you pass in the code that writes to Hadoop, and then register that as an output stream, so the function you pass in is periodically executed and causes the data to be written to HDFS. If you are ok with the data being in text forma

Re: Spark and Flume integration - do I understand this correctly?

2014-07-29 Thread Hari Shreedharan
Hi, Deploying spark with Flume is pretty simple. What you'd need to do is: 1. Start your spark Flume DStream Receiver on some machine using one of the FlumeUtils.createStream methods - where you need to specify the hostname and port of the worker node on which you want the spark executor to r