Re: How to save ReceiverInputDStream to Hadoop using saveAsNewAPIHadoopFile

2014-10-21 Thread Akhil Das
Hi Buntu, You could something similar to the following: val receiver_stream = new ReceiverInputDStream(ssc) { > override def getReceiver(): Receiver[Nothing] = ??? //Whatever > }.map((x : String) => (null, x)) > val config = new Configuration() > config.set("mongo.output.uri",

Re: How to save ReceiverInputDStream to Hadoop using saveAsNewAPIHadoopFile

2014-10-10 Thread Akhil Das
You can convert this ReceiverInputDStream into PairRDDFuctions and call the saveAsNewAP

Re: How to save ReceiverInputDStream to Hadoop using saveAsNewAPIHadoopFile

2014-10-09 Thread Buntu Dev
Basically I'm attempting to convert a JSON stream to Parquet and I get this error without the .values or .map(_._2) : value saveAsNewAPIHadoopFile is not a member of org.apache.spark.streaming.dstream.ReceiverInputDStream[(String, String)] On Thu, Oct 9, 2014 at 10:15 PM, Sean Owen wrote: >

Re: How to save ReceiverInputDStream to Hadoop using saveAsNewAPIHadoopFile

2014-10-09 Thread Sean Owen
Your RDD does not contain pairs, since you ".map(_._2)" (BTW that can just be ".values"). "Hadoop files" means "SequenceFiles" and those store key-value pairs. That's why the method only appears for RDD[(K,V)]. On Fri, Oct 10, 2014 at 3:50 AM, Buntu Dev wrote: > Thanks Sean, but I'm importing org

Re: How to save ReceiverInputDStream to Hadoop using saveAsNewAPIHadoopFile

2014-10-09 Thread Buntu Dev
Thanks Sean, but I'm importing org.apache.spark.streaming. StreamingContext._ Here are the spark imports: import org.apache.spark.streaming._ import org.apache.spark.streaming.StreamingContext._ import org.apache.spark.streaming.kafka._ import org.apache.spark.SparkConf val stream =

Re: How to save ReceiverInputDStream to Hadoop using saveAsNewAPIHadoopFile

2014-10-09 Thread Sean Owen
I think you have not imported org.apache.spark.streaming.StreamingContext._ ? This gets you the implicits that provide these methods. On Thu, Oct 9, 2014 at 8:40 PM, bdev wrote: > I'm using KafkaUtils.createStream for the input stream to pull messages from > kafka which seems to return a Receiver