)
rdd.saveAsTextFile(file)
}
}
this.foreachRDD(saveFunc)
}
Regards
Deenar
P.S. The mail archive on nabble does not seem to show all responses.
-Original Message-
From: Sean Owen [mailto:so...@cloudera.com]
Sent: 22 March 2015 11:49
To: Deenar Toraskar
Cc: user@spark.apache.org
Subject: Re: co
On Sun, Mar 22, 2015 at 8:43 AM, deenar.toraskar wrote:
> 1) if there are no sliding window calls in this streaming context, will
> there just one file written per interval?
As many files as there are partitions will be written in each interval.
> 2) if there is a sliding window call in the same
Sean
Dstream.saveAsTextFiles internally calls foreachRDD and saveAsTextFile for
each interval
def saveAsTextFiles(prefix: String, suffix: String = "") {
val saveFunc = (rdd: RDD[T], time: Time) => {
val file = rddToFileName(prefix, suffix, time)
rdd.saveAsTextFile(file)
}
Thanks Dear, It is good to save this data to HDFS and then load back into an
RDD :)
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/converting-DStream-String-into-RDD-String-in-spark-streaming-tp20253p20258.html
Sent from the Apache Spark User List mailing l
DStream.foreachRDD gives you an RDD[String] for each interval of
course. I don't think it makes sense to say a DStream can be converted
into one RDD since it is a stream. The past elements are inherently
not supposed to stick around for a long time, and future elements
aren't known. You may conside