Hello Subacini,
Until someone more knowledgeable suggests a better, more straightforward,
and simpler approach with a working code snippet, I suggest the following
workaround / hack:
inputStream.foreachRDD(rdd =>
val myStr = rdd.toDebugString
// process myStr string value, e.g. using
Thank you Emre, This helps, i am able to get filename.
But i am not sure how to fit this into Dstream RDD.
val inputStream = ssc.textFileStream("/hdfs Path/")
inputStream is Dstreamrdd and in foreachrdd , am doing my processing
inputStream.foreachRDD(rdd => {
* //how to get filename here??*
Hello,
Did you check the following?
http://themodernlife.github.io/scala/spark/hadoop/hdfs/2014/09/28/spark-input-filename/
http://apache-spark-user-list.1001560.n3.nabble.com/access-hdfs-file-name-in-map-td6551.html
--
Emre Sevinç
On Fri, Feb 6, 2015 at 2:16 AM, Subacini B wrote:
> Hi All
Hi All,
We have filename with timestamp say ABC_1421893256000.txt and the
timestamp needs to be extracted from file name for further processing.Is
there a way to get input file name picked up by spark streaming job?
Thanks in advance
Subacini