Yes, looks like a solution but quite tricky. You have to parse the debug string to get the file name, also relies on HadoopRDD to get the file name :)
2015-04-29 14:52 GMT+08:00 Akhil Das <ak...@sigmoidanalytics.com>: > It is possible to access the filename, its a bit tricky though. > > val fstream = ssc.fileStream[LongWritable, IntWritable, > SequenceFileInputFormat[LongWritable, > IntWritable]]("/home/akhld/input/") > > fstream.foreach(x =>{ > //You can get it with this object. > println(x.values.toDebugString) > > } ) > > [image: Inline image 1] > > Thanks > Best Regards > > On Wed, Apr 29, 2015 at 8:33 AM, bit1...@163.com <bit1...@163.com> wrote: > >> For the SparkContext#textFile, if a directory is given as the path >> parameter ,then it will pick up the files in the directory, so the same >> thing will occur. >> >> ------------------------------ >> bit1...@163.com >> >> >> *From:* Saisai Shao <sai.sai.s...@gmail.com> >> *Date:* 2015-04-29 10:54 >> *To:* Vadim Bichutskiy <vadim.bichuts...@gmail.com> >> *CC:* bit1...@163.com; lokeshkumar <lok...@dataken.net>; user >> <user@spark.apache.org> >> *Subject:* Re: Re: Spark streaming - textFileStream/fileStream - Get >> file name >> I think it might be useful in Spark Streaming's file input stream, but >> not sure is it useful in SparkContext#textFile, since we specify the file >> by our own, so why we still need to know the file name. >> >> I will open up a JIRA to mention about this feature. >> >> Thanks >> Jerry >> >> >> 2015-04-29 10:49 GMT+08:00 Vadim Bichutskiy <vadim.bichuts...@gmail.com>: >> >>> I was wondering about the same thing. >>> >>> Vadim >>> ᐧ >>> >>> On Tue, Apr 28, 2015 at 10:19 PM, bit1...@163.com <bit1...@163.com> >>> wrote: >>> >>>> Looks to me that the same thing also applies to the >>>> SparkContext.textFile or SparkContext.wholeTextFile, there is no way in RDD >>>> to figure out the file information where the data in RDD is from >>>> >>>> ------------------------------ >>>> bit1...@163.com >>>> >>>> >>>> *From:* Saisai Shao <sai.sai.s...@gmail.com> >>>> *Date:* 2015-04-29 10:10 >>>> *To:* lokeshkumar <lok...@dataken.net> >>>> *CC:* spark users <user@spark.apache.org> >>>> *Subject:* Re: Spark streaming - textFileStream/fileStream - Get file >>>> name >>>> I think currently there's no API in Spark Streaming you can use to get >>>> the file names for file input streams. Actually it is not trivial to >>>> support this, may be you could file a JIRA with wishes you want the >>>> community to support, so anyone who is interested can take a crack on this. >>>> >>>> Thanks >>>> Jerry >>>> >>>> >>>> 2015-04-29 0:13 GMT+08:00 lokeshkumar <lok...@dataken.net>: >>>> >>>>> Hi Forum, >>>>> >>>>> Using spark streaming and listening to the files in HDFS using >>>>> textFileStream/fileStream methods, how do we get the fileNames which >>>>> are >>>>> read by these methods? >>>>> >>>>> I used textFileStream which has file contents in JavaDStream and I got >>>>> no >>>>> success with fileStream as it is throwing me a compilation error with >>>>> spark >>>>> version 1.3.1. >>>>> >>>>> Can someone please tell me if we have an API function or any other way >>>>> to >>>>> get the file names that these streaming methods read? >>>>> >>>>> Thanks >>>>> Lokesh >>>>> >>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-textFileStream-fileStream-Get-file-name-tp22692.html >>>>> Sent from the Apache Spark User List mailing list archive at >>>>> Nabble.com. >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>> >>>>> >>>> >>> >> >