I think it might be useful in Spark Streaming's file input stream, but not sure is it useful in SparkContext#textFile, since we specify the file by our own, so why we still need to know the file name.
I will open up a JIRA to mention about this feature. Thanks Jerry 2015-04-29 10:49 GMT+08:00 Vadim Bichutskiy <vadim.bichuts...@gmail.com>: > I was wondering about the same thing. > > Vadim > ᐧ > > On Tue, Apr 28, 2015 at 10:19 PM, bit1...@163.com <bit1...@163.com> wrote: > >> Looks to me that the same thing also applies to the >> SparkContext.textFile or SparkContext.wholeTextFile, there is no way in RDD >> to figure out the file information where the data in RDD is from >> >> ------------------------------ >> bit1...@163.com >> >> >> *From:* Saisai Shao <sai.sai.s...@gmail.com> >> *Date:* 2015-04-29 10:10 >> *To:* lokeshkumar <lok...@dataken.net> >> *CC:* spark users <user@spark.apache.org> >> *Subject:* Re: Spark streaming - textFileStream/fileStream - Get file >> name >> I think currently there's no API in Spark Streaming you can use to get >> the file names for file input streams. Actually it is not trivial to >> support this, may be you could file a JIRA with wishes you want the >> community to support, so anyone who is interested can take a crack on this. >> >> Thanks >> Jerry >> >> >> 2015-04-29 0:13 GMT+08:00 lokeshkumar <lok...@dataken.net>: >> >>> Hi Forum, >>> >>> Using spark streaming and listening to the files in HDFS using >>> textFileStream/fileStream methods, how do we get the fileNames which are >>> read by these methods? >>> >>> I used textFileStream which has file contents in JavaDStream and I got no >>> success with fileStream as it is throwing me a compilation error with >>> spark >>> version 1.3.1. >>> >>> Can someone please tell me if we have an API function or any other way to >>> get the file names that these streaming methods read? >>> >>> Thanks >>> Lokesh >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-textFileStream-fileStream-Get-file-name-tp22692.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >