I think it might be useful in Spark Streaming's file input stream, but not
sure is it useful in SparkContext#textFile, since we specify the file by
our own, so why we still need to know the file name.

I will open up a JIRA to mention about this feature.

Thanks
Jerry


2015-04-29 10:49 GMT+08:00 Vadim Bichutskiy <vadim.bichuts...@gmail.com>:

> I was wondering about the same thing.
>
> Vadim
> ᐧ
>
> On Tue, Apr 28, 2015 at 10:19 PM, bit1...@163.com <bit1...@163.com> wrote:
>
>> Looks to me  that the same thing also applies to the
>> SparkContext.textFile or SparkContext.wholeTextFile, there is no way in RDD
>> to figure out the file information where the data in RDD is from
>>
>> ------------------------------
>> bit1...@163.com
>>
>>
>> *From:* Saisai Shao <sai.sai.s...@gmail.com>
>> *Date:* 2015-04-29 10:10
>> *To:* lokeshkumar <lok...@dataken.net>
>> *CC:* spark users <user@spark.apache.org>
>> *Subject:* Re: Spark streaming - textFileStream/fileStream - Get file
>> name
>> I think currently there's no API in Spark Streaming you can use to get
>> the file names for file input streams. Actually it is not trivial to
>> support this, may be you could file a JIRA with wishes you want the
>> community to support, so anyone who is interested can take a crack on this.
>>
>> Thanks
>> Jerry
>>
>>
>> 2015-04-29 0:13 GMT+08:00 lokeshkumar <lok...@dataken.net>:
>>
>>> Hi Forum,
>>>
>>> Using spark streaming and listening to the files in HDFS using
>>> textFileStream/fileStream methods, how do we get the fileNames which are
>>> read by these methods?
>>>
>>> I used textFileStream which has file contents in JavaDStream and I got no
>>> success with fileStream as it is throwing me a compilation error with
>>> spark
>>> version 1.3.1.
>>>
>>> Can someone please tell me if we have an API function or any other way to
>>> get the file names that these streaming methods read?
>>>
>>> Thanks
>>> Lokesh
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-textFileStream-fileStream-Get-file-name-tp22692.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>

Reply via email to