For the SparkContext#textFile, if a directory is given as the path parameter  
,then it will pick up the files in the directory, so the same thing will occur.



bit1...@163.com
 
From: Saisai Shao
Date: 2015-04-29 10:54
To: Vadim Bichutskiy
CC: bit1...@163.com; lokeshkumar; user
Subject: Re: Re: Spark streaming - textFileStream/fileStream - Get file name
I think it might be useful in Spark Streaming's file input stream, but not sure 
is it useful in SparkContext#textFile, since we specify the file by our own, so 
why we still need to know the file name.

I will open up a JIRA to mention about this feature.

Thanks
Jerry


2015-04-29 10:49 GMT+08:00 Vadim Bichutskiy <vadim.bichuts...@gmail.com>:
I was wondering about the same thing.

Vadim
ᐧ

On Tue, Apr 28, 2015 at 10:19 PM, bit1...@163.com <bit1...@163.com> wrote:
Looks to me  that the same thing also applies to the SparkContext.textFile or 
SparkContext.wholeTextFile, there is no way in RDD to figure out the file 
information where the data in RDD is from 



bit1...@163.com
 
From: Saisai Shao
Date: 2015-04-29 10:10
To: lokeshkumar
CC: spark users
Subject: Re: Spark streaming - textFileStream/fileStream - Get file name
I think currently there's no API in Spark Streaming you can use to get the file 
names for file input streams. Actually it is not trivial to support this, may 
be you could file a JIRA with wishes you want the community to support, so 
anyone who is interested can take a crack on this.

Thanks
Jerry


2015-04-29 0:13 GMT+08:00 lokeshkumar <lok...@dataken.net>:
Hi Forum,

Using spark streaming and listening to the files in HDFS using
textFileStream/fileStream methods, how do we get the fileNames which are
read by these methods?

I used textFileStream which has file contents in JavaDStream and I got no
success with fileStream as it is throwing me a compilation error with spark
version 1.3.1.

Can someone please tell me if we have an API function or any other way to
get the file names that these streaming methods read?

Thanks
Lokesh



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-textFileStream-fileStream-Get-file-name-tp22692.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org




Reply via email to