Re: how to monitor multi directories in spark streaming task

2015-05-13 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, You could retroactively union an existing DStream with one from a newly created file. Then when another file is "detected", you would need to re-union the stream an create another DStream. It seems like the implementation of FileInputDStream only

Re: how to monitor multi directories in spark streaming task

2015-05-13 Thread lisendong
but in fact the directories are not ready at the beginning to my task . for example: /user/root/2015/05/11/data.txt /user/root/2015/05/12/data.txt /user/root/2015/05/13/data.txt like this. and one new directory one day. how to create the new DStream for tomorrow’s new directory(/user/root/20

Re: how to monitor multi directories in spark streaming task

2015-05-13 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I would suggest creating one DStream per directory and then using StreamingContext#union(...) to get a union DStream. - -- Ankur On 13/05/2015 00:53, hotdog wrote: > I want to use use fileStream in spark streaming to monitor multi > hdfs directories,

how to monitor multi directories in spark streaming task

2015-05-13 Thread hotdog
ing of the three class : LongWritable, Text, TextInputFormat but it doesn't work... -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/how-to-monitor-multi-directories-in-spark-streaming-task-tp22863.html Sent from the Apache Spark User List mailing