Hi Renyi,

This is the intended behavior of the streaming HdfsWordCount example.  It
makes use of a 'textFileStream' which will monitor a hdfs directory for any
newly created files and push them into a dstream.  It is meant to be run
indefinitely, unless interrupted by ctrl-c, for example.

-bryan
On Nov 13, 2015 10:52 AM, "Renyi Xiong" <renyixio...@gmail.com> wrote:

> Hi,
>
> I try to run the following 1.4.1 sample by putting a words.txt under
> localdir
>
> bin\run-example org.apache.spark.examples.streaming.HdfsWordCount localdir
>
> 2 questions
>
> 1. it does not pick up words.txt because it's 'old' I guess - any option
> to let it picked up?
> 2. I managed to put a 'new' file on the fly which got picked up, but after
> processing, the program doesn't stop (keeps generating empty RDDs instead),
> any option to let it stop when no new files come in (otherwise it blocks
> others when I want to run multiple samples?)
>
> Thanks,
> Renyi.
>

Reply via email to