I am using spark streaming for text files. I have feeder program which moves
files to spark streaming directory. 

while Spark processing particular file at the same time if feeders puts
another file into streaming directory, sometimes spark does not pick file
for processing.

we are using sparking streaming directory as NFS mounted shared drive so
that spark slaves using mesos can also access it.

below is simple code

  final SparkConf sparkConf = new SparkConf();

  final JavaStreamingContext javaStreamingContext =
                new JavaStreamingContext(sparkConf,
                    new Duration(2000));

 JavaDStream<String> dstreamRdd =
javaStreamingContext.textFileStream("/mnt/streamingDir");

there is processing on this rdd later..

Any idea why Sparking streaming missing files in streaming directory once
its moved from feeder?





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Sparkstreaming-not-consistently-picking-files-from-streaming-directory-tp27465.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to