subject:"Re\: Spark Streaming using File Stream in Java"

Re: Spark Streaming using File Stream in Java

2014-07-10 Thread Tathagata Das

The fileStream is not designed to work with continuously updating file, as the one of the main design goals of Spark is immutability (to guarantee fault-tolerance by recomputation), and files that are appending (mutating) defeats that. It rather designed to pickup new files added atomically (using

Re: Spark Streaming using File Stream in Java

2014-07-09 Thread Aravind

Hi Akil, It didnt work. Here is the code... package com.paypal; import org.apache.spark.SparkConf; import org.apache.spark.storage.StorageLevel; import org.apache.spark.streaming.api.java.JavaPairInputDStream; import org.apache.spark.streaming.api.java.JavaStreamingContext; import org.apache.sp

Re: Spark Streaming using File Stream in Java

2014-07-09 Thread Akhil Das

Try this out: JavaStreamingContext sc = new JavaStreamingContext(...);JavaDStream lines = ctx.fileStream("whatever");JavaDStream words = lines.flatMap( new FlatMapFunction() { public Iterable call(String s) { return Arrays.asList(s.split(" ")); } }); JavaPairDStream ones = words