Tathagata, thank you for the response.
I have two receivers in my Spark Stream job; 1 reads an endless stream of
data from flume and the other reads data from HDFS directory. However,
files do not get moved into HDFS frequently (let's say it gets moved every
10 minutes). This is where I need to c
As TD mentions, there's no such thing as an 'empty DStream'. Some intervals
of a DStream could be empty, in which case the related RDD will be empty.
This means that you should express such condition based on the RDD's of the
DStream. Translated in code:
dstream.foreachRDD{ rdd =>
if (!rdd.isEmpt
What do you mean by checking when a "DStream is empty"? DStream represents
an endless stream of data, and at point of time checking whether it is
empty or not does not make sense.
FYI, there is RDD.isEmpty()
On Wed, Oct 21, 2015 at 10:03 AM, diplomatic Guru
wrote:
> I tried below code but sti
I tried below code but still carrying out the action even though there
is no new data.
JavaPairInputDStream input =
ssc.fileStream(iFolder, LongWritable.class,Text.class,
TextInputFormat.class);
if(input != null){
//do some action if it is not empty
}
On 21 October 2015 at 18:00, diplomatic Gu