Marking files as read in Spark Streaming

2016-07-11 Thread soumick dasgupta
Hi, I am looking for a solution in Spark Streaming where I can mark the files that I have already read in HDFS. This is to make sure that I am not reading the same file by mistake and also to ensure that I have read all the records in a given file. Thank You, Soumick

Spark Streaming stateful operation to HBase

2016-06-08 Thread soumick dasgupta
RDD is emitting result from the previous state, it should not print/write that value. Thank You, Soumick Dasgupta