Dibyendu,

Tnks for getting back.

I believe you are absolutely right. We were under the assumption that the
raw data was being computed again and that's not happening after further
tests. This applies to Kafka as well.

The issue is of major priority fortunately. 

Regarding your suggestion, I would maybe prefer to have the problem resolved
within Spark's internals since once the data is replicated we should be able
to access it once more and not having to pool it back again from Kafka or
any other stream that is being affected by this issue. If for example there
is a big amount of batches to be recomputed I would rather have them done
distributed than overloading the batch interval with huge amount of Kafka
messages.

I do not have yet enough know how on where is the issue and about the
internal Spark code so I can't really how much difficult will be the
implementation.

tnks,
Rod



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Low-Level-Kafka-Consumer-for-Spark-tp11258p12966.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to