I'm writing a Spark Streaming application that uses RabbitMQ to consume
events. One feature of RabbitMQ that I intend to make use of is bulk ack of
messages, i.e. no need to ack one-by-one, but only ack the last event in a
batch and that would ack the entire batch.

Before I commit to doing so, I'd like to know if Spark Streaming always
processes RDDs in the same order they arrive in, i.e. if RDD1 arrives before
RDD2, is it true that RDD2 will never be scheduled/processed before RDD1 is
finished?

This is crucial to the ack logic, since if RDD2 can be potentially processed
while RDD1 is still being processed, then if I ack the the last event in
RDD2 that would also ack all events in RDD1, even though they may have not
been completely processed yet.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Are-Spark-Streaming-RDDs-always-processed-in-order-tp23616.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to