Hi Pedro, Thanks for reaching out. A Samza app using KVStores can expect read-your-writes guarantee because a. in the failure free case, rocksdb provides that, and b. in case of failures, samza resumes input (and state) consumption from the last complete checkpoint (at which everything written to store is guaranteed to be persisted/flushed to kafka's changelog). At checkpoint, Samza simply flushes the local rocksdbstore, the kafka changelog producer and then proceeds to checkpoint input only if both flushes succeed.
-- thanks rayman On Thu, Nov 14, 2019 at 1:42 PM Pedro Silvestre <pmfsilves...@gmail.com> wrote: > Hello all, > > I was reading through the Samza paper ( > http://www.vldb.org/pvldb/vol10/p1634-noghabi.pdf, very nicely written by > the way), and in the section on fault-tolerance I noticed that the > changelog is implemented with read-your-writes guarantees. Knowing that > this changelog is a Kafka stream, I cannot find any information on whether > Kafka provides read-your-writes guarantees. > > Intuitively, since producers and consumers are separate entities I would > expect this guarantee to not exist: a process acting as both a producer and > consumer, which executes a produce() followed by a poll() is not guaranteed > to read the produced record immediately. > > So, how is the read-your-writes changelog implemented? > > Regards, > > Pedro Silvestre > -- thanks rayman