Thank you Jake and Navina for your timely answers :) Regards, Jack
On Tue, Aug 2, 2016 at 12:18 PM, Navina Ramesh <nram...@linkedin.com.invalid > wrote: > Hi Jack, > You are right. In the situation where state *doesn't need to be recovered > upon failure*, you could use in-memory store, instead of RocksDB. > The only limitation there is the size of your dataset. If it exceeds beyond > the allocated memory, you won't have any checks around it. Also, if you > have a "TTL" or cached dataset, you may have to manually time-out the > entries. > > If you do use RocksDB with a changelog, the data not only gets persisted to > the disk, but also to the changelog topic. So, if the container fails and > restarts on another host, it can recover the state from the changelog. > > Additionally, if you enable host-affinity feature for your Samza > application, a failed container may get allocated on the same host. In this > case, it will re-use the already persisted state on disk, after making sure > that it is caught up with the changelog stream. > > HTH, > Navina > > On Tue, Aug 2, 2016 at 11:51 AM, Jack Huang <jackhu...@mz.com> wrote: > > > Hi all, > > > > Is there any reason to use RocksDB without associating it to changelog in > > Kafka? My understanding is that even though Rocks persists data to disk, > > when container fails the partition might be restarted on a different > > machine, where there is no persisted data on disk. In that case, wouldn't > > it make sense to use a in-memory store (e.g. java.util.Map) instead since > > Rocks effectively does not provide persistence under Samza's > architecture? > > > > Thanks, > > Jack > > > > > > -- > Navina R. >