Thanks Yi Pan, I have one more question. Does KV-store consume automatically from a Kafka topic? Does it consume only on restore()? If so, do I have to implement the StreamTask job to consume a Kafka topic and call add() method?
On Fri, Oct 2, 2015 at 2:01 PM, Yi Pan <nickpa...@gmail.com> wrote: > Hi, Jae Hyeon, > > Good to see you back on the mailing list again! Regarding to your > questions, please see the answers below: > > > My KeyValueStore usage is a little bit different from usual cases because > > > I have to cache all unique ids for the past six hours, which can be > > > configured for the retention usage. Unique ids won't be repeated such > as > > > timestamp. In this case, log.cleanup.policy=compact will keep growing > the > > > KeyValueStore size, right? > > > > It will grow as big as the accumulative size of your unique ids. > > > > > > > > Can I use Samza KeyValueStore for the topics > > > with log.cleanup.policy=delete? If not, what's your recommended way for > > > state management of non-changelog Kafka topic? If it's possible, how > does > > > Kafka cleanup remove outdated records in KeyValueStore? > > > > I am not quite sure about your definition of "non-changelog" Kafka topics. > If you want to retire some of the old records in a KV-store periodically, > you will have to run the pruning manually in the window() method in the > current release. In the upcoming 0.10 release, we have incorporated RocksDB > TTL features in the KV-store, which would automatically prune the old > entries in the RocksDB store automatically. That said, the upcoming TTL > feature is not fully synchronized w/ the Kafka cleanup yet and is an > on-going work in the future. The recommendation is to use the TTL feature > and set the Kafka changelog to be time-retention based, w/ a retention time > longer than the RocksDB TTL to ensure no data loss. > > Hope the above answered your questions. > > Cheers! > > -Yi >