Re: How to synchronize KeyValueStore and Kafka cleanup

Bae, Jae Hyeon Fri, 02 Oct 2015 15:10:38 -0700

Thanks Yi Pan, I have one more question.

Does KV-store consume automatically from a Kafka topic? Does it consume
only on restore()? If so, do I have to implement the StreamTask job to
consume a Kafka topic and call add() method?


On Fri, Oct 2, 2015 at 2:01 PM, Yi Pan <nickpa...@gmail.com> wrote:

> Hi, Jae Hyeon,
>
> Good to see you back on the mailing list again! Regarding to your
> questions, please see the answers below:
>
> > My KeyValueStore usage is a little bit different from usual cases because
> > >  I have to cache all unique ids for the past six hours, which can be
> > > configured for the retention usage. Unique ids won't be repeated such
> as
> > > timestamp. In this case, log.cleanup.policy=compact will keep growing
> the
> > > KeyValueStore size, right?
> >
>
> It will grow as big as the accumulative size of your unique ids.
>
>
> > >
> > > Can I use Samza KeyValueStore for the topics
> > > with log.cleanup.policy=delete? If not, what's your recommended way for
> > > state management of non-changelog Kafka topic? If it's possible, how
> does
> > > Kafka cleanup remove outdated records in KeyValueStore?
> >
>
> I am not quite sure about your definition of "non-changelog" Kafka topics.
> If you want to retire some of the old records in a KV-store periodically,
> you will have to run the pruning manually in the window() method in the
> current release. In the upcoming 0.10 release, we have incorporated RocksDB
> TTL features in the KV-store, which would automatically prune the old
> entries in the RocksDB store automatically. That said, the upcoming TTL
> feature is not fully synchronized w/ the Kafka cleanup yet and is an
> on-going work in the future. The recommendation is to use the TTL feature
> and set the Kafka changelog to be time-retention based, w/ a retention time
> longer than the RocksDB TTL to ensure no data loss.
>
> Hope the above answered your questions.
>
> Cheers!
>
> -Yi
>

Re: How to synchronize KeyValueStore and Kafka cleanup

Reply via email to