Ivan,

>From your description it seems Kafka stores "source of truth" of the data
and the k-v store is constructed via consuming from Kafka, right? In that
case time/size-based data retention policy is usually not preferred as it
may delete data out of expectation while people are querying the k-v store.
If you have to enforce some retention policy, then I would suggest use log
compaction at the Kafka layer and use an app-level thread that cleans up
the data in both kafka / kv stored according to your policy.

Guozhang


On Sun, Mar 1, 2015 at 8:07 AM, Ivan Balashov <ibalas...@gmail.com> wrote:

> 2015-03-01 18:41 GMT+03:00 Jay Kreps <jay.kr...@gmail.com>:
> > They are mutually exclusive. Can you expand on the motivation/use for
> > combining them?
>
> Thanks, Jay
>
> Let's say we need to build key-value storage semantically connected to
> the data that also stored in kafka.
> Once the particular pieces of data are gone due to retention
> expiration there might be no need to keep relevant pieces in the
> kv-storage.
> On the other hand, kv-storage most likely will benefit from
> compaction, since its keys receive multiple updates.
>
> If this is not available oob, looks like the same can now be achieved
> by manually scanning compacted topic and issuing "delete" markers.
>



-- 
-- Guozhang

Reply via email to