Ivan, >From your description it seems Kafka stores "source of truth" of the data and the k-v store is constructed via consuming from Kafka, right? In that case time/size-based data retention policy is usually not preferred as it may delete data out of expectation while people are querying the k-v store. If you have to enforce some retention policy, then I would suggest use log compaction at the Kafka layer and use an app-level thread that cleans up the data in both kafka / kv stored according to your policy.
Guozhang On Sun, Mar 1, 2015 at 8:07 AM, Ivan Balashov <ibalas...@gmail.com> wrote: > 2015-03-01 18:41 GMT+03:00 Jay Kreps <jay.kr...@gmail.com>: > > They are mutually exclusive. Can you expand on the motivation/use for > > combining them? > > Thanks, Jay > > Let's say we need to build key-value storage semantically connected to > the data that also stored in kafka. > Once the particular pieces of data are gone due to retention > expiration there might be no need to keep relevant pieces in the > kv-storage. > On the other hand, kv-storage most likely will benefit from > compaction, since its keys receive multiple updates. > > If this is not available oob, looks like the same can now be achieved > by manually scanning compacted topic and issuing "delete" markers. > -- -- Guozhang