We've tried to use Kafka not as a persistent store, but as a long-term archival store. An outstanding issue we've had with that is that the broker holds on to an open file handle on every file in the log! The other issue we've had is when you create a long-term archival log on shared storage, you can't simply access that data from another cluster b/c of meta data being stored in zookeeper rather than in the log.
--Scott Thibault On Mon, Jul 13, 2015 at 4:44 AM, Daniel Schierbeck < daniel.schierb...@gmail.com> wrote: > Would it be possible to document how to configure Kafka to never delete > messages in a topic? It took a good while to figure this out, and I see it > as an important use case for Kafka. > > On Sun, Jul 12, 2015 at 3:02 PM Daniel Schierbeck < > daniel.schierb...@gmail.com> wrote: > > > > > > On 10. jul. 2015, at 23.03, Jay Kreps <j...@confluent.io> wrote: > > > > > > If I recall correctly, setting log.retention.ms and > log.retention.bytes > > to > > > -1 disables both. > > > > Thanks! > > > > > > > > On Fri, Jul 10, 2015 at 1:55 PM, Daniel Schierbeck < > > > daniel.schierb...@gmail.com> wrote: > > > > > >> > > >>> On 10. jul. 2015, at 15.16, Shayne S <shaynest...@gmail.com> wrote: > > >>> > > >>> There are two ways you can configure your topics, log compaction and > > with > > >>> no cleaning. The choice depends on your use case. Are the records > > >> uniquely > > >>> identifiable and will they receive updates? Then log compaction is > the > > >> way > > >>> to go. If they are truly read only, you can go without log > compaction. > > >> > > >> I'd rather be free to use the key for partitioning, and the records > are > > >> immutable — they're event records — so disabling compaction altogether > > >> would be preferable. How is that accomplished? > > >>> > > >>> We have a small processes which consume a topic and perform upserts > to > > >> our > > >>> various database engines. It's easy to change how it all works and > > simply > > >>> consume the single source of truth again. > > >>> > > >>> I've written a bit about log compaction here: > > >>> > > http://www.shayne.me/blog/2015/2015-06-25-everything-about-kafka-part-2/ > > >>> > > >>> On Fri, Jul 10, 2015 at 3:46 AM, Daniel Schierbeck < > > >>> daniel.schierb...@gmail.com> wrote: > > >>> > > >>>> I'd like to use Kafka as a persistent store – sort of as an > > alternative > > >> to > > >>>> HDFS. The idea is that I'd load the data into various other systems > in > > >>>> order to solve specific needs such as full-text search, analytics, > > >> indexing > > >>>> by various attributes, etc. I'd like to keep a single source of > truth, > > >>>> however. > > >>>> > > >>>> I'm struggling a bit to understand how I can configure a topic to > > retain > > >>>> messages indefinitely. I want to make sure that my data isn't > deleted. > > >> Is > > >>>> there a guide to configuring Kafka like this? > > >> > > > -- *This e-mail is not encrypted. Due to the unsecured nature of unencrypted e-mail, there may be some level of risk that the information in this e-mail could be read by a third party. Accordingly, the recipient(s) named above are hereby advised to not communicate protected health information using this e-mail address. If you desire to send protected health information electronically, please contact MultiScale Health Networks at (206)538-6090*