Am I correct in assuming that Kafka will only retain a file handle for the last segment of the log? If the number of handles grows unbounded, then it would be an issue. But I plan on writing to this topic continuously anyway, so not separating data into cold and hot storage is the entire point.
Daniel Schierbeck > On 13. jul. 2015, at 15.41, Scott Thibault <scott.thiba...@multiscalehn.com> > wrote: > > We've tried to use Kafka not as a persistent store, but as a long-term > archival store. An outstanding issue we've had with that is that the > broker holds on to an open file handle on every file in the log! The other > issue we've had is when you create a long-term archival log on shared > storage, you can't simply access that data from another cluster b/c of meta > data being stored in zookeeper rather than in the log. > > --Scott Thibault > > > On Mon, Jul 13, 2015 at 4:44 AM, Daniel Schierbeck < > daniel.schierb...@gmail.com> wrote: > >> Would it be possible to document how to configure Kafka to never delete >> messages in a topic? It took a good while to figure this out, and I see it >> as an important use case for Kafka. >> >> On Sun, Jul 12, 2015 at 3:02 PM Daniel Schierbeck < >> daniel.schierb...@gmail.com> wrote: >> >>> >>>> On 10. jul. 2015, at 23.03, Jay Kreps <j...@confluent.io> wrote: >>>> >>>> If I recall correctly, setting log.retention.ms and >> log.retention.bytes >>> to >>>> -1 disables both. >>> >>> Thanks! >>> >>>> >>>> On Fri, Jul 10, 2015 at 1:55 PM, Daniel Schierbeck < >>>> daniel.schierb...@gmail.com> wrote: >>>> >>>>> >>>>>> On 10. jul. 2015, at 15.16, Shayne S <shaynest...@gmail.com> wrote: >>>>>> >>>>>> There are two ways you can configure your topics, log compaction and >>> with >>>>>> no cleaning. The choice depends on your use case. Are the records >>>>> uniquely >>>>>> identifiable and will they receive updates? Then log compaction is >> the >>>>> way >>>>>> to go. If they are truly read only, you can go without log >> compaction. >>>>> >>>>> I'd rather be free to use the key for partitioning, and the records >> are >>>>> immutable — they're event records — so disabling compaction altogether >>>>> would be preferable. How is that accomplished? >>>>>> >>>>>> We have a small processes which consume a topic and perform upserts >> to >>>>> our >>>>>> various database engines. It's easy to change how it all works and >>> simply >>>>>> consume the single source of truth again. >>>>>> >>>>>> I've written a bit about log compaction here: >>> http://www.shayne.me/blog/2015/2015-06-25-everything-about-kafka-part-2/ >>>>>> >>>>>> On Fri, Jul 10, 2015 at 3:46 AM, Daniel Schierbeck < >>>>>> daniel.schierb...@gmail.com> wrote: >>>>>> >>>>>>> I'd like to use Kafka as a persistent store – sort of as an >>> alternative >>>>> to >>>>>>> HDFS. The idea is that I'd load the data into various other systems >> in >>>>>>> order to solve specific needs such as full-text search, analytics, >>>>> indexing >>>>>>> by various attributes, etc. I'd like to keep a single source of >> truth, >>>>>>> however. >>>>>>> >>>>>>> I'm struggling a bit to understand how I can configure a topic to >>> retain >>>>>>> messages indefinitely. I want to make sure that my data isn't >> deleted. >>>>> Is >>>>>>> there a guide to configuring Kafka like this? > > > > -- > *This e-mail is not encrypted. Due to the unsecured nature of unencrypted > e-mail, there may be some level of risk that the information in this e-mail > could be read by a third party. Accordingly, the recipient(s) named above > are hereby advised to not communicate protected health information using > this e-mail address. If you desire to send protected health information > electronically, please contact MultiScale Health Networks at (206)538-6090*