Awesome! Thanks Dave! Right now we're using a chain of Apache Flume instances to do this (since Flume doesn't support SSL->Kafka), but Dory with persistence would let us simplify that a lot straight into Kafka from Dory.
-J On Mon, Jun 13, 2016 at 12:23 PM, Dave Peterson <d...@dspeterson.com> wrote: > Hmm... interesting use case. I can see the usefulness of persistence in > that scenario. Thanks very much for the feedback! I just added your > idea to Dory's wiki page http://dory.wikidot.com . Progress at adding > features has been slow lately, since I work on Dory in my limited spare > time. But I welcome all the outside help I can get. If you or others > are interested in working on adding a persistence feature, that would be > great. Otherwise hopefully I'll get a chance to work on it myself at > some point. Thanks again for the suggestion! > > > Regards, > Dave > > > -----Original Message----- > From: "Jason J. W. Williams" <jasonjwwilli...@gmail.com> > Sent: Monday, June 13, 2016 10:45am > To: users@kafka.apache.org > Subject: Re: Introducing Dory > > Hi Dave, > > Dory sounds very exciting. Without persistence its less useful for clients > connected over a WAN, since if the WAN goes wonky you could build up quite > a queue until it comes back. > > -J > > On Mon, Jun 13, 2016 at 3:00 AM, Dave Peterson <d...@dspeterson.com> > wrote: > > > Hi Arya, > > > > In the case of a kernel panic or other failure serious enough to bring > > a machine down (hardware failure, power outage, etc.), message loss is > > definitely possible, and not totally preventable regardless of what > > precautions software takes to avoid data loss. To ensure high > > throughput, Dory batches messages together before forwarding them to > > Kafka. The amount of batching Dory performs is configurable on a > > per-topic basis, and in practice will typically cause a delay of > > between several hundred milliseconds and several seconds before > > messages are forwarded to Kafka. Thus, if a machine on which Dory is > > running crashes, you can expect to lose up to a few seconds of batched > > messages depending on the batching configuration. Additionally, you > > may lose some sent messages for which Dory has not yet received an ACK > > from Kafka indicating successful receipt and persistence to disk. > > > > To narrow the window of possible message loss, one can set a very > > short batching delay, or even completely disable batching for one or > > more topics. However this is not recommended since it will have a > > serious adverse affect on the performance of your Kafka cluster. > > Another possibility would be for Dory to immediately persist messages > > to local disk on receipt, even before they are batched, and then > > delete them only after a non-error ACK has been received from Kafka. > > This would greatly narrow, although not totally eliminate the window > > for message loss. However it would add substantial complexity to > > Dory, and impose a substantial disk I/O performance cost. An > > important consideration is that when you write to a file, the kernel's > > default action is to not immediately write the data to disk. It > > buffers the data in its page cache for a period of time (typically up > > to 30 seconds) to improve throughput. If the machine crashes, all of > > this unwritten data is lost. You can get around this by calling > > fsync() on a file descriptor or specifying O_DIRECT | O_SYNC when > > opening a file, but this will degrade I/O performance. Dory avoids > > persisting data to disk with the view that these downsides outweigh > > the benefits of avoiding up to a few seconds of data loss when a > > machine crashes. However a feature I intend to add (described later > > in this email) will provide useful information about which messages > > were successfully delivered to Kafka in the event of a machine crash. > > > > Dory takes great care to avoid message loss due to circumstances > > other than machine crashes. When Dory starts up, it preallocates a > > fixed amount of memory (configurable with the --msg_buffer_max N > > command line option) for storing messages received from clients. > > Dory holds on to each message it receives until one of the following > > has occurred: > > > > - It receives an ACK from Kafka indicating that the message has > > been successfully received and persisted to disk. > > > > - It receives an error ACK from Kafka that causes the message to > > be immediately discarded. Certain error ACKs cause Dory to > > immediately discard messages, and others cause it to pause, > > update its metadata, and attempt redelivery based on the new > > metadata. ACK error 2 (invalid message, i.e. CRC error) causes > > Dory to immediately attempt redelivery without updating its > > metadata. The actions Dory takes for various error ACK values > > are documented here: > > > > > > https://github.com/dspeterson/dory/blob/master/doc/design.md#dispatcher > > > > - As mentioned above, certain error ACKs will cause Dory to > > attempt redelivery. Each time this occurs for a given message, > > Dory increments a failed delivery attempt count on the message. > > Once the count exceeds a certain value (specified by the > > --max_failed_delivery_attempts N command line option), Dory > > discards the message. > > > > - It has determined that the message is undeliverable due to a > > problem such as the message being malformed (see > > > > https://github.com/dspeterson/dory/blob/master/doc/sending_messages.md > > which documents Dory's input message format) or having an > > invalid topic. In this case, the message is discarded. > > > > - It has exhausted its preallocated memory for storing messages. > > This may occur in the case where a network-related outage or > > serious problem with the Kafka cluster persists for an extended > > period of time, preventing normal message delivery. In this > > case, new messages received from clients are discarded. > > > > - Dory receives a shutdown signal (SIGTERM or SIGINT). In this > > case, Dory immediately stops accepting new messages from > > clients, and for a certain configurable period of time attempts > > to deliver any pending messages. Once the time limit expires, > > any remaining messages are lost. This type of message loss > > can be avoided by not shutting Dory down until no more clients > > are sending it messages, and Dory has been given enough time to > > process all pending messages. > > > > A critical aspect of Dory's behavior is that every discard is tracked > > along with the reason why the discard occurred. Dory provides a web > > interface from which periodic discard reports and other status > > information can be obtained. Thus, in the rare event when discards > > occur, they can be tracked and recorded on a per-topic basis. The > > intent is for monitoring infrastructure to periodically ask Dory for > > a discard report, alert an administrator if message loss occurred, and > > optionally store the discard reports in a database so that a queryable > > history of data quality is available. > > > > A relevant feature I intend to add at some point is described here: > > > > http://dory.wikidot.com/wiki:commit-points > > > > This notion of a commit point can be useful in the event of a machine > > crash. If monitoring infrastructure periodically querys Dory for > > commit point information, then in the event of a machine crash, we > > know that any resulting message loss is limited to messages with > > timestamps > T, where T is the latest commit point. > > > > I hope this helps answer your questions, and that I haven't inundated > > you with too much information. Let me know if you have more > > questions. > > > > > > Regards, > > Dave > > > > > > -----Original Message----- > > From: "Arya Ketan" <ketan.a...@gmail.com> > > Sent: Sunday, June 12, 2016 11:52am > > To: users@kafka.apache.org > > Subject: Re: Introducing Dory > > > > Hi Dave, > > Dory looks pretty interesting. I had a few further questions on it > > a) How does Dory handle kernel panics? > > b) What kind of message guarantees does dory provide and also if you can > > share some design decisions taken to enable the guarantees whatever they > > are. > > > > Thanks > > Arya > > > > Arya > > > > On Sun, Jun 12, 2016 at 8:10 PM, Dave Peterson <d...@dspeterson.com> > > wrote: > > > > > Thanks! Enjoy :-) > > > > > > > > > > > > On 6/12/2016 12:24 AM, Gwen Shapira wrote: > > > > > >> Dory is pretty cool (even though it is named after a somewhat dorky > > >> fish). Thank you for sharing :) > > >> > > >> On Sun, Jun 12, 2016 at 1:24 AM, Dave Peterson <d...@dspeterson.com> > > >> wrote: > > >> > > >>> Hello Kafka users, > > >>> > > >>> Version 1.1.0 of Dory is now available. See > > >>> https://github.com/dspeterson/dory for details. Dory is the > successor > > >>> to Bruce (https://github.com/tagged/bruce), a Kafka producer daemon > I > > >>> created while working at if(we) (http://www.ifwe.co/). The code has > > >>> seen a number of improvements since its initial release in September > > >>> 2014. The list of example clients for various programming languages > > >>> has also been extended. Dory maintains full backward compatibility > > >>> with Bruce, so existing users can easily switch. > > >>> > > >>> The latest release adds support for receiving messages from clients > by > > >>> UNIX domain stream socket or local TCP. Although UNIX domain > > >>> datagrams are still the preferred means of sending messages in most > > >>> cases, the option of using stream sockets facilitates sending > messages > > >>> too large to fit in a single datagram. The local TCP option > > >>> facilitates adding support for clients written in programming > > >>> languages that do not provide easy access to UNIX domain sockets. > > >>> > > >>> Dory's wiki page http://dory.wikidot.com/start contains a list of > > >>> ideas for additional features and other improvements. Community > > >>> feedback is welcomed and appreciated. If you have ideas for things > > >>> you would like to see in future releases, please add them to the > list. > > >>> Also, please contribute code if you can afford the time. > > >>> > > >>> > > >>> Thanks, > > >>> Dave Peterson > > >>> > > >>> > > >>> > > > > > > > > > > >