Hi Ben, Thanks for creating the ticket. Having check-and-set capability will be sweet :) Are you planning to implement this yourself? Or is it just an idea for the community?
Gwen On Thu, Jun 11, 2015 at 8:01 PM, Ben Kirwin <b...@kirw.in> wrote: > As it happens, I submitted a ticket for this feature a couple days ago: > > https://issues.apache.org/jira/browse/KAFKA-2260 > > Couldn't find any existing proposals for similar things, but it's > certainly possible they're out there... > > On the other hand, I think you can solve your particular issue by > reframing the problem: treating the messages as 'requests' or > 'commands' instead of statements of fact. In your flight-booking > example, the log would correctly reflect that two different people > tried to book the same flight; the stream consumer would be > responsible for finalizing one booking, and notifying the other client > that their request had failed. (In-browser or by email.) > > On Wed, Jun 10, 2015 at 5:04 AM, Daniel Schierbeck > <daniel.schierb...@gmail.com> wrote: >> I've been working on an application which uses Event Sourcing, and I'd like >> to use Kafka as opposed to, say, a SQL database to store events. This would >> allow me to easily integrate other systems by having them read off the >> Kafka topics. >> >> I do have one concern, though: the consistency of the data can only be >> guaranteed if a command handler has a complete picture of all past events >> pertaining to some entity. >> >> As an example, consider an airline seat reservation system. Each >> reservation command issued by a user is rejected if the seat has already >> been taken. If the seat is available, a record describing the event is >> appended to the log. This works great when there's only one producer, but >> in order to scale I may need multiple producer processes. This introduces a >> race condition: two command handlers may simultaneously receive a command >> to reserver the same seat. The event log indicates that the seat is >> available, so each handler will append a reservation event – thus >> double-booking that seat! >> >> I see three ways around that issue: >> 1. Don't use Kafka for this. >> 2. Force a singler producer for a given flight. This will impact >> availability and make routing more complex. >> 3. Have a way to do optimistic locking in Kafka. >> >> The latter idea would work either on a per-key basis or globally for a >> partition: when appending to a partition, the producer would indicate in >> its request that the request should be rejected unless the current offset >> of the partition is equal to x. For the per-key setup, Kafka brokers would >> track the offset of the latest message for each unique key, if so >> configured. This would allow the request to specify that it should be >> rejected if the offset for key k is not equal to x. >> >> This way, only one of the command handlers would succeed in writing to >> Kafka, thus ensuring consistency. >> >> There are different levels of complexity associated with implementing this >> in Kafka depending on whether the feature would work per-partition or >> per-key: >> * For the per-partition optimistic locking, the broker would just need to >> keep track of the high water mark for each partition and reject conditional >> requests when the offset doesn't match. >> * For per-key locking, the broker would need to maintain an in-memory table >> mapping keys to the offset of the last message with that key. This should >> be fairly easy to maintain and recreate from the log if necessary. It could >> also be saved to disk as a snapshot from time to time in order to cut down >> the time needed to recreate the table on restart. There's a small >> performance penalty associated with this, but it could be opt-in for a >> topic. >> >> Am I the only one thinking about using Kafka like this? Would this be a >> nice feature to have?