As it happens, I submitted a ticket for this feature a couple days ago: https://issues.apache.org/jira/browse/KAFKA-2260
Couldn't find any existing proposals for similar things, but it's certainly possible they're out there... On the other hand, I think you can solve your particular issue by reframing the problem: treating the messages as 'requests' or 'commands' instead of statements of fact. In your flight-booking example, the log would correctly reflect that two different people tried to book the same flight; the stream consumer would be responsible for finalizing one booking, and notifying the other client that their request had failed. (In-browser or by email.) On Wed, Jun 10, 2015 at 5:04 AM, Daniel Schierbeck <daniel.schierb...@gmail.com> wrote: > I've been working on an application which uses Event Sourcing, and I'd like > to use Kafka as opposed to, say, a SQL database to store events. This would > allow me to easily integrate other systems by having them read off the > Kafka topics. > > I do have one concern, though: the consistency of the data can only be > guaranteed if a command handler has a complete picture of all past events > pertaining to some entity. > > As an example, consider an airline seat reservation system. Each > reservation command issued by a user is rejected if the seat has already > been taken. If the seat is available, a record describing the event is > appended to the log. This works great when there's only one producer, but > in order to scale I may need multiple producer processes. This introduces a > race condition: two command handlers may simultaneously receive a command > to reserver the same seat. The event log indicates that the seat is > available, so each handler will append a reservation event – thus > double-booking that seat! > > I see three ways around that issue: > 1. Don't use Kafka for this. > 2. Force a singler producer for a given flight. This will impact > availability and make routing more complex. > 3. Have a way to do optimistic locking in Kafka. > > The latter idea would work either on a per-key basis or globally for a > partition: when appending to a partition, the producer would indicate in > its request that the request should be rejected unless the current offset > of the partition is equal to x. For the per-key setup, Kafka brokers would > track the offset of the latest message for each unique key, if so > configured. This would allow the request to specify that it should be > rejected if the offset for key k is not equal to x. > > This way, only one of the command handlers would succeed in writing to > Kafka, thus ensuring consistency. > > There are different levels of complexity associated with implementing this > in Kafka depending on whether the feature would work per-partition or > per-key: > * For the per-partition optimistic locking, the broker would just need to > keep track of the high water mark for each partition and reject conditional > requests when the offset doesn't match. > * For per-key locking, the broker would need to maintain an in-memory table > mapping keys to the offset of the last message with that key. This should > be fairly easy to maintain and recreate from the log if necessary. It could > also be saved to disk as a snapshot from time to time in order to cut down > the time needed to recreate the table on restart. There's a small > performance penalty associated with this, but it could be opt-in for a > topic. > > Am I the only one thinking about using Kafka like this? Would this be a > nice feature to have?