I have been thinking a little about this. I don't think CAS actually requires any particular broker support. Rather the two writers just write messages with some deterministic check-and-set criteria and all the replicas read from the log and check this criteria before applying the write. This mechanism has the downside that it creates additional writes when there is a conflict and requires waiting on the full roundtrip (write and then read) but it has the advantage that it is very flexible as to the criteria you use.
An alternative strategy for accomplishing the same thing a bit more efficiently is to elect leaders amongst the writers themselves. This would require broker support for single writer to avoid the possibility of split brain. I like this approach better because the leader for a partition can then do anything they want on their local data to make the decision of what is committed, however the downside is that the mechanism is more involved. -Jay On Fri, Jun 12, 2015 at 6:43 AM, Ben Kirwin <b...@kirw.in> wrote: > Gwen: Right now I'm just looking for feedback -- but yes, if folks are > interested, I do plan to do that implementation work. > > Daniel: Yes, that's exactly right. I haven't thought much about > per-key... it does sound useful, but the implementation seems a bit > more involved. Want to add it to the ticket? > > On Fri, Jun 12, 2015 at 7:49 AM, Daniel Schierbeck > <daniel.schierb...@gmail.com> wrote: > > Ben: your solutions seems to focus on partition-wide CAS. Have you > > considered per-key CAS? That would make the feature more useful in my > > opinion, as you'd greatly reduce the contention. > > > > On Fri, Jun 12, 2015 at 6:54 AM Gwen Shapira <gshap...@cloudera.com> > wrote: > > > >> Hi Ben, > >> > >> Thanks for creating the ticket. Having check-and-set capability will be > >> sweet :) > >> Are you planning to implement this yourself? Or is it just an idea for > >> the community? > >> > >> Gwen > >> > >> On Thu, Jun 11, 2015 at 8:01 PM, Ben Kirwin <b...@kirw.in> wrote: > >> > As it happens, I submitted a ticket for this feature a couple days > ago: > >> > > >> > https://issues.apache.org/jira/browse/KAFKA-2260 > >> > > >> > Couldn't find any existing proposals for similar things, but it's > >> > certainly possible they're out there... > >> > > >> > On the other hand, I think you can solve your particular issue by > >> > reframing the problem: treating the messages as 'requests' or > >> > 'commands' instead of statements of fact. In your flight-booking > >> > example, the log would correctly reflect that two different people > >> > tried to book the same flight; the stream consumer would be > >> > responsible for finalizing one booking, and notifying the other client > >> > that their request had failed. (In-browser or by email.) > >> > > >> > On Wed, Jun 10, 2015 at 5:04 AM, Daniel Schierbeck > >> > <daniel.schierb...@gmail.com> wrote: > >> >> I've been working on an application which uses Event Sourcing, and > I'd > >> like > >> >> to use Kafka as opposed to, say, a SQL database to store events. This > >> would > >> >> allow me to easily integrate other systems by having them read off > the > >> >> Kafka topics. > >> >> > >> >> I do have one concern, though: the consistency of the data can only > be > >> >> guaranteed if a command handler has a complete picture of all past > >> events > >> >> pertaining to some entity. > >> >> > >> >> As an example, consider an airline seat reservation system. Each > >> >> reservation command issued by a user is rejected if the seat has > already > >> >> been taken. If the seat is available, a record describing the event > is > >> >> appended to the log. This works great when there's only one producer, > >> but > >> >> in order to scale I may need multiple producer processes. This > >> introduces a > >> >> race condition: two command handlers may simultaneously receive a > >> command > >> >> to reserver the same seat. The event log indicates that the seat is > >> >> available, so each handler will append a reservation event – thus > >> >> double-booking that seat! > >> >> > >> >> I see three ways around that issue: > >> >> 1. Don't use Kafka for this. > >> >> 2. Force a singler producer for a given flight. This will impact > >> >> availability and make routing more complex. > >> >> 3. Have a way to do optimistic locking in Kafka. > >> >> > >> >> The latter idea would work either on a per-key basis or globally for > a > >> >> partition: when appending to a partition, the producer would > indicate in > >> >> its request that the request should be rejected unless the current > >> offset > >> >> of the partition is equal to x. For the per-key setup, Kafka brokers > >> would > >> >> track the offset of the latest message for each unique key, if so > >> >> configured. This would allow the request to specify that it should be > >> >> rejected if the offset for key k is not equal to x. > >> >> > >> >> This way, only one of the command handlers would succeed in writing > to > >> >> Kafka, thus ensuring consistency. > >> >> > >> >> There are different levels of complexity associated with implementing > >> this > >> >> in Kafka depending on whether the feature would work per-partition or > >> >> per-key: > >> >> * For the per-partition optimistic locking, the broker would just > need > >> to > >> >> keep track of the high water mark for each partition and reject > >> conditional > >> >> requests when the offset doesn't match. > >> >> * For per-key locking, the broker would need to maintain an in-memory > >> table > >> >> mapping keys to the offset of the last message with that key. This > >> should > >> >> be fairly easy to maintain and recreate from the log if necessary. It > >> could > >> >> also be saved to disk as a snapshot from time to time in order to cut > >> down > >> >> the time needed to recreate the table on restart. There's a small > >> >> performance penalty associated with this, but it could be opt-in for > a > >> >> topic. > >> >> > >> >> Am I the only one thinking about using Kafka like this? Would this > be a > >> >> nice feature to have? > >> >