Ben: your solutions seems to focus on partition-wide CAS. Have you
considered per-key CAS? That would make the feature more useful in my
opinion, as you'd greatly reduce the contention.

On Fri, Jun 12, 2015 at 6:54 AM Gwen Shapira <gshap...@cloudera.com> wrote:

> Hi Ben,
>
> Thanks for creating the ticket. Having check-and-set capability will be
> sweet :)
> Are you planning to implement this yourself? Or is it just an idea for
> the community?
>
> Gwen
>
> On Thu, Jun 11, 2015 at 8:01 PM, Ben Kirwin <b...@kirw.in> wrote:
> > As it happens, I submitted a ticket for this feature a couple days ago:
> >
> > https://issues.apache.org/jira/browse/KAFKA-2260
> >
> > Couldn't find any existing proposals for similar things, but it's
> > certainly possible they're out there...
> >
> > On the other hand, I think you can solve your particular issue by
> > reframing the problem: treating the messages as 'requests' or
> > 'commands' instead of statements of fact. In your flight-booking
> > example, the log would correctly reflect that two different people
> > tried to book the same flight; the stream consumer would be
> > responsible for finalizing one booking, and notifying the other client
> > that their request had failed. (In-browser or by email.)
> >
> > On Wed, Jun 10, 2015 at 5:04 AM, Daniel Schierbeck
> > <daniel.schierb...@gmail.com> wrote:
> >> I've been working on an application which uses Event Sourcing, and I'd
> like
> >> to use Kafka as opposed to, say, a SQL database to store events. This
> would
> >> allow me to easily integrate other systems by having them read off the
> >> Kafka topics.
> >>
> >> I do have one concern, though: the consistency of the data can only be
> >> guaranteed if a command handler has a complete picture of all past
> events
> >> pertaining to some entity.
> >>
> >> As an example, consider an airline seat reservation system. Each
> >> reservation command issued by a user is rejected if the seat has already
> >> been taken. If the seat is available, a record describing the event is
> >> appended to the log. This works great when there's only one producer,
> but
> >> in order to scale I may need multiple producer processes. This
> introduces a
> >> race condition: two command handlers may simultaneously receive a
> command
> >> to reserver the same seat. The event log indicates that the seat is
> >> available, so each handler will append a reservation event – thus
> >> double-booking that seat!
> >>
> >> I see three ways around that issue:
> >> 1. Don't use Kafka for this.
> >> 2. Force a singler producer for a given flight. This will impact
> >> availability and make routing more complex.
> >> 3. Have a way to do optimistic locking in Kafka.
> >>
> >> The latter idea would work either on a per-key basis or globally for a
> >> partition: when appending to a partition, the producer would indicate in
> >> its request that the request should be rejected unless the current
> offset
> >> of the partition is equal to x. For the per-key setup, Kafka brokers
> would
> >> track the offset of the latest message for each unique key, if so
> >> configured. This would allow the request to specify that it should be
> >> rejected if the offset for key k is not equal to x.
> >>
> >> This way, only one of the command handlers would succeed in writing to
> >> Kafka, thus ensuring consistency.
> >>
> >> There are different levels of complexity associated with implementing
> this
> >> in Kafka depending on whether the feature would work per-partition or
> >> per-key:
> >> * For the per-partition optimistic locking, the broker would just need
> to
> >> keep track of the high water mark for each partition and reject
> conditional
> >> requests when the offset doesn't match.
> >> * For per-key locking, the broker would need to maintain an in-memory
> table
> >> mapping keys to the offset of the last message with that key. This
> should
> >> be fairly easy to maintain and recreate from the log if necessary. It
> could
> >> also be saved to disk as a snapshot from time to time in order to cut
> down
> >> the time needed to recreate the table on restart. There's a small
> >> performance penalty associated with this, but it could be opt-in for a
> >> topic.
> >>
> >> Am I the only one thinking about using Kafka like this? Would this be a
> >> nice feature to have?
>

Reply via email to