Hi,

Couple of comments on this.

What you're proposing is difficult to do at scale and would require some
type of Paxos style algorithm for the update only if different - it would
be easier in that case to just go ahead and do the update.

Also, it seems like a conflation of concerns - in an event sourcing model,
we save the immutable event, and represent current state in another,
separate data structure.  Perhaps cassandra would work well here - that
data model night provide what you're looking for out of the box.

Just as I don't recommend people use data stores as queuing mechanisms, I
also recommend not using a queuing mechanism as a primary datastore - mixed
semantics.

--
*Colin*
+1 612 859-6129


On Mon, Jan 5, 2015 at 4:47 AM, Daniel Schierbeck <
daniel.schierb...@gmail.com> wrote:

> I'm trying to design a system that uses Kafka as its primary data store by
> persisting immutable events into a topic and keeping a secondary index in
> another data store. The secondary index would store the "entities". Each
> event would pertain to some "entity", e.g. a user, and those entities are
> stored in an easily queriable way.
>
> Kafka seems well suited for this, but there's one thing I'm having problems
> with. I cannot guarantee that only one process writes events about an
> entity, which makes the design vulnerable to integrity issues.
>
> For example, say that a user can have multiple email addresses assigned,
> and the EmailAddressRemoved event is published when the user removes one.
> There's an integrity constraint, though: every user MUST have at least one
> email address. As far as I can see, there's no way to stop two separate
> processes from looking up a user entity, seeing that there are two email
> addresses assigned, and each publish an event. The end result would violate
> the contraint.
>
> If I'm wrong in saying that this isn't possible I'd love some feedback!
>
> My current thinking is that Kafka could relatively easily support this kind
> of application with a small additional API. Kafka already has the abstract
> notion of entities through its key-based retention policy. If the produce
> API was modified in order to allow an integer OffsetConstraint, the
> following algorithm could determine whether the request should proceed:
>
> 1. For every key seen, keep track of the offset of the latest message
> referencing the key.
> 2. When an OffsetContraint is specified in the produce API call, compare
> that value with the latest offset for the message key.
> 2.1. If they're identical, allow the operation to continue.
> 2.2. If they're not identical, fail with some OptimisticLockingFailure.
>
> Would such a feature be completely out of scope for Kafka?
>
> Best regards,
> Daniel Schierbeck
>

Reply via email to