Re: Event sourcing and topic partitions

Mark van Leeuwen Tue, 29 Mar 2016 16:05:25 -0700

Thanks for your detailed reply. Viewing each partition as an orderedevent bus is helpful.

The problem then moves to working out a strategy for mapping individualDDD aggregates to partitions in a cluster which distributes load andalso allows for growing the cluster.



On 30/03/16 01:07, Helleren, Erik wrote:

Well, if a partition is too large of a unit of order for your tastes, you
can add publisher ID¹s to some metadata, or force partition mapping and
use the key as an extra level of partitioning.  And, pick a topicName that
describes all the traffic on that topic. An example:

topicName=³ad.click.events²; //lets say we are a web advertising company
and want to aggregate click events

publisherId=³ad.click.handling.server.1234²; // this is the name of the
entity publishing to the topic, server #1234
targetPartition=config.targetPartition; //This needs to be configured
somehow, and known to all consumers
usingPublisherIDWithPartition = new ProducerRecord<>(topicName,
targetPartition, publisherId, payload);

Then, on consumers, you can use the kafka API to listen to a targeted
partition, and then filter to just the publisher ID you are interested in.
  You still get an ³even sourced² bus that would let you reliably replay
those messages in the same order to any number of consumers.

If you are willing to embrace kafka¹s consumer API to scale consumers,
your life becomes even easier. Example:

topicName=³ad.click.events²;
publisherId=³ad.click.handling.server.1234²;

usingOnlyPublisherID = new ProducerRecord<>(topicName, publisherId,
payload);



Kafka will ensure that all messages with the same publisher ID are sent to
the same partition. And from there, all messages with the same publisher
ID are sent in the same order to all consumers.  That is, provided there
are no mid-execution changes to the number of partitions, which is a
manual operation.  Also, number of publisherIds >> number of partitions
within the topic.

The think to keep in mind is that, while you would decrease any brokers
overhead, you are implicitly limiting your throughput.  And feel free to
replace ³publisherID² with ³TargetConsumerID² or ³OrderedEventBusID².


On 3/29/16, 7:38 AM, "Mark van Leeuwen" <m...@vl.id.au> wrote:

Thanks for sharing your experience.

I'm surprised no one else has responded. Maybe there are few people
using Kafka for event sourcing.

I did find one answer to my question
http://stackoverflow.com/questions/26060535/kafka-as-event-store-in-event-
sourced-system?rq=1

I guess using a single topic partition as the event source for multiple
aggregates is similar to using a single table shard for that purpose in
a relational DB.


On 29/03/16 02:28, Daniel Schierbeck wrote:

I ended up abandoning the use of Kafka as a primary event store, for
several reasons. One is the partition granularity issue; another is the
lack of a way to guarantee exclusive write access, i.e. ensure that
only a
single process can commit an event for an aggregate at any one time.
On Mon, 28 Mar 2016 at 16:54 Mark van Leeuwen <m...@vl.id.au> wrote:

Hi all,

When using Kafka for event sourcing in a CQRS style app, what approach
do you recommend for mapping DDD aggregates to topic partitions?

Assigning a partition to each aggregate seems at first to be the right
approach: events can be replayed in correct order for each aggregate
and
there is no mixing of events for different aggregates.

But this page


http://www.confluent.io/blog/how-to-choose-the-number-of-topicspartition
s-in-a-kafka-cluster

<http://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitio
ns-in-a-kafka-cluster>
has the recommendation to " limit the number of partitions per broker
to
two to four thousand and the total number of partitions in the cluster
to low tens of thousand".

One partition per aggregate will far exceed that number.

Thanks,
Mark

________________________________

NOTICE: This message, and any attachments, are for the intended recipient(s) only, 
may contain information that is privileged, confidential and/or proprietary and 
subject to important terms and conditions available at E-Communication 
Disclaimer<http://www.cmegroup.com/tools-information/communications/e-communication-disclaimer.html>.
 If you are not the intended recipient, please delete this message. CME Group and its 
subsidiaries reserve the right to monitor all email communications that occur on CME 
Group information systems.

Re: Event sourcing and topic partitions

Reply via email to