Re: Best practices Partition Key

Dmitry Minkovsky Thu, 25 Jan 2018 07:04:34 -0800

> one entity - one topic, because I need to ensure the properly ordering in
the events.

This is a great in insight. I discovered that keeping entity-related things
on one topic is much easier than splitting entity-related things onto
multiple topics. If you have one topic, replaying that topic is trivial. If
you have multiple topics, replaying those topics requires careful
synchronization. In my case, I am doing event capture and I have
entity-related events on multiple topics. For example, for a user entity I
have topics `join-requests` and `settings-update-requests`. Having separate
topics is superficially nicer in terms of consuming them with Kafka
Streams: you can set up topic-specific serdes. But the benefit you get from
this is dwarfed by the complexity of then having to synchronize these two
streams if you want to replay them. Your situation seems simpler though
because you are not even doing event capture, but just logging complete
entities out of Cassandra.

> If I will use kafka like a datastore and search throgh the records,

Interactive Queries API makes this very nice.

On Thu, Jan 25, 2018 at 8:47 AM, Maria Pilar <pilife...@gmail.com> wrote:

> Hi everyone,
>
> I´m trying to understand the best practice to define the partition key. I
> have defined some topics that they are related with entities in cassandra
> data model, the relationship is one-to-one, one entity - one topic, because
> I need to ensure the properly ordering in the events. I have created one
> partition for each topic to ensure it as well.
>
> If I will use kafka like a datastore and search throgh the records, I know
> that could be a best practice use the partition key of Cassandra (e.g
> Customer ID) as a partition key in kafka
>
> any comment please ??
>
> thanks
>

Re: Best practices Partition Key

Reply via email to