Hi Hemant,

Flink passes your configurations to the Kafka consumer, so you could check
if you can subscribe to only one partition there.

However, I would discourage that approach. I don't see the benefit to just
subscribing to the topic entirely and have dedicated processing for the
different devices.

If you are concerned about the order, you shouldn't. Since all events of a
specific device-id reside in the same source partition, events are in-order
in Kafka (responsibility of producer, but I'm assuming that because of your
mail) and thus they are also in order in non-keyed streams in Flink. Any
keyBy on device-id or composite key involving device-id, would also retain
the order.

If you have exactly one partition per device-id, you could even go with
`DataStreamUtil#reinterpretAsKeyedStream` to avoid any shuffling.

Let me know if I misunderstood your use case or if you have further
questions.

Best,

Arvid

On Wed, Feb 19, 2020 at 8:39 AM hemant singh <hemant2...@gmail.com> wrote:

> Hello Flink Users,
>
> I have a use case where I am processing metrics from different type of
> sources(one source will have multiple devices) and for aggregations as well
> as build alerts order of messages is important. To maintain customer data
> segregation I plan to have single topic for each customer with each source
> stream data to one kafka partition.
> To maintain ordering I am planning to push data for a single source type
> to single partitions. Then I can create keyedstream so that each of the
> device-id I have a single stream which has ordered data for each device-id.
>
> However, flink-kafka consumer I don't see that I can read from a specific
> partition hence flink consumer read from multiple kafka partitions. So even
> if I try to create a keyedstream on source type(and then write to a
> partition for further processing like keyedstream on device-id) I think
> ordering will not be maintained per source type.
>
> Only other option I feel I am left with is have single partition for the
> topic so that flink can subscribe to the topic and this maintains the
> ordering, the challenge is too many topics(as I have this configuration for
> multiple customers) which is not advisable for a kafka cluster.
>
> Can anyone shed some light on how to handle this use case.
>
> Thanks,
> Hemant
>

Reply via email to