Hi,

I have different events from a device which constitutes different metrics
for same device. Each of these event is produced by the device in
interval of few milli seconds to a minute.

Event1(Device1) -> Stream1 -> Metric 1
Event2 (Device1) -> Stream2 -> Metric 2 ...
..............
.......
Event100(Device1) -> Stream100 -> Metric100

The number of events can go up to few 100s for each data protocol and we
have around 4-5 data protocols. Metrics from different streams makes up a
records
like for example from above example for device 1 -

Device1 -> Metric1, Metric 2, Metric15 forms a single record for the
device. Currently in development phase I am using interval join to achieve
this, that is to create a record with latest data from different
streams(events).

Based on data protocol I have 4-5 topics. Currently the data for a single
event is being pushed to a partition of the kafka topic(producer key ->
event_type + data_protocol). So essentially one topic is made up of many
streams. I am filtering on the key to define the streams.

My question is - Is this correct way to stream the data, I had thought of
maintaining different topic for an event, however in that case number of
topics could go to few thousands and that is something which becomes little
challenging to maintain and not sure if kafka handles that well.

I know there are traditional ways to do this like pushing it to
timeseries db and then joining data for different metric but that is
something which will never scale, also this processing should be as
realtime as possible.

Are there better ways to handle this use case or I am on correct path.

Thanks,
Hemant

Reply via email to