Hi Hemant,
what you described is an aggregation. You aggregate 15 small records into
one large record. The concept of aggregation goes beyond pure numeric
operations; for example, forming one large string with CONCAT is also a
type of aggregation.
In your case, I'd still follow my general outline
Hi Arvid,
I don't want to aggregate all events, rather I want to create a record for
a device combining data from multiple events. Each of this event gives me a
metric for a device, so for example if I want one record for device-id=1
the metric will look like metric1, metric2, metric3 where m
Hi Hemant,
In general, you want to keep all data coming from one device in one Kafka
partition, such that the timestamps of that device are monotonically
increasing. Thus, when processing data from one device, you have ensured
that no out-of-order events with respect to this device happen.
If you
Hello Flink Users,
Any views on this question of mine.
Thanks,
Hemant
On Tue, May 12, 2020 at 7:00 PM hemant singh wrote:
> Hello Roman,
>
> Thanks for your response.
>
> I think partitioning you described (event type + protocol type) is subject
> to data skew. Including a device ID should sol
Hello Roman,
Thanks for your response.
I think partitioning you described (event type + protocol type) is subject
to data skew. Including a device ID should solve this problem.
Also, including "protocol_type" into the key and having topic per
protocol_type seems redundant.
Each protocol is in sin
Hello Hemant,
Thanks for your reply.
I think partitioning you described (event type + protocol type) is subject
to data skew. Including a device ID should solve this problem.
Also, including "protocol_type" into the key and having topic per
protocol_type seems redundant.
Furthermore, do you have
Hello Roman,
PFB my response -
As I understand, each protocol has a distinct set of event types (where
event type == metrics type); and a distinct set of devices. Is this correct?
Yes, correct. distinct events and devices. Each device emits these event.
> Based on data protocol I have 4-5 topics
Hi Hemant,
As I understand, each protocol has a distinct set of event types (where
event type == metrics type); and a distinct set of devices. Is this correct?
> Based on data protocol I have 4-5 topics. Currently the data for a single
event is being pushed to a partition of the kafka topic(produ
Hi,
I have different events from a device which constitutes different metrics
for same device. Each of these event is produced by the device in
interval of few milli seconds to a minute.
Event1(Device1) -> Stream1 -> Metric 1
Event2 (Device1) -> Stream2 -> Metric 2 ...
..
...
Even