Anomalous spikes in aggregations of keyed data

Kegel, Mark Mon, 30 Nov 2020 12:20:42 -0800

We have a high volume (600-700 shards) kinesis data stream that we are doing a 
simple keying and aggregation on. The logic is very simple: kinesis source, key 
by fields (A,B,C), window (1-minute, tumbling), aggregate by summing over 
integer field R, connect to sink.


We are seeing some anomalous spikes in our aggregations. From one minute to the 
next, the sum total for one particular key may increase 25x or more and then 
drop back down to a normal level, yet sums for other keys in the same window 
remain roughly the same, which we expect.

We don’t see this too often. Maybe 1-5 data points (key + timestamp) in an 
hour’s worth of 1-minute windowed data will have these spikes. The data has 
fairly low cardinality. There are only roughly two hundred distinct keys.

We inspected the raw kinesis stream and found no duplicates. It isn’t clear how 
these spikes could happen or what we might do to work around the issue since 
the code is as idiomatic as possible.

We are running the job as part of Kinesis Data Analytics, which is using Flink 
version 1.8. To connect to Kinesis we are using the 
amazon-kinesis-connection-flink library (v1.0.4) library and the EFO consumer 
mode.

Anomalous spikes in aggregations of keyed data

Reply via email to