AW: Timeseries aggregation with many IoT devices off of one Kafka topic.

2020-02-24 Thread theo.diefent...@scoop-software.de
Hi, At last flink forward in Berlin I spoke with some persons about the same problem, where they had construction devices as IoT sensors which could even be offline for multiple days. They told me that the major problem for them was that the watermark in Flink is maintained per operator /subtask, e

AW: Process stream multiple time with different KeyBy

2020-02-17 Thread theo.diefent...@scoop-software.de
Hi Sebastian, I'd also highly recommend a recent Flink blog post to you where exactly this question was answered in quote some detail : https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html Best regardsTheo Ursprüngliche Nachricht Von: Eduardo Winpenny Tejedor Datum

AW: How Flink Kafka Consumer works when it restarts

2020-02-12 Thread theo.diefent...@scoop-software.de
Hi Mahendra, Flink will regularly create checkpoints or manually triggered savepoints. This is data managed and stored by Flink and that data also contains the kafka offsets. When restarting, you can configure to restart from the last checkpoint and or savepoint. You can additionally configure F

AW: ArrayIndexOutOfBoundException on checkpoint creation

2019-11-27 Thread theo.diefent...@scoop-software.de
Sorry, I forgot to mention the environment. We use Flink 1.9.1 on a cloudera cdh6. 3.1 cluster (with Hadoop 3.0.0 but using Flink shaded 2.8.3-7. Might this be a problem? As it seems to arise from kryo, I doubt it) Our flink is configured as default. Our job uses FsStateBackend and exactly once

AW: [QUESTION] How to parallelize with explicit punctuation in Flink?

2019-10-09 Thread theo.diefent...@scoop-software.de
Hi Filip, I don't really understand your problem here. Do you have a source with a single sequential stream, where from time to time, there is a barrier element? Or do you have a source like Kafka with multiple partitions? If you have case 2 with multiple partitions, what exactly do you mean by "

AW: Flink restoring a job from a checkpoint

2019-10-09 Thread theo.diefent...@scoop-software.de
Hi Vishaws, With "from scratch", Congxian means that Flink won't load any state automatically and starts as if there was no state. Of course if the kafka consumer group already exists and you have configured Flink to start from group offsets if there is no state yet, it will start from the group of

Filter events based on future events

2019-09-10 Thread theo.diefent...@scoop-software.de
Hi there, I have the following use case:I get transaction logs from multiple servers. Each server puts its logs into its own Kafka partition so that within each partition the elements are monothonically ordered by time. Within the stream of transactions, we have some special events. Let's call them