KafkaConsumerThread exception logging

2019-09-04 Thread Aminouvic
Hello, We're running a streaming application that reads data from Kafka (Flink on yarn, Flink version 1.6.1 / Kafka cluster version 0.10.1) While troubleshooting a performance issue, we've noticed several KafkaExceptions (around 3500 per minute) with the following stack trace (using Flight Reco

Re: Large rocksdb state restore/checkpoint duration behavior

2018-10-23 Thread Aminouvic
Hello, Thank you for your answer and apologies for the late response. For timers we are using : state.backend.rocksdb.timer-service.factory: rocksdb Are we still affected by [1] ? For the interruptibility, we have coalesced our timers and the application became more responsive to stop signa

Large rocksdb state restore/checkpoint duration behavior

2018-10-10 Thread Aminouvic
Hi, We are using Flink 1.6.1 on yarn with rocksdb as backend incrementally checkpointed to hdfs (for data and timers). The job reads events from kafka (~1 billion event per day), constructs user sessions using an EventTimeSessionWindow coupled with a late firing trigger and WindowFunction with Ag