Just to be sure: Is the task whose backpressure you want to monitor the Kafka Source?
There is an open issue that backpressure monitoring does not work for the Kafka Source: https://issues.apache.org/jira/browse/FLINK-3456 To circumvent that, add an "IdentityMapper" after the Kafka source, make sure it is non-chained, and monitor the backpressure on that MapFunction. Greetings, Stephan On Thu, Mar 10, 2016 at 11:23 AM, Robert Metzger <rmetz...@apache.org> wrote: > Hi Vijay, > > regarding your other questions: > > 1) On the TaskManagers, the FlinkKafkaConsumers will write the partitions > they are going to read in the log. There is currently no way of seeing the > state of a checkpoint in Flink (which is the offsets). > However, once a checkpoint is completed, the Kafka consumer is committing > the offset to the Kafka broker. (I could not find tool to get the committed > offsets from the broker, but its either stored in ZK or in a special topic > by the broker. In Kafka 0.8 that's easily doable with the > kafka.tools.ConsumerOffsetChecker) > > 2) Do you see duplicate data written by the rolling file sink? Or do you > see it somewhere else? > HDP 2.4 is using Hadoop 2.7.1 so the truncate() of invalid data should > actually work properly. > > > > > > On Thu, Mar 10, 2016 at 10:44 AM, Ufuk Celebi <u...@apache.org> wrote: > >> How many vertices does the web interface show and what parallelism are >> you running? If the sleeping operator is chained you will not see >> anything. >> >> If your goal is to just see some back pressure warning, you can call >> env.disableOperatorChaining() and re-run the program. Does this work? >> >> – Ufuk >> >> >> On Thu, Mar 10, 2016 at 1:35 AM, Vijay Srinivasaraghavan >> <vijikar...@yahoo.com> wrote: >> > Hi Ufuk, >> > >> > I have increased the sampling size to 1000 and decreased the refresh >> > interval by half. In my Kafka topic I have pumped million messages >> which is >> > read by KafkaConsumer pipeline and then pass it to a transofmation step >> > where I have introduced sleep (3 sec) for every single message received >> and >> > the final step is HDFS sink using RollingSinc API. >> > >> > jobmanager.web.backpressure.num-samples: 1000 >> > jobmanager.web.backpressure.refresh-interval: 30000 >> > >> > >> > I was hoping to see the backpressure tab from UI to display some >> warning but >> > I still see "OK" message. >> > >> > This makes me wonder if I am testing the backpressure scenario properly >> or >> > not? >> > >> > Regards >> > Vijay >> > >> > On Monday, March 7, 2016 3:19 PM, Ufuk Celebi <u...@apache.org> wrote: >> > >> > >> > Hey Vijay! >> > >> > On Mon, Mar 7, 2016 at 8:42 PM, Vijay Srinivasaraghavan >> > <vijikar...@yahoo.com> wrote: >> >> 3) How can I simulate and verify backpressure? I have introduced some >> >> delay >> >> (Thread Sleep) in the job before the sink but the "backpressure" tab >> from >> >> UI >> >> does not show any indication of whether backpressure is working or not. >> > >> > If a task is slow, it is back pressuring upstream tasks, e.g. if your >> > transformations have the sleep, the sources should be back pressured. >> > It can happen that even with the sleep the tasks still produce their >> > data as fast as they can and hence no back pressure is indicated in >> > the web interface. You can increase the sleep to check this. >> > >> > The mechanism used to determine back pressure is based on sampling the >> > stack traces of running tasks. You can increase the number of samples >> > and/or decrease the delay between samples via config parameters shown >> > in [1]. It can happen that the samples miss the back pressure >> > indicators, but usually the defaults work fine. >> > >> > >> > [1] >> > >> https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#jobmanager-web-frontend >> > >> > >> > >> > >