Job is Failing for every 2hrs - Out of Memory Exception

2024-05-14 Thread Madan D via user
Hello Team, Good morning! We have been running a flink job with Kafka  where it gets restarted every 2 hours with an Out of Memory Exception. We tried to increase task manager memory and reduce parallelism and  having rate limit to reduce consumption rate, but irrespectively, it restarts every

Re: Flink Job Failed With Kafka Exception

2023-11-06 Thread Madan D via user
Hello Hang/Lee,Thanks!In my usecase we listen from multiple topics but in few cases one of the topic may become inactive if producer decides to shutdown one of the topic but other topics still will be receiving data but what we observe is that if there’s one of the topic is getting in-active entire

Flink Job Failed With Kafka Exception

2023-10-10 Thread Madan D via user
Hello Team, We are running the Flink pipeline by consuming data from multiple topics, but we recently encountered that if there's one topic having issues with participation, etc., the whole Flink pipeline is failing, which is affecting topics. Is there a way we can make Flink Piplein keep runnin

Re: Flink Logs are missing While writing to file - Large amount of Events

2023-07-14 Thread Madan D via user
Hello Weihua, I tried to increase akka.framesize from default to 30MB but I still see few logs are missing from down stream operators if there's large amount of data processing. Regards,Madan  On Friday, 14 July 2023 at 11:45:36 am GMT-7, Madan D via user wrote: Hello Weihu

Re: Flink Logs are missing While writing to file - Large amount of Events

2023-07-14 Thread Madan D via user
reads the log file and displays it in the UI, so if the UI is accurate, There shouldn't be any issues with Flink writing files. You can check if there are any issues while transferring log file data to Splunk. Best, Weihua On Wed, Jul 12, 2023 at 1:02 AM Madan D via user wrote: Hello

Flink Logs are missing While writing to file - Large amount of Events

2023-07-11 Thread Madan D via user
Hello Team, Recently, we observed that Flink logs were missing while writing to files in order to forward them to Splunk to see event metrics, even though the Flink UI showed them accurate.Can you please help me understand what might be causing it? Regards,Madan

Re: 回复: Questions regarding adaptive scheduler with YARN and application mode

2023-06-28 Thread Madan D via user
Hello Leon, As described by Chen below Adaptive Scheduler doesn't perform auto scale a Flink Job other than allocating the requested slots based on availability. Recently we implemented this with EMR managed scaling by combining adaptive scheduler since there's no direct support of auto scaling

Re: Flink Kafka Source rebalancing - 1.14.2

2023-05-09 Thread Madan D via user
at 2:04 AM Madan D via user wrote: Hello Team, We have been using Flink Kafka consumer and recently we have been moving to Flink Kafka source to get more advanced features but we have been observing more rebalances right after data consumed and moving to next operator than Flink Kafka consumer

Flink Kafka Source rebalancing - 1.14.2

2023-05-09 Thread Madan D via user
Hello Team, We have been using Flink Kafka consumer and recently we have been moving to Flink Kafka source to get more advanced features but we have been observing more rebalances right after data consumed and moving to next operator than Flink Kafka consumer. Can you please let us know what m

Rescale Flink - Adjust Parallelism based on TPS

2023-01-27 Thread Madan D via user
Hello Team, I would like to understand if there's a way flink can rescale automatically by adjusting parallelism based on incoming traffic. For example, when a job starts incoming TPS is around 100 which can be processed with parallelism 4, and when TPS increased 1000 I might need to rescale fl

Updating Parallelism based on Traffic and Max Parallelism

2023-01-26 Thread Madan D via user
Hello Team, I am fine tuning my application which can adjust parallelism automatically based on traffic and EMR auto-scaling will be adding more cores as needed. I am trying to set max Parallelism but I always see its working with default parallelism irrespective of traffic. Can you please let

Re: Flink reactive mode for application clusters on AWS EMR - Auto Scaling

2023-01-14 Thread Madan D via user
Kubernetes native integration and works more or less out of the box. Cheers Gyula On Sat, 14 Jan 2023 at 01:16, Madan D via user wrote: Hello Team,I would like to understand auto scaling on EMR using either reactive mode or adaptive scheduler with custom or managed scaling.Can some one help me

Flink reactive mode for application clusters on AWS EMR - Auto Scaling

2023-01-13 Thread Madan D via user
Hello Team,I would like to understand auto scaling on EMR using either reactive mode or adaptive scheduler with custom or managed scaling.Can some one help me on this. Regards,Madan 

Flink adaptive Scheduler enabling EMR Auto Scaling

2022-12-10 Thread Madan D via user
Hello All, We are running our application on EMR and we would like to implement elastic scaling(since reactive mode works only in standalone mode) for flink using EMR managed scaling where we are trying to use adaptive scheduler to make sure job will not be failed when resources are not availabl

Exceeded Checkpoint tolerable failure

2022-12-07 Thread Madan D via user
Hello All, I am seeing below issue after I upgraded from 1.9.0 to 1.14.2 while publishing messages to pub sub which is causing frequent job restart and slow processing. Can you please help me. `Caused by: org.apache.flink.util.FlinkRuntimeException: Exceeded checkpoint tolerable failure thresho