Re: Heartbeat of TaskManager with id container

2023-11-03 Thread xiangyu feng
Hi Nagireddy, Pls try to use '-Dlog4j.configuration=file:/opt/test/flink/log4j.properties'. Regards, Xiangyu Y SREEKARA BHARGAVA REDDY 于2023年11月3日周五 19:45写道: > Yes, I went through that document. > > How can i override log4j.properties with my custom log4j.properties( > */opt/test/flink/log4j.

Re: Heartbeat of TaskManager with id container

2023-11-02 Thread xiangyu feng
Hi Nagireddy, You can configure logging for your 1.16 job according to this doc[1]. [1] https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/deployment/advanced/logging/ Regards Xiangyu, Y SREEKARA BHARGAVA REDDY 于2023年11月3日周五 08:57写道: > Hi Xiangyu, > > I have one issue, > > I am

Re: Heartbeat of TaskManager with id container

2023-11-02 Thread Y SREEKARA BHARGAVA REDDY
Hi Xiangyu, I have one issue, I am using *Flink** 1.16* version, How can i specify log4j.properties for the flink run command line along with my job. every job i need to pass different log file. looks like below one is not working: -Dlog4j.configurationFile= Please help me with correct config fo

Re: Heartbeat of TaskManager with id container

2023-08-23 Thread xiangyu feng
Hi Nagireddy, I'm not sure how you monitoring kafka lag. AFAIK, you can check the metadata of the topic in your Kafka cluster to see the actual lag by following command. ./kafka-consumer-groups.sh --bootstrap-server 192.168.0.107:39092 --group --describe This tool is provided with Kafka distri

Re: Heartbeat of TaskManager with id container

2023-08-21 Thread Y SREEKARA BHARGAVA REDDY
Thanks Xiangyu, I have one issue, while running flink with kafka connector. its a working fine for couple of days. But suddenly kafka lag went to "Negative value" I am trying to find the root cause for that. Any suggestions? On Sat, Aug 5, 2023 at 5:57 PM xiangyu feng wrote: > Hi Nagireddy,

Re: Heartbeat of TaskManager with id container

2023-08-05 Thread xiangyu feng
Hi Nagireddy, I'm not particularly familiar with StreamingFileSink but I checked with the implementation of HadoopFsCommitter. AFAIK, when committing files to HDFS the committer will check if the temp file exist in the first place. [image: image.png] In your case, could u check why the committing

Re: Heartbeat of TaskManager with id container

2023-08-05 Thread Y SREEKARA BHARGAVA REDDY
Hi Xiangyu/Dev, Did any one has solution handle below important note in StreamingFileSink: Caused by: java.io.IOException: Cannot clean commit: Staging file does not exist. at org.apache.flink.runtime.fs.hdfs. HadoopRecoverableFsDataOutputStream$HadoopFsCommitter.commit( HadoopRecoverableFsDa

Re: Heartbeat of TaskManager with id container

2023-08-03 Thread xiangyu feng
Hi ynagireddy4u, >From the exception info, I think your application has met a HDFS file issue during the commit phase of checkpoint. Can u check why 'Staging file does not exist' in the first place? Regards, Xiangyu Y SREEKARA BHARGAVA REDDY 于2023年8月4日周五 12:21写道: > Hi Xiangyu/Dev Team, > > Tha

Re: Heartbeat of TaskManager with id container

2023-08-03 Thread Y SREEKARA BHARGAVA REDDY
Hi Xiangyu/Dev Team, Thanks for reply. In our flink job, we increase the *checkpoint timeout to 30 min.* And the *checkpoint interval is 10 min.* But while running the job we got below exception. java.lang.RuntimeException: Error while confirming checkpoint at org.apache.flink.streaming.ru

Re: Heartbeat of TaskManager with id container

2023-08-02 Thread xiangyu feng
Hi ynagireddy4u, We have met this exception before. Usually it is caused by following reasons: 1), TaskManager is too busy with other works to send the heartbeat to JobMaster or TaskManager process might already exited; 2), There might be a network issues between this TaskManager and JobMaster; 3

Heartbeat of TaskManager with id container

2023-07-31 Thread Y SREEKARA BHARGAVA REDDY
Hi Team, Did any one face the below exception. If yes, please share the resolution. 2023-07-28 22:04:16 j*ava.util.concurrent.TimeoutException: Heartbeat of TaskManager with id container_e19_1690528962823_0382_01_05 timed out.* at org.apache.flink.runtime.jobmaster. JobMaster$TaskManager