from:"chenqin"

handle SUSPENDED in ZooKeeperLeaderRetrievalService

2021-04-12 Thread chenqin

Hi there, We observed several 1.11 job running in 1.11 restart due to job leader lost. Dig deeper, the issue seems related to SUSPENDED state handler in ZooKeeperLeaderRetrievalService. ASFAIK, suspended state is expected when zk is not certain if leader is still alive. It can follow up with RECO

Re: Flink job cannot find recover path after using entropy injection for s3 file systems

2021-04-06 Thread chenqin

Friendly ping, the fix for entropy marker is ready. -- Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/

Re: Flink job cannot find recover path after using entropy injection for s3 file systems

2021-03-30 Thread chenqin

link fix pr here https://github.com/apache/flink/pull/15442 we might need someone help review and merge meanwhile. -- Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/

Re: flink 1.11 class loading question

2021-03-30 Thread chenqin

Hi Till, We did some investigation and found this memory usage point to rocksdbstatebackend running on managed memory. So far we have seen this bug in rocksdbstatebackend on managed memory. we followed suggestion [1] and disabled managed memory management so far not seeing issue. I felt this mi

Re: Flink job cannot find recover path after using entropy injection for s3 file systems

2021-03-23 Thread chenqin

Also noticed the actual states stored in _metadata still contains entropy marker after we fix metadata directory issue. This issue seems related to code refactory as well as doesn't conveyed in tests. -- Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/

Re: Flink job cannot find recover path after using entropy injection for s3 file systems

2021-03-23 Thread chenqin

make it easier to read @Nullable private static EntropyInjectingFileSystem getEntropyFs(FileSystem fs) { LOG.warn(fs.getClass().toGenericString()); if (fs instanceof EntropyInjectingFileSystem) { return (EntropyInjectingFileSy

Re: Flink job cannot find recover path after using entropy injection for s3 file systems

2021-03-23 Thread chenqin

Hi Till, Thanks for sharing pointers related to entropy injection feature on 1.11. We did some investigation and so far it seems like an edge case handling bug. Testing Environment: flink 1.11.2 release with plugins plugins/s3-fs-hadoop/flink-s3-fs-hadoop state.backend.rocksdb.timer-service.fac

handle SUSPENDED in ZooKeeperLeaderRetrievalService

Re: Flink job cannot find recover path after using entropy injection for s3 file systems

Re: Flink job cannot find recover path after using entropy injection for s3 file systems

Re: flink 1.11 class loading question

Re: Flink job cannot find recover path after using entropy injection for s3 file systems

Re: Flink job cannot find recover path after using entropy injection for s3 file systems

Re: Flink job cannot find recover path after using entropy injection for s3 file systems

7 matches

Site Navigation

Mail list logo

Footer information