Hi there,
We observed several 1.11 job running in 1.11 restart due to job leader lost.
Dig deeper, the issue seems related to SUSPENDED state handler in
ZooKeeperLeaderRetrievalService.
ASFAIK, suspended state is expected when zk is not certain if leader is
still alive. It can follow up with RECO
Friendly ping, the fix for entropy marker is ready.
--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
link fix pr here https://github.com/apache/flink/pull/15442
we might need someone help review and merge meanwhile.
--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
Hi Till,
We did some investigation and found this memory usage point to
rocksdbstatebackend running on managed memory. So far we have seen this bug
in rocksdbstatebackend on managed memory. we followed suggestion [1] and
disabled managed memory management so far not seeing issue.
I felt this mi
Also noticed the actual states stored in _metadata still contains entropy
marker after we fix metadata directory issue. This issue seems related to
code refactory as well as doesn't conveyed in tests.
--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
make it easier to read
@Nullable
private static EntropyInjectingFileSystem getEntropyFs(FileSystem fs) {
LOG.warn(fs.getClass().toGenericString());
if (fs instanceof EntropyInjectingFileSystem) {
return (EntropyInjectingFileSy
Hi Till,
Thanks for sharing pointers related to entropy injection feature on 1.11.
We did some investigation and so far it seems like an edge case handling
bug.
Testing Environment:
flink 1.11.2 release with plugins
plugins/s3-fs-hadoop/flink-s3-fs-hadoop
state.backend.rocksdb.timer-service.fac