[ https://issues.apache.org/jira/browse/FLINK-26624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot updated FLINK-26624: ----------------------------------- Labels: stale-critical starter test-stability (was: starter test-stability) I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help the community manage its development. I see this issues has been marked as Critical but is unassigned and neither itself nor its Sub-Tasks have been updated for 14 days. I have gone ahead and marked it "stale-critical". If this ticket is critical, please either assign yourself or give an update. Afterwards, please remove the label or in 7 days the issue will be deprioritized. > Running HA (hashmap, async) end-to-end test failed on azure due to unable to > find master logs > --------------------------------------------------------------------------------------------- > > Key: FLINK-26624 > URL: https://issues.apache.org/jira/browse/FLINK-26624 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination > Affects Versions: 1.15.0, 1.17.0, 1.16.1 > Reporter: Yun Gao > Priority: Critical > Labels: stale-critical, starter, test-stability > Attachments: 20230304.2-build-46800-flink-logs.tgz > > > {code:java} > Mar 12 04:31:15 Waiting for text Completed checkpoint [1-9]* for job > 699ebf9bdcb51a9fe76db5463027d34c to appear 2 of times in logs... > grep: > /home/vsts/work/_temp/debug_files/flink-logs/*standalonesession-1*.log*: No > such file or directory > Mar 12 04:31:16 Starting standalonesession daemon on host fv-az302-918. > grep: > /home/vsts/work/_temp/debug_files/flink-logs/*standalonesession-1*.log*: No > such file or directory > Mar 12 04:41:23 A timeout occurred waiting for Completed checkpoint [1-9]* > for job 699ebf9bdcb51a9fe76db5463027d34c to appear 2 of times in logs. > Mar 12 04:41:23 Stopping job timeout watchdog (with pid=272045) > Mar 12 04:41:23 Killing JM watchdog @ 273681 > Mar 12 04:41:23 Killing TM watchdog @ 274268 > Mar 12 04:41:23 [FAIL] Test script contains errors. > Mar 12 04:41:23 Checking of logs skipped. > Mar 12 04:41:23 > Mar 12 04:41:23 [FAIL] 'Running HA (hashmap, async) end-to-end test' failed > after 10 minutes and 31 seconds! Test exited with exit code 1 > Mar 12 04:41:23 > 04:41:23 ##[group]Environment Information > Mar 12 04:41:24 Searching for .dump, .dumpstream and related files in > '/home/vsts/work/1/s' > dmesg: read kernel buffer failed: Operation not permitted > Mar 12 04:41:28 Stopping taskexecutor daemon (pid: 272837) on host > fv-az302-918. > Mar 12 04:41:29 Stopping standalonesession daemon (pid: 274590) on host > fv-az302-918. > Mar 12 04:41:35 Stopping zookeeper... > Mar 12 04:41:36 Stopping zookeeper daemon (pid: 272248) on host fv-az302-918. > The STDIO streams did not close within 10 seconds of the exit event from > process '/usr/bin/bash'. This may indicate a child process inherited the > STDIO streams and has not yet exited. > ##[error]Bash exited with code '1'. > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=32945&view=logs&j=bea52777-eaf8-5663-8482-18fbc3630e81&t=b2642e3a-5b86-574d-4c8a-f7e2842bfb14 -- This message was sent by Atlassian Jira (v8.20.10#820010)