[ https://issues.apache.org/jira/browse/FLINK-17849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113092#comment-17113092 ]
Yang Wang commented on FLINK-17849: ----------------------------------- The reason why the {{YARNHighAvailabilityITCase}} hangs is that the application started by {{testJobRecoversAfterKillingTaskManager}} does not finished and next test {{testKillYarnSessionClusterEntrypoint}} do not have enough resources to start the Flink cluster. We could find the following logs of Flink client to verify. {code:java} 28160 16:30:59,764 [main] INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Deployment took more than 60 seconds. Please check if the requested resources are available in the YARN cluster {code} To make the test more stable, i suggest to increase the {{AkkaOptions.ASK_TIMEOUT}} to 30s, just like what we have done in {{YARNITCase}}. > YARNHighAvailabilityITCase hangs in Azure Pipelines CI > ------------------------------------------------------ > > Key: FLINK-17849 > URL: https://issues.apache.org/jira/browse/FLINK-17849 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN > Affects Versions: 1.11.0 > Reporter: Stephan Ewen > Priority: Blocker > Fix For: 1.11.0 > > Attachments: jobmanager.log > > > The test seems to hang for 15 minutes, then gets killed. > Full logs: > https://dev.azure.com/sewen0794/19b23adf-d190-4fb4-ae6e-2e92b08923a3/_apis/build/builds/25/logs/121 -- This message was sent by Atlassian Jira (v8.3.4#803005)