[ 
https://issues.apache.org/jira/browse/FLINK-17849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113092#comment-17113092
 ] 

Yang Wang commented on FLINK-17849:
-----------------------------------

The reason why the {{YARNHighAvailabilityITCase}} hangs is that the application 
started by {{testJobRecoversAfterKillingTaskManager}} does not finished and 
next test {{testKillYarnSessionClusterEntrypoint}} do not have enough resources 
to start the Flink cluster. We could find the following logs of Flink client to 
verify.
{code:java}
28160 16:30:59,764 [main] INFO  org.apache.flink.yarn.YarnClusterDescriptor     
             [] - Deployment took more than 60 seconds. Please check if the 
requested resources are available in the YARN cluster
{code}
To make the test more stable, i suggest to increase the 
{{AkkaOptions.ASK_TIMEOUT}} to 30s, just like what we have done in 
{{YARNITCase}}.

> YARNHighAvailabilityITCase hangs in Azure Pipelines CI
> ------------------------------------------------------
>
>                 Key: FLINK-17849
>                 URL: https://issues.apache.org/jira/browse/FLINK-17849
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN
>    Affects Versions: 1.11.0
>            Reporter: Stephan Ewen
>            Priority: Blocker
>             Fix For: 1.11.0
>
>         Attachments: jobmanager.log
>
>
> The test seems to hang for 15 minutes, then gets killed.
> Full logs: 
> https://dev.azure.com/sewen0794/19b23adf-d190-4fb4-ae6e-2e92b08923a3/_apis/build/builds/25/logs/121



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to