[ https://issues.apache.org/jira/browse/FLINK-10842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16696615#comment-16696615 ]
ASF GitHub Bot commented on FLINK-10842: ---------------------------------------- twalthr commented on a change in pull request #7073: [FLINK-10842][E2E tests] fix broken waiting loops in common.sh URL: https://github.com/apache/flink/pull/7073#discussion_r235892506 ########## File path: flink-end-to-end-tests/test-scripts/common.sh ########## @@ -242,30 +245,45 @@ function start_taskmanagers { } function start_and_wait_for_tm { - local url="${REST_PROTOCOL}://${NODENAME}:8081/taskmanagers" - - tm_query_result=$(curl ${CURL_SSL_ARGS} -s "${url}") - + tm_query_result=`query_running_tms` # we assume that the cluster is running if ! [[ ${tm_query_result} =~ \{\"taskmanagers\":\[.*\]\} ]]; then echo "Your cluster seems to be unresponsive at the moment: ${tm_query_result}" 1>&2 exit 1 fi - running_tms=`curl ${CURL_SSL_ARGS} -s "${url}" | grep -o "id" | wc -l` - + running_tms=`query_number_of_running_tms` ${FLINK_DIR}/bin/taskmanager.sh start + wait_for_number_of_running_tms $((running_tms+1)) +} - for i in {1..10}; do - local new_running_tms=`curl ${CURL_SSL_ARGS} -s "${url}" | grep -o "id" | wc -l` - if [ $((new_running_tms-running_tms)) -eq 0 ]; then - echo "TaskManager is not yet up." +function query_running_tms { + local url="${REST_PROTOCOL}://${NODENAME}:8081/taskmanagers" + curl ${CURL_SSL_ARGS} -s "${url}" Review comment: Yes, that makes sense to me. Thanks for the clarification. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Waiting loops are broken in e2e/common.sh > ----------------------------------------- > > Key: FLINK-10842 > URL: https://issues.apache.org/jira/browse/FLINK-10842 > Project: Flink > Issue Type: Bug > Components: E2E Tests > Affects Versions: 1.7.0 > Reporter: Andrey Zagrebin > Assignee: Andrey Zagrebin > Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > > There are 3 loops in flink-end-to-end-tests/test-scripts/common.sh where the > script waits for some event to happen (for i in \{1..10}; do): > - wait_dispatcher_running > - start_and_wait_for_tm > - wait_job_running > All loops have 10 iterations and the loop breaks if the awaited event > happens. If timeout occurs then the script does not fail and the function > just continues after 10 iterations ignoring that the awaited event did not > happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)