[jira] [Created] (FLINK-9678) Remove hard-coded sleeps in HA E2E test

Chesnay Schepler (JIRA) Wed, 27 Jun 2018 05:11:13 -0700

Chesnay Schepler created FLINK-9678:
---------------------------------------


             Summary: Remove hard-coded sleeps in HA E2E test
                 Key: FLINK-9678
                 URL: https://issues.apache.org/jira/browse/FLINK-9678
             Project: Flink
          Issue Type: Improvement
          Components: Distributed Coordination, Tests
    Affects Versions: 1.5.0, 1.6.0
            Reporter: Chesnay Schepler


{{test_ha.sh}} uses 2 hard-coded sleeps.
{code:java}
# let the job run for a while to take some checkpoints
sleep 20

for (( c=0; c<${JM_KILLS}; c++ )); do
    # kill the JM and wait for watchdog to
    # create a new one which will take over
    kill_jm
    sleep 60
done{code}
These sleeps are always troublesome as they either make the test brittle by 
being to small, or causing the test to idle when they are to large.

The first sleep should be replaced with {{wait_num_checkpoints.}}

I'm not entirely sure about the semantics of the second sleep, but I guess 
we're waiting for the new JM to continue the job execution. In this case I 
suggest to instead query the job status via REST and wait until the job is 
running.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-9678) Remove hard-coded sleeps in HA E2E test

Reply via email to