gaborgsomogyi commented on pull request #16917: URL: https://github.com/apache/flink/pull/16917#issuecomment-903127554
Lately I've put the test into a loop with extra logs and checking similar path. The test has failed after huuuuge amount of time and my findings are the following: * `waitApplicationFinishedElseKillIt` returned w/o exception because state reached `FINISHED` * YARN started to clean up directories etc... * Depending on how fast this clean-up is `flinkUberjar` is either there or already deleted All in all I agree with the direction but personally I would add `Assert.fail` on the case where `flinkUberjar` is missing (before getting file status [here](https://github.com/apache/flink/blob/82c1cc12ec6830d6e9cff27eb77dbebbe354f703/flink-yarn-tests/src/test/java/org/apache/flink/yarn/YARNFileReplicationITCase.java#L165)). I think that would help later analysis (job already failed or killed). I'm just starting the loop w/ the suggested change and let's see whether this solves it or not. Will come back w/ the result in couple of days... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org