[ https://issues.apache.org/jira/browse/FLINK-34202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17818269#comment-17818269 ]
Xingbo Huang commented on FLINK-34202: -------------------------------------- I reviewed all the stages where timeouts occurred and found that these stages all ran on AlibabaCI001. Simultaneously, the runtime of all other successful stages is consistently around 2 hours and 40 minutes. In the logs, I didn't notice any tests being stuck or having an overly long runtime, so I think the timeout is largely due to AlibabaCI001's performance not being sufficient to complete 4 Python version tests within 4 hours. I submitted a PR to have the nightly CI randomly select a Python version for testing rather than running all 4 Python versions. By only running one Python version test, even if the machine's performance is poor, it should not exceed 2 hours (the lab triggered by the PR only runs the latest Python version, which takes about 40 minutes). > python tests take suspiciously long in some of the cases > -------------------------------------------------------- > > Key: FLINK-34202 > URL: https://issues.apache.org/jira/browse/FLINK-34202 > Project: Flink > Issue Type: Bug > Components: API / Python > Affects Versions: 1.17.2, 1.19.0, 1.18.1 > Reporter: Matthias Pohl > Assignee: Xingbo Huang > Priority: Critical > Labels: pull-request-available, test-stability > > [This release-1.18 > build|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56603&view=logs&j=3e4dd1a2-fe2f-5e5d-a581-48087e718d53&t=b4612f28-e3b5-5853-8a8b-610ae894217a] > has the python stage running into a timeout without any obvious reason. The > [python stage run for > JDK17|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56603&view=logs&j=b53e1644-5cb4-5a3b-5d48-f523f39bcf06] > was also getting close to the 4h timeout. > I'm creating this issue for documentation purposes. -- This message was sent by Atlassian Jira (v8.20.10#820010)