This is an automated email from the ASF dual-hosted git repository.

dbecker pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 99e8170997f18db0f63d451af89ca32320ebb465
Author: Yida Wu <[email protected]>
AuthorDate: Mon Feb 5 13:59:13 2024 -0800

    IMPALA-12721: Fix flaky tests involving check_deleted_file_fd()
    
    check_deleted_file_fd() is introduced in IMPALA-12681, however some
    spilling testcases involving check_deleted_file_fd() seem flaky.
    
    This patch fixed the issue by adding a retry mechanism within the
    check_deleted_file_fd() function. If the function encounters a
    failure, it retries the process of verifying the presence of a
    deleted referencing file. Based on my local test, the file will be
    removed after the test even when the test fails and the call to
    delete the file handle is ahead of the call to remove the file (This
    has been confirmed through additional testing logs). While there is
    no theory why this would happen, introducing a retry mechanism has
    allowed the test case to run successfully for 200 times without
    encountering any failures. It is possible that a delay may be
    occurring at some point in the process which leads to this kind of
    failure.
    
    Tests:
    Reran the testcase 200 times without a failure.
    
    Change-Id: I900aab7dc9833015ce140253ff40da28a6ed3ba6
    Reviewed-on: http://gerrit.cloudera.org:8080/21000
    Reviewed-by: Impala Public Jenkins <[email protected]>
    Tested-by: Impala Public Jenkins <[email protected]>
---
 tests/custom_cluster/test_scratch_disk.py | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tests/custom_cluster/test_scratch_disk.py 
b/tests/custom_cluster/test_scratch_disk.py
index d1ab74fc8..bc08c647b 100644
--- a/tests/custom_cluster/test_scratch_disk.py
+++ b/tests/custom_cluster/test_scratch_disk.py
@@ -26,6 +26,7 @@ import shutil
 import stat
 import subprocess
 import tempfile
+import time
 
 from tests.common.custom_cluster_test_suite import CustomClusterTestSuite
 from tests.common.skip import SkipIf
@@ -307,6 +308,10 @@ class TestScratchDir(CustomClusterTestSuite):
     assert pids
     for pid in pids:
       deleted_files = self.find_deleted_files_in_fd(pid)
+      if deleted_files is not None:
+        # Retry again if fails at the first time.
+        time.sleep(15)
+        deleted_files = self.find_deleted_files_in_fd(pid)
       assert deleted_files is None
 
   @pytest.mark.execute_serially

Reply via email to