Bug#986527: Patches for flaky build and cython unavailability

Ahzo Sat, 31 Jul 2021 16:57:54 -0700

Control: tags -1 patch

Hi,

the main problem making the sagemath testsuite flaky is that it randomly aborts
due to 'Too many open files'.
Thus only a small part of the test suite gets actually run, when the build is
heavily parallelized.
This can be seen by reporting not only the number of failed, but also that of
run tests, which shows significant fluctuations.

The problem occurs, because every finished, but not yet logged worker, holds an
open fd (a pipe used to read the output of the child actually doing the tests).
Thus when following a long running worker, i.e. logging its messages, while it
is still running, so many finished tests can accumulate, that the open files
limit (ulimit -n) is reached.

However, there should be no open pipe per finished worker, as the test suite
calls 'os.close(self.rmessages)' before waiting for logging the messages.
So this seems to be caused by something that python does behind the scenes.
Removing the single line 'finished.append(w)' in src/sage/doctest/forker.py
prevents the open fd increase, though at the cost of hardly logging any test
output.

This problem can be avoided by simply logging every finished test, but no
running one.

With only the 0001-Report-the-number-of-total-tests-run.patch, the result is
something like:
Success: 5 of 71435 tests failed, up to 200 failures are tolerated

Adding the dt-Do-not-follow-a-running-worker.patch, the result becomes:
Success: 194 of 361139 tests failed, up to 200 failures are tolerated

These 194 failures are pretty close to the threshold of 200, so it is not
particularly surprising, that this can fail in some environments.
Slightly passing this threshold triggered the build failure in this bug and
also the one in bug #983931.

Increasing the threshold to 300 should make that rather unlikely, though.
And considering that there are more than 360 thousand tests, less then 300
failures means more than 99.9 % of the tests succeeded.

The "cython: not found" issue is trivial to fix and important, because
otherwise 'sage --cython' does not work and there is no '--cython3' option
(unlike e.g. the '--ipython3' option).

After adding the 0002-Tolerate-up-to-300-failing-tests.patch and the
u2-Adapt-to-python2-removal.patch the test result is:
Success: 189 of 361139 tests failed, up to 300 failures are tolerated

It would also be a good idea to include a backport of commit 5cf493ca51 ("Avoid
libgmp's new lazy allocation") in the next sagemath upload, as that fixes a
severe memory leak (see bug #964848).

As to the crashes, I can't reproduce any crash when testing
interfaces/singular.py:
sage -t --long --random-seed=0 src/sage/interfaces/singular.py
[404 tests, 3.87 s]

This crash also does not always happen for the reproducible builds either, e.g.
the following log shows it first crashing and then passing this test:
https://tests.reproducible-builds.org/debian/rbuild/bullseye/amd64/sagemath_9.2-2.rbuild.log.gz
[...]
sage -t --long --random-seed=0 src/sage/interfaces/singular.py
Killed due to segmentation fault
[...]
sage -t --long --random-seed=0 src/sage/interfaces/singular.py
[404 tests, 21.06 s]
[...]

However, a number of other crashes happen during every test run, but only one
of them causes a test failure:
sage -t --long --random-seed=0 src/sage/interfaces/tests.py
**********************************************************************
File "src/sage/interfaces/tests.py", line 34, in sage.interfaces.tests
Failed example:
subprocess.call("echo syntax error | ecl", **kwds) in (0, 255)
Expected:
True
Got:
False
**********************************************************************

Similar crashes sometimes also occur when testing interfaces/lisp.py, but
without causing the test to fail.
This is a problem in ecl, which crashes when both stdout and stderr are full,
see bug #710953.

Then there is a crash in nauty-gentourng triggered by
src/sage/graphs/digraph_generators.py.
For details see bug #991750.

There are also two SIGABRT crashes in mwrank triggered by
src/sage/interfaces/mwrank.py.
These seem to be intentional due to invalid input.

Finally, there are some python crashes (5 SIGQUIT, 1 SIGABRT, 1 SIGSEGV) that
are all caused intentionally by the test suite.

So none of these crashes are problems in sagemath itself.

Regards,
Ahzo

>From 5c741b0066c861504483b5ed66915b01ddd078b0 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:15:51 +0200
Subject: [PATCH 1/2] Report the number of total tests run

This makes it easier to notice when tests get skipped.
---
 debian/rules    | 10 ++++++----
 debian/tests.mk |  4 ++++
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/debian/rules b/debian/rules
index f984695..9e904e2 100755
--- a/debian/rules
+++ b/debian/rules
@@ -163,11 +163,12 @@ endif
 had-few-failures:
 	if ! test -f $(LOGFILE); then echo "Error: log file $(LOGFILE) not found"; false; fi
 	N_TEST_FAILURES="$$($(TESTS_MK) failed-tests-total-normal)"; \
+	N_TESTS="$$($(TESTS_MK) tests-total)"; \
 	  if ! test $${N_TEST_FAILURES} -le $(MAX_TEST_FAILURES); then \
-	    echo "Error: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
+	    echo "Error: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
 	    false; \
 	  else \
-	    echo "Success: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
+	    echo "Success: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
 	  fi
 	if ! test -z "$$($(TESTS_MK) failed-tests-special $(IGNORE_FAILURES))"; then \
 	  echo "Error: critical test failures (e.g. timeout, segfault, etc.)"; false; fi
@@ -176,11 +177,12 @@ had-not-too-many-failures:
 	echo "Checking number of failed tests to determine whether to rerun tests in series..."
 	if ! test -f $(LOGFILE); then echo "Error: log file $(LOGFILE) not found"; false; fi
 	N_TEST_FAILURES="$$($(TESTS_MK) failed-tests-total-normal)"; \
+	N_TESTS="$$($(TESTS_MK) tests-total)"; \
 	  if ! test $${N_TEST_FAILURES} -le $(MAX_TEST_FAILURES_RERUN); then \
-	    echo "No: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
+	    echo "No: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
 	    false; \
 	  else \
-	    echo "Yes: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
+	    echo "Yes: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
 	  fi
 
 run_tests = \
diff --git a/debian/tests.mk b/debian/tests.mk
index 2bb2322..da707ad 100755
--- a/debian/tests.mk
+++ b/debian/tests.mk
@@ -14,6 +14,10 @@ check:
 check-failed:
 	$(SAGE) -t -p --all --long --logfile=$(LOGFILE) -f $(SAGE_TEST_FLAGS)
 
+TESTS_TOTAL = grep '^    \[\([0-9]*\) tests' $(LOGFILE) | grep -v "\[0 tests" | sed 's/.*\[\([0-9]*\) tests.*/\1/' | awk '{s+=$$1} END {print s}'
+tests-total:
+	$(TESTS_TOTAL)
+
 FAILED_TESTS = grep '^sage -t .*  \#' $(LOGFILE)
 failed-tests:
 	$(FAILED_TESTS)
-- 
2.30.2

>From 8ca73a8db8b77ae6343f7718f6e4e55afcb3b569 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:24:33 +0200
Subject: [PATCH 2/2] Tolerate up to 300 failing tests

Even with 300 failures more than 99.9 % of the tests pass.
---
 debian/rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/debian/rules b/debian/rules
index 9e904e2..3ffc022 100755
--- a/debian/rules
+++ b/debian/rules
@@ -201,7 +201,7 @@ run_tests_with_retry = \
 
 override_dh_auto_test-arch:
 ifeq (,$(filter nocheck,$(DEB_BUILD_OPTIONS)))
-	$(call run_tests_with_retry,arch,$(SAGE_TEST_FLAGS_ARCH) src/sage,200)
+	$(call run_tests_with_retry,arch,$(SAGE_TEST_FLAGS_ARCH) src/sage,300)
 endif
 
 override_dh_auto_test-indep:
-- 
2.30.2

>From 2af1cafaa2785f641aa8cf317c963b8150c0ba10 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:23:58 +0200
Subject: [PATCH] Do not follow a running worker

While it is running, no other results can be logged.

Every finished, but not yet logged worker, holds an open fd.
Thus when following a long running worker, so many finished tests can accumulate, that the open files limit (ulimit -n) is reached.
This then causes the test suite to fail with 'OSError: [Errno 24] Too many open files'.

Thus simply log every finished test, but no running one.
---
 sage/src/sage/doctest/forker.py | 31 +------------------------------
 1 file changed, 1 insertion(+), 30 deletions(-)

diff --git a/sage/src/sage/doctest/forker.py b/sage/src/sage/doctest/forker.py
index cb3667659e..fd7550da93 100644
--- a/sage/src/sage/doctest/forker.py
+++ b/sage/src/sage/doctest/forker.py
@@ -1817,11 +1817,6 @@ class DocTestDispatcher(SageObject):
         # If exitfirst is set and we got a failure.
         abort_now = False
 
-        # One particular worker that we are "following": we report the
-        # messages while it's running. For other workers, we report the
-        # messages if there is no followed worker.
-        follow = None
-
         # Install signal handler for SIGCHLD
         signal.signal(signal.SIGCHLD, dummy_handler)
 
@@ -1895,15 +1890,9 @@ class DocTestDispatcher(SageObject):
                     workers = new_workers
 
                     # Similarly, process finished workers.
-                    new_finished = []
                     for w in finished:
                         if opt.exitfirst and w.result[1].failures:
                             abort_now = True
-                        elif follow is not None and follow is not w:
-                            # We are following a different worker, so
-                            # we cannot report now.
-                            new_finished.append(w)
-                            continue
 
                         # Report the completion of this worker
                         log(w.messages, end="")
@@ -1918,9 +1907,8 @@ class DocTestDispatcher(SageObject):
                         pending_tests -= 1
 
                         restart = True
-                        follow = None
 
-                    finished = new_finished
+                    finished = []
 
                     if abort_now:
                         break
@@ -1972,23 +1960,6 @@ class DocTestDispatcher(SageObject):
                         if w.rmessages is not None and w.rmessages in rlist:
                             w.read_messages()
 
-                    # Find a worker to follow: if there is only one worker,
-                    # always follow it. Otherwise, take the worker with
-                    # the earliest deadline of all workers whose
-                    # messages are more than just the heading.
-                    if follow is None:
-                        if len(workers) == 1:
-                            follow = workers[0]
-                        else:
-                            for w in workers:
-                                if len(w.messages) > w.heading_len:
-                                    if follow is None or w.deadline < follow.deadline:
-                                        follow = w
-
-                    # Write messages of followed worker
-                    if follow is not None:
-                        log(follow.messages, end="")
-                        follow.messages = ""
         finally:
             # Restore SIGCHLD handler (which is to ignore the signal)
             signal.signal(signal.SIGCHLD, signal.SIG_DFL)
-- 
2.30.2

>From 1f1dd9da522c288a4dda381828854162a65cabf1 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:24:39 +0200
Subject: [PATCH] Adapt to python2 removal

Use ipython3 and cython3 instead of ipython and cython.

Also call ipython3 directly instead of from SAGE_LOCAL, as the latter is
set to /usr anyway in the package.
During the build SAGE_LOCAL points to sage/local, but for some reason
ipython3 is not linked into its bin subfolder.
---
 sage/src/bin/sage                 | 6 +++---
 sage/src/sage/interfaces/tests.py | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/sage/src/bin/sage b/sage/src/bin/sage
index 265f2a82dd..eefe36762a 100755
--- a/sage/src/bin/sage
+++ b/sage/src/bin/sage
@@ -527,12 +527,12 @@ fi
 
 if [ "$1" = '-ipython' -o "$1" = '--ipython' ]; then
     shift
-    exec "$SAGE_LOCAL"/bin/ipython "$@"
+    exec ipython3 "$@"
 fi
 
 if [ "$1" = '-ipython3' -o "$1" = '--ipython3' ]; then
     shift
-    exec "$SAGE_LOCAL"/bin/ipython3 "$@"
+    exec ipython3 "$@"
 fi
 
 if [ "$1" = '-jupyter' -o "$1" = '--jupyter' ]; then
@@ -546,7 +546,7 @@ fi
 
 if [ "$1" = "-cython" -o "$1" = '--cython' -o "$1" = '-pyrex' -o "$1" = "--pyrex" ]; then
     shift
-    exec cython "$@"
+    exec cython3 "$@"
 fi
 
 if [ "$1" = '-gap' -o "$1" = '--gap' ]; then
diff --git a/sage/src/sage/interfaces/tests.py b/sage/src/sage/interfaces/tests.py
index a6847a9c85..29f604a0a5 100644
--- a/sage/src/sage/interfaces/tests.py
+++ b/sage/src/sage/interfaces/tests.py
@@ -37,7 +37,7 @@ Test that write errors to stderr are handled gracefully by GAP
     True
     sage: subprocess.call("echo syntax error | gp", **kwds)
     0
-    sage: subprocess.call("echo syntax error | ipython", **kwds) in (0, 1, 120)
+    sage: subprocess.call("echo syntax error | ipython3", **kwds) in (0, 1, 120)
     True
     sage: subprocess.call("echo syntax error | Singular", **kwds)
     0
-- 
2.30.2

>From 5c741b0066c861504483b5ed66915b01ddd078b0 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:15:51 +0200
Subject: [PATCH 1/2] Report the number of total tests run

This makes it easier to notice when tests get skipped.
---
 debian/rules    | 10 ++++++----
 debian/tests.mk |  4 ++++
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/debian/rules b/debian/rules
index f984695..9e904e2 100755
--- a/debian/rules
+++ b/debian/rules
@@ -163,11 +163,12 @@ endif
 had-few-failures:
 	if ! test -f $(LOGFILE); then echo "Error: log file $(LOGFILE) not found"; false; fi
 	N_TEST_FAILURES="$$($(TESTS_MK) failed-tests-total-normal)"; \
+	N_TESTS="$$($(TESTS_MK) tests-total)"; \
 	  if ! test $${N_TEST_FAILURES} -le $(MAX_TEST_FAILURES); then \
-	    echo "Error: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
+	    echo "Error: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
 	    false; \
 	  else \
-	    echo "Success: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
+	    echo "Success: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
 	  fi
 	if ! test -z "$$($(TESTS_MK) failed-tests-special $(IGNORE_FAILURES))"; then \
 	  echo "Error: critical test failures (e.g. timeout, segfault, etc.)"; false; fi
@@ -176,11 +177,12 @@ had-not-too-many-failures:
 	echo "Checking number of failed tests to determine whether to rerun tests in series..."
 	if ! test -f $(LOGFILE); then echo "Error: log file $(LOGFILE) not found"; false; fi
 	N_TEST_FAILURES="$$($(TESTS_MK) failed-tests-total-normal)"; \
+	N_TESTS="$$($(TESTS_MK) tests-total)"; \
 	  if ! test $${N_TEST_FAILURES} -le $(MAX_TEST_FAILURES_RERUN); then \
-	    echo "No: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
+	    echo "No: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
 	    false; \
 	  else \
-	    echo "Yes: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
+	    echo "Yes: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
 	  fi
 
 run_tests = \
diff --git a/debian/tests.mk b/debian/tests.mk
index 2bb2322..da707ad 100755
--- a/debian/tests.mk
+++ b/debian/tests.mk
@@ -14,6 +14,10 @@ check:
 check-failed:
 	$(SAGE) -t -p --all --long --logfile=$(LOGFILE) -f $(SAGE_TEST_FLAGS)
 
+TESTS_TOTAL = grep '^    \[\([0-9]*\) tests' $(LOGFILE) | grep -v "\[0 tests" | sed 's/.*\[\([0-9]*\) tests.*/\1/' | awk '{s+=$$1} END {print s}'
+tests-total:
+	$(TESTS_TOTAL)
+
 FAILED_TESTS = grep '^sage -t .*  \#' $(LOGFILE)
 failed-tests:
 	$(FAILED_TESTS)
-- 
2.30.2

>From 8ca73a8db8b77ae6343f7718f6e4e55afcb3b569 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:24:33 +0200
Subject: [PATCH 2/2] Tolerate up to 300 failing tests

Even with 300 failures more than 99.9 % of the tests pass.
---
 debian/rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/debian/rules b/debian/rules
index 9e904e2..3ffc022 100755
--- a/debian/rules
+++ b/debian/rules
@@ -201,7 +201,7 @@ run_tests_with_retry = \
 
 override_dh_auto_test-arch:
 ifeq (,$(filter nocheck,$(DEB_BUILD_OPTIONS)))
-	$(call run_tests_with_retry,arch,$(SAGE_TEST_FLAGS_ARCH) src/sage,200)
+	$(call run_tests_with_retry,arch,$(SAGE_TEST_FLAGS_ARCH) src/sage,300)
 endif
 
 override_dh_auto_test-indep:
-- 
2.30.2

>From 2af1cafaa2785f641aa8cf317c963b8150c0ba10 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:23:58 +0200
Subject: [PATCH] Do not follow a running worker

While it is running, no other results can be logged.

Every finished, but not yet logged worker, holds an open fd.
Thus when following a long running worker, so many finished tests can accumulate, that the open files limit (ulimit -n) is reached.
This then causes the test suite to fail with 'OSError: [Errno 24] Too many open files'.

Thus simply log every finished test, but no running one.
---
 sage/src/sage/doctest/forker.py | 31 +------------------------------
 1 file changed, 1 insertion(+), 30 deletions(-)

diff --git a/sage/src/sage/doctest/forker.py b/sage/src/sage/doctest/forker.py
index cb3667659e..fd7550da93 100644
--- a/sage/src/sage/doctest/forker.py
+++ b/sage/src/sage/doctest/forker.py
@@ -1817,11 +1817,6 @@ class DocTestDispatcher(SageObject):
         # If exitfirst is set and we got a failure.
         abort_now = False
 
-        # One particular worker that we are "following": we report the
-        # messages while it's running. For other workers, we report the
-        # messages if there is no followed worker.
-        follow = None
-
         # Install signal handler for SIGCHLD
         signal.signal(signal.SIGCHLD, dummy_handler)
 
@@ -1895,15 +1890,9 @@ class DocTestDispatcher(SageObject):
                     workers = new_workers
 
                     # Similarly, process finished workers.
-                    new_finished = []
                     for w in finished:
                         if opt.exitfirst and w.result[1].failures:
                             abort_now = True
-                        elif follow is not None and follow is not w:
-                            # We are following a different worker, so
-                            # we cannot report now.
-                            new_finished.append(w)
-                            continue
 
                         # Report the completion of this worker
                         log(w.messages, end="")
@@ -1918,9 +1907,8 @@ class DocTestDispatcher(SageObject):
                         pending_tests -= 1
 
                         restart = True
-                        follow = None
 
-                    finished = new_finished
+                    finished = []
 
                     if abort_now:
                         break
@@ -1972,23 +1960,6 @@ class DocTestDispatcher(SageObject):
                         if w.rmessages is not None and w.rmessages in rlist:
                             w.read_messages()
 
-                    # Find a worker to follow: if there is only one worker,
-                    # always follow it. Otherwise, take the worker with
-                    # the earliest deadline of all workers whose
-                    # messages are more than just the heading.
-                    if follow is None:
-                        if len(workers) == 1:
-                            follow = workers[0]
-                        else:
-                            for w in workers:
-                                if len(w.messages) > w.heading_len:
-                                    if follow is None or w.deadline < follow.deadline:
-                                        follow = w
-
-                    # Write messages of followed worker
-                    if follow is not None:
-                        log(follow.messages, end="")
-                        follow.messages = ""
         finally:
             # Restore SIGCHLD handler (which is to ignore the signal)
             signal.signal(signal.SIGCHLD, signal.SIG_DFL)
-- 
2.30.2

>From 1f1dd9da522c288a4dda381828854162a65cabf1 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:24:39 +0200
Subject: [PATCH] Adapt to python2 removal

Use ipython3 and cython3 instead of ipython and cython.

Also call ipython3 directly instead of from SAGE_LOCAL, as the latter is
set to /usr anyway in the package.
During the build SAGE_LOCAL points to sage/local, but for some reason
ipython3 is not linked into its bin subfolder.
---
 sage/src/bin/sage                 | 6 +++---
 sage/src/sage/interfaces/tests.py | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/sage/src/bin/sage b/sage/src/bin/sage
index 265f2a82dd..eefe36762a 100755
--- a/sage/src/bin/sage
+++ b/sage/src/bin/sage
@@ -527,12 +527,12 @@ fi
 
 if [ "$1" = '-ipython' -o "$1" = '--ipython' ]; then
     shift
-    exec "$SAGE_LOCAL"/bin/ipython "$@"
+    exec ipython3 "$@"
 fi
 
 if [ "$1" = '-ipython3' -o "$1" = '--ipython3' ]; then
     shift
-    exec "$SAGE_LOCAL"/bin/ipython3 "$@"
+    exec ipython3 "$@"
 fi
 
 if [ "$1" = '-jupyter' -o "$1" = '--jupyter' ]; then
@@ -546,7 +546,7 @@ fi
 
 if [ "$1" = "-cython" -o "$1" = '--cython' -o "$1" = '-pyrex' -o "$1" = "--pyrex" ]; then
     shift
-    exec cython "$@"
+    exec cython3 "$@"
 fi
 
 if [ "$1" = '-gap' -o "$1" = '--gap' ]; then
diff --git a/sage/src/sage/interfaces/tests.py b/sage/src/sage/interfaces/tests.py
index a6847a9c85..29f604a0a5 100644
--- a/sage/src/sage/interfaces/tests.py
+++ b/sage/src/sage/interfaces/tests.py
@@ -37,7 +37,7 @@ Test that write errors to stderr are handled gracefully by GAP
     True
     sage: subprocess.call("echo syntax error | gp", **kwds)
     0
-    sage: subprocess.call("echo syntax error | ipython", **kwds) in (0, 1, 120)
+    sage: subprocess.call("echo syntax error | ipython3", **kwds) in (0, 1, 120)
     True
     sage: subprocess.call("echo syntax error | Singular", **kwds)
     0
-- 
2.30.2

>From 5c741b0066c861504483b5ed66915b01ddd078b0 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:15:51 +0200
Subject: [PATCH 1/2] Report the number of total tests run

This makes it easier to notice when tests get skipped.
---
 debian/rules    | 10 ++++++----
 debian/tests.mk |  4 ++++
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/debian/rules b/debian/rules
index f984695..9e904e2 100755
--- a/debian/rules
+++ b/debian/rules
@@ -163,11 +163,12 @@ endif
 had-few-failures:
 	if ! test -f $(LOGFILE); then echo "Error: log file $(LOGFILE) not found"; false; fi
 	N_TEST_FAILURES="$$($(TESTS_MK) failed-tests-total-normal)"; \
+	N_TESTS="$$($(TESTS_MK) tests-total)"; \
 	  if ! test $${N_TEST_FAILURES} -le $(MAX_TEST_FAILURES); then \
-	    echo "Error: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
+	    echo "Error: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
 	    false; \
 	  else \
-	    echo "Success: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
+	    echo "Success: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES) failures are tolerated"; \
 	  fi
 	if ! test -z "$$($(TESTS_MK) failed-tests-special $(IGNORE_FAILURES))"; then \
 	  echo "Error: critical test failures (e.g. timeout, segfault, etc.)"; false; fi
@@ -176,11 +177,12 @@ had-not-too-many-failures:
 	echo "Checking number of failed tests to determine whether to rerun tests in series..."
 	if ! test -f $(LOGFILE); then echo "Error: log file $(LOGFILE) not found"; false; fi
 	N_TEST_FAILURES="$$($(TESTS_MK) failed-tests-total-normal)"; \
+	N_TESTS="$$($(TESTS_MK) tests-total)"; \
 	  if ! test $${N_TEST_FAILURES} -le $(MAX_TEST_FAILURES_RERUN); then \
-	    echo "No: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
+	    echo "No: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
 	    false; \
 	  else \
-	    echo "Yes: $${N_TEST_FAILURES} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
+	    echo "Yes: $${N_TEST_FAILURES} of $${N_TESTS} tests failed, up to $(MAX_TEST_FAILURES_RERUN) failures are tolerated for rerun"; \
 	  fi
 
 run_tests = \
diff --git a/debian/tests.mk b/debian/tests.mk
index 2bb2322..da707ad 100755
--- a/debian/tests.mk
+++ b/debian/tests.mk
@@ -14,6 +14,10 @@ check:
 check-failed:
 	$(SAGE) -t -p --all --long --logfile=$(LOGFILE) -f $(SAGE_TEST_FLAGS)
 
+TESTS_TOTAL = grep '^    \[\([0-9]*\) tests' $(LOGFILE) | grep -v "\[0 tests" | sed 's/.*\[\([0-9]*\) tests.*/\1/' | awk '{s+=$$1} END {print s}'
+tests-total:
+	$(TESTS_TOTAL)
+
 FAILED_TESTS = grep '^sage -t .*  \#' $(LOGFILE)
 failed-tests:
 	$(FAILED_TESTS)
-- 
2.30.2

>From 8ca73a8db8b77ae6343f7718f6e4e55afcb3b569 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:24:33 +0200
Subject: [PATCH 2/2] Tolerate up to 300 failing tests

Even with 300 failures more than 99.9 % of the tests pass.
---
 debian/rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/debian/rules b/debian/rules
index 9e904e2..3ffc022 100755
--- a/debian/rules
+++ b/debian/rules
@@ -201,7 +201,7 @@ run_tests_with_retry = \
 
 override_dh_auto_test-arch:
 ifeq (,$(filter nocheck,$(DEB_BUILD_OPTIONS)))
-	$(call run_tests_with_retry,arch,$(SAGE_TEST_FLAGS_ARCH) src/sage,200)
+	$(call run_tests_with_retry,arch,$(SAGE_TEST_FLAGS_ARCH) src/sage,300)
 endif
 
 override_dh_auto_test-indep:
-- 
2.30.2

>From 2af1cafaa2785f641aa8cf317c963b8150c0ba10 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:23:58 +0200
Subject: [PATCH] Do not follow a running worker

While it is running, no other results can be logged.

Every finished, but not yet logged worker, holds an open fd.
Thus when following a long running worker, so many finished tests can accumulate, that the open files limit (ulimit -n) is reached.
This then causes the test suite to fail with 'OSError: [Errno 24] Too many open files'.

Thus simply log every finished test, but no running one.
---
 sage/src/sage/doctest/forker.py | 31 +------------------------------
 1 file changed, 1 insertion(+), 30 deletions(-)

diff --git a/sage/src/sage/doctest/forker.py b/sage/src/sage/doctest/forker.py
index cb3667659e..fd7550da93 100644
--- a/sage/src/sage/doctest/forker.py
+++ b/sage/src/sage/doctest/forker.py
@@ -1817,11 +1817,6 @@ class DocTestDispatcher(SageObject):
         # If exitfirst is set and we got a failure.
         abort_now = False
 
-        # One particular worker that we are "following": we report the
-        # messages while it's running. For other workers, we report the
-        # messages if there is no followed worker.
-        follow = None
-
         # Install signal handler for SIGCHLD
         signal.signal(signal.SIGCHLD, dummy_handler)
 
@@ -1895,15 +1890,9 @@ class DocTestDispatcher(SageObject):
                     workers = new_workers
 
                     # Similarly, process finished workers.
-                    new_finished = []
                     for w in finished:
                         if opt.exitfirst and w.result[1].failures:
                             abort_now = True
-                        elif follow is not None and follow is not w:
-                            # We are following a different worker, so
-                            # we cannot report now.
-                            new_finished.append(w)
-                            continue
 
                         # Report the completion of this worker
                         log(w.messages, end="")
@@ -1918,9 +1907,8 @@ class DocTestDispatcher(SageObject):
                         pending_tests -= 1
 
                         restart = True
-                        follow = None
 
-                    finished = new_finished
+                    finished = []
 
                     if abort_now:
                         break
@@ -1972,23 +1960,6 @@ class DocTestDispatcher(SageObject):
                         if w.rmessages is not None and w.rmessages in rlist:
                             w.read_messages()
 
-                    # Find a worker to follow: if there is only one worker,
-                    # always follow it. Otherwise, take the worker with
-                    # the earliest deadline of all workers whose
-                    # messages are more than just the heading.
-                    if follow is None:
-                        if len(workers) == 1:
-                            follow = workers[0]
-                        else:
-                            for w in workers:
-                                if len(w.messages) > w.heading_len:
-                                    if follow is None or w.deadline < follow.deadline:
-                                        follow = w
-
-                    # Write messages of followed worker
-                    if follow is not None:
-                        log(follow.messages, end="")
-                        follow.messages = ""
         finally:
             # Restore SIGCHLD handler (which is to ignore the signal)
             signal.signal(signal.SIGCHLD, signal.SIG_DFL)
-- 
2.30.2

>From 1f1dd9da522c288a4dda381828854162a65cabf1 Mon Sep 17 00:00:00 2001
From: Ahzo <a...@tutanota.com>
Date: Sat, 31 Jul 2021 13:24:39 +0200
Subject: [PATCH] Adapt to python2 removal

Use ipython3 and cython3 instead of ipython and cython.

Also call ipython3 directly instead of from SAGE_LOCAL, as the latter is
set to /usr anyway in the package.
During the build SAGE_LOCAL points to sage/local, but for some reason
ipython3 is not linked into its bin subfolder.
---
 sage/src/bin/sage                 | 6 +++---
 sage/src/sage/interfaces/tests.py | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/sage/src/bin/sage b/sage/src/bin/sage
index 265f2a82dd..eefe36762a 100755
--- a/sage/src/bin/sage
+++ b/sage/src/bin/sage
@@ -527,12 +527,12 @@ fi
 
 if [ "$1" = '-ipython' -o "$1" = '--ipython' ]; then
     shift
-    exec "$SAGE_LOCAL"/bin/ipython "$@"
+    exec ipython3 "$@"
 fi
 
 if [ "$1" = '-ipython3' -o "$1" = '--ipython3' ]; then
     shift
-    exec "$SAGE_LOCAL"/bin/ipython3 "$@"
+    exec ipython3 "$@"
 fi
 
 if [ "$1" = '-jupyter' -o "$1" = '--jupyter' ]; then
@@ -546,7 +546,7 @@ fi
 
 if [ "$1" = "-cython" -o "$1" = '--cython' -o "$1" = '-pyrex' -o "$1" = "--pyrex" ]; then
     shift
-    exec cython "$@"
+    exec cython3 "$@"
 fi
 
 if [ "$1" = '-gap' -o "$1" = '--gap' ]; then
diff --git a/sage/src/sage/interfaces/tests.py b/sage/src/sage/interfaces/tests.py
index a6847a9c85..29f604a0a5 100644
--- a/sage/src/sage/interfaces/tests.py
+++ b/sage/src/sage/interfaces/tests.py
@@ -37,7 +37,7 @@ Test that write errors to stderr are handled gracefully by GAP
     True
     sage: subprocess.call("echo syntax error | gp", **kwds)
     0
-    sage: subprocess.call("echo syntax error | ipython", **kwds) in (0, 1, 120)
+    sage: subprocess.call("echo syntax error | ipython3", **kwds) in (0, 1, 120)
     True
     sage: subprocess.call("echo syntax error | Singular", **kwds)
     0
-- 
2.30.2

Bug#986527: Patches for flaky build and cython unavailability

Reply via email to