On Wed, Sep 18, 2019 at 04:25:14PM +0530, Kuntal Ghosh wrote:
Hello Michael,

On Wed, Sep 18, 2019 at 6:28 AM Michael Paquier <mich...@paquier.xyz> wrote:

On my side, I have let this thing run for a couple of hours with a
patched version to include a sleep between the rename and the sync but
I could not reproduce it either:
#!/bin/bash
attempt=0
while true; do
        attempt=$((attempt+1))
        echo "Attempt $attempt"
        cd $HOME/postgres/src/test/recovery/
        PROVE_TESTS=t/006_logical_decoding.pl make check > /dev/null 2>&1
        ERRNUM=$?
        if [ $ERRNUM != 0 ]; then
                echo "Failed at attempt $attempt"
                exit $ERRNUM
        fi
done
I think the failing test is src/test/subscription/t/010_truncate.pl.
I've tried to reproduce the same failure using your script in OS X
10.14 and Ubuntu 18.04.2 (Linux version 5.0.0-23-generic), but
couldn't reproduce the same.


I kinda suspect it might be just a coincidence that it fails during that
particular test. What likely plays a role here is a checkpoint timing
(AFAICS that's the thing removing the file).  On most systems the tests
complete before any checkpoint is triggered, hence no issue.

Maybe aggressively triggering checkpoints on the running cluter from
another session would do the trick ...

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply via email to