Hi Horiguchi-san,
In a project I helped with, I encountered an issue where
the archive command kept failing. I thought this issue was
related to the problem in this thread, so I'm sharing it here.
If I should create a new thread, please let me know.
* Problem
- The archive_command is failed always.
* Conditions under which the problem occurs (parameters)
- archive_mode=always
- Using the test command in archive_command
* Probable cause
- I guess that is because the .history file already exists,
and the test command fails.
(but if we use archive_mode=on, archive_command is successful).
* How to reproduce
- Attached is a script to reproduce the problem.
Note: the script will remove $PGDATA when it started
The test command is listed as an example of the use of archive_command
in postgresql.conf, and the project faced this problem because it used
the example as is. If this behavior is a specification, it would be
better not to write the test command as a usage example.
Or maybe there should be a note that the test command should not be used
when archive_mode=always. Maybe, I'm missing something, sorry.
Regards,
Tatsuro Yamada
set -v
pg_ctl stop
rm -rf $PGDATA
initdb --encoding=UTF8 --no-locale
mkdir $PGDATA/arc
cat <<@EOF >> $PGDATA/postgresql.conf
archive_mode = always
archive_command = 'test ! -f arc/%f && cp %p arc/%f'
restore_command = 'cp arc/%f %p'
recovery_target_timeline = 'latest'
@EOF
touch $PGDATA/standby.signal
pg_ctl start
sleep 3
pg_ctl promote
sleep 3
psql -c "create table test (a int);"
# These are the trigger for archive_command
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();" # OK
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();" # OK
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();" # OK
sleep 1
touch $PGDATA/standby.signal
pg_ctl stop
pg_ctl start
echo "---- NG ---- (If we use archive_mode=on, archive_command is successful)"
sleep 10
pg_ctl promote
sleep 3
# These are the trigger for archive_command
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();" #
NG: archive_command failed
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();" #
NG: archive_command failed
psql -c "insert into test values (1); checkpoint; select pg_switch_wal();" #
NG: archive_command failed
waiting for server to shut down.... done
server stopped
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "C".
The default text search configuration will be set to "english".
Data page checksums are disabled.
creating directory /home/postgres/PG140/DATA ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Tokyo
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
pg_ctl -D /home/postgres/PG140/DATA -l logfile start
waiting for server to start....2021-05-28 12:19:06.378 JST [8215] LOG:
starting PostgreSQL 14beta1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-39), 64-bit
2021-05-28 12:19:06.378 JST [8215] LOG: listening on IPv6 address "::1", port
1400
2021-05-28 12:19:06.379 JST [8215] LOG: listening on IPv4 address "127.0.0.1",
port 1400
2021-05-28 12:19:06.381 JST [8215] LOG: listening on Unix socket
"/tmp/.s.PGSQL.1400"
2021-05-28 12:19:06.386 JST [8216] LOG: database system was shut down at
2021-05-28 12:19:06 JST
cp: cannot stat ‘arc/00000002.history’: No such file or directory
2021-05-28 12:19:06.393 JST [8216] LOG: entering standby mode
cp: cannot stat ‘arc/000000010000000000000001’: No such file or directory
2021-05-28 12:19:06.402 JST [8216] LOG: consistent recovery state reached at
0/16E9CF8
2021-05-28 12:19:06.402 JST [8216] LOG: invalid record length at 0/16E9CF8:
wanted 24, got 0
2021-05-28 12:19:06.402 JST [8215] LOG: database system is ready to accept
read only connections
cp: cannot stat ‘arc/00000002.history’: No such file or directory
cp: cannot stat ‘arc/000000010000000000000001’: No such file or directory
cp: cannot stat ‘arc/00000002.history’: No such file or directory
done
server started
waiting for server to promote....cp: cannot stat
‘arc/000000010000000000000001’: No such file or directory
2021-05-28 12:19:09.462 JST [8216] LOG: received promote request
2021-05-28 12:19:09.463 JST [8216] LOG: redo is not required
cp: cannot stat ‘arc/000000010000000000000001’: No such file or directory
cp: cannot stat ‘arc/00000002.history’: No such file or directory
2021-05-28 12:19:09.478 JST [8216] LOG: selected new timeline ID: 2
2021-05-28 12:19:09.502 JST [8216] LOG: archive recovery complete
cp: cannot stat ‘arc/00000001.history’: No such file or directory
2021-05-28 12:19:09.510 JST [8215] LOG: database system is ready to accept
connections
done
server promoted
CREATE TABLE
pg_switch_wal
---------------
0/17025F0
(1 row)
pg_switch_wal
---------------
0/20001B0
(1 row)
pg_switch_wal
---------------
0/30001D8
(1 row)
waiting for server to shut down....2021-05-28 12:19:13.669 JST [8215] LOG:
received fast shutdown request
2021-05-28 12:19:13.671 JST [8215] LOG: aborting any active transactions
2021-05-28 12:19:13.672 JST [8215] LOG: background worker "logical replication
launcher" (PID 8239) exited with exit code 1
2021-05-28 12:19:13.672 JST [8219] LOG: shutting down
2021-05-28 12:19:13.713 JST [8215] LOG: database system is shut down
done
server stopped
waiting for server to start....2021-05-28 12:19:13.798 JST [8267] LOG:
starting PostgreSQL 14beta1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-39), 64-bit
2021-05-28 12:19:13.799 JST [8267] LOG: listening on IPv6 address "::1", port
1400
2021-05-28 12:19:13.799 JST [8267] LOG: listening on IPv4 address "127.0.0.1",
port 1400
2021-05-28 12:19:13.801 JST [8267] LOG: listening on Unix socket
"/tmp/.s.PGSQL.1400"
2021-05-28 12:19:13.804 JST [8268] LOG: database system was shut down at
2021-05-28 12:19:13 JST
cp: cannot stat ‘arc/00000003.history’: No such file or directory
2021-05-28 12:19:13.806 JST [8268] LOG: entering standby mode
2021-05-28 12:19:13.809 JST [8268] LOG: restored log file "00000002.history"
from archive
cp: cannot stat ‘arc/000000020000000000000005’: No such file or directory
2021-05-28 12:19:13.819 JST [8268] LOG: consistent recovery state reached at
0/50000A0
2021-05-28 12:19:13.819 JST [8268] LOG: invalid record length at 0/50000A0:
wanted 24, got 0
2021-05-28 12:19:13.820 JST [8267] LOG: database system is ready to accept
read only connections
2021-05-28 12:19:13.825 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:13.825 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
cp: cannot stat ‘arc/00000003.history’: No such file or directory
cp: cannot stat ‘arc/000000020000000000000005’: No such file or directory
cp: cannot stat ‘arc/00000003.history’: No such file or directory
done
server started
---- NG ---- (When we use archive_mode=on, archive_command is successful)
2021-05-28 12:19:14.831 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:14.831 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:19:15.837 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:15.837 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:19:15.838 JST [8275] WARNING: archiving write-ahead log file
"00000002.history" failed too many times, will try again later
cp: cannot stat ‘arc/000000020000000000000005’: No such file or directory
cp: cannot stat ‘arc/00000003.history’: No such file or directory
cp: cannot stat ‘arc/000000020000000000000005’: No such file or directory
cp: cannot stat ‘arc/00000003.history’: No such file or directory
waiting for server to promote....cp: cannot stat
‘arc/000000020000000000000005’: No such file or directory
2021-05-28 12:19:23.895 JST [8268] LOG: received promote request
2021-05-28 12:19:23.896 JST [8268] LOG: redo is not required
cp: cannot stat ‘arc/000000020000000000000005’: No such file or directory
cp: cannot stat ‘arc/00000003.history’: No such file or directory
2021-05-28 12:19:23.911 JST [8268] LOG: selected new timeline ID: 3
2021-05-28 12:19:23.948 JST [8268] LOG: archive recovery complete
2021-05-28 12:19:23.952 JST [8268] LOG: restored log file "00000002.history"
from archive
2021-05-28 12:19:23.956 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:23.956 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:19:23.960 JST [8267] LOG: database system is ready to accept
connections
done
server promoted
2021-05-28 12:19:24.962 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:24.962 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:19:25.968 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:25.968 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:19:25.969 JST [8275] WARNING: archiving write-ahead log file
"00000002.history" failed too many times, will try again later
2021-05-28 12:19:25.973 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:25.973 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:19:26.979 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:26.979 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
pg_switch_wal
---------------
0/5000328
(1 row)
pg_switch_wal
---------------
0/6000220
(1 row)
pg_switch_wal
---------------
0/7000240
(1 row)
2021-05-28 12:19:27.985 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:27.985 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:19:27.985 JST [8275] WARNING: archiving write-ahead log file
"00000002.history" failed too many times, will try again later
2021-05-28 12:19:27.989 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:27.989 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:19:28.995 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:28.995 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:19:30.002 JST [8275] LOG: archive command failed with exit code 1
2021-05-28 12:19:30.002 JST [8275] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:19:30.002 JST [8275] WARNING: archiving write-ahead log file
"00000002.history" failed too many times, will try again later
waiting for server to shut down....... done
server stopped
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "C".
The default text search configuration will be set to "english".
Data page checksums are disabled.
creating directory /home/postgres/PG140/DATA ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Asia/Tokyo
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
pg_ctl -D /home/postgres/PG140/DATA -l logfile start
waiting for server to start....2021-05-28 12:28:18.988 JST [8911] LOG:
starting PostgreSQL 14beta1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-39), 64-bit
2021-05-28 12:28:18.988 JST [8911] LOG: listening on IPv6 address "::1", port
1400
2021-05-28 12:28:18.988 JST [8911] LOG: listening on IPv4 address "127.0.0.1",
port 1400
2021-05-28 12:28:18.991 JST [8911] LOG: listening on Unix socket
"/tmp/.s.PGSQL.1400"
2021-05-28 12:28:18.996 JST [8913] LOG: database system was shut down at
2021-05-28 12:28:18 JST
cp: cannot stat ‘arc/00000002.history’: No such file or directory
2021-05-28 12:28:19.001 JST [8913] LOG: entering standby mode
cp: cannot stat ‘arc/000000010000000000000001’: No such file or directory
2021-05-28 12:28:19.009 JST [8913] LOG: consistent recovery state reached at
0/16E9CF8
2021-05-28 12:28:19.009 JST [8913] LOG: invalid record length at 0/16E9CF8:
wanted 24, got 0
2021-05-28 12:28:19.010 JST [8911] LOG: database system is ready to accept
read only connections
cp: cannot stat ‘arc/00000002.history’: No such file or directory
cp: cannot stat ‘arc/000000010000000000000001’: No such file or directory
cp: cannot stat ‘arc/00000002.history’: No such file or directory
done
server started
waiting for server to promote....cp: cannot stat
‘arc/000000010000000000000001’: No such file or directory
2021-05-28 12:28:22.076 JST [8913] LOG: received promote request
2021-05-28 12:28:22.076 JST [8913] LOG: redo is not required
cp: cannot stat ‘arc/000000010000000000000001’: No such file or directory
cp: cannot stat ‘arc/00000002.history’: No such file or directory
2021-05-28 12:28:22.091 JST [8913] LOG: selected new timeline ID: 2
2021-05-28 12:28:22.115 JST [8913] LOG: archive recovery complete
cp: cannot stat ‘arc/00000001.history’: No such file or directory
2021-05-28 12:28:22.123 JST [8911] LOG: database system is ready to accept
connections
done
server promoted
CREATE TABLE
pg_switch_wal
---------------
0/17025F0
(1 row)
pg_switch_wal
---------------
0/20001B0
(1 row)
pg_switch_wal
---------------
0/30001D8
(1 row)
waiting for server to shut down....2021-05-28 12:28:26.282 JST [8911] LOG:
received fast shutdown request
2021-05-28 12:28:26.284 JST [8911] LOG: aborting any active transactions
2021-05-28 12:28:26.284 JST [8911] LOG: background worker "logical replication
launcher" (PID 8936) exited with exit code 1
2021-05-28 12:28:26.285 JST [8917] LOG: shutting down
2021-05-28 12:28:26.315 JST [8911] LOG: database system is shut down
done
server stopped
waiting for server to start....2021-05-28 12:28:26.409 JST [8965] LOG:
starting PostgreSQL 14beta1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-39), 64-bit
2021-05-28 12:28:26.410 JST [8965] LOG: listening on IPv6 address "::1", port
1400
2021-05-28 12:28:26.410 JST [8965] LOG: listening on IPv4 address "127.0.0.1",
port 1400
2021-05-28 12:28:26.412 JST [8965] LOG: listening on Unix socket
"/tmp/.s.PGSQL.1400"
2021-05-28 12:28:26.416 JST [8966] LOG: database system was shut down at
2021-05-28 12:28:26 JST
cp: cannot stat ‘arc/00000003.history’: No such file or directory
2021-05-28 12:28:26.418 JST [8966] LOG: entering standby mode
2021-05-28 12:28:26.421 JST [8966] LOG: restored log file "00000002.history"
from archive
cp: cannot stat ‘arc/000000020000000000000005’: No such file or directory
2021-05-28 12:28:26.427 JST [8966] LOG: consistent recovery state reached at
0/50000A0
2021-05-28 12:28:26.427 JST [8966] LOG: invalid record length at 0/50000A0:
wanted 24, got 0
2021-05-28 12:28:26.428 JST [8965] LOG: database system is ready to accept
read only connections
2021-05-28 12:28:26.430 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:26.430 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
cp: cannot stat ‘arc/00000003.history’: No such file or directory
cp: cannot stat ‘arc/000000020000000000000005’: No such file or directory
cp: cannot stat ‘arc/00000003.history’: No such file or directory
done
server started
---- NG ---- (If we use archive_mode=on, archive_command is successful)
2021-05-28 12:28:27.436 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:27.436 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:28:28.443 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:28.443 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:28:28.443 JST [8973] WARNING: archiving write-ahead log file
"00000002.history" failed too many times, will try again later
cp: cannot stat ‘arc/000000020000000000000005’: No such file or directory
cp: cannot stat ‘arc/00000003.history’: No such file or directory
cp: cannot stat ‘arc/000000020000000000000005’: No such file or directory
cp: cannot stat ‘arc/00000003.history’: No such file or directory
waiting for server to promote....cp: cannot stat
‘arc/000000020000000000000005’: No such file or directory
2021-05-28 12:28:36.506 JST [8966] LOG: received promote request
2021-05-28 12:28:36.506 JST [8966] LOG: redo is not required
cp: cannot stat ‘arc/000000020000000000000005’: No such file or directory
cp: cannot stat ‘arc/00000003.history’: No such file or directory
2021-05-28 12:28:36.521 JST [8966] LOG: selected new timeline ID: 3
2021-05-28 12:28:36.543 JST [8966] LOG: archive recovery complete
2021-05-28 12:28:36.546 JST [8966] LOG: restored log file "00000002.history"
from archive
2021-05-28 12:28:36.549 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:36.549 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:28:36.552 JST [8965] LOG: database system is ready to accept
connections
done
server promoted
2021-05-28 12:28:37.556 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:37.556 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:28:38.562 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:38.562 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:28:38.562 JST [8973] WARNING: archiving write-ahead log file
"00000002.history" failed too many times, will try again later
2021-05-28 12:28:38.566 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:38.566 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:28:39.572 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:39.572 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
pg_switch_wal
---------------
0/5000328
(1 row)
pg_switch_wal
---------------
0/6000220
(1 row)
pg_switch_wal
---------------
0/7000240
(1 row)
2021-05-28 12:28:40.578 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:40.578 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:28:40.578 JST [8973] WARNING: archiving write-ahead log file
"00000002.history" failed too many times, will try again later
2021-05-28 12:28:40.583 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:40.583 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:28:41.586 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:41.586 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:28:42.592 JST [8973] LOG: archive command failed with exit code 1
2021-05-28 12:28:42.592 JST [8973] DETAIL: The failed archive command was:
test ! -f arc/00000002.history && cp pg_wal/00000002.history
arc/00000002.history
2021-05-28 12:28:42.593 JST [8973] WARNING: archiving write-ahead log file
"00000002.history" failed too many times, will try again later