Hi, I just tried another round of tests, without running "sync; echo 3 > /proc/sys/vm/drop_caches', still got the same error, following FATAL errors are found in pg_log (slave), can anyone please advise how to resolve this error?
regards LOG: entering standby mode LOG: consistent recovery state reached at 11/42000318 LOG: redo starts at 11/42000280 LOG: invalid record length at 11/42000318 LOG: database system is ready to accept read only connections LOG: streaming replication successfully connected to primary FATAL: could not send data to WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. LOG: unexpected pageaddr 10/D2EC0000 in log file 18, segment 5, offset 15466496 LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed LOG: streaming replication successfully connected to primary FATAL: could not receive data from WAL stream: FATAL: requested WAL segment 000000010000001200000005 has already been removed On 6 Aug 2013, at 12:39 PM, ascot.m...@gmail.com wrote: > Hi, > > I found the problem should be because I tried to clean RAM cache in the slave > by running "sync; echo 3 > /proc/sys/vm/drop_caches' > that caused the "receiver" of slave gone away. > > ps -ef | grep receiver > postgres 6182 6178 0 12:11 ? 00:00:06 postgres: wal receiver > process streaming D/FB8DA000 > sync; echo 3 > /proc/sys/vm/drop_caches > ps -ef | grep receiver > root 8804 30447 0 12:29 pts/2 00:00:00 grep --color=auto receiver > > regards > > On 6 Aug 2013, at 10:44 AM, ascot.m...@gmail.com wrote: > >> Hi, >> >> I am doing some stress tests to a pair of PG servers to monitor the >> pg_stat_replication, during the test, the pg_stat_replication suddenly >> became empty. >> >> PG version: 9.2.4 >> O/S: Ubuntu: 12.04 >> >> Since I need to monitor the replication lag from time to time, if the >> pg_stat_replication becomes empty, the lag calculation in the slave will be >> wrong. >> Please advise if this is a bug. >> >> regards >> >> >> >> >> How to reproduce: >> >> session 1: Master server - try to insert a large number of records into a >> test table >> postgres=# drop table test;CREATE TABLE test (id INTEGER PRIMARY KEY); >> INSERT INTO test VALUES (generate_series(1,100000000)); EXPLAIN ANALYZE >> SELECT COUNT(*) FROM test; >> >> 2) session 2: Master server - check the byte_lag from time to time >> postgres=# SELECT >> sent_offset - ( >> replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS >> byte_lag >> FROM ( >> SELECT >> client_addr, >> ('x' || lpad(split_part(sent_location, '/', 1), 8, >> '0'))::bit(32)::bigint AS sent_xlog, >> ('x' || lpad(split_part(replay_location, '/', 1), 8, >> '0'))::bit(32)::bigint AS replay_xlog, >> ('x' || lpad(split_part(sent_location, '/', 2), 8, >> '0'))::bit(32)::bigint AS sent_offset, >> ('x' || lpad(split_part(replay_location, '/', 2), 8, >> '0'))::bit(32)::bigint AS replay_offset >> FROM pg_stat_replication >> ) AS s; >> byte_lag >> ---------- >> 2097216 >> (1 row) >> >> postgres=# SELECT >> >> sent_offset - ( >> replay_offset - (sent_xlog - replay_xlog) * 255 * 16 ^ 6 ) AS >> byte_lag >> FROM ( >> SELECT >> client_addr, >> ('x' || lpad(split_part(sent_location, '/', 1), 8, >> '0'))::bit(32)::bigint AS sent_xlog, >> ('x' || lpad(split_part(replay_location, '/', 1), 8, >> '0'))::bit(32)::bigint AS replay_xlog, >> ('x' || lpad(split_part(sent_location, '/', 2), 8, >> '0'))::bit(32)::bigint AS sent_offset, >> ('x' || lpad(split_part(replay_location, '/', 2), 8, >> '0'))::bit(32)::bigint AS replay_offset >> FROM pg_stat_replication >> ) AS s; >> byte_lag >> ---------- >> (0 rows) >> >> >> 3) session 3: Slave server - >> postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = >> pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - >> pg_last_xact_replay_timestamp()) END AS log_delay; >> log_delay >> ----------- >> 0 >> (1 row) >> >> postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = >> pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - >> pg_last_xact_replay_timestamp()) END AS log_delay; >> log_delay >> ----------- >> 4.873282 >> (1 row) >> >> . >> . >> . >> postgres=# SELECT CASE WHEN pg_last_xlog_receive_location() = >> pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - >> pg_last_xact_replay_timestamp()) END AS log_delay; >> log_delay >> ------------- >> 4070.325329 >> (1 row) >> >> >