Re: BUG #18575: Sometimes pg_rewind mistakenly assumes that nothing needs to be done.

2025-03-18 Thread Evgeniy Ratkov
On 08/09/2024 15:26, Heikki Linnakangas wrote: > 2. Independently of pg_rewind: When you start PostgreSQL, it will first > try to recover all the WAL it has locally in pg_wal. That goes wrong if > you have set a recovery target TLI. For example, imagine this situation: > > - Reco

Re: pg_rewind - enable wal_log_hints or data-checksums

2025-02-17 Thread Bowen Shi
Hi Michael, I first use initdb, and set wal_log_hints=off, data_checksums=off, and full_page_writes=on. Starting pg and running for a while. Then switch over happened, I used the following commands: 1. Old master postgresql.conf set wal_log_hints=on, then start and stop pg. 2. using pg_rewind

Re: Trouble using pg_rewind to undo standby promotion

2024-11-07 Thread Craig McIlwee
t back. > Setting archive_mode = on and a restore_command that reads from the WAL archive did the trick. With those changes in place, I was able to successfully run pg_rewind and get the promoted standby back onto timeline 1. Thanks for the tips. Craig

Re: Trouble using pg_rewind to undo standby promotion

2024-11-07 Thread Craig McIlwee
On Thu, Nov 7, 2024 at 4:47 AM Torsten Förtsch wrote: > Your point of divergence is in the middle of the 7718/00BF file. So, > you should have 2 such files eventually, one on timeline 1 and the other on > timeline 2. > > Are you archiving WAL on the promoted machine in a way that your > resto

Re: Trouble using pg_rewind to undo standby promotion

2024-11-07 Thread Torsten Förtsch
a command that populates the > WAL archive. > > I would like to be able to promote standby 2 (hereon referred to just as > 'standby'), perform some writes, then rewind it back to the point before > promotion so it can become a standby again. The documentation for > p

Trouble using pg_rewind to undo standby promotion

2024-11-06 Thread Craig McIlwee
it back to the point before promotion so it can become a standby again. The documentation for pg_rewind says that this is supported and it seems like it should be straightforward, but I'm not having any luck getting this to work so I'm hoping someone can point out what I'm doing wrong.

pg_rewind Issue: Trying to read Incorrect WAL file for checkpoint record

2024-07-13 Thread azeem subhani
use pg_rewind to sync the failed primary with the current primary, but it failed. The error shows pg_rewind is trying to read from an incorrect WAL file `00010015` instead of `00040015` as per the control data: pg_rewind: servers diverged at WAL location 15A0/0

Re: pg_rewind after promote

2024-03-28 Thread Laurenz Albe
On Thu, 2024-03-28 at 17:17 +0100, Emond Papegaaij wrote: > Op do 28 mrt 2024 om 16:21 schreef Laurenz Albe : > > On Thu, 2024-03-28 at 15:52 +0100, Emond Papegaaij wrote: > > > pg_rewind: source and target cluster are on the same timeline pg_rewind: > > > no rewind

Re: pg_rewind after promote

2024-03-28 Thread Emond Papegaaij
Op do 28 mrt 2024 om 16:21 schreef Laurenz Albe : > On Thu, 2024-03-28 at 15:52 +0100, Emond Papegaaij wrote: > > This works fine most of the time, but sometimes we see this message on > one of the nodes: > > pg_rewind: source and target cluster are on the same timeline pg_r

Re: pg_rewind after promote

2024-03-28 Thread Laurenz Albe
med from the new primary >  * the node that needed to be taken out of the cluster (the old primary) >is shutdown and rebooted > > This works fine most of the time, but sometimes we see this message on one of > the nodes: > pg_rewind: source and target cluster are on the same ti

pg_rewind after promote

2024-03-28 Thread Emond Papegaaij
the new primary * the node that needed to be taken out of the cluster (the old primary) is shutdown and rebooted This works fine most of the time, but sometimes we see this message on one of the nodes: pg_rewind: source and target cluster are on the same timeline pg_rewind: no rewind required This

Re: pg_rewind and replication user

2023-02-01 Thread Mateusz Henicz
Hey, If you would look into docs https://www.postgresql.org/docs/current/app-pgrewind.html on the "Notes" section you will find a list of permissions that user needs to have to be able to run pg_rewind. Cheers, Mateusz śr., 1 lut 2023, 15:09 użytkownik Wiwwo Staff napisał: > Hi!

pg_rewind and replication user

2023-02-01 Thread Wiwwo Staff
Hi! Provided my replication user created with CREATE USER repl_user REPLICATION LOGIN ENCRYPTED PASSWORD''; If I run pg_rewing referring to this user postgres@host1:~: pg_rewind -D $PGDATA --source-server="host=nre_primary port=5432 user=repl_user passfile='/var/lib/postgr

Re: pg_rewind and user / passfile

2023-01-25 Thread Wiwwo Staff
Sorry for the confusion, I must have done some crazy stuff about the user of pg_basebackup. Please just consider the question: * is there a way to tell pg_rewind to use the passfile? Thanks! On Wed, Jan 25, 2023 at 10:37 AM Wiwwo Staff wrote: > Hi! > I have noticed, if I use > pg_b

pg_rewind and user / passfile

2023-01-25 Thread Wiwwo Staff
re. If instead, on a old primary, I perform a pg_rewind, the primary_conninfo is user= password= channel_binding=prefer host=pg_blue port=5432 sslmode=prefer sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any I

pg_rewind

2021-03-31 Thread Alexey Bashtanov
Hi, I'm trying to get my get my head around pg_rewind. Why does it need full_page_writes and wal_log_hints on the target? As far as I could see it only needs old target WAL to see what pages have been touched since the last checkpoint before diverge point. Why can't it get this

Prevent pg_rewind destroying the data

2020-12-20 Thread Christopher Pereira
Hi, When pg_rewind is interrupted due to network errors, the cluster gets corrupted: Running pg_rewind for a second time returns "pg_rewind: fatal: target server must be shut down cleanly". Trying to fix the cluster with "/usr/pgsql-12/bin/postmaster' --single -F -

Re: how reliable is pg_rewind?

2020-08-03 Thread Curt Kolovson
Curt Kolovson wrote: > >> Any info on the reliability of pg_rewind and its limitations would be > appreciated. > > > > FWIW, we use it in production to accelerate the redeployment of > > standbys in HA configuration for 4 years now in at least one product, > > and

Re: how reliable is pg_rewind?

2020-08-02 Thread Paul Förster
Hi Curt, > On 03. Aug, 2020, at 08:25, Curt Kolovson wrote: > Thanks, Paul and Michael. I forgot to mention that we're using postgres > v10.12. 11.6 and 12.3 here. Also, please don't top-post, thanks. Cheers, Paul

Re: how reliable is pg_rewind?

2020-08-02 Thread Paul Förster
Hi Curt, hi Michael, > On 03. Aug, 2020, at 03:58, Michael Paquier wrote: > > On Sat, Aug 01, 2020 at 10:35:37AM -0700, Curt Kolovson wrote: >> Any info on the reliability of pg_rewind and its limitations would be >> appreciated. > > FWIW, we use it in p

Re: how reliable is pg_rewind?

2020-08-02 Thread Michael Paquier
On Sat, Aug 01, 2020 at 10:35:37AM -0700, Curt Kolovson wrote: > When trying to resync an old primary to become a new standby, I have found > that pg_rewind only works occasionally. How reliable/robust is pg_rewind, > and what are its limitations? We have observed that approx half ou

how reliable is pg_rewind?

2020-08-02 Thread Curt Kolovson
When trying to resync an old primary to become a new standby, I have found that pg_rewind only works occasionally. How reliable/robust is pg_rewind, and what are its limitations? We have observed that approx half our FPIs in the WALs are due to XLOG/FPI_FOR_HINT. The only reason we'v

Re: [EXTERNAL] Re: PostgreSQL-12 replication failover, pg_rewind fails

2020-05-13 Thread Michael Paquier
On Wed, May 13, 2020 at 04:58:15AM +, Mariya Rampurawala wrote: > Thank you Kyotaro and Laurenz for your quick responses. > This helped me get my setup working. Please note that we have added in Postgres 13 the possibility to use a restore_command when using pg_rewind if the parameter

Re: [EXTERNAL] Re: PostgreSQL-12 replication failover, pg_rewind fails

2020-05-13 Thread Mariya Rampurawala
; configuration changes so that I donʼt hit this issue? > > Is there anything I must change in my restore command? As mentioned in the documentation, pg_rewind uses the WAL records startng from the last checkpoint just before the divergence point. The divergence point is

Re: [EXTERNAL] Re: PostgreSQL-12 replication failover, pg_rewind fails

2020-05-12 Thread Kyotaro Horiguchi
> How long is "long time after divergence"? Is there a way I can make some > > configuration changes so that I don’t hit this issue? > > Is there anything I must change in my restore command? As mentioned in the documentation, pg_rewind uses the WAL records startng from t

Re: [EXTERNAL] Re: PostgreSQL-12 replication failover, pg_rewind fails

2020-05-12 Thread Laurenz Albe
tion changes so that I don’t hit this issue? > Is there anything I must change in my restore command? What you can do is to use a higher value for "wal_keep_segments". Then PostgreSQL will keep around that number of old WAL segments, which increases the chance for "pg_rewind&qu

Re: [EXTERNAL] Re: PostgreSQL-12 replication failover, pg_rewind fails

2020-05-12 Thread Mariya Rampurawala
ipts. > My set up consists on two nodes, Master and Slave. When master fails, The slave is promoted to master. But when I try to re-register the old master as slave, the pg_rewind command fails. Details below. ... > 1. Rewind again: > 2. -bash-4.2$ /usr/pgsql-12/bin/pg_

Re: PostgreSQL-12 replication failover, pg_rewind fails

2020-05-12 Thread Kyotaro Horiguchi
r the old master as > slave, the pg_rewind command fails. Details below. ... > 1. Rewind again: > 2. -bash-4.2$ /usr/pgsql-12/bin/pg_rewind -D /pg_mnt/pg-12/data > --source-server="host=10.209.57.17 port=5432 user=postgres dbname=postgres" > > pg_rewind: servers d

Re: PostgreSQL-12 replication failover, pg_rewind fails

2020-05-12 Thread Mariya Rampurawala
Hello, Can someone please help me with the below query? Regards, Mariya From: Mariya Rampurawala Date: Sunday, 10 May 2020 at 2:55 PM To: "pgsql-general@lists.postgresql.org" , "pgsql-gene...@postgresql.org" Subject: PostgreSQL-12 replication failover, pg_rewind fails

PostgreSQL-12 replication failover, pg_rewind fails

2020-05-10 Thread Mariya Rampurawala
Hi, I am working on providing HA for replication, using automation scripts. My set up consists on two nodes, Master and Slave. When master fails, The slave is promoted to master. But when I try to re-register the old master as slave, the pg_rewind command fails. Details below. Non-default

pg_rewind: invalid record length

2019-09-17 Thread Ben Wheatley
Hi, We're encountering an issue when running pg_rewind, and are looking for advice on how to proceed. We have a set of 3 Postgres instances which are being restored from the same physical disk snapshot (which was taken from a standby on a production system), in order to test a disaster rec

pg_rewind and full_page_writes on zfs

2019-05-17 Thread Malte Swart
Hi, I am currently building a PostgreSQL cluster with physical replication. I want to disable full_page_writes as the underlaying filesystem (ZFS) prevents partial page writes. I would like to use pg_rewind for faster reintegration of diverted nodes but pg_rewind refuses to act if

Re: pg_rewind success even though getting error 'record with incorrect prev-link'

2019-01-30 Thread Abdullah Al Maruf
> The only *error* I see is when you apparently manually kill the process. You mean walreceiver process?? 'FATAL: terminating walreceiver process due to administrator command' ? Actually, I didn't kill the receiver. It is done by postgres itself, as far as I understand. I restart this node using

Re: pg_rewind success even though getting error 'record with incorrect prev-link'

2019-01-29 Thread Ron
e is not streaming from master. pg_rewind still resolves timeline conflict, but it's not fixing this second error. Any work around?? My scenario, in short, I have 1 master nodes (0th node) and three standby nodes (1st, 2nd & 3rd node). When I make the 3rd node as

Re: pg_rewind success even though getting error 'record with incorrect prev-link'

2019-01-29 Thread Abdullah Al Maruf
or, `LOG: record with incorrect prev-link 1/21B at 0/B98` Actually, the 1st error is not making any issue. This node starts to streaming from primary successfully. But when the second error comes, It appears every 5 seconds. And, the node is not streaming from master. pg_rewind still

Re: pg_rewind success even though getting error 'record with incorrect prev-link'

2019-01-29 Thread Michael Paquier
On Tue, Jan 29, 2019 at 07:13:11PM +0600, Abdullah Al Maruf wrote: > When I try to attach an old master with 'archiving set to on` as a new > standby, `pg_rewind` doesn't throw any error, But, when the database > starts, The following error appears: > > ``` > LOG:

pg_rewind success even though getting error 'record with incorrect prev-link'

2019-01-29 Thread Abdullah Al Maruf
When I try to attach an old master with 'archiving set to on` as a new standby, `pg_rewind` doesn't throw any error, But, when the database starts, The following error appears: ``` LOG: invalid record length at 0/B98: wanted 24, got 0 LOG: started streaming WAL from primary at 0/

Re: help with startup slave after pg_rewind

2018-09-20 Thread Michael Paquier
On Wed, Sep 19, 2018 at 10:29:44PM +, Dylan Luong wrote: > After promoting slave to master, I completed a pg_rewind of the slave > (old master) to the new master. But when I try to start the slave I am > getting the following error. > > I tried to run pg_rewind again, but now i

help with startup slave after pg_rewind

2018-09-19 Thread Dylan Luong
Hi After promoting slave to master, I completed a pg_rewind of the slave (old master) to the new master. But when I try to start the slave I am getting the following error. 2018-09-20 07:53:51 ACST [20265]: [2-1] db=[unknown],user=replicant app=[unknown],host=10.69.20.22(51271) FATAL: the

Re: Pg_rewind cannot load history wal

2018-08-08 Thread Abhinav Mehta
Yes, consider using Repmgr. > On 08-Aug-2018, at 3:20 AM, Richard Schmidt > wrote: > > We think we have found our missing step. We needed to do an ordered shutdown > of the original primary before promoting the standby > I.e. > > >1. Make some changes to the A (the primary) and check that th

FW: Pg_rewind cannot load history wal

2018-08-08 Thread Richard Schmidt
We think we have found our missing step. We needed to do an ordered shutdown of the original primary before promoting the standby I.e. >1. Make some changes to the A (the primary) and check that they are replicated >to the B (the standby) Missing step: Perform ordered shutdown of A (the pr

Re: Pg_rewind cannot load history wal

2018-08-04 Thread Christophe Pettus
> On Aug 4, 2018, at 13:50, Michael Paquier wrote: > > Hm? The specific situation is if pg_rewind is attached to the target before the forced post-recovery checkpoint completes, the target can be corrupted: https://www.postgresql.org/message-id/ece3b665-e9dd-43ff-b6a6-

Re: Pg_rewind cannot load history wal

2018-08-04 Thread Michael Paquier
On Sat, Aug 04, 2018 at 07:54:36AM -0700, Christophe Pettus wrote: > Would having pg_rewind do a checkpoint on the source actually cause > anything to break, as opposed to a delay while the checkpoint > completes? Users relying only on streaming without archives would be impacted as po

Re: Pg_rewind cannot load history wal

2018-08-04 Thread Christophe Pettus
> On Aug 4, 2018, at 06:13, Michael Paquier wrote: > > Well, since its creation we have the tool behave this way. I am not > sure either that we can have pg_rewind create a checkpoint on the source > node each time a rewind is done, as it may not be necessary, and it >

Re: Pg_rewind cannot load history wal

2018-08-04 Thread Michael Paquier
On Sat, Aug 04, 2018 at 04:59:45AM -0700, Andres Freund wrote: > On 2018-08-04 10:54:22 +0100, Simon Riggs wrote: >> pg_rewind doesn't work correctly. Documenting a workaround doesn't change >> that. > > Especially because most people will only understand this af

Re: Pg_rewind cannot load history wal

2018-08-04 Thread Andres Freund
On 2018-08-04 10:54:22 +0100, Simon Riggs wrote: > On 4 August 2018 at 07:56, Michael Paquier wrote: > >> Sounds like we should write a pending timeline change to the control > >> file and have pg_rewind check that instead. > >> > >> I'd call this

Re: Pg_rewind cannot load history wal

2018-08-04 Thread Simon Riggs
g timeline change to the control >> file and have pg_rewind check that instead. >> >> I'd call this a timing bug, not a doc issue. > > Well, having pg_rewind enforce a checkpoint on the promoted standby > could cause a performance hit as well if we do it mandatori

Re: Pg_rewind cannot load history wal

2018-08-03 Thread Michael Paquier
9.3 to let the startup request a checkpoint to the checkpointer process instead of doing it itself. > Sounds like we should write a pending timeline change to the control > file and have pg_rewind check that instead. > > I'd call this a timing bug, not a doc issue. Well, having pg_rew

Re: Pg_rewind cannot load history wal

2018-08-03 Thread Simon Riggs
w primary) for A (soon to >> be standby) >> 6. Add a recovery.conf to A (soon to be standby). File contains >> recovery_target_timeline = 'latest' and restore_command = 'cp >> /ice-dev/wal_archive/%f "%p" >> 7. Run pg_rewind on A - this a

Re: Pg_rewind cannot load history wal

2018-08-03 Thread Michael Paquier
oon to be standby). File contains > recovery_target_timeline = 'latest' and restore_command = 'cp > /ice-dev/wal_archive/%f "%p" > 7. Run pg_rewind on A - this appears to work as it returns the > message 'source and target cluster are on the same timeline n

FW: Pg_rewind cannot load history wal

2018-08-02 Thread Richard Schmidt
> Now once your master A can’t become slave of B. Isn’t that the exact situation that pg_rewind should take care of? This email and any attachments may contain confidential information. If you are not the intended recipient, your use or communication of the information is strictly prohibi

Re: Pg_rewind cannot load history wal

2018-08-01 Thread Abhinav Mehta
al primary) > Add the replication slot to B (the new primary) for A (soon to be standby) > Add a recovery.conf to A (soon to be standby). File contains > recovery_target_timeline = 'latest' and restore_command = 'cp > /ice-dev/wal_archive/%f "%p" > Run pg_rewind o

Pg_rewind cannot load history wal

2018-08-01 Thread Richard Schmidt
4. Switch of the A (the original primary) 5. Add the replication slot to B (the new primary) for A (soon to be standby) 6. Add a recovery.conf to A (soon to be standby). File contains recovery_target_timeline = 'latest' and restore_command = 'cp /ice-dev/wal_archive/%f &qu

Re: Missing WAL file after running pg_rewind

2018-01-14 Thread Michael Paquier
ew archive files of the new timeline (new master): > 0005038300C0.partial > 0006038300C0 > 0006038300C1 > 0006038300C2 > 0006038300C3 > > Looks like it has folked at C0. > But why is the new slave asking for 0006038300BE on

RE: Missing WAL file after running pg_rewind

2018-01-14 Thread Dylan Luong
(new master): 0005038300C0.partial 0006038300C0 0006038300C1 0006038300C2 0006038300C3 Looks like it has folked at C0. But why is the new slave asking for 0006038300BE on timeline during the restore after the pg_rewind? And

Re: Missing WAL file after running pg_rewind

2018-01-13 Thread Michael Paquier
On Fri, Jan 12, 2018 at 09:44:25PM +, Dylan Luong wrote: > The file exist in the archive directory of the old master but it is > for the previous timeline, ie 5 and not 6, ie > 0005038300BE. Can I just rename the file to 6 timeline? Ie > 0006038300BE What are the conte

RE: Missing WAL file after running pg_rewind

2018-01-12 Thread Dylan Luong
] Sent: Friday, 12 January 2018 12:08 PM To: Dylan Luong Cc: pgsql-general@lists.postgresql.org Subject: Re: Missing WAL file after running pg_rewind On Thu, Jan 11, 2018 at 04:58:02PM +, Dylan Luong wrote: > The steps I took were: > > 1. Stop all watchdogs > > 2.

Re: Missing WAL file after running pg_rewind

2018-01-11 Thread Michael Paquier
On Thu, Jan 11, 2018 at 04:58:02PM +, Dylan Luong wrote: > The steps I took were: > > 1. Stop all watchdogs > > 2. Start/stop the old master > > 3. Run 'checkpoint' on new master > > 4. Run the pg_rewind on old master to resyn

Missing WAL file after running pg_rewind

2018-01-11 Thread Dylan Luong
Hi We had a failover situation where our monitoring watchdog processes promoted the slave to become the new master. I restarted the old master database to ensure a clean stop/start and performed pg_rewind on the old master to resync with the new master. However, after successful rewind, there