On Fri, Aug 29, 2014 at 3:46 PM, Andres Freund <and...@2ndquadrant.com> wrote:
> [FWIW: proper quoting makes answering easier and thus more likely] > > On 2014-08-29 15:37:51 -0700, Patrick Krecker wrote: > > I ran the following on the local endpoint of spiped: > > > > while [ true ]; do psql -h localhost -p 5445 judicata -U marbury -c > "select > > modtime, pg_last_xlog_receive_location(), pg_last_xlog_replay_location() > > from replication_time"; done; > > > > And the same command on production and I was able to verify that the > xlogs > > for a given point in time were the same (modtime is updated every second > by > > an upstart job): > > > > spiped from office -> production: > > modtime | pg_last_xlog_receive_location | > > pg_last_xlog_replay_location > > > ----------------------------+-------------------------------+------------------------------ > > 2014-08-29 15:23:25.563766 | 177/2E80C9F8 | > 177/2E80C9F8 > > > > Ran directly on production replica: > > modtime | pg_last_xlog_receive_location | > > pg_last_xlog_replay_location > > > ----------------------------+-------------------------------+------------------------------ > > 2014-08-29 15:23:25.563766 | 177/2E80C9F8 | > 177/2E80C9F8 > > > > To me, this is sufficient proof that spiped is indeed talking to the > > machine I think it's talking to (also lsof reports the correct hostname). > > > > I created another basebackup from the currently stuck postgres intance on > > another machine and I also get this error: > > > > 2014-08-29 15:27:30 PDT FATAL: could not receive data from WAL stream: > > ERROR: requested starting point 177/2D000000 is ahead of the WAL flush > > position of this server 174/B76D16A8 > > Uh. this indicates that the machine you're talking to is *not* one of > the above as it has a flush position of '174/B76D16A8' - not something > that's really possible when the node actually is at '177/2E80C9F8'. > > Could you run, on the standby that's having problems, the following > command: > psql 'host=127.0.0.1 port=5445 user=XXX password=XXX' -c 'IDENTIFY_SYSTEM;' > > Greetings, > > Andres Freund > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > RE: quoting, I wonder if Gmail is messing it up somehow? Or am I doing something else wrong? Sorry :( First, I apologize for the misleading information, but when I made another basebackup and tried to use it, I configured the machine to cascade from the stuck replica, *not* from the spiped endpoint. When I properly connected it to the spiped endpoint it synced up fine, giving this log line: 2014-08-29 16:16:21 PDT LOG: started streaming WAL from primary at 177/4F000000 on timeline 1 The command as you gave reported a syntax error as is, but I googled a little bit and run this one: psql 'replication=1 dbname=XXX host=127.0.0.1 port=5445 user=XXX password=XXX' -c 'IDENTIFY_SYSTEM;' And it gave me this output: systemid | timeline | xlogpos ---------------------+----------+-------------- 5964163898407843711 | 1 | 177/53091990