Re: psql: FATAL: the database system is starting up

2019-06-02 Thread Tom K
Hey Adrian, Fixed it. I saw the post from jebriggs but that didn't work for me so posted here. Anyway, here's how I resolved it: When I ran an strace on the postgres startup line, I got this: open("pg_logical/replorigin_checkpoint", O_RDONLY) = 6 write(2, "2019-06-02 14:50:34.777 EDT [283"...,

Re: psql: FATAL: the database system is starting up

2019-06-02 Thread Adrian Klaver
On 6/2/19 11:14 AM, Tom K wrote: Nope. wal_level was set to replica, not logical.  Unless you mean What was the role of this cluster in the original setup? The cluster was the backend database for a number of applications.  The aim was to point applications to a single large cluster in

Re: psql: FATAL: the database system is starting up

2019-06-02 Thread Tom K
On Sun, Jun 2, 2019 at 11:47 AM Adrian Klaver wrote: > On 6/1/19 8:07 PM, Tom K wrote: > > > > > https://www.postgresql.org/docs/10/app-postgres.html > > Single-User Mode > > ... > > > > and see if that at least gets the server started. This is a highly > > restricted so do no

Re: psql: FATAL: the database system is starting up

2019-06-02 Thread Adrian Klaver
On 6/1/19 8:07 PM, Tom K wrote: https://www.postgresql.org/docs/10/app-postgres.html Single-User Mode ... and see if that at least gets the server started. This is a highly restricted so do not expect much usability. These servers did crash before however didn't' notice a

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 8:53 PM Adrian Klaver wrote: > On 6/1/19 5:32 PM, Tom K wrote: > > > > > > > Trying what we did above but on the second node: > > > > Was this node the primary? > > To me the below looks like there are replication slots set up that are > failing. Not sure how to deal with t

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 5:32 PM, Tom K wrote: Trying what we did above but on the second node: Was this node the primary? To me the below looks like there are replication slots set up that are failing. Not sure how to deal with this at the moment. You might try single-user mode: https://www.postgres

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
That file just generates the postgres configs. Here is what is generated: -bash-4.2$ cat postgresql.conf # Do not edit this file manually! # It will be overwritten by Patroni! include 'postgresql.base.conf' cluster_name = 'postgres' hot_standby = 'on' listen_addresses = '192.168.0.124' max_conn

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 5:21 PM, Tom K wrote: On Sat, Jun 1, 2019 at 7:34 PM Adrian Klaver > wrote: On 6/1/19 4:22 PM, Tom K wrote: > > > > Looks like this crash was far more catastrophic then I thought. By the > looks of things, thinkin

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 8:25 PM Tom K wrote: > So the best bet will be trying to get through this error then: > > [ PSQL02 ] > PANIC:replication check point has wrong magic 0 instead of 307747550 > > > > > On Sat, Jun 1, 2019 at 8:21 PM Tom K wrote: > >> >> >> On Sat, Jun 1, 2019 at 7:34 PM Adri

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
So the best bet will be trying to get through this error then: [ PSQL02 ] PANIC:replication check point has wrong magic 0 instead of 307747550 On Sat, Jun 1, 2019 at 8:21 PM Tom K wrote: > > > On Sat, Jun 1, 2019 at 7:34 PM Adrian Klaver > wrote: > >> On 6/1/19 4:22 PM, Tom K wrote: >> > >

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 7:34 PM Adrian Klaver wrote: > On 6/1/19 4:22 PM, Tom K wrote: > > > > > > > > > Looks like this crash was far more catastrophic then I thought. By the > > looks of things, thinking on psql02 would be my best bet. > > > > The more I look at it the more I think the replicat

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 4:22 PM, Tom K wrote: Looks like this crash was far more catastrophic then I thought.  By the looks of things, thinking on psql02 would be my best bet. The more I look at it the more I think the replication was not doing what you thought it was doing. That psql02 was the pri

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 4:22 PM, Tom K wrote: I thought you said you had copied in data directories from the other nodes, did I remember correctly? Yep, you remembered correctly. I copied the files as they were, out to a temporary folder under root for each node but never dug into base/ etc an

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 7:12 PM Adrian Klaver wrote: > On 6/1/19 3:56 PM, Tom K wrote: > > > > > > > > > postgres=# select oid, datname from pg_database ; > >oid | datname > > ---+--- > > 13806 | postgres > > 1 | template1 > > 13805 | template0 > > (3 rows) > > > > So t

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 3:56 PM, Tom K wrote: postgres=# select oid, datname from pg_database ;   oid  |  datname ---+---  13806 | postgres      1 | template1  13805 | template0 (3 rows) So there are only the system databases available -bash-4.2$ cd /data/patroni/ -bash-4.2$ ls -altr

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 6:36 PM Adrian Klaver wrote: > On 6/1/19 3:14 PM, Tom K wrote: > > > > > > > ** Correction. There is postgres, template1 and template2 but none of > > the other databases we had. > > In a psql session do: > > select oid, datname from pg_database ; > > Then go to /data/patr

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 6:39 PM Adrian Klaver wrote: > On 6/1/19 3:14 PM, Tom K wrote: > > > > > ** Correction. There is postgres, template1 and template2 but none of > > the other databases we had. > > Just noticed, is that really template2 or is it actually template0? Apologies. You're right

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 3:14 PM, Tom K wrote: ** Correction.  There is postgres, template1 and template2 but none of the other databases we had. Just noticed, is that really template2 or is it actually template0? -- Adrian Klaver adrian.kla...@aklaver.com

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 3:14 PM, Tom K wrote: ** Correction.  There is postgres, template1 and template2 but none of the other databases we had. In a psql session do: select oid, datname from pg_database ; Then go to /data/patroni an drill down to the base directory. In that directory there should be

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 6:05 PM Tom K wrote: > > > On Sat, Jun 1, 2019 at 5:51 PM Adrian Klaver > wrote: > >> On 6/1/19 2:31 PM, Tom K wrote: >> > >> > >> >> > >> > Spoke too soon. There's no tables when it's started without the >> > recovery.conf file. >> >> Where there any errors in the start

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 5:51 PM Adrian Klaver wrote: > On 6/1/19 2:31 PM, Tom K wrote: > > > > > > > > > Spoke too soon. There's no tables when it's started without the > > recovery.conf file. > > Where there any errors in the start up? > Nothing I would discern as a startup error: [root@psql03

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 2:31 PM, Tom K wrote: Spoke too soon.  There's no tables when it's started without the recovery.conf file. Where there any errors in the start up? Are there databases in the clusters system(template1, postgres, etc) or user? Did you start against the correct PG_DATA directo

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 2:30 PM, Tom K wrote: On Sat, Jun 1, 2019 at 4:52 PM Adrian Klaver > wrote: On 6/1/19 12:42 PM, Tom K wrote: > > > > Of note are the characters f2W below.  I see nothing in the postgres > source code to indicate this is

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 5:30 PM Tom K wrote: > > > On Sat, Jun 1, 2019 at 4:52 PM Adrian Klaver > wrote: > >> On 6/1/19 12:42 PM, Tom K wrote: >> > >> > >> >> > >> > Of note are the characters f2W below. I see nothing in the postgres >> > source code to indicate this is any recognizable postgres

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 4:52 PM Adrian Klaver wrote: > On 6/1/19 12:42 PM, Tom K wrote: > > > > > > > > > Of note are the characters f2W below. I see nothing in the postgres > > source code to indicate this is any recognizable postgres message. A > > part of me suspects that the postgres binarie

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 2:08 PM, Tom K wrote: Yep, cheap LAB hardware with no power redundancy ( yet ) . I don't suppose a pg_dump was done anytime recently? -- Adrian Klaver adrian.kla...@aklaver.com

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 4:11 PM Adrian Klaver wrote: > On 6/1/19 12:32 PM, Tom K wrote: > > > > > > > > > What if you move the recovery.conf file out? > > > > > > Will try. > > > > > > > > The below looks like missing/corrupted/incorrect files. Hard to tell > > without knowing what Pat

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 12:42 PM, Tom K wrote: Of note are the characters f2W below.  I see nothing in the postgres source code to indicate this is any recognizable postgres message.  A part of me suspects that the postgres binaries got corrupted.   Had this case occur with glib-common and a reinstall

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 3:32 PM Tom K wrote: > > > On Sat, Jun 1, 2019 at 9:55 AM Adrian Klaver > wrote: > >> On 5/31/19 7:53 PM, Tom K wrote: >> > >> >> > There are two places to connect with the Patroni community: on >> github, >> > via Issues and PRs, and on channel #patroni in the Pos

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 6/1/19 12:32 PM, Tom K wrote: What if you move the recovery.conf file out? Will try. The below looks like missing/corrupted/incorrect files. Hard to tell without knowing what Patroni did? Storage disappeared from underneath these clusters.  The OS was of course still

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Tom K
On Sat, Jun 1, 2019 at 9:55 AM Adrian Klaver wrote: > On 5/31/19 7:53 PM, Tom K wrote: > > > > > There are two places to connect with the Patroni community: on > github, > > via Issues and PRs, and on channel #patroni in the PostgreSQL Slack. > If > > you're using Patroni, or just int

Re: psql: FATAL: the database system is starting up

2019-06-01 Thread Adrian Klaver
On 5/31/19 7:53 PM, Tom K wrote: There are two places to connect with the Patroni community: on github, via Issues and PRs, and on channel #patroni in the PostgreSQL Slack. If you're using Patroni, or just interested, please join us. Will post there as well.  Thank you.  My thin

Re: psql: FATAL: the database system is starting up

2019-05-31 Thread Tom K
On Wed, May 29, 2019 at 10:28 AM Adrian Klaver wrote: > On 5/28/19 6:59 PM, Tom K wrote: > > > > > > On Tue, May 28, 2019 at 9:53 AM Adrian Klaver > > wrote: > > > > > > > Correct. Master election occurs through Patroni. WAL level is set to: > > > > wal_level

Re: psql: FATAL: the database system is starting up

2019-05-29 Thread Adrian Klaver
On 5/28/19 6:59 PM, Tom K wrote: On Tue, May 28, 2019 at 9:53 AM Adrian Klaver > wrote: Correct.  Master election occurs through Patroni.  WAL level is set to: wal_level = 'replica' So no archiving. > > After the most recent crash 2-3 weeks

Re: psql: FATAL: the database system is starting up

2019-05-28 Thread Tom K
On Tue, May 28, 2019 at 9:53 AM Adrian Klaver wrote: > On 5/27/19 9:59 PM, Tom K wrote: > > Hey Guy's, > > > > > > I'm running Patroni w/ PostgreSQL 10, ETCD, Haproxy and Keepalived on 3 > > RHEL 7.6 VM's. Every now and then the underlying storage crashes taking > > out the cluster. On recovery

Re: psql: FATAL: the database system is starting up

2019-05-28 Thread Adrian Klaver
On 5/27/19 9:59 PM, Tom K wrote: Hey Guy's, I'm running Patroni w/ PostgreSQL 10, ETCD, Haproxy and Keepalived on 3 RHEL 7.6 VM's.  Every now and then the underlying storage crashes taking out the cluster.  On recovery, PostgreSQL tends to come up while other databases just blow up.  That is