Re: here does postgres take its timezone information from?

2019-11-13 Thread Tom Lane
Thomas Munro  writes:
> FreeBSD users already have the choice between zoneinfo from base or
> the misc/zoneinfo port if for some reason they want to control tzdata
> updates separately.  PostgreSQL and FreeBSD both track tzdata closely,
> and both pushed a commit for version 2019c into their stable branches
> within a couple of weeks of it being released, so I don't foresee any
> problem with this change, and staying in sync with libc seems to
> outweigh any other concerns IMHO.

Note that the normal situation, on a platform with a well-maintained
tzdata package, is that tzdata acquired via Postgres is going to lag
behind.  That's because we don't ship tzdata updates until our next
quarterly release, while the OS vendor probably has a much more
streamlined process for package updates.

In the case of 2019c, I pushed it into our code while it was still
pretty fresh, because it was just a few days to 12rc1 and I thought
the RC should contain the latest tzdata.  But usually we only bother
to sync from tzdata upstream when a quarterly release is impending.

In theory, it's possible that a tzdata update could break Postgres.
But it'd probably break a lot of other code too.  In practice, the
IANA people are well aware of that hazard, so there is a *long*
delay between when they introduce a new tzcode feature and when
they're willing to start relying on it in tzdata.  We try to stay
fairly current on our copy of tzcode, so that should only be a live
hazard for out-of-support Postgres branches.

(I was reminded of this just the other day when I had occasion
to run the 9.2 regression tests, and they failed because of a
no-longer-valid assumption about Venezuelan time.  I should
rebuild that legacy installation without --with-system-tzdata,
I guess, so that it uses tzdata it's expecting.)

regards, tom lane




CASE(?) to write in a different column based on a string pattern

2019-11-13 Thread Moreno Andreo

Hi,
    I need to create a CASE (I think) statement to check for a string 
pattern, and based on its value, write a substring in a different column 
(alias).
I'm trying to create a COPY statement to port a table into antoher 
database, which has a table with another format (that's why the aliases)


Let's write it in pseudoSQL:

given this

select pattern from tbl;
pattern
--
foo1234
bar5678
baz9012

That's what I'm trying to achieve

select
    pattern,
        CASE when pattern like 'foo%' then ltrim(pattern, 'bar') as foo
              when pattern like 'bar%' then ltrim(pattern, 'bar') 
as bar
              when pattern like 'baz%' then ltrim(pattern, 'baz') 
as baz

        END
from tbl;

|foo   |bar  |baz |
 1234
            5678
                        9012
(hoping text formatting is ok... 1234 should go in column foo, 568 in 
bar and 9012 in baz)


Is it possible?

Thanks in advance
Moreno.-





Re: CASE(?) to write in a different column based on a string pattern

2019-11-13 Thread Geoff Winkless
On Wed, 13 Nov 2019 at 16:24, Moreno Andreo  wrote:
> |foo   |bar  |baz |
>   1234
>  5678
>  9012
> (hoping text formatting is ok... 1234 should go in column foo, 568 in
> bar and 9012 in baz)
>
> Is it possible?

Simplest way in plain SQL would be individual case statements for each
column, I think.

SELECT pattern,
  CASE WHEN pattern LIKE 'foo%' THEN SUBSTR(pattern, 4) ELSE '' END AS foo
  CASE WHEN pattern LIKE 'bar%' THEN SUBSTR(pattern, 4) ELSE '' END AS bar
  CASE WHEN pattern LIKE 'baz%' THEN SUBSTR(pattern, 4) ELSE '' END AS baz
FROM tbl;

Geoff




Re: CASE(?) to write in a different column based on a string pattern

2019-11-13 Thread Andrew Kerber
So what you are doing is transforming the table format from vertical to
horizontal.  I think you will want to use a union to join the table to
itself along with the case statement to produce the output you are looking
for.

On Wed, Nov 13, 2019 at 10:37 AM Geoff Winkless  wrote:

> On Wed, 13 Nov 2019 at 16:24, Moreno Andreo 
> wrote:
> > |foo   |bar  |baz |
> >   1234
> >  5678
> >  9012
> > (hoping text formatting is ok... 1234 should go in column foo, 568 in
> > bar and 9012 in baz)
> >
> > Is it possible?
>
> Simplest way in plain SQL would be individual case statements for each
> column, I think.
>
> SELECT pattern,
>   CASE WHEN pattern LIKE 'foo%' THEN SUBSTR(pattern, 4) ELSE '' END AS foo
>   CASE WHEN pattern LIKE 'bar%' THEN SUBSTR(pattern, 4) ELSE '' END AS bar
>   CASE WHEN pattern LIKE 'baz%' THEN SUBSTR(pattern, 4) ELSE '' END AS baz
> FROM tbl;
>
> Geoff
>
>
>

-- 
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'


Re: CASE(?) to write in a different column based on a string pattern

2019-11-13 Thread Moreno Andreo

Il 13/11/19 17:48, Andrew Kerber ha scritto:
So what you are doing is transforming the table format from vertical 
to horizontal.  I think you will want to use a union to join the table 
to itself along with the case statement to produce the output you are 
looking for.


Not precisely, the string pattern is only part of a bigger table (30 
columns in total), what I'm trying to achieve is what Geoff explained, 
just split values in 3 different columns based on the string pattern


Thanks for your time
Moreno.





Re: CASE(?) to write in a different column based on a string pattern

2019-11-13 Thread Moreno Andreo

Il 13/11/19 17:36, Geoff Winkless ha scritto:


Simplest way in plain SQL would be individual case statements for each
column, I think.

SELECT pattern,
   CASE WHEN pattern LIKE 'foo%' THEN SUBSTR(pattern, 4) ELSE '' END AS foo
   CASE WHEN pattern LIKE 'bar%' THEN SUBSTR(pattern, 4) ELSE '' END AS bar
   CASE WHEN pattern LIKE 'baz%' THEN SUBSTR(pattern, 4) ELSE '' END AS baz
FROM tbl;

Geoff



Geoff,
    it worked perfectly!

Thanks a lot!
Moreno.-





terminated by signal 11: Segmentation fault

2019-11-13 Thread Gerrit Fouche
Hi All,

This is the second time I get this error since Postgresql 12 was officially
released. My version:
PostgreSQL 12.0 (Ubuntu 12.0-2.pgdg18.04+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, 64-bit

I am running a Warehouse Management system in a Live environment.
Postgresql was installed from http://apt.postgresql.org/pub/repos/apt
bionic-pgdg InRelease

When this error occurs all other active processes are also terminated. Both
these statements are in production for years. The OS of the server is a new
implementation on VMWare. Can you please point me in the right direction to
solve it?

2019-11-13 11:18:31.296 SAST,,,5033,,5dc7a74d.13a9,75,,2019-11-10 07:59:41
SAST,,0,LOG,0,"server process (PID 17257) was terminated by signal 11:
Segmentation fault","Failed process was running: UPDATE stock_items SET
""parties_role_id"" = NULL, ""inventory_transaction_stock_id"" = NULL,
""stock_type_id"" = 3, ""inventory_quantity"" = 1, ""previous_location_id""
= 7638, ""updated_at"" = '2019-11-13 11:18:30.30', ""location_code"" =
'KROMCO', ""status_id"" = NULL, ""parties_role_name"" = NULL, ""destroyed""
= 't', ""inventory_transaction_id"" = 32504133, ""status_code"" =
'available', ""current_reference_id"" = NULL, ""inventory_reference"" =
'11460516', ""party_name"" = NULL, ""location_id"" = 5150,
""stock_type_code"" = 'BIN', ""created_on"" = '2019-05-21 14:11:12.793316'
WHERE ""id"" = 5197826"""
2019-11-13 11:18:31.296 SAST,,,5033,,5dc7a74d.13a9,76,,2019-11-10 07:59:41
SAST,,0,LOG,0,"terminating any other active server processes",""


/pglog/postgresql-2019-11-05_00.csv:2019-11-05 09:20:58.291
SAST,,,19626,,5da40a0c.4caa,3,,2019-10-14 07:39:24
SAST,,0,LOG,0,"server process (PID 12242) was terminated by signal 11:
Segmentation fault","Failed process was running: UPDATE bins SET
tipped_date_time='Tue Nov 05 07:29:17 UTC
2019',exit_reference_date_time='Tue Nov 05 07:29:17 UTC
2019',exit_ref='PRESORT_BIN_TIPPED',ps_tipped_lot_no='12755',updated_by =
'',affected_by_program = '',affected_by_env = '',affected_by_function
='',updated_at = '2019-11-05 09:10:05' WHERE (bin_number = '11375907')
"""


Last autovacuum time - what is it?

2019-11-13 Thread Durumdara
Hello!

I need to ask about  last_autovacuum column  in *pg_stat_all_tables.*

I'm not sure is this a starting time, or when the daemon finished.

I make an "accident" with changing many rows in a huge table.
I didn't want to start autovacuum with this operation.

They reported slowing, but I'm not sure this caused the problem or other
thing.
This points to night, but I don't know is this the starting of the
cleaning, or when the daemon finished, so how to calculate the daemon's
working interval.

Thank you for your help!

Best regards
dd

last_autovacuum timestamp with time zone Last time at which this table was
vacuumed by the autovacuum daemon


Re: terminated by signal 11: Segmentation fault

2019-11-13 Thread Tom Lane
Gerrit Fouche  writes:
> This is the second time I get this error since Postgresql 12 was officially
> released. My version:
> PostgreSQL 12.0 (Ubuntu 12.0-2.pgdg18.04+1) on x86_64-pc-linux-gnu,
> compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, 64-bit

Given that this failed in an UPDATE, I suspect it's a known problem:
if you have a BEFORE UPDATE row-level trigger, and a concurrent update
on the same row happens, 12.0 can dump core due to a slot memory
management mistake.  There's a fix in 12.1, due out tomorrow.

regards, tom lane




Re: Last autovacuum time - what is it?

2019-11-13 Thread Tom Lane
Durumdara  writes:
> I need to ask about  last_autovacuum column  in *pg_stat_all_tables.*
> I'm not sure is this a starting time, or when the daemon finished.

Looks like it's the time that the vacuum operation finished.

regards, tom lane




`pg_ls_dir` can query some directories, but not others

2019-11-13 Thread Brennan Vincent
Copying here a question I asked on StackOverflow:
https://stackoverflow.com/questions/58846076

===

On my system, `/home` and `/etc` have exactly the same permissions:

```
$ ls -ld /home /etc
drwxr-xr-x 67 root root 4096 Nov 13 15:59 /etc
drwxr-xr-x  3 root root 4096 Oct 18 13:45 /home
```

However, Postgres can read one, but not the other:

```
test=# select count(*) from (select pg_ls_dir('/etc')) a;
count 
---
  149
(1 row)

test=# select count(*) from (select pg_ls_dir('/home')) a;
ERROR:  could not open directory "/home": Permission denied
```

Even though the user the DB is running as can, in fact, run `ls /home`:
```
$ sudo -u postgres ls /home > /dev/null && echo "ls succeeded"
ls succeeded
```

What is going on?

My postgres version is 11.5, running on Arch Linux.






Re: `pg_ls_dir` can query some directories, but not others

2019-11-13 Thread Adrian Klaver

On 11/13/19 2:32 PM, Brennan Vincent wrote:

Copying here a question I asked on StackOverflow:
https://stackoverflow.com/questions/58846076

===

On my system, `/home` and `/etc` have exactly the same permissions:

```
$ ls -ld /home /etc
drwxr-xr-x 67 root root 4096 Nov 13 15:59 /etc
drwxr-xr-x  3 root root 4096 Oct 18 13:45 /home
```

However, Postgres can read one, but not the other:

```
test=# select count(*) from (select pg_ls_dir('/etc')) a;
count
---
   149
(1 row)

test=# select count(*) from (select pg_ls_dir('/home')) a;
ERROR:  could not open directory "/home": Permission denied
```

Even though the user the DB is running as can, in fact, run `ls /home`:
```
$ sudo -u postgres ls /home > /dev/null && echo "ls succeeded"
ls succeeded
```

What is going on?


Works here(Postgres 11.5, openSuSE Leap 15):

drwxr-xr-x 149 root root  12288 Nov 13 15:24 etc/
drwxr-xr-x   4 root root   4096 Jun  7  2018 home/

production_(postgres)# select count(*) from (select pg_ls_dir('/etc')) a;
 count
---
   339
(1 row)

production_(postgres)# select count(*) from (select pg_ls_dir('/home')) a;
 count
---
 2
(1 row)

SELinux (or equivalent) in play?




My postgres version is 11.5, running on Arch Linux.







--
Adrian Klaver
adrian.kla...@aklaver.com




ERROR: COPY escape must be a single one-byte character (multi-delimiter appears to work on Postgres 9.0 but does not on Postgres 9.2)

2019-11-13 Thread Brandon Ragland
Hello,

I have a Talend enterprise job that loads data into a PostgreSQL database
via the COPY command. When migrating to a new server this command fails
with the following error message: org.postgresql.util.PSQLException:ERROR:
COPY escape must be a single one-byte character

The thing is, I looked over the documentation for both Postgres 9.0 and
9.2. Both documentations say that multi-byte delimiters are not allowed. So
I'm very confused on why this job works perfectly on Postgres 9.0 but not
on 9.2.

I am unable to edit this Talend job, as it's very old and we do not have
the source code for the job anymore. I am unable to see what the actual
delimiter is. I am also unable to see exactly how the COPY command is being
run, such as whether it's pushing directly to the server via the Postgres
driver, or if it's created a temporary CSV file somewhere and then loading
the data into the server. I believe the reason we have multi byte
delimiters setup is due to the use of various special characters in a few
of the columns for multiple tables.

I am not aware of any edits to the source code of the old 9.0 Postgres
server.

The reason we are migrating servers is due to the end of life for CentOS 5.
The new server runs CentOS 7. I believe that both servers are using the
default Postgres versions that come in the default CentOS repositories. I
know for sure that the CentOS 7 server is indeed running the default
Postgres version, as I installed it myself through yum.

Any help would be greatly appreciated.

Also, is there a way to copy the old Postgres server, dependencies, and
executables to our new server, in case the source was modified?

Brandon Ragland
Software Engineer
BREAKFRONT SOFTWARE
Office: 704.688.4085 | Mobile: 240.608.9701 | Fax: 704.973.0607


Re: ERROR: COPY escape must be a single one-byte character (multi-delimiter appears to work on Postgres 9.0 but does not on Postgres 9.2)

2019-11-13 Thread Adrian Klaver

On 11/13/19 4:40 PM, Brandon Ragland wrote:

Hello,

I have a Talend enterprise job that loads data into a PostgreSQL 
database via the COPY command. When migrating to a new server this 
command fails with the following error message: 
org.postgresql.util.PSQLException:ERROR: COPY escape must be a single 
one-byte character


Does the Postgres log offer anymore information?

Can you crank up the error level to get more info?

Are the encodings for the Postgres server/OS different ?




The thing is, I looked over the documentation for both Postgres 9.0 and 
9.2. Both documentations say that multi-byte delimiters are not allowed. 
So I'm very confused on why this job works perfectly on Postgres 9.0 but 
not on 9.2.


I am unable to edit this Talend job, as it's very old and we do not have 
the source code for the job anymore. I am unable to see what the actual 
delimiter is. I am also unable to see exactly how the COPY command is 
being run, such as whether it's pushing directly to the server via the 
Postgres driver, or if it's created a temporary CSV file somewhere and 
then loading the data into the server. I believe the reason we have 
multi byte delimiters setup is due to the use of various special 
characters in a few of the columns for multiple tables.


I am not aware of any edits to the source code of the old 9.0 Postgres 
server.


The reason we are migrating servers is due to the end of life for CentOS 
5. The new server runs CentOS 7. I believe that both servers are using 
the default Postgres versions that come in the default CentOS 
repositories. I know for sure that the CentOS 7 server is indeed running 
the default Postgres version, as I installed it myself through yum.


Any help would be greatly appreciated.

Also, is there a way to copy the old Postgres server, dependencies, and 
executables to our new server, in case the source was modified?


Brandon Ragland
Software Engineer
BREAKFRONT SOFTWARE
Office: 704.688.4085 | Mobile: 240.608.9701 | Fax: 704.973.0607



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: `pg_ls_dir` can query some directories, but not others

2019-11-13 Thread Brennan Vincent



> On Nov 13, 2019, at 6:33 PM, Adrian Klaver  wrote:
> 
> On 11/13/19 2:32 PM, Brennan Vincent wrote:
>> Copying here a question I asked on StackOverflow:
>> https://stackoverflow.com/questions/58846076
>> ===
>> On my system, `/home` and `/etc` have exactly the same permissions:
>> ```
>> $ ls -ld /home /etc
>> drwxr-xr-x 67 root root 4096 Nov 13 15:59 /etc
>> drwxr-xr-x  3 root root 4096 Oct 18 13:45 /home
>> ```
>> However, Postgres can read one, but not the other:
>> ```
>> test=# select count(*) from (select pg_ls_dir('/etc')) a;
>> count
>> ---
>>   149
>> (1 row)
>> test=# select count(*) from (select pg_ls_dir('/home')) a;
>> ERROR:  could not open directory "/home": Permission denied
>> ```
>> Even though the user the DB is running as can, in fact, run `ls /home`:
>> ```
>> $ sudo -u postgres ls /home > /dev/null && echo "ls succeeded"
>> ls succeeded
>> ```
>> What is going on?
> 
> Works here(Postgres 11.5, openSuSE Leap 15):
> 
> drwxr-xr-x 149 root root  12288 Nov 13 15:24 etc/
> drwxr-xr-x   4 root root   4096 Jun  7  2018 home/
> 
> production_(postgres)# select count(*) from (select pg_ls_dir('/etc')) a;
> count
> ---
>   339
> (1 row)
> 
> production_(postgres)# select count(*) from (select pg_ls_dir('/home')) a;
> count
> ---
> 2
> (1 row)
> 
> SELinux (or equivalent) in play?
> 
> 
>> My postgres version is 11.5, running on Arch Linux.
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com

Mystery solved: Arch’s bundled systemd service file for postgresql sets 
`ProtectHome=true`, which runs the daemon in a file system namespace that 
blocks access to /home .




Re: Wall shiping replica failed to recover database with error: invalid contrecord length 1956 at FED/38FFE208

2019-11-13 Thread Kyotaro Horiguchi
Hello.

At Wed, 2 Oct 2019 19:24:02 -0400, Stephen Frost  wrote in 
> Greetings,
> 
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > Stephen Frost  writes:
> > > * Aleš Zelený (zeleny.a...@gmail.com) wrote:
> > >> But recovery on replica failed to proceed WAL file
> > >> 00010FED0039  with log message: " invalid contrecord length
> > >> 1956 at FED/38FFE208".
> > 
> > > Err- you've drawn the wrong conclusion from that message (and you're
> > > certainly not alone- it's a terrible message and we should really have a
> > > HINT there or something).
> > 
> > Yeah, those messages are all pretty ancient, from when WAL was new and not
> > to be trusted much.  Perhaps the thing to do is move the existing info
> > into DETAIL and make the primary message be something like "reached
> > apparent end of WAL stream".
> 
> Yes, +1 on that.

The attached is something like that. AFAICS we can assume that an
invalid record means the end of WAL as long as we are fetching
successive records and XLogPageRead() has not logged anything.

As the current comment in ReadRecord says, we don't have a message if
standby has been triggered, but that's not always the case. Still we
may have a message if new segment is available after triggering.

I used a kind-of-tricky way to handle optional errdetail but it may be
better to have some additional feature as an ereport
subfunction. (Maybe named "errif" or something?)

I think it works fine as expected but I find one arguable behavior.

In the first patch, the "end of WAL" message is accompanied by source,
LSN and timeline.

LOG:  reached end of WAL in archive at 0/A360 on timeline 8
DETAIL:  invalid record length at 0/A360: wanted 24, got 0

The last two just seems good but the first one shows "archive" even on
promotion. It is right that we scan archive after promotion but seems
confusing. So I attached another patch that shows operation instead of
WAL source.

LOG:  reached end of WAL during streaming reaplication at 0/A560 on 
timeline 8
DETAIL:  invalid record length at 0/A560: wanted 24, got 0

What do you think about this?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 2bd3d0e5e5..70fd34659a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -852,6 +852,9 @@ static bool bgwriterLaunched = false;
 static int	MyLockNo = 0;
 static bool holdingAllLocks = false;
 
+/* Have we complaind about the record at the location? */
+static XLogRecPtr lastComplaint = 0;
+
 #ifdef WAL_DEBUG
 static MemoryContext walDebugCxt = NULL;
 #endif
@@ -889,6 +892,7 @@ static int	XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr,
 static bool WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 		bool fetching_ckpt, XLogRecPtr tliRecPtr);
 static int	emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
+static bool have_complained_at(XLogRecPtr RecPtr);
 static void XLogFileClose(void);
 static void PreallocXlogFiles(XLogRecPtr endptr);
 static void RemoveTempXlogFiles(void);
@@ -4264,8 +4268,11 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode,
 		record = XLogReadRecord(xlogreader, RecPtr, &errormsg);
 		ReadRecPtr = xlogreader->ReadRecPtr;
 		EndRecPtr = xlogreader->EndRecPtr;
+
 		if (record == NULL)
 		{
+			XLogRecPtr reportptr = RecPtr ? RecPtr : EndRecPtr;
+
 			if (readFile >= 0)
 			{
 close(readFile);
@@ -4273,13 +4280,24 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode,
 			}
 
 			/*
-			 * We only end up here without a message when XLogPageRead()
-			 * failed - in that case we already logged something.
+			 * When we end up here while reading successive records, we reached
+			 * the end of WAL unless XLogPageRead() has logged something. We
+			 * may or may not have a message here depending on the situation.
+			 * Otherwise we just report the received message if any and if
+			 * needed.
+			 *
+			 * Note: errormsg is alreay translated.
 			 */
-			if (errormsg)
-ereport(emode_for_corrupt_record(emode,
- RecPtr ? RecPtr : EndRecPtr),
-		(errmsg_internal("%s", errormsg) /* already translated */ ));
+			if (RecPtr == InvalidXLogRecPtr && !have_complained_at(EndRecPtr))
+ereport(emode,
+		(errmsg("reached end of WAL in %s at %X/%X on timeline %u",
+xlogSourceNames[currentSource],
+(uint32) (reportptr >> 32), (uint32) reportptr,
+ThisTimeLineID),
+		 (errormsg ? errdetail_internal("%s", errormsg) : 0)));
+			else if (errormsg)
+ereport(emode_for_corrupt_record(emode, reportptr),
+		(errmsg_internal("%s", errormsg)));
 		}
 
 		/*
@@ -4331,7 +4349,7 @@ ReadRecord(XLogReaderState *xlogreader, XLogRecPtr RecPtr, int emode,
 !fetching_ckpt)
 			{
 ereport(DEBUG1,
-		(errmsg_internal("reached end of WAL in pg_wal, entering archive recovery")));
+		(errmsg_internal

root page 3 of index "pg_class_oid_index" has level 0, expected 1

2019-11-13 Thread Abhijit Gharami
Hi,

During major version upgrade from PostgreSQL v9.4 to v9.6 I facing
following error:

Performing Consistency Checks
-
Checking cluster versions   ok

connection to database failed: FATAL:  root page 3 of index
"pg_class_oid_index" has level 0, expected 1


could not connect to old postmaster started with the command:
"/postgresql94/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D
"/data/directory/postgresql" -o "-p 5432 -b  -c listen_addresses='' -c
unix_socket_permissions=0700 -c
unix_socket_directories='/home/postgresql_user'" start
Failure, exiting


Again, I was trying to take the pg_dumpall of the same database
cluster(PostgreSQL v9.4) and it failed with error:
pg_dump: [archiver (db)] connection to database "template1" failed: FATAL:
 root page 3 of index "pg_class_oid_index" has level 0, expected 1
pg_dumpall: pg_dump failed on database "template1", exiting

Please suggest how should I proceed with major version upgrade in this case.

Regards,
Abhijit


Re: root page 3 of index "pg_class_oid_index" has level 0, expected 1

2019-11-13 Thread Laurenz Albe
Abhijit Gharami wrote:
> During major version upgrade from PostgreSQL v9.4 to v9.6 I facing following 
> error:
> 
> Performing Consistency Checks
> -
> Checking cluster versions   ok
> 
> connection to database failed: FATAL:  root page 3 of index 
> "pg_class_oid_index" has level 0, expected 1
> 
> 
> could not connect to old postmaster started with the command:
> "/postgresql94/bin/pg_ctl" -w -l "pg_upgrade_server.log" -D 
> "/data/directory/postgresql" -o "-p 5432 -b  -c listen_addresses='' -c 
> unix_socket_permissions=0700 -c
> unix_socket_directories='/home/postgresql_user'" start
> Failure, exiting
> 
> 
> Again, I was trying to take the pg_dumpall of the same database 
> cluster(PostgreSQL v9.4) and it failed with error:
> pg_dump: [archiver (db)] connection to database "template1" failed: FATAL:  
> root page 3 of index "pg_class_oid_index" has level 0, expected 1
> pg_dumpall: pg_dump failed on database "template1", exiting
> 
> Please suggest how should I proceed with major version upgrade in this case.

You are experiencing catalog corruption.

Try stopping the cluster and starting it with

  postgres -P -D /path/to/data/dir

Then connect and run

  REINDEX TABLE pg_catalog.pg_class;

That should fix the corrupted index.

If you have data corruption, you shouldn't use pg_upgrade to upgrade.
pg_dumpall / psql is the way to go.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: terminated by signal 11: Segmentation fault

2019-11-13 Thread Gerrit Fouche
Hi Tom,
Thank you, we have that same scenario.
Regards
Gerrit

On Wed, 13 Nov 2019, 20:57 Tom Lane,  wrote:

> Gerrit Fouche  writes:
> > This is the second time I get this error since Postgresql 12 was
> officially
> > released. My version:
> > PostgreSQL 12.0 (Ubuntu 12.0-2.pgdg18.04+1) on x86_64-pc-linux-gnu,
> > compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, 64-bit
>
> Given that this failed in an UPDATE, I suspect it's a known problem:
> if you have a BEFORE UPDATE row-level trigger, and a concurrent update
> on the same row happens, 12.0 can dump core due to a slot memory
> management mistake.  There's a fix in 12.1, due out tomorrow.
>
> regards, tom lane
>


Re: root page 3 of index "pg_class_oid_index" has level 0, expected 1

2019-11-13 Thread Michael Paquier
On Thu, Nov 14, 2019 at 08:26:48AM +0100, Laurenz Albe wrote:
> If you have data corruption, you shouldn't use pg_upgrade to upgrade.
> pg_dumpall / psql is the way to go.

Please refer to this wiki page:
https://wiki.postgresql.org/wiki/Corruption

If you have a cluster in such a state, you have risks of having a
corruption which has spread, and you should not reuse this data folder
as-is.  You can think of this REINDEX as a short-term solution to
retrieve as much of your data as you can so as you can dump it, and
then restore it into a safer location.  For corruptions with system
indexes, you also have the option of using ignore_system_indexes.
--
Michael


signature.asc
Description: PGP signature