Re: [GENERAL] Corrupt indexes on slave when using pg_bulkload on master

James Cowell Mon, 10 Dec 2012 06:08:59 -0800

Hello,
 
Does anyone else have any insight or information around this issue?  I can't 
find anything out there on the web.
 
If it's a simple incompatability then fine, but I'd still like to understand 
why the indexes on the secondary node corrupt when the data seems to 
be replicated without issue.
 
Thanks,
 
James

________________________________
 From: James Cowell <jcow...@btinternet.com>
To: 'Jeff Janes' <jeff.ja...@gmail.com> 
Cc: pgsql-general@postgresql.org 
Sent: Wednesday, 5 December 2012, 18:56
Subject: Re: [GENERAL] Corrupt indexes on slave when using pg_bulkload on master

Hi Jeff,

Thanks for your reply.

I can understand that it would be incompatible if you bypassed the WAL logs,
but I've got it configured to use the buffered loader so that WAL logs are
written.  The data replicates fine, the indexes don't, yet a reindex on the
primary fixes the index on the secondary.  I'd like to understand why if
possible.

I've seen mention of bug(s) around corrupt indexes on the secondary node,
there was a fix in 9.1.6, but it doesn't appear to be the same problem I'm
having. 

I've also seen other people with similar errors that aren't using
pg_bulkload, for example:
http://postgresql.1045698.n5.nabble.com/BUG-7562-could-not-read-block-0-in-f
ile-quot-base-16385-16585-quot-read-only-0-of-8192-bytes-td5724738.html

Cheers,

James

-----Original Message-----
From: Jeff Janes [mailto:jeff.ja...@gmail.com] 
Sent: 05 December 2012 18:02
To: James Cowell
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Corrupt indexes on slave when using pg_bulkload on
master

On Wed, Dec 5, 2012 at 5:17 AM, James Cowell <jcow...@btinternet.com> wrote:
> I'm using pg_bulkload to load large amounts of CSV data into a 
> postgres database hourly.
>
> This database is replicated to a second node.

pg_bulkload is fundamentally incompatible with PITR, streaming, and forms of
replication that depend on those things.

Cheers,

Jeff

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

I'm using pg_bulkload to load large amounts of CSV data into a postgres 
database hourly.
This database is replicated to a second node.
Whenever a bulk load happens the indexes on the updated tables on the secondary 
node corrupt and are unusable until a reindex is run on the primary node.  I 
get the error below on node 2:
ERROR: index "tablename" contains unexpected zero page at block 0
SQL state: XX002
Hint: Please REINDEX it.
I'm assuming that this is because of the way pg_bulkload builds the index on 
the primary, and possibly has something to do with the way pg_bulkload 
overwrites rows in the event of a constraint violation, but at the same time if 
something works on the primary shouldn't the replicated node be able to process 
the WAL log?
I've tried this on 9.1.6 and 9.1.5 on RHEL 6.3 with pg_bulkload build 
3.1.1-1.pg91.rhel6 and it happens every time.
Does anyone have any experience in this area or advice they could give?  If you 
can point out something stupid I'm doing that would be very welcome :)
Thanks,
James

Re: [GENERAL] Corrupt indexes on slave when using pg_bulkload on master

Reply via email to