[GENERAL] Data corruption

2014-11-13 Thread Russell Keane
We appear to have had some corruption on a customer's postgres cluster. They are on 9.0.17 32bit Windows Server 2003 - Service pack 2 Intel Xeon 2.66GHZ 4GB Memory Raid is setup but doesn't look good - just now showing status of Degraded!! The RAID doesn't look too well currently has status De

Re: [GENERAL] DATA corruption after promoting slave to master

2014-11-07 Thread Kirit Parmar
Hey Shaun, Thanks for quick reply We have not changed the default_statistics_target, so it shall remain to its default value 100. I would like to share our observation on this if you can infer anything from it. : It started using query plan 1 after a our scheduled vacuum run and it continued to

Re: [GENERAL] DATA corruption after promoting slave to master

2014-11-07 Thread Shaun Thomas
Krit, It sounds like you might be hitting edge-case statistics problems on recent data. We've had issues with this before, ourselves. I'd personally recommend increasing your default_statistics_target to 400 for your entire database in general. But it's possible that won't help this. Recent dat

Re: [GENERAL] DATA corruption after promoting slave to master

2014-11-06 Thread Kirit Parmar
Hey Shaun, Hope you are keeping well. Will you please help me with this un-common behaviour of postgres. When executing the query (1) it takes around 62 seconds. However when disabling the index scan the same query (2) executes in around 2 seconds. Is there any reason why the query planner prefers

Re: [GENERAL] DATA corruption after promoting slave to master

2014-11-06 Thread Shaun Thomas
Hi Krit, It looks like your actual problem is here: >  Index Scan using t1_orderid_creationtime_idx on t1 >  (cost=0.43..1181104.36 rows=9879754 width=158) > (actual time=0.021..60830.724 rows=2416614 loops=1 This index scan estimates 9.8M rows, and had to touch 2.4M. The issue is that your LI

Re: [GENERAL] DATA corruption after promoting slave to master

2014-06-26 Thread Karthik Iyer
A full dump and restore would definitely help. I tend not to suggest that often because I work with very large databases that are usually extremely cumbersome to dump and restore. But yeah, if you can get a successful pg_dump from your database, a restore should obviously clean up all of you

Re: [GENERAL] DATA corruption after promoting slave to master

2014-06-26 Thread Karthik Iyer
Hi Shaun, We reindexed all the primary and unique keys of all the tables, But we did not reindex the tables. You think we should do that also ? Yes, you need to reindex. Part of the problem with this kind of table corruption, is that PostgreSQL has applied data and index page modification

Re: [GENERAL] DATA corruption after promoting slave to master

2014-06-26 Thread Karthik Iyer
Thanks Shaun. We reindexed all the primary and unique keys of all the tables, But we did not reindex the tables. You think we should do that also ? Also, do you think we should do a clean dump restore to eliminate all data inconsistencies. One more query : We managed to get the old server

Re: [GENERAL] DATA corruption after promoting slave to master

2014-06-26 Thread Shaun Thomas
On 06/26/2014 10:34 AM, Karthik Iyer wrote: Any inputs here? You think a pgdump and restore would help more ? A full dump and restore would definitely help. I tend not to suggest that often because I work with very large databases that are usually extremely cumbersome to dump and restore.

Re: [GENERAL] DATA corruption after promoting slave to master

2014-06-26 Thread Shaun Thomas
On 06/26/2014 09:44 AM, Karthik Iyer wrote: We reindexed all the primary and unique keys of all the tables, But we did not reindex the tables. You think we should do that also ? Yes, you need to reindex. Part of the problem with this kind of table corruption, is that PostgreSQL has applied d

Re: [GENERAL] DATA corruption after promoting slave to master

2014-06-25 Thread Shaun Thomas
On 06/25/2014 06:29 AM, Karthik Iyer wrote: [2] We also have a daemon process which copies the latest partial WAL log file (which postgres is currently writing to, under pg_xlog/) every 3 secs to a different location. No. No, no, no, no no. No. Also, no. Partial WAL files are not valid for r

[GENERAL] DATA corruption after promoting slave to master

2014-06-25 Thread Karthik Iyer
Hello, We are facing issues of some DB inconsistencies, while promoting the slave to master. [1] We have master-slave with Hot Standby Setup (log shipping). [2] We also have a daemon process which copies the latest partial WAL log file (which postgres is currently writing to, under pg_xlog

[GENERAL] data corruption when using base backups generated from hot standby

2013-01-10 Thread Lonni J Friedman
Greetings, I'm running postgres-9.2.2 in a Linux-x86_64 cluster with 1 master and several hot standby servers. Since upgrading to 9.2.2 from 9.1.x a few months ago, I switched from generating a base backup on the master, to generating it on a dedicated slave/standby (to reduce the load on the mast

Re: [GENERAL] Data corruption (8.2.5 Windows XP)

2009-06-30 Thread Marc Schablewski
regis.boum...@steria.com wrote: > Is there a reason for this? > Is there a way to "repair" the database? > > I guess your primary key index got damaged. You could try to REINDEX the table or DROP and recreate the index manually. Perhaps you should do a backup of your data directory before reinde

Re: [GENERAL] Data corruption (8.2.5 Windows XP)

2009-06-30 Thread Alan Hodgson
On Tuesday 30 June 2009, regis.boum...@steria.com wrote: > SELECT * FROM t_table t WHERE t.id=1; => no result > > Is there a reason for this? > Is there a way to "repair" the database? > reindex -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subsc

[GENERAL] Data corruption (8.2.5 Windows XP)

2009-06-30 Thread regis . boumera
After a crash, some data disappeared in some tables. Furthermore, there is a table with an INTEGER as primary key. One of its row is corrupted. All global requests (no WHERE clause) succeed, but each request with a WHERE clause on the primary acts like this values does not exist. Example : SELEC

Re: [GENERAL] Data corruption problem...

2006-12-02 Thread Ragnar
I haven't seen any replies to this, so I will, although I am not a j2ee/jdbc person. On fim, 2006-11-30 at 14:14 -0700, Nathan Wilhelmi wrote: > Server: 8.0.3 on Solaris 9 > JDBC Driver: 8.0.311 ... > delete from X where id in (select bad_id from Z where name='qwerty'); > delete from Y where id i

[GENERAL] Data corruption problem...

2006-12-01 Thread Nathan Wilhelmi
Hello - Ran into an interesting problem last week with data unexpectedly being deleted. Has anyone else experienced this and have any suggestions on how to prevent this in the future? Here is the environment: Server: 8.0.3 on Solaris 9 JDBC Driver: 8.0.311 This is the series of statements tha

Re: [GENERAL] Data corruption

2006-11-15 Thread Ardian Xharra
nesday, November 15, 2006 11:38 AM Subject: Re: [GENERAL] Data corruption Do you have antivirus software installed on the server? I seem to recall A/V software being a common source of "permission denied" errors when running Postgresql on Windows. Regards, Shelby Cain - Origina

Re: [GENERAL] Data corruption

2006-11-15 Thread Shelby Cain
ff Davis <[EMAIL PROTECTED]> Cc: postgreSQL postgreSQL Sent: Tuesday, November 14, 2006 1:36:48 PM Subject: Re: [GENERAL] Data corruption I'm running under Windows. PostgreSQL runs through user postgres created during the installat

Re: [GENERAL] Data corruption

2006-11-14 Thread Ardian Xharra
- Original Message - From: "Jeff Davis" <[EMAIL PROTECTED]> To: "Ardian Xharra" <[EMAIL PROTECTED]> Cc: "postgreSQL postgreSQL" Sent: Tuesday, November 14, 2006 2:12 PM Subject: Re: [GENERAL] Data corruption On Tue, 2006-11-14 at 13:34 -05

Re: [GENERAL] Data corruption

2006-11-14 Thread Jeff Davis
On Tue, 2006-11-14 at 13:34 -0500, Ardian Xharra wrote: > Hi all, > > We have a database installed on Intel Xeon (Dell) and running on > postgreSQL 8.1. > And the database couldn't start. Here is the logfile of what happend: > > 2006-11-14 00:28:38 PANIC: could not write to log file 6, segment

[GENERAL] Data corruption

2006-11-14 Thread Ardian Xharra
Hi all,   We have a database installed on Intel Xeon (Dell) and running on postgreSQL 8.1. And the database couldn't start. Here is the logfile of what happend:   2006-11-14 00:28:38 PANIC:  could not write to log file 6, segment 239 at offset 6430720, length 16384: Permission denied   2006-11

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-09 Thread Tom Lane
Noel Faux <[EMAIL PROTECTED]> writes: > To clarify, when set on, every time it hits this error, postgres will > rezero that block? It'll only "re" zero if the page gets dropped from shared memory without there having been any occasion to write it out. Otherwise, the first write will clobber the

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-09 Thread Noel Faux
To clarify, when set on, every time it hits this error, postgres will rezero that block? Michael Fuhr wrote: On Thu, Mar 09, 2006 at 03:57:46PM +1100, Noel Faux wrote: Given that this seems problem has occurred a number of times for a number I've written a small step by step proc

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-09 Thread Michael Fuhr
On Thu, Mar 09, 2006 at 03:57:46PM +1100, Noel Faux wrote: > Given that this seems problem has occurred a number of times for a > number I've written a small step by step procedure to address this > issue. Is there any other comments you which to add. I was thinking > that this should be added

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-08 Thread Michael Fuhr
On Thu, Mar 09, 2006 at 03:57:46PM +1100, Noel Faux wrote: > Given that this seems problem has occurred a number of times for a > number I've written a small step by step procedure to address this > issue. Is there any other comments you which to add. I was thinking > that this should be added

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-08 Thread Noel Faux
Given that this seems problem has occurred a number of times for a number I've written a small step by step procedure to address this issue.  Is there any other comments you which to add.  I was thinking that this should be added to the FAQ / troubleshooting in the docs. How to repair corrupte

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-08 Thread Michael Fuhr
On Thu, Mar 09, 2006 at 12:37:52PM +1100, Noel Faux wrote: > I've been watching the post: Re: [GENERAL] Fixing up a corrupted toast table > In there they mention deletion of the bad rows from the table based on > the citid. If I could come up with a def of a back row, would this > work, or are t

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-08 Thread Michael Fuhr
On Thu, Mar 09, 2006 at 12:29:17PM +1100, Noel Faux wrote: > Thanks for all your help Michael, we wish to do a vacuum and dump before > the upgrade to 8.02. 8.0.7 and 8.1.3 are the latest versions in their respective branches; those are the versions to run to get the latest bug fixes. > Do you b

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-08 Thread Noel Faux
I've been watching the post: Re: [GENERAL] Fixing up a corrupted toast table In there they mention deletion of the bad rows from the table based on the citid.  If I could come up with a def of a back row, would this work, or are there other issues? Cheers Noel Michael Fuhr wrote: On Thu, M

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-08 Thread Noel Faux
Thanks for all your help Michael, we wish to do a vacuum and dump before the upgrade to 8.02.  Do you believe this data corruption is a postgres issue of an OS / hardware issue? Cheers Noel Michael Fuhr wrote: On Thu, Mar 09, 2006 at 11:13:40AM +1100, Noel Faux wrote: Ok it worke

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-08 Thread Michael Fuhr
On Thu, Mar 09, 2006 at 11:13:40AM +1100, Noel Faux wrote: > Ok it worked but we ran into another bad block :( > /vacuumdb: vacuuming of database "monashprotein" failed: ERROR: invalid > page header in block 9022937 of relation "gap" > / > So the command we used was: > dd bs=8k seek=110025 conv=no

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-08 Thread Noel Faux
Ok it worked but we ran into another bad block :( vacuumdb: vacuuming of database "monashprotein" failed: ERROR:  invalid page header in block 9022937 of relation "gap" So the command we used was: dd bs=8k seek=110025 conv=notrunc count=1 if=/dev/zero of=/usr/local/postgresql/postgresql-7.4.8/

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-06 Thread Michael Fuhr
On Tue, Mar 07, 2006 at 01:41:44PM +1100, Noel Faux wrote: > Here is the output from the pg_filedump; is there anything which looks > suss and where would we re-zero the data, if that's the next step: [...] > Block 110025 > - > Block Of

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-06 Thread Michael Fuhr
On Mon, Mar 06, 2006 at 05:17:54PM +1100, Noel Faux wrote: > dd bs=8k skip=115860 count=1 > if=/usr/local/postgresql/postgresql-7.4.8/data/base/37958/111685332.68 | > od -x Wrong block (115860) -- you used the number from my earlier message, which was based on the bad block being 902292. After

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-05 Thread Michael Fuhr
On Mon, Mar 06, 2006 at 11:57:53AM +1100, Noel Faux wrote: > >Anyway, if the block size is 8192 > >then 902292 sould be in the .6 file. If you can spare the time > >then you might run the dd and od commands that Tom Lane mentions > >in the above message and post the output. > Here's the output:

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-05 Thread Noel Faux
Is your table really over 100G? Yeap 600+ million rows. Anyway, if the block size is 8192 then 902292 sould be in the .6 file. If you can spare the time then you might run the dd and od commands that Tom Lane mentions in the above message and post the output. Here's the output: 000 0

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-02 Thread Michael Fuhr
On Fri, Mar 03, 2006 at 09:56:40AM +1100, Noel Faux wrote: > Which config file will tell us how big the bock sizes are? Run the query "SHOW block_size" in the database or use pg_controldata from the shell. It's probably 8192; changing it is done at compile time. -- Michael Fuhr ---

Re: [GENERAL] Data corruption zero a file - help!!

2006-03-02 Thread Noel Faux
Thanks for the pointers Michael! Which config file will tell us how big the bock sizes are? Cheers Noel Michael Fuhr wrote: On Wed, Mar 01, 2006 at 04:12:53PM +1100, Noel Faux wrote: Now after doing some searches I managed to work out that the data corruption starts at 902292.13

Re: [GENERAL] Data corruption zero a file - help!!

2006-02-28 Thread Michael Fuhr
On Tue, Feb 28, 2006 at 10:54:48PM -0700, Michael Fuhr wrote: > Is your table really over 100G? Anyway, if the block size is 8192 > then 902292 sould be in the .6 file. If you can spare the time > then you might run the dd and od commands that Tom Lane mentions > in the above message and post the

Re: [GENERAL] Data corruption zero a file - help!!

2006-02-28 Thread Michael Fuhr
On Wed, Mar 01, 2006 at 04:12:53PM +1100, Noel Faux wrote: > Now after doing some searches I managed to work out that the data > corruption starts at 902292.137 > using this sql: > SELECT * FROM gap WHERE ctid = '(902292,$x)' > Where $x I changed from 1-150. > > as mentioned on this > post:http:

[GENERAL] Data corruption zero a file - help!!

2006-02-28 Thread Noel Faux
Hi all, I posted this on the novice mailing list and as yet had no response, hopefully someone here can help. While we where trying to do a vacuum / pg_dump we encountered the following error: [EMAIL PROTECTED]:~$ pg_dumpall -d > dump.pg pg_dump: dumpClasses(): SQL command failed pg_dump:

Re: [GENERAL] Data Corruption in case of abrupt failure

2004-03-17 Thread Andrew Sullivan
On Tue, Mar 16, 2004 at 01:16:56PM +0530, Shridhar Daithankar wrote: > I believe using hdparm on linux, one can see whether or not write caching > is enabled and also set the caching. > > I don't think any IDE disk would lie about write caching status. (If one > does then it is really hard to f

Re: [GENERAL] Data Corruption in case of abrupt failure

2004-03-17 Thread Bruce Momjian
Rod K wrote: > Does anyone know the equivalent to hdparm on FreeBSD? Yes, something like: In FreeBSD, add "hw.ata.wc=0" to /boot/loader.conf. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a h

Re: [GENERAL] Data Corruption in case of abrupt failure

2004-03-17 Thread Rod K
> [EMAIL PROTECTED] > Subject: Re: [GENERAL] Data Corruption in case of abrupt failure > > > On Tue, 16 Mar 2004, Tom Lane wrote: > > > "Keith C. Perry" <[EMAIL PROTECTED]> writes: > > > I've read threads like this before and because I've ne

Re: [GENERAL] Data Corruption in case of abrupt failure

2004-03-17 Thread Tom Lane
"scott.marlowe" <[EMAIL PROTECTED]> writes: > On Tue, 16 Mar 2004, Tom Lane wrote: >> What I'd suggest is to set up a simple test involving a long string of >> very small transactions (a bunch of separate INSERTs into a table with >> no indexes works fine). Time it twice, once with "fsync" enabled

Re: [GENERAL] Data Corruption in case of abrupt failure

2004-03-17 Thread scott.marlowe
On Tue, 16 Mar 2004, Tom Lane wrote: > "Keith C. Perry" <[EMAIL PROTECTED]> writes: > > I've read threads like this before and because I've never lost data on > > servers with IDE drives after doing some basic torture tests > > (e.g. pulling the plug in the middle of an update et al), I don't > >

Re: [GENERAL] Data Corruption in case of abrupt failure

2004-03-10 Thread scott.marlowe
On Wed, 3 Mar 2004, satish satish wrote: > Hi, > > I am trying to do some reliability tests on postgre SQL. I have > use-case where the power can go off abruptly. I initiated 10,000 insert > operations and pulled out the cable in the middle. I had auto-commit > option turned on. I observed 2