Re: [BUGS] Completely broken replica after PANIC: WAL contains references to invalid pages

2013-04-05 Thread Sergey Konoplev
On Tue, Apr 2, 2013 at 11:26 AM, Andres Freund  wrote:
> The attached patch fixes this although I don't like the way it knowledge of 
> the
> point up to which StartupSUBTRANS zeroes pages is handled.

Thank you for the patch, Andres.

Is it included in 9.2.4?

BTW, it has happened again and I am going to make a copy of the
cluster to be able to provide you some extra information. Do you still
need it?

--
Kind regards,
Sergey Konoplev
Database and Software Consultant

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: gray...@gmail.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Completely broken replica after PANIC: WAL contains references to invalid pages

2013-04-05 Thread Andres Freund
On 2013-04-05 07:10:12 -0700, Sergey Konoplev wrote:
> On Tue, Apr 2, 2013 at 11:26 AM, Andres Freund  wrote:
> > The attached patch fixes this although I don't like the way it knowledge of 
> > the
> > point up to which StartupSUBTRANS zeroes pages is handled.
> 
> Thank you for the patch, Andres.
> 
> Is it included in 9.2.4?

No. Too late for that. It hasn't bee committed yet.

> BTW, it has happened again and I am going to make a copy of the
> cluster to be able to provide you some extra information. Do you still
> need it?

Cool. It would be very helpful if you could apply the patch and verify
that it works, it has been written somewhat blindly. Also I am afraid
that at least last time there was a second bug involved.

Could you show the log?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Completely broken replica after PANIC: WAL contains references to invalid pages

2013-04-05 Thread Sergey Konoplev
On Fri, Apr 5, 2013 at 7:15 AM, Andres Freund  wrote:
> Cool. It would be very helpful if you could apply the patch and verify
> that it works, it has been written somewhat blindly. Also I am afraid
> that at least last time there was a second bug involved.

Okay, I will try to talk to admins but I am afraid it could take long.

> Could you show the log?

2013-04-05 17:26:31 MSK 2113 @ from  [vxid: txid:0] [] LOG:  database
system was shut down in recovery at 2013-04-05 17:18:02 MSK
2013-04-05 17:26:32 MSK 2113 @ from  [vxid: txid:0] [] LOG:  entering
standby mode
2013-04-05 17:26:32 MSK 2113 @ from  [vxid:1/0 txid:0] [] LOG:  redo
starts at 25BD/907338F8
2013-04-05 17:26:32 MSK 2113 @ from  [vxid:1/0 txid:0] [] LOG:  file
"pg_subtrans/28E5" doesn't exist, reading as zeroes
2013-04-05 17:26:32 MSK 2113 @ from  [vxid:1/0 txid:0] [] CONTEXT:
xlog redo xid assignment xtop 686136255: subxacts: 686137344 686137345
686137346 686137347 686137348 686137349 686137350 686137351 686137352
686137353 686137354 686137355 686137356 686137357 686137358 686137359
686137360 686137361 686137362 686137363 686137364 686137365 686137366
686137367 686137368 686137369 686137370 686137371 686137372 686137373
686137374 686137375 686137376 686137377 686137378 686137379 686137380
686137381 686137382 686137383 686137384 686137385 686137386 686137387
686137388 686137389 686137390 686137391 686137392 686137393 686137394
686137395 686137396 686137397 686137398 686137399 686137400 686137401
686137402 686137403 686137404 686137405 686137406 686137407
2013-04-05 17:26:32 MSK 2113 @ from  [vxid:1/0 txid:0] [] LOG:  file
"pg_subtrans/28E5" doesn't exist, reading as zeroes
2013-04-05 17:26:32 MSK 2113 @ from  [vxid:1/0 txid:0] [] CONTEXT:
xlog redo xid assignment xtop 686136255: subxacts: 686139330 686139331
686139332 686139333 686139334 686139335 686139336 686139337 686139338
686139339 686139340 686139341 686139342 686139343 686139344 686139345
686139346 686139347 686139348 686139349 686139350 686139351 686139352
686139353 686139354 686139355 686139356 686139357 686139358 686139359
686139360 686139361 686139362 686139363 686139364 686139365 686139366
686139367 686139368 686139369 686139370 686139371 686139372 686139373
686139374 686139375 686139376 686139377 686139378 686139379 686139380
686139381 686139382 686139383 686139384 686139385 686139386 686139387
686139388 686139389 686139390 686139391 686139392 686139393

[some more like this]

2013-04-05 17:26:36 MSK 2113 @ from  [vxid:1/0 txid:0] [] LOG:  file
"pg_subtrans/28E6" doesn't exist, reading as zeroes
2013-04-05 17:26:36 MSK 2113 @ from  [vxid:1/0 txid:0] [] CONTEXT:
xlog redo xid assignment xtop 686216055: subxacts: 686222447 686222448
686222449 686222450 686222451 686222452 686222453 686222454 686222455
686222456 686222457 686222459 686222460 686222461 686222462 686222463
686222464 686222502 686222561 686222647 686222722 686223272 686223359
686223360 686223361 686223363 686223364 686223365 686223366 686223367
686223368 686223369 686223370 686223371 686223372 686223373 686223374
686223375 686223376 686223377 686223378 686223379 686223380 686223381
686223382 686223383 686223384 686223385 686223386 686223387 686223388
686223389 686223390 686223391 686223392 686223393 686223394 686223395
686223396 686223397 686223398 686223399 686223400 686223401
2013-04-05 17:26:36 MSK 2113 @ from  [vxid:1/0 txid:0] [] FATAL:
could not access status of transaction 686225586
2013-04-05 17:26:36 MSK 2113 @ from  [vxid:1/0 txid:0] [] DETAIL:
Could not read from file "pg_subtrans/28E6" at offset 253952: Success.
2013-04-05 17:26:36 MSK 2113 @ from  [vxid:1/0 txid:0] [] CONTEXT:
xlog redo xid assignment xtop 686225585: subxacts: 686225586 686225587
686225588 686225589 686225590 686225591 686225592 686225593 686225594
686225595 686225596 686225597 686225598 686225599 686225600 686225601
686225602 686225603 686225604 686225605 686225606 686225607 686225608
686225609 686225610 686225611 686225612 686225613 686225614 686225615
686225616 686225617 686225621 686225622 686225625 686225626 686225628
686225632 686225633 686225636 686225637 686225638 686225639 686225640
686225641 686225644 686225645 686225646 686225649 686225650 686225657
686225658 686225661 686225662 686225665 686225666 686225670 686225671
686225672 686225673 686225678 686225679 686225684 686225685
2013-04-05 17:26:36 MSK 2110 @ from  [vxid: txid:0] [] LOG:  startup
process (PID 2113) exited with exit code 1
2013-04-05 17:26:36 MSK 2110 @ from  [vxid: txid:0] [] LOG:
terminating any other active server processes


--
Kind regards,
Sergey Konoplev
Database and Software Consultant

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: gray...@gmail.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Completely broken replica after PANIC: WAL contains references to invalid pages

2013-04-05 Thread Andres Freund
On 2013-04-05 07:22:08 -0700, Sergey Konoplev wrote:
> On Fri, Apr 5, 2013 at 7:15 AM, Andres Freund  wrote:
> > Cool. It would be very helpful if you could apply the patch and verify
> > that it works, it has been written somewhat blindly. Also I am afraid
> > that at least last time there was a second bug involved.
> 
> Okay, I will try to talk to admins but I am afraid it could take long.

Ok.
 
> > Could you show the log?
> 
> 2013-04-05 17:26:31 MSK 2113 @ from  [vxid: txid:0] [] LOG:  database
> system was shut down in recovery at 2013-04-05 17:18:02 MSK
> 2013-04-05 17:26:32 MSK 2113 @ from  [vxid: txid:0] [] LOG:  entering
> standby mode
> 2013-04-05 17:26:32 MSK 2113 @ from  [vxid:1/0 txid:0] [] LOG:  redo
> starts at 25BD/907338F8
> 2013-04-05 17:26:32 MSK 2113 @ from  [vxid:1/0 txid:0] [] LOG:  file
> "pg_subtrans/28E5" doesn't exist, reading as zeroes

Looks like it could be fixed by the patch. But that seems to imply that
you restarted not long before that? Could you check if theres a
different error before those?

Greetings,

Andres Freund

PS: The tander.ru addresses seem to bounce all mail I send them...

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] Installation crashes

2013-04-05 Thread HGuardia
*Hello, I'm using windows 7 starter 32bits and I have the following issues:

When I tried to install an older version (8.3), it shows me a message like
this:
"An error has occurred executing the microsoft vc++ runtime installer" 

Newer versions shows me this error:
"prerunscript.command.line.error" in 9.2.4v, I'm not an advanced user, but
here is the log file:
*


Log started 04/05/2013 at 10:35:49
Preferred installation mode : qt
Trying to init installer in mode qt
Mode qt successfully initialized
Executing
C:\Users\Jenni\AppData\Local\Temp/postgresql_installer_958a0fe061/temp_check_comspec.bat
 
Script exit code: 0

Script output:
 "test ok"

Script stderr:
 

Could not find registry key
HKEY_LOCAL_MACHINE\SOFTWARE\PostgreSQL\Installations\postgresql-9.2 Data
Directory. Setting variable iDataDirectory to empty value
Could not find registry key
HKEY_LOCAL_MACHINE\SOFTWARE\PostgreSQL\Installations\postgresql-9.2 Base
Directory. Setting variable iBaseDirectory to empty value
Could not find registry key
HKEY_LOCAL_MACHINE\SOFTWARE\PostgreSQL\Installations\postgresql-9.2 Service
ID. Setting variable iServiceName to empty value
Could not find registry key
HKEY_LOCAL_MACHINE\SOFTWARE\PostgreSQL\Installations\postgresql-9.2 Service
Account. Setting variable iServiceAccount to empty value
Could not find registry key
HKEY_LOCAL_MACHINE\SOFTWARE\PostgreSQL\Installations\postgresql-9.2 Super
User. Setting variable iSuperuser to empty value
Could not find registry key
HKEY_LOCAL_MACHINE\SOFTWARE\PostgreSQL\Installations\postgresql-9.2
Branding. Setting variable iBranding to empty value
Could not find registry key
HKEY_LOCAL_MACHINE\SOFTWARE\PostgreSQL\Installations\postgresql-9.2 Version.
Setting variable brandingVer to empty value
Could not find registry key
HKEY_LOCAL_MACHINE\SOFTWARE\PostgreSQL\Installations\postgresql-9.2
Shortcuts. Setting variable iShortcut to empty value
Could not find registry key
HKEY_LOCAL_MACHINE\SOFTWARE\PostgreSQL\Installations\postgresql-9.2
DisableStackBuilder. Setting variable iDisableStackBuilder to empty value
[10:36:08] Existing base directory: 
[10:36:08] Existing data directory: 
[10:36:08] Using branding: PostgreSQL 9.2
[10:36:08] Using Super User: postgres and Service Account: NT
AUTHORITY\NetworkService
[10:36:08] Using Service Name: postgresql-9.2
Executing cscript //NoLogo
"C:\Users\Jenni\AppData\Local\Temp\postgresql_installer_958a0fe061\prerun_checks.vbs"
Script exit code: 1

Script output:
 Error de CScript: No se encuentra el motor de secuencias de comandos
"VBScript" para la secuencia
"C:\Users\Jenni\AppData\Local\Temp\postgresql_installer_958a0fe061\prerun_checks.vbs".

Script stderr:
 Program ended with an error exit code

Error al ejecutar cscript //NoLogo
"C:\Users\Jenni\AppData\Local\Temp\postgresql_installer_958a0fe061\prerun_checks.vbs"
: Program ended with an error exit code




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Installation-crashes-tp5750926.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Completely broken replica after PANIC: WAL contains references to invalid pages

2013-04-05 Thread Sergey Konoplev
On Fri, Apr 5, 2013 at 7:33 AM, Andres Freund  wrote:
> Looks like it could be fixed by the patch. But that seems to imply that
> you restarted not long before that? Could you check if theres a
> different error before those?

Yes it had happened straight after restart this time. There are no any
errors in logs before it.


--
Kind regards,
Sergey Konoplev
Database and Software Consultant

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: gray...@gmail.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #8043: 9.2.4 doesn't open WAL files from archive, only looks in pg_xlog

2013-04-05 Thread bohmer
The following bug has been logged on the website:

Bug reference:  8043
Logged by:  Jeff Bohmer
Email address:  boh...@visionlink.org
PostgreSQL version: 9.2.4
Operating system:   CentOS 5.9 x86_64 kernel 2.6.18-348.3.1.el5
Description:

Hi pgsql-bugs,

PG version: 9.2.4, from yum.postgresql.org
OS: CentOS 5.9 x86_64 kernel 2.6.18-348.3.1.el5

Upgrading from 9.2.3 to 9.2.4 has broken archive recovery for me. Probably
related to this 9.2.4 change:

Ensure we do crash recovery before entering archive recovery,
if the database was not stopped cleanly and a recovery.conf
file is present (Heikki Linnakangas, Kyotaro Horiguchi,
Mitsumasa Kondo)

When starting the PostgreSQL 9.2.4 service on a base backup, I get this:

2013-04-05 12:49:04 MDT [10302]: [1-1] user=,db= LOG:  database system was
interrupted; last known up at 2013-04-05 10:02:01 MDT
2013-04-05 12:49:04 MDT [10302]: [2-1] user=,db= LOG:  starting archive
recovery
2013-04-05 12:49:04 MDT [10302]: [3-1] user=,db= LOG:  could not open file
"pg_xlog/0001002F002D" (log file 47, segment 45): No such file
or directory
2013-04-05 12:49:04 MDT [10302]: [4-1] user=,db= LOG:  invalid primary
checkpoint record
2013-04-05 12:49:04 MDT [10302]: [5-1] user=,db= LOG:  could not open file
"pg_xlog/0001002F002C" (log file 47, segment 44): No such file
or directory
2013-04-05 12:49:04 MDT [10302]: [6-1] user=,db= LOG:  invalid secondary
checkpoint record
2013-04-05 12:49:04 MDT [10302]: [7-1] user=,db= PANIC:  could not locate a
valid checkpoint record
2013-04-05 12:49:04 MDT [10297]: [2-1] user=,db= LOG:  startup process (PID
10302) was terminated by signal 6: Aborted
2013-04-05 12:49:04 MDT [10297]: [3-1] user=,db= LOG:  aborting startup due
to startup process failure

The WAL file 0001002F002D does exist in my WAL archive, but not
in pg_xlog. I exclude pg_xlog files when taking the base backup, per the
instructions from
:

You can, however, omit from the backup dump the files within
the cluster's pg_xlog/ subdirectory. This slight adjustment
is worthwhile because it reduces the risk of mistakes when
restoring.

I use a custom base backup script to call pg_start/stop_backup() and make
the backup with rsync.

The restore_command in recovery.conf is never called by PG 9.2.4 during
startup. I confirmed this by adding a "touch /tmp/restore_command.`date
+%H:%M:%S`" line at the beginning of the shell script I use for my
restore_command. No such files are created when starting PG 9.2.4.

After downgrading back to 9.2.3, archive recovery works using the very same
base backup, recovery.conf file, and restore_command. The log indicates that
PG 9.2.3 begins recovery by pulling WAL files from the archive instead of
pg_xlog:

2013-04-05 13:01:14 MDT [16981]: [1-1] user=,db= LOG:  database system was
interrupted; last known up at 2013-04-05 10:02:01 MDT
2013-04-05 13:01:14 MDT [16981]: [2-1] user=,db= LOG:  starting archive
recovery
2013-04-05 13:01:14 MDT [16981]: [3-1] user=,db= LOG:  restored log file
"0001002F002D" from archive
2013-04-05 13:01:14 MDT [16981]: [4-1] user=,db= LOG:  consistent recovery
state reached at 2F/2D80
2013-04-05 13:01:14 MDT [16981]: [5-1] user=,db= LOG:  redo starts at
2F/2D80
2013-04-05 13:01:15 MDT [16981]: [6-1] user=,db= LOG:  restored log file
"0001002F002E" from archive
2013-04-05 13:01:15 MDT [16981]: [7-1] user=,db= LOG:  restored log file
"0001002F002F" from archive

2013-04-05 13:01:17 MDT [16981]: [25-1] user=,db= LOG:  redo done at
2F/3F07B4D0
2013-04-05 13:01:17 MDT [16981]: [26-1] user=,db= LOG:  last completed
transaction was at log time 2013-04-05 12:53:01.069826-06
2013-04-05 13:01:17 MDT [16981]: [27-1] user=,db= LOG:  restored log file
"0001002F003F" from archive
2013-04-05 13:01:17 MDT [16981]: [28-1] user=,db= LOG:  selected new
timeline ID: 2
2013-04-05 13:01:17 MDT [16981]: [29-1] user=,db= LOG:  archive recovery
complete
2013-04-05 13:01:17 MDT [16991]: [1-1] user=,db= LOG:  checkpoint starting:
end-of-recovery immediate wait
2013-04-05 13:01:17 MDT [16991]: [2-1] user=,db= LOG:  checkpoint complete:
wrote 327 buffers (0.1%); 0 transaction log file(s) ad
ded, 0 removed, 0 recycled; write=0.015 s, sync=0.000 s, total=0.035 s; sync
files=0, longest=0.000 s, average=0.000 s
2013-04-05 13:01:17 MDT [16978]: [2-1] user=,db= LOG:  database system is
ready to accept connections

So, the behavior has definitely changed between 9.2.3 and 9.2.4. Is this a
bug in 9.2.4?

Or, must I now include pg_xlog files when taking base backups with 9.2.4,
contrary to the documentation?

Or, is there a way to make PG 9.2.4 use restore_command for recovery, as
9.2.3 does?

Thank you,
- Jeff



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.po

Re: [BUGS] BUG #8043: 9.2.4 doesn't open WAL files from archive, only looks in pg_xlog

2013-04-05 Thread Jeff Janes
On Fri, Apr 5, 2013 at 12:27 PM,  wrote:

> The following bug has been logged on the website:
>
> Bug reference:  8043
> Logged by:  Jeff Bohmer
> Email address:  boh...@visionlink.org
> PostgreSQL version: 9.2.4
> Operating system:   CentOS 5.9 x86_64 kernel 2.6.18-348.3.1.el5
> Description:
>
> Hi pgsql-bugs,
>
> PG version: 9.2.4, from yum.postgresql.org
> OS: CentOS 5.9 x86_64 kernel 2.6.18-348.3.1.el5
>
> Upgrading from 9.2.3 to 9.2.4 has broken archive recovery for me. Probably
> related to this 9.2.4 change:
>
> Ensure we do crash recovery before entering archive recovery,
> if the database was not stopped cleanly and a recovery.conf
> file is present (Heikki Linnakangas, Kyotaro Horiguchi,
> Mitsumasa Kondo)
>
> When starting the PostgreSQL 9.2.4 service on a base backup, I get this:
>
> 2013-04-05 12:49:04 MDT [10302]: [1-1] user=,db= LOG:  database system was
> interrupted; last known up at 2013-04-05 10:02:01 MDT
> 2013-04-05 12:49:04 MDT [10302]: [2-1] user=,db= LOG:  starting archive
> recovery
> 2013-04-05 12:49:04 MDT [10302]: [3-1] user=,db= LOG:  could not open file
> "pg_xlog/0001002F002D" (log file 47, segment 45): No such file
> or directory
>

 ...


> I use a custom base backup script to call pg_start/stop_backup() and make
> the backup with rsync.
>
> The restore_command in recovery.conf is never called by PG 9.2.4 during
> startup. I confirmed this by adding a "touch /tmp/restore_command.`date
> +%H:%M:%S`" line at the beginning of the shell script I use for my
> restore_command. No such files are created when starting PG 9.2.4.
>
> After downgrading back to 9.2.3, archive recovery works using the very same
> base backup, recovery.conf file, and restore_command. The log indicates
> that
> PG 9.2.3 begins recovery by pulling WAL files from the archive instead of
> pg_xlog:
>


I can reproduce the behavior you report only if I remove the "backup_label"
file from the restored data directory before I begin recovery.  Of course,
doing that renders the backup invalid, as without it recovery is very
likely to begin from the wrong WAL recovery location.

I think it is appropriate that 9.2.4 refuses to cooperate in those
circumstances, and it was a bug that 9.2.3 did allow it.

Do you have a "backup_label" file?



> Or, must I now include pg_xlog files when taking base backups with 9.2.4,
> contrary to the documentation?
>


You do not need to include pg_xlog, but you do need to include
backup_label.  And you always did need to include it--if you were not
including it in the past, then you were playing with fire and is only due
to luck that your database survived.

Cheers,

Jeff


[BUGS]

2013-04-05 Thread Roberto Lemos
Good evening, I need to install Postgre but when I install the siguente
message appears: The password specified was incorrect. Please enter the
correct password for the postgres windows user account.
I do not know what to do: Password never installed and this message appears.

Thanks


Re: [BUGS]

2013-04-05 Thread John R Pierce

On 4/5/2013 4:07 PM, Roberto Lemos wrote:
Good evening, I need to install Postgre but when I install the 
siguente message appears: The password specified was incorrect. Please 
enter the correct password for the postgres windows user account.
I do not know what to do: Password never installed and this message 
appears.


Thanks


A) this is not a bug, and should be on a discussion list or forum.


B) you probably had a prior version installed at some point.   Go into 
the Windows "Computer Management" tool, Local Users and Groups, Users, 
find the postgres account, and manually set a password there.then 
tell the postgres installer what you set it to.



--
john r pierce  37N 122W
somewhere on the middle of the left coast



--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs