[BUGS] BUG #7532: cannot update to 9.2

2012-09-12 Thread arsen_mmx
The following bug has been logged on the website:

Bug reference:  7532
Logged by:  arnes
Email address:  arsen_...@163.com
PostgreSQL version: 9.1.5
Operating system:   ubuntu 11.04 natty
Description:

I cannot update my postgresql to 9.2. Is there anything wrong with the
repository?
My friend uses ubuntu 11.10, and he can update to 9.2.



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7532: cannot update to 9.2

2012-09-12 Thread Martin Pitt
Hello Arsen,

arsen_...@163.com [2012-09-12  1:48 +]:
> Bug reference:  7532
> Logged by:  arnes
> Email address:  arsen_...@163.com
> PostgreSQL version: 9.1.5
> Operating system:   ubuntu 11.04 natty
> Description:
> 
> I cannot update my postgresql to 9.2. Is there anything wrong with the
> repository?
> My friend uses ubuntu 11.10, and he can update to 9.2.

This is not an upstream bug for this list. Neither Ubuntu 11.04 nor
Ubuntu 11.10 contain 9.2, as at that time 9.2 wasn't released yet.

Presumably your friend is using
https://launchpad.net/~pitti/+archive/postgresql or
http://pgapt.debian.net/  (but be aware that neither are officially
supported).

Martin
-- 
Martin Pitt| http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #6704: ALTER EXTENSION postgis SET SCHEMA leaves dangling relations

2012-09-12 Thread Dimitri Fontaine
Hi,

Sorry for being late at the party… been distracted away…
   
Bruce Momjian  writes:
> On Fri, Jun 22, 2012 at 10:37:10PM -0400, Tom Lane wrote:
>> j...@pgexperts.com writes:
>> > DROP and CREATE extension appear to work fine, but if you ALTER EXTENSION
>> > postgis SET SCHEMA foo, it leaves a few relations behind.
>> 
>> What it seems to be leaving behind is indexes ... also relation rowtypes.
>> 
>> A bit of looking shows that ALTER EXTENSION SET SCHEMA calls
>> AlterObjectNamespace_oid on the table.  AlterObjectNamespace_oid
>> calls AlterRelationNamespaceInternal, and nothing else.  In comparison,
>> ALTER TABLE SET SCHEMA (AlterTableNamespace) calls
>> AlterRelationNamespaceInternal and about four other things.  I'm not
>> sure if this was broken before the last round of refactoring in this
>> area, but for sure it's broken now.

Looking at that code, my theory of how we got there is that in the
submitted extension patch I did only use DEPENDENCY_INTERNAL and Tom
introduced the much better DEPENDENCY_EXTENSION tracking. With the
former, indexes and sequences and constraints where found in the
dependency walking code, but only the main relation is now registered in
the later model.

I need to do some testing about dependency tracking on SERIAL generated
sequences compared to manually created sequences in extension scripts, I
think we track sequences directly only in the manual case.

I think we need to share more code in between
AlterRelationNamespaceInternal and AlterTableNamespace, but I'm not sure
if that's not exactly what Álvaro did try with his _oid() attempt that
failed.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7516: PL/Perl crash

2012-09-12 Thread Marko Tiikkaja

On 9/12/12 1:50 AM, Tom Lane wrote:

Marko Tiikkaja  writes:

Joel Jacobson managed to narrow it down to this test case, which crashes
consistently on Ubuntu 12.04 both with and without your patch.  I,
however, wasn't able to reproduce the problem on my OS X Mountain Lion.


Doesn't reproduce for me either ...


Ok, I can reproduce it on my Ubuntu virtual machine.

The problem looks like this:

#0  free_plperl_function (prodesc=0x24195a0) at plperl.c:2428
#1  0x7ff47babc045 in plperl_call_handler (fcinfo=0x247c160) at 
plperl.c:1710
#2  0x005663fa in ExecMakeFunctionResult (fcache=0x247c0f0, 
econtext=0x247bf00, isNull=0x247ca78 "", isDone=0x247cb90) at 
execQual.c:1917
#3  0x00568942 in ExecTargetList (isDone=0x7fff1402fc0c, 
itemIsDone=0x247cb90, isnull=0x247ca78 "", values=0x247ca60, 
econtext=0x247bf00, targetlist=0x247cb60) at execQual.c:5210
#4  ExecProject (projInfo=, isDone=0x7fff1402fc0c) at 
execQual.c:5425

#5  0x00578aea in ExecResult (node=0x247bdf0) at nodeResult.c:155
#6  0x00561b18 in ExecProcNode (node=0x247bdf0) at 
execProcnode.c:367
#7  0x0055ef2a in ExecutePlan (dest=0xad4600, 
direction=, numberTuples=0, sendTuples=1 '\001', 
operation=CMD_SELECT, planstate=0x247bdf0, estate=0x247bce0) at 
execMain.c:1440
#8  standard_ExecutorRun (queryDesc=0x244f6d0, direction=out>, count=0) at execMain.c:314
#9  0x0058192d in _SPI_pquery (tcount=, 
fire_triggers=1 '\001', queryDesc=) at spi.c:2110
#10 _SPI_execute_plan (paramLI=0x245e1f0, snapshot=, 
crosscheck_snapshot=0x0, read_only=0 '\000', fire_triggers=1 '\001', 
tcount=0, plan=) at spi.c:1922
#11 0x00581e57 in SPI_execute_plan_with_paramlist 
(plan=0x2476c30, params=0x245e1f0, read_only=0 '\000', tcount=0) at 
spi.c:423
#12 0x7ff47b0c7075 in exec_run_select (estate=0x7fff14030270, 
expr=0x24569f0, maxtuples=0, portalP=0x0) at pl_exec.c:4573
#13 0x7ff47b0ca7b4 in exec_stmt_perform (estate=0x7fff14030270, 
stmt=) at pl_exec.c:1413

#14 exec_stmt (stmt=0x2456ab0, estate=0x7fff14030270) at pl_exec.c:1289
#15 exec_stmts (estate=0x7fff14030270, stmts=) at 
pl_exec.c:1248
#16 0x7ff47b0cce59 in exec_stmt_block (estate=0x7fff14030270, 
block=0x2457448) at pl_exec.c:1047
#17 0x7ff47b0ca798 in exec_stmt (stmt=0x2457448, 
estate=0x7fff14030270) at pl_exec.c:1281
#18 exec_stmts (estate=0x7fff14030270, stmts=) at 
pl_exec.c:1248
#19 0x7ff47b0ccfcc in exec_stmt_block (estate=0x7fff14030270, 
block=0x2457508) at pl_exec.c:1186
#20 0x7ff47b0cda37 in plpgsql_exec_function (func=0x243a140, 
fcinfo=0x7fff14030580) at pl_exec.c:324
#21 0x7ff47b0c26d3 in plpgsql_call_handler (fcinfo=0x7fff14030580) 
at pl_handler.c:122
#22 0x00566d4d in ExecMakeTableFunctionResult 
(funcexpr=0x2443368, econtext=0x24428b0, expectedDesc=0x2443110, 
randomAccess=0 '\000') at execQual.c:2146
#23 0x00578381 in FunctionNext (node=0x24427a0) at 
nodeFunctionscan.c:65
#24 0x00568dde in ExecScanFetch (recheckMtd=0x578300 
, accessMtd=0x578310 , node=0x24427a0) at 
execScan.c:82
#25 ExecScan (node=0x24427a0, accessMtd=0x578310 , 
recheckMtd=0x578300 ) at execScan.c:132
#26 0x00561a78 in ExecProcNode (node=0x24427a0) at 
execProcnode.c:416
#27 0x0055ef2a in ExecutePlan (dest=0xad4600, 
direction=, numberTuples=0, sendTuples=1 '\001', 
operation=CMD_SELECT, planstate=0x24427a0, estate=0x2442690) at 
execMain.c:1440
#28 standard_ExecutorRun (queryDesc=0x243f700, direction=out>, count=0) at execMain.c:314
#29 0x0058192d in _SPI_pquery (tcount=, 
fire_triggers=1 '\001', queryDesc=) at spi.c:2110
#30 _SPI_execute_plan (paramLI=0x243f670, snapshot=, 
crosscheck_snapshot=0x0, read_only=0 '\000', fire_triggers=1 '\001', 
tcount=0, plan=) at spi.c:1922
#31 0x00581f39 in SPI_execute_plan (plan=0x23ee750, 
Values=0x243df38, Nulls=0x24376b0 "  RCHAR", read_only=0 '\000', 
tcount=0) at spi.c:391
#32 0x7ff47babce2d in plperl_spi_exec_prepared (query=0x2437690 
"0x22930e0", attr=0x0, argc=2, argv=0x243df18) at plperl.c:3425
#33 0x7ff47babe810 in XS__spi_exec_prepared (my_perl=out>, cv=) at SPI.xs:141
#34 0x7ff47b7e67ff in Perl_pp_entersub (my_perl=0x237d800) at 
pp_hot.c:3046
#35 0x7ff47b7ddc96 in Perl_runops_standard (my_perl=0x237d800) at 
run.c:41
#36 0x7ff47b77959e in Perl_call_sv (my_perl=0x237d800, sv=0x24017f8, 
flags=10) at perl.c:2647
#37 0x7ff47bab5f62 in plperl_call_perl_func (desc=0x24195a0, 
fcinfo=0x242fa90) at plperl.c:2056
#38 0x7ff47baba45b in plperl_func_handler (fcinfo=0x242fa90) at 
plperl.c:2196

#39 plperl_call_handler (fcinfo=0x242fa90) at plperl.c:1705
#40 0x005663fa in ExecMakeFunctionResult (fcache=0x242fa20, 
econtext=0x242f830, isNull=0x24303a8 "", isDone=0x24304c0) at 
execQual.c:1917
#41 0x00568942 in ExecTargetList (isDone=0x7fff140312fc, 
itemIsDone=0x24304c0, isnull=0x24303a8 "", values=0x2430390, 
econtext=0x242f830, targetlist=0x2430490) at execQual.c:5210
#42 ExecProject (projInfo=, is

[BUGS] BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby

2012-09-12 Thread amit . kapila
The following bug has been logged on the website:

Bug reference:  7533
Logged by:  Amit Kapila
Email address:  amit.kap...@huawei.com
PostgreSQL version: 9.2.0
Operating system:   Suse
Description:

M host is primary, S host is standby and CS host is cascaded standby. 

1.Set up postgresql-9.2beta2/RC1 on  all hosts. 
2.Execute command initdb on host M to create fresh database. 
3.Modify the configure file postgresql.conf on host M like this: 
 listen_addresses = 'M' 
   port = 15210 
   wal_level = hot_standby   
   max_wal_senders = 4 
   hot_standby = on 
4.modify the configure file pg_hba.conf on host M like this: 
host replication repl M/24md5 
5.Start the server on host M as primary. 
6.Connect one client to primary server and create a user ‘repl’ 
  Create user repl superuser password '123'; 
7.Use the command pg_basebackup on the host S to retrieve database of
primary host 
pg_basebackup  -D /opt/t38917/data -F p -x fetch -c fast -l repl_backup -P
-v -h M -p 15210 -U repl –W 
8. Copy one recovery.conf.sample from share folder of package to database
folder of the host S. Then rename this file to recovery.conf 
9.Modify the file recovery.conf on host S as below: 
 standby_mode = on 
 primary_conninfo = 'host=M port=15210 user=repl password=123' 
10. Modify the file postgresql.conf on host S as follow: 
   listen_addresses = 'S' 
11.Start the server on host S as standby server. 
12.Use the command pg_basebackup on the host CS to retrieve database of
standby host 
pg_basebackup  -D /opt/t38917/data -F p -x fetch -c fast -l repl_backup -P
-v -h M -p 15210 -U repl –W 
13.Modify the file recovery.conf on host CS as below: 
   standby_mode = on 
   primary_conninfo = 'host=S port=15210 user=repl password=123' 
14. Modify the file postgresql.conf on host S as follow: 
 listen_addresses = 'CS' 
15.Start the server on host CS as Cascaded standby server node. 
16. Try to connect a client to host CS but it gives error as: 
FATAL:  the database system is starting up 


Observations related to bug
--
In the above scenario it is observed that Start-up process has read all data
(in our defect scenario minRecoveryPoint is 5016220) till the position
5016220 and then it goes and check for recovery consistency by following
condition in function CheckRecoveryConsistency: 
if (!reachedConsistency && 
XLByteLE(minRecoveryPoint, EndRecPtr) && 
XLogRecPtrIsInvalid(ControlFile->backupStartPoint)) 

At this point first two conditions are true but last condition is not true
because still redo has not been applied and hence backupStartPoint has not
been reset. So it does not signal postmaster regarding consistent stage.
After this it goes and applies the redo and then reset backupStartPoint and
then it goes to read next set of record. Since all records have been already
read, so it starts waiting for the new record from the Standby node. But
since there is no new record from Standby node coming so it keeps waiting
for that and it does not get chance to recheck the recovery consistent
level. And hence client connection does not get allowed.




-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-09-12 Thread amit . kapila
The following bug has been logged on the website:

Bug reference:  7534
Logged by:  Amit Kapila
Email address:  amit.kap...@huawei.com
PostgreSQL version: 9.2.0
Operating system:   Suse 10
Description:

1. Both master and standby machine are connected normally, 
2. then you use the command: ifconfig ip down; make the network card of
master and standby down, 

Observation 
master can detect connect abnormal, but the standby can't detect connect
abnormal and show a connected channel long time. 




-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #6704: ALTER EXTENSION postgis SET SCHEMA leaves dangling relations

2012-09-12 Thread Alvaro Herrera
Excerpts from Dimitri Fontaine's message of mié sep 12 06:51:43 -0300 2012:
> Hi,
> 
> Sorry for being late at the party… been distracted away…

Welcome ;-)

> > On Fri, Jun 22, 2012 at 10:37:10PM -0400, Tom Lane wrote:

> >> A bit of looking shows that ALTER EXTENSION SET SCHEMA calls
> >> AlterObjectNamespace_oid on the table.  AlterObjectNamespace_oid
> >> calls AlterRelationNamespaceInternal, and nothing else.  In comparison,
> >> ALTER TABLE SET SCHEMA (AlterTableNamespace) calls
> >> AlterRelationNamespaceInternal and about four other things.  I'm not
> >> sure if this was broken before the last round of refactoring in this
> >> area, but for sure it's broken now.
> 
> Looking at that code, my theory of how we got there is that in the
> submitted extension patch I did only use DEPENDENCY_INTERNAL and Tom
> introduced the much better DEPENDENCY_EXTENSION tracking. With the
> former, indexes and sequences and constraints where found in the
> dependency walking code, but only the main relation is now registered in
> the later model.
> 
> I need to do some testing about dependency tracking on SERIAL generated
> sequences compared to manually created sequences in extension scripts, I
> think we track sequences directly only in the manual case.

Well, what I saw was that both the table and its SERIAL-generated
sequence got an DEPENDENCY_EXTENSION row in pg_depend, which is exactly
what (IMV) causes the problem.  One of my proposals is to tweak the code
to avoid that row (but if we do that, then we need to do something about
databases that contain such rows today).

> I think we need to share more code in between
> AlterRelationNamespaceInternal and AlterTableNamespace, but I'm not sure
> if that's not exactly what Álvaro did try with his _oid() attempt that
> failed.

What I did was create an AlterTableNamespaceInternal that goes through
all the things that must be moved (this means calling
AlterRelationNamespaceInternal for the table, and then doing some more
calls to move the sequences, indexes, constraints).  With this change,
both AlterTableNamespace and AlterObjectNamespace_oid can call it.
Previously, AlterObjectNamespace_oid was calling
AlterRelationNamespaceInternal directly for the relation only.

As far as I can see, that split of AlterTableNamespace is still needed.
The problem here is that we need something *beyond* that fix.  Did you
look at my patches?

Am I making sense?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-09-12 Thread Magnus Hagander
On Wed, Sep 12, 2012 at 1:54 PM,   wrote:
> The following bug has been logged on the website:
>
> Bug reference:  7534
> Logged by:  Amit Kapila
> Email address:  amit.kap...@huawei.com
> PostgreSQL version: 9.2.0
> Operating system:   Suse 10
> Description:
>
> 1. Both master and standby machine are connected normally,
> 2. then you use the command: ifconfig ip down; make the network card of
> master and standby down,
>
> Observation
> master can detect connect abnormal, but the standby can't detect connect
> abnormal and show a connected channel long time.

The master will detect it quicker, because it will get an error when
it tries to send something.

But the standby should detect it either when sending the feedback
message (what's your wal_receiver_status_interval set to?) or when
ythe kernel does (have you configured the tcp keepalive on the slave
somehow?)

Oh, and what do you actually mean by "long time"?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-09-12 Thread Fujii Masao
On Wed, Sep 12, 2012 at 8:54 PM,   wrote:
> The following bug has been logged on the website:
>
> Bug reference:  7534
> Logged by:  Amit Kapila
> Email address:  amit.kap...@huawei.com
> PostgreSQL version: 9.2.0
> Operating system:   Suse 10
> Description:
>
> 1. Both master and standby machine are connected normally,
> 2. then you use the command: ifconfig ip down; make the network card of
> master and standby down,
>
> Observation
> master can detect connect abnormal, but the standby can't detect connect
> abnormal and show a connected channel long time.

What about setting keepalives_xxx libpq parameters?
http://www.postgresql.org/docs/devel/static/libpq-connect.html#LIBPQ-PARAMKEYWORDS

Keepalives are not a perfect solution for the termination of connection, but
it would help to a certain extent. If you need something like
walreceiver-version
of replication_timeout, such feature has not been implemented yet. Please feel
free to implement that!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] initdb.exe changes --locale option

2012-09-12 Thread Mike Toews
I've found a general solution: with the locale string, replace the
first ", " (comma space) with "_".

Around line 33 of initcluster.vbs, add:

strLocale = Replace(strLocale,", ","_",1,1)

I think it is fine to show "English, New Zealand" in the drop-down
menu for the GUI installer, but initcluster.vbs needs to do the
replacement to "English_New Zealand" in order to fulfil the correct
initialisation.

My testing was conducted using a Python script http://pastebin.com/9epyWz7x
which produces a tab delimited table of input locales, and the locale
chosen by initdb.exe, as well as the default language for text search
configuration.

The results from 200 locales shows some significant problems with
locale detection, such that most "Language, Country" are substituted
with only one country (you will pick up the pattern if you look at the
data). Secondly, there are cases that are completely off: "Tamazight
(Latin), Algeria" : "English_United Kingdom.1252", which is corrected
to "Tamazight (Latin)_Algeria.1252" with the proper substitution.

However, there are three corner cases (of 200) that either sort-of
breaks things, or doesn't resolve anything:

Original: Chinese (Traditional), Macao S.A.R. : Chinese (Traditional)_Taiwan.950
Replaced: Chinese (Traditional)_Macao S.A.R. : English_United Kingdom.1252

Original: Lao, Lao P.D.R. : Lao_Lao P.D.R..1252
Replaced: Lao_Lao P.D.R. : English_United Kingdom.1252

Original: Norwegian (Bokmål), Norway : English_United Kingdom.1252
Replaced: Norwegian (Bokmål)_Norway : English_United Kingdom.1252

(Note: I'm testing on a Windows Vista computer from the UK)

Lastly, I had a look at the source code initdb.c, which appears to
assume only POSIX locale of the format:
[language[_territory][.codeset][@modifier]]

E.g., see find_matching_ts_config, which assumes this locale format:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/bin/initdb/initdb.c;h=824c7fa7e4c76e0a3b8204ce0cdd21564f23d5df;hb=HEAD#l886

It should probably handle the WIN32 logic separately from POSIX
locales, but that's a deeper matter.

-Mike


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] Re: Probable bug with CreateFakeRelcacheEntry (now with reproducible test case)

2012-09-12 Thread Jeff Davis
Indeed, this is a nasty bug that leads to data corruption. The following
sequence results in corruption of the visibility map, but I believe it
can be shown to cause problems for a btree or GIN index as well. So it's
recoverable if you do a VACUUM or a reindex.

drop table foo;
create table foo(i int);
create index foo_idx on foo(i);
insert into foo values(1);
vacuum foo;
checkpoint;
insert into foo values(2);

-- Send a SIGQUIT to the postmaster, and restart it.
-- Now the buffer for the VM is in shared memory without BM_PERMANENT.
-- VM block comes from backup block which still has VM bit set, but wal
-- record for insert (2) unsets it in memory.
 
delete from foo where i = 2;
-- This checkpoint will *not* write out the buffer because it's not
-- BM_PERMANENT. Therefore, it remains the same on disk, with the VM
-- bit set.
checkpoint;

-- Send a SIGQUIT to the postmaster and restart. WAL records prior to
-- that last checkpoint aren't replayed, so VM bit is still set.

set enable_seqscan=false;
select * from foo where i = 2;
vacuum foo;
  WARNING:  page is not marked all-visible but visibility map bit is
  set in relation "foo" page 0
  VACUUM
select * from foo;
   i 
  ---
   1
  (1 row)

This bug seems particularly troublesome because the right fix would be
to include the relpersistence in the WAL records that need it. But that
can't be backported (right?).

The problem might not be a live problem before
7e4911b2ae33acff7b85234b91372133ec6df9d4, because that bug was hiding
this one.

Regards,
Jeff Davis



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby

2012-09-12 Thread Fujii Masao
On Wed, Sep 12, 2012 at 8:47 PM,   wrote:
> The following bug has been logged on the website:
>
> Bug reference:  7533
> Logged by:  Amit Kapila
> Email address:  amit.kap...@huawei.com
> PostgreSQL version: 9.2.0
> Operating system:   Suse
> Description:
>
> M host is primary, S host is standby and CS host is cascaded standby.
>
> 1.Set up postgresql-9.2beta2/RC1 on  all hosts.
> 2.Execute command initdb on host M to create fresh database.
> 3.Modify the configure file postgresql.conf on host M like this:
>  listen_addresses = 'M'
>port = 15210
>wal_level = hot_standby
>max_wal_senders = 4
>hot_standby = on
> 4.modify the configure file pg_hba.conf on host M like this:
> host replication repl M/24md5
> 5.Start the server on host M as primary.
> 6.Connect one client to primary server and create a user ‘repl’
>   Create user repl superuser password '123';
> 7.Use the command pg_basebackup on the host S to retrieve database of
> primary host
> pg_basebackup  -D /opt/t38917/data -F p -x fetch -c fast -l repl_backup -P
> -v -h M -p 15210 -U repl –W
> 8. Copy one recovery.conf.sample from share folder of package to database
> folder of the host S. Then rename this file to recovery.conf
> 9.Modify the file recovery.conf on host S as below:
>  standby_mode = on
>  primary_conninfo = 'host=M port=15210 user=repl password=123'
> 10. Modify the file postgresql.conf on host S as follow:
>listen_addresses = 'S'
> 11.Start the server on host S as standby server.
> 12.Use the command pg_basebackup on the host CS to retrieve database of
> standby host
> pg_basebackup  -D /opt/t38917/data -F p -x fetch -c fast -l repl_backup -P
> -v -h M -p 15210 -U repl –W
> 13.Modify the file recovery.conf on host CS as below:
>standby_mode = on
>primary_conninfo = 'host=S port=15210 user=repl password=123'
> 14. Modify the file postgresql.conf on host S as follow:
>  listen_addresses = 'CS'
> 15.Start the server on host CS as Cascaded standby server node.
> 16. Try to connect a client to host CS but it gives error as:
> FATAL:  the database system is starting up

This procedures didn't reproduce the problem in HEAD. But when I restarted
the master server between the step 11 and 12, I was able to reproduce the
problem.

> Observations related to bug
> --
> In the above scenario it is observed that Start-up process has read all data
> (in our defect scenario minRecoveryPoint is 5016220) till the position
> 5016220 and then it goes and check for recovery consistency by following
> condition in function CheckRecoveryConsistency:
> if (!reachedConsistency &&
> XLByteLE(minRecoveryPoint, EndRecPtr) &&
> XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
>
> At this point first two conditions are true but last condition is not true
> because still redo has not been applied and hence backupStartPoint has not
> been reset. So it does not signal postmaster regarding consistent stage.
> After this it goes and applies the redo and then reset backupStartPoint and
> then it goes to read next set of record. Since all records have been already
> read, so it starts waiting for the new record from the Standby node. But
> since there is no new record from Standby node coming so it keeps waiting
> for that and it does not get chance to recheck the recovery consistent
> level. And hence client connection does not get allowed.

If cascaded standby starts a recovery at a normal checkpoint record,
this problem will not happen. Because if wal_level is set to hot_standby,
XLOG_RUNNING_XACTS WAL record always follows after the normal
checkpont record. So while XLOG_RUNNING_XACTS record is being replayed,
ControlFile->backupStartPoint can be reset, and then cascaded standby
can pass through the consistency test.

The problem happens when cascaded standby starts a recovery at a
shutdown checkpoint record. In this case, no WAL record might follow
the checkpoint one yet. So, after replaying the shutdown checkpoint
record, cascaded standby needs to wait for new WAL record to appear
before reaching the code block for resetting ControlFile->backupStartPoint.
The cascaded standby cannot reach a consistent state and a client cannot
connect to the cascaded standby until new WAL has arrived.

Attached patch will fix the problem. In this patch, if recovery is
beginning at a shutdown checkpoint record, any ControlFile fields
(like backupStartPoint) required for checking that an end-of-backup is
reached are not set at first. IOW, cascaded standby thinks that the
database is consistent from the beginning. This is safe because
a shutdown checkpoint record means that there is no running database
activity at that point and the database is in consistent state.

Comments? Review?

Regards,

-- 
Fujii Masao


cascaded_standby_and_shutdown_checkpoint_v1.patch
Description: Binary data

-- 
Sent via pgsq

[BUGS] BUG #7535: ERROR: variable not found in subplan target list

2012-09-12 Thread ldm
The following bug has been logged on the website:

Bug reference:  7535
Logged by:  Louis-David Mitterrand
Email address:  l...@apartia.fr
PostgreSQL version: 9.2.0
Operating system:   debian testing
Description:

/*error in 9.2*/
drop database error92;
create database error92;
\c error92
CREATE TABLE price (
id_price serial primary key
);
CREATE TABLE cabin_class (
id_cabin_class serial primary key,
cabin_class_name text not null
);
CREATE TABLE cabin_type (
id_cabin_type serial primary key,
id_cabin_class integer references cabin_class not null,
cabin_type_name text not null,
cabin_type_code text not null
);
CREATE TABLE cabin_category (
id_cabin_category serial primary key,
id_cabin_type integer references cabin_type not null,
cabin_cat_name text,
cabin_cat_code text
);
CREATE TABLE alert_cruise (
id_alert_cruise serial primary key,
/*id_cruise integer references cruise not null,
id_currency integer references currency,*/
enabled boolean default true not null,
md5_code text DEFAULT md5(now()::text || random()::text)
);
CREATE TABLE alert_cabin_category (
id_alert_cruise integer references alert_cruise,
id_cabin_category integer references cabin_category not null
);
CREATE TABLE alert_cabin_type (
id_alert_cruise integer references alert_cruise,
id_cabin_type integer references cabin_type not null
);
CREATE TABLE alert_cabin_class (
id_alert_cruise integer references alert_cruise,
id_cabin_class integer references cabin_class not null
);
CREATE VIEW alert_cruise_all AS
select c.*, ac.enabled
from (
-- ac.id_cabin_class is the first one, so other UNION parts will
-- use that column name as well
select 'class' as type,cl.cabin_class_name as type_name,ac.*
from alert_cabin_class ac
join cabin_class cl using (id_cabin_class)
union all
select 'type' as type,cl.cabin_class_name||case when 
cl.cabin_class_name =
ct.cabin_type_name then '' else ' ~ '||ct.cabin_type_name end as
type_name,ac.*
from alert_cabin_type ac
join cabin_type ct using (id_cabin_type)
join cabin_class cl using (id_cabin_class)
union all
select 'category' as type,cl.cabin_class_name||
case when cl.cabin_class_name = ct.cabin_type_name
then '' else ' ~ '||ct.cabin_type_name end||' ~ 
'||cc.cabin_cat_code
as type_name,ac.*
from alert_cabin_category ac
join cabin_category cc
join cabin_type ct using (id_cabin_type)
join cabin_class cl using (id_cabin_class)
using (id_cabin_category)
)
as c join alert_cruise ac using (id_alert_cruise);
create view alert_to_category as
select ac.*, cl.cabin_class_name, ct.cabin_type_name, cc.cabin_cat_name,
cc.id_cabin_category, cc.cabin_cat_code,
case when type='class' then cl.cabin_class_name
when type='type' then ct.cabin_type_name
when type='category' then ct.cabin_type_name||' '||cc.cabin_cat_code
end as cabin_name
from alert_cruise_all ac
left join cabin_class cl on (
(ac.type = 'class' and cl.id_cabin_class=ac.id_cabin_class)
or (ac.type = 'type' and cl.id_cabin_class=(select id_cabin_class
from cabin_type where id_cabin_type=ac.id_cabin_class))
or (ac.type = 'category' and cl.id_cabin_class=(select
ct2.id_cabin_class from cabin_type ct2 join cabin_category cc2 using
(id_cabin_type) where cc2.id_cabin_category=ac.id_cabin_class))
)
join cabin_type ct on (
(ac.type = 'class' and  ct.id_cabin_class=cl.id_cabin_class)
or (ac.type = 'type' and ct.id_cabin_type=ac.id_cabin_class)
or (ac.type = 'category' and ct.id_cabin_type=(select id_cabin_type
from cabin_category where id_cabin_category=ac.id_cabin_class))
)
join cabin_category cc on (
(ac.type = 'category' and 
cc.id_cabin_category=ac.id_cabin_class)
or (ac.type != 'category' and ct.id_cabin_type=cc.id_cabin_type)
);
select 
first_value(max(p.id_price)) over () as id_price1
,ac.cabin_name
from alert_to_category ac
join price p on (p.id_price=ac.id_alert_cruise)
group by ac.cabin_name;
/*
select 
first_value(max(p.id_price)) over () as id_price1
,ac.cabin_name
from alert_to_category ac
join price p on (p.id_cruise=ac.id_cruise)
group by ac.cabin_name;
*/
/*EOF*/




-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-09-12 Thread Amit Kapila
On Wednesday, September 12, 2012 10:12 PM Magnus Hagander wrote:
On Wed, Sep 12, 2012 at 1:54 PM,   wrote:
>> The following bug has been logged on the website:
>
>> Bug reference:  7534
>> Logged by:  Amit Kapila
>> Email address:  amit.kap...@huawei.com
>> PostgreSQL version: 9.2.0
>> Operating system:   Suse 10
>> Description:
>
>> 1. Both master and standby machine are connected normally,
>> 2. then you use the command: ifconfig ip down; make the network card of
>> master and standby down,
>
>> Observation
>> master can detect connect abnormal, but the standby can't detect connect
>> abnormal and show a connected channel long time.

> The master will detect it quicker, because it will get an error when
> it tries to send something.

> But the standby should detect it either when sending the feedback
> message (what's your wal_receiver_status_interval set to?) or when
> ythe kernel does (have you configured the tcp keepalive on the slave
> somehow?)
  wal_receiver_status_interval - 10s (we have not changed this. Used as
default).
  We have tried by using tcp keepalive as well, it might not be able to
detect as receiver is anyway trying to send
  Receiver status.  
  It fails during send socket call from XLogWalRcvSendReply() after calling
the same many times as internally might be in   
  send() until the sockets internal buffer is full, it keeps accumulating
even if other side recv has not received the 
  data.
  Also in walsender, it is failing to replication_timeout parameter not due
to send failure.
  So in my opinion, the full-proof solution would be to have mechanism
(replication_timeout) similar to walsender in 
  walreceiver.

> Oh, and what do you actually mean by "long time"?
  15-20 mins.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-09-12 Thread Amit Kapila
On Wednesday, September 12, 2012 10:15 PM Fujii Masao
On Wed, Sep 12, 2012 at 8:54 PM,   wrote:
>> The following bug has been logged on the website:
>>
>> Bug reference:  7534
>> Logged by:  Amit Kapila
>> Email address:  amit.kap...@huawei.com
>> PostgreSQL version: 9.2.0
>> Operating system:   Suse 10
>> Description:
>
>> 1. Both master and standby machine are connected normally,
>> 2. then you use the command: ifconfig ip down; make the network card of
>> master and standby down,
>
>> Observation
>> master can detect connect abnormal, but the standby can't detect connect
>> abnormal and show a connected channel long time.

> What about setting keepalives_xxx libpq parameters?
>
http://www.postgresql.org/docs/devel/static/libpq-connect.html#LIBPQ-PARAMKE
YWORDS

> Keepalives are not a perfect solution for the termination of connection,
but
> it would help to a certain extent. 

We have tried by enabling keepalive, but it didn't worked maybe because
walreceiver is trying to send reveiver status.
It fails in sending that after many attempts of same.

> If you need something like walreceiver-version of replication_timeout,
such feature has not been implemented yet. 
> Please feel free to implement that!

 I would like to implement such feature for walreceiver, but there is one
confusion that whether to use 
 same configuration parameter(replication_timeout) for walrecevier as for
master or introduce a new 
 configuration parameter (receiver_replication_timeout).

 The only point in having different timeout parameters for walsender and
walreceiver is for the case of standby which 
 has both walsender and walreceiver to send logs to cascaded standby, in
such case somebody might want to have different timeout parameters for
walsender and walreceiver.
 OTOH it will create confusion to have too many parameters. My opinion is to
have one timeout parameter for both walsender and walrecevier.

Let me know your suggestion/opinion about same.

Note- I am marking cc to pgsql-hackers, as it will be a feature request.

With Regards,
Amit Kapila.




-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7533: Client is not able to connect cascade standby incase basebackup is taken from hot standby

2012-09-12 Thread Amit Kapila
On Thursday, September 13, 2012 12:34 AM Fujii Masao wrote:
On Wed, Sep 12, 2012 at 8:47 PM,   wrote:
>> The following bug has been logged on the website:
>
>> Bug reference:  7533
>> Logged by:  Amit Kapila
>> Email address:  amit.kap...@huawei.com
>> PostgreSQL version: 9.2.0
>> Operating system:   Suse
>> Description:
>
>> M host is primary, S host is standby and CS host is cascaded standby.
>


> This procedures didn't reproduce the problem in HEAD. But when I restarted 
> the master server between the step 11 and 
> 12, I was able to reproduce the problem.

  We also observed that it didn't appear in 9.2rc1 due to commit 
b8b69d89905e04b910bcd65efce1791477b45d35 by Tom.
  The reason is checkpoint WAL will come from master after above fix , and 
cascaded standby will not stuck in LOOP.
  However if we increase Checkpoint interval, this does appear, some times we 
need to try 4-5 times.

>> Observations related to bug
>> --
>> In the above scenario it is observed that Start-up process has read 
>> all data (in our defect scenario minRecoveryPoint is 5016220) till the 
>> position 5016220 and then it goes and check for recovery consistency 
>> by following condition in function CheckRecoveryConsistency:
>> if (!reachedConsistency &&
>> XLByteLE(minRecoveryPoint, EndRecPtr) &&
>> XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
>>
>> At this point first two conditions are true but last condition is not 
>> true because still redo has not been applied and hence 
>> backupStartPoint has not been reset. So it does not signal postmaster 
>> regarding consistent stage.
>> After this it goes and applies the redo and then reset 
>> backupStartPoint and then it goes to read next set of record. Since 
>> all records have been already read, so it starts waiting for the new 
>> record from the Standby node. But since there is no new record from 
>> Standby node coming so it keeps waiting for that and it does not get 
>> chance to recheck the recovery consistent level. And hence client connection 
>> does not get allowed.

If cascaded standby starts a recovery at a normal checkpoint record, this 
problem will not happen. Because if wal_level is set to hot_standby, 
XLOG_RUNNING_XACTS WAL record always follows after the normal checkpont record. 
So while XLOG_RUNNING_XACTS record is being replayed,
ControlFile->backupStartPoint can be reset, and then cascaded standby
can pass through the consistency test.

> The problem happens when cascaded standby starts a recovery at a shutdown 
> checkpoint record. In this case, no WAL 
> record might follow the checkpoint one yet. So, after replaying the shutdown 
> checkpoint record, cascaded standby needs > to wait for new WAL record to 
> appear before reaching the code block for resetting 
> ControlFile->backupStartPoint.
> The cascaded standby cannot reach a consistent state and a client cannot 
> connect to the cascaded standby until new WAL > has arrived.

In the above scenario, we are not doing shutdown so how can shutdown checkpoint 
record can come. 

Also for the normal checkpoint case I have done brief analysis:
I have observed in code that ControlFile->minRecoveryPoint is updated while 
replaying XLOG_BACKUP_END WAL.
On hot standby S, this means that ControlFile->minRecoveryPoint will point to a 
lsn after checkpoint record.

Now when start recovery on cascaded standby CS after basebackup from hot 
standby S, minRecoveryPoint should point to lsn after checkpoint record lsn, so 
it might create problem.


> Attached patch will fix the problem. In this patch, if recovery is beginning 
> at a shutdown checkpoint record, any 
> ControlFile fields (like backupStartPoint) required for checking that an 
> end-of-backup is reached are not set at first. > IOW, cascaded standby thinks 
> that the database is consistent from the beginning. This is safe because a 
> shutdown 
> checkpoint record means that there is no running database activity at that 
> point and the database is in consistent 
> state.

  I shall test with the patch.

With Regards,
Amit Kapila.




-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs