Re: How to discover what table is

2020-03-11 Thread PegoraroF10
Well, for now it´s solved but I´ll explain what happens to solve it better on
future.
Suppose on Master you have a database with hundreds of schemas with same
structure, so table Customer happens 500 times on that DB. That database
being replicated with publication/subscription for all tables model, just
that. Now to have new schemas added to this database you go to replica,
create them with structure only, go to master and create them with data, go
back to replica and refresh publication. But then imagine that one of those
schemas you´ve created on replica with data. This is where problem occurs
and message is just PK of Table Customer is duplicated but is not easy to
find which table because I´ll have that table and that key on 90% of my
schemas. If, on error message we just have which schema belongs that table
would be great.



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html




Force WAL cleanup on running instance

2020-03-11 Thread Torsten Krah
Hi,

I am building a docker image with a postgresql 12.2 instance and while
doing so and importing a dump and running some update scripts wal size
is increasing.

When finished I don't need all those wal files anymore and tried to
force the daemon to clean them up and tried this:

select pg_switch_wal();
CHECKPOINT;

and did wait for a minute.

Sometimes it works and wal files are cleaned and moved away so my image
size is way smaller - but it does not happen always in that minute.

So is there a way to tell postgres to force the housekeeping of the wal
stuff via a statement / command line tool?
In a "normal" running instance it just takes care of itself and it will
happen sooner or later and it doesn't really matter when that will
happen - but with my docker image which is automatically build it would
be nice to have a deterministic way of trigger that to reduce the final
size image.

Is there some way to trigger this?

kind regards

Torsten
 
-- 






Re: Streaming replication - 11.5

2020-03-11 Thread Nicola Contu
These are the lines before

2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [1-1]
db=cmdv3,user=zabbix_check ERROR:  recovery is in progress
2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [2-1]
db=cmdv3,user=zabbix_check HINT:  WAL control functions cannot be executed
during recovery.
2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [3-1]
db=cmdv3,user=zabbix_check STATEMENT:  select
greatest(0,pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)) from
pg_stat_replication where client_addr ='10.150.20.22'

That query is made by Zabbix. So I stopped the zabbix agent and tested
again. But still failing, because of this now :

pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_51199"

*pg_basebackup: could not receive data from WAL stream: SSL SYSCALL error:
EOF detected*^C4699810/504983062 kB (70%), 0/1 tablespace
(...ql11/data/base/16401/231363544.2)


here the full log starting right before the last try :

2020-03-11 09:22:44 GMT [] [12598]: [4508-1] db=,user= LOG:  restartpoint
complete: wrote 19565 buffers (0.2%); 0 WAL file(s) added, 0 removed, 7
recycled; write=270.014 s, sync=0.009 s, total=270.036 s; sync files=804,
longest=0.001 s, average=0.000 s; distance=131239 kB, estimate=725998 kB
2020-03-11 09:22:44 GMT [] [12598]: [4509-1] db=,user= LOG:  recovery
restart point at 643A/D8C05F70
2020-03-11 09:22:44 GMT [] [12598]: [4510-1] db=,user= DETAIL:  Last
completed transaction was at log time 2020-03-11 09:22:44.050084+00.
2020-03-11 09:23:14 GMT [] [12598]: [4511-1] db=,user= LOG:  restartpoint
starting: time
2020-03-11 09:27:44 GMT [] [12598]: [4512-1] db=,user= LOG:  restartpoint
complete: wrote 17069 buffers (0.2%); 0 WAL file(s) added, 0 removed, 17
recycled; write=269.879 s, sync=0.006 s, total=269.902 s; sync files=811,
longest=0.001 s, average=0.000 s; distance=120469 kB, estimate=665445 kB
2020-03-11 09:27:44 GMT [] [12598]: [4513-1] db=,user= LOG:  recovery
restart point at 643A/E01AB438
2020-03-11 09:27:44 GMT [] [12598]: [4514-1] db=,user= DETAIL:  Last
completed transaction was at log time 2020-03-11 09:27:43.945485+00.
2020-03-11 09:27:44 GMT [] [12598]: [4515-1] db=,user= LOG:  restartpoint
starting: force wait
2020-03-11 09:29:24 GMT [10.222.8.2(47834)] [50961]: [1-1]
db=cmdv3,user=nis LOG:  duration: 1402.004 ms  statement: SELECT id, name,
parent_id, parent, short_name, sales_rep_id FROM mmx_clients;
2020-03-11 09:29:34 GMT [10.222.8.2(47834)] [50961]: [2-1]
db=cmdv3,user=nis LOG:  duration: 9493.259 ms  statement: SELECT slid,
gnid, sof_id, client_id, product FROM mmx_slids;
2020-03-11 09:32:14 GMT [] [12598]: [4516-1] db=,user= LOG:  restartpoint
complete: wrote 71260 buffers (0.8%); 0 WAL file(s) added, 0 removed, 13
recycled; write=269.953 s, sync=0.012 s, total=269.979 s; sync files=760,
longest=0.002 s, average=0.000 s; distance=123412 kB, estimate=611242 kB
2020-03-11 09:32:14 GMT [] [12598]: [4517-1] db=,user= LOG:  recovery
restart point at 643A/E7A30498
2020-03-11 09:32:14 GMT [] [12598]: [4518-1] db=,user= DETAIL:  Last
completed transaction was at log time 2020-03-11 09:32:13.916101+00.
2020-03-11 09:32:44 GMT [] [12598]: [4519-1] db=,user= LOG:  restartpoint
starting: time
2020-03-11 09:37:14 GMT [] [12598]: [4520-1] db=,user= LOG:  restartpoint
complete: wrote 27130 buffers (0.3%); 0 WAL file(s) added, 0 removed, 12
recycled; write=270.026 s, sync=0.007 s, total=270.052 s; sync files=814,
longest=0.001 s, average=0.000 s; distance=280595 kB, estimate=578177 kB
2020-03-11 09:37:14 GMT [] [12598]: [4521-1] db=,user= LOG:  recovery
restart point at 643A/F8C351C8
2020-03-11 09:37:14 GMT [] [12598]: [4522-1] db=,user= DETAIL:  Last
completed transaction was at log time 2020-03-11 09:37:14.067443+00.
2020-03-11 09:37:44 GMT [] [12598]: [4523-1] db=,user= LOG:  restartpoint
starting: time
2020-03-11 09:42:14 GMT [] [12598]: [4524-1] db=,user= LOG:  restartpoint
complete: wrote 26040 buffers (0.3%); 0 WAL file(s) added, 0 removed, 9
recycled; write=269.850 s, sync=0.019 s, total=269.886 s; sync files=834,
longest=0.002 s, average=0.000 s; distance=236392 kB, estimate=543999 kB
2020-03-11 09:42:14 GMT [] [12598]: [4525-1] db=,user= LOG:  recovery
restart point at 643B/730F3F8
2020-03-11 09:42:14 GMT [] [12598]: [4526-1] db=,user= DETAIL:  Last
completed transaction was at log time 2020-03-11 09:42:13.900088+00.
2020-03-11 09:42:44 GMT [] [12598]: [4527-1] db=,user= LOG:  restartpoint
starting: time
2020-03-11 09:43:53 GMT [10.150.20.22(54906)] [51199]: [1-1]
db=[unknown],user=replicator LOG:  terminating walsender process due to
replication timeout
2020-03-11 09:47:14 GMT [] [12598]: [4528-1] db=,user= LOG:  restartpoint
complete: wrote 20966 buffers (0.2%); 0 WAL file(s) added, 0 removed, 9
recycled; write=270.048 s, sync=0.014 s, total=270.085 s; sync files=852,
longest=0.001 s, average=0.000 s; distance=183749 kB, estimate=507974 kB
2020-03-11 09:47:14 GMT [] [12598]: [4529-1] db=,user= LOG:  recovery
rest

Re: How to set a value when NULL

2020-03-11 Thread sivapostg...@yahoo.com
Hello,Need to set a value of Zero when the field value is NULL in trigger 
function.
Tried with,NEW.fieldname = NULLIF(NEW.fieldname, 0)
in before insert/update trigger.
Looks like it's not working.  I'm doing anything wrong.
Happiness AlwaysBKR Sivaprakash


RE: Patterns to look for in the PostgreSQL server log

2020-03-11 Thread Kevin Brannen
>From: Mageshwaran Janarthanam 
>
>Hi Team...I am trying to setup some monitoring over the PostgreSQL server log. 
>I am not clear which error I should be most concerned about. Could you please 
>share your thoughts on what pattern I should search in the log file?

I'd treat it like any other log and apply negative filters; i.e. filter out
things that aren't a problem, which leaves things you do care about and
anything unexpected (which might cause you to add another filter to remove
it for next time if you find it's not really a problem).

Adrian is giving good advice too. Understand the log levels and set your config
appropriately for your situation and comfort level.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


Re: How to set a value when NULL

2020-03-11 Thread David G. Johnston
On Wednesday, March 11, 2020, sivapostg...@yahoo.com 
wrote:

> Hello,
> Need to set a value of Zero when the field value is NULL in trigger
> function.
>
> Tried with,
> NEW.fieldname = NULLIF(NEW.fieldname, 0)
>
> in before insert/update trigger.
>
> Looks like it's not working.  I'm doing anything wrong
>

NULLIF does the inverse of what you want - you expression returns null if
fieldna,e has a value of 0.

COALESCE is what you want.

Coalesce(fieldname, 0) — returns the first, non-null argument.

David J.


Re: Force WAL cleanup on running instance

2020-03-11 Thread Michael Lewis
I don't know the answer to your stated question. I am curious if you have
set wal_level = minimal and if not, if that would be appropriate for your
use case and might render your concern a non-issue.


Re: Web users as database users?

2020-03-11 Thread Michael Lewis
On Fri, Sep 20, 2019 at 8:19 AM Tom Lane  wrote:

> There is a restriction on how many distinct GRANTs you can
> issue against any one object --- performance will get bad if the ACL
> list gets too large.
>


Any ballpark numbers here? Are we talking 50 or 8000?


Re: Web users as database users?

2020-03-11 Thread Andrei Zhidenkov
I used to use a different approach:

1. Create auth() pl/python procedure as follows:

create or replace
function auth(auser_id integer) returns void as $$
GD['user_id'] = auser_id
$$ language plpythonu;

This procedure is supposed to be called after a sucesseful authorisation (in a 
database or on application side).

2. Create get_current_user() procedure:

create or replace
function get_current_user() returns integer as $$
return GD.get('user_id')
$$ language plpythonu stable security definer;

Now you can get current user id from every SQL query or stored procedure. It 
works fast because Python shared array GD is always present in memory.

> On 11. Mar 2020, at 15:46, Michael Lewis  wrote:
> 
> On Fri, Sep 20, 2019 at 8:19 AM Tom Lane  > wrote:
> There is a restriction on how many distinct GRANTs you can
> issue against any one object --- performance will get bad if the ACL
> list gets too large.
> 
> 
> Any ballpark numbers here? Are we talking 50 or 8000?



Re: Force WAL cleanup on running instance

2020-03-11 Thread Torsten Krah
Am Mittwoch, den 11.03.2020, 08:42 -0600 schrieb Michael Lewis:
> I don't know the answer to your stated question. I am curious if you
> have
> set wal_level = minimal and if not, if that would be appropriate for
> your
> use case and might render your concern a non-issue.

Hi Micheal,

I am already running with "wal_level = minimal" set.

kind regards

Torsten





Re: Web users as database users?

2020-03-11 Thread Tom Lane
Michael Lewis  writes:
> On Fri, Sep 20, 2019 at 8:19 AM Tom Lane  wrote:
>> There is a restriction on how many distinct GRANTs you can
>> issue against any one object --- performance will get bad if the ACL
>> list gets too large.

> Any ballpark numbers here? Are we talking 50 or 8000?

More like the former.  aclitems are 12 bytes each, so once you get
past ~100 items in an object's ACL list, the array is going to
get large enough to be subject to toasting, greatly increasing the
time to access it.

That's not even counting the question of whether scanning lots
of items to determine access privileges is expensive.  The code
that deals with ACL lists isn't terribly well optimized.

I'm not aware that anyone's done formal performance testing,
but if you want to have a lot of roles in the system, my
expectation is that you'd be better off granting privileges
to a small number of group roles and then granting group
role membership as appropriate.

regards, tom lane




Re: ERROR: invalid memory alloc request size 1073741824

2020-03-11 Thread Stefan Blanke

Hi,

We've upgraded to PostgreSQL 11.5 (postgresql.org rhel 6 rpm) and I have 
had another occurrence of this invalid alloc of 1GB. Apologies for never 
providing a query plan when discussing this two years ago; we decided to 
move to a newer PostgreSQL to see if the issue went away but took a 
while to complete the move.


The invalid alloc still only occurs occasionally every few months on a 
query that we run every minute; so unfortunately we still don't have a 
contained reproducible test case.


This is the SQL we are running with a query plan - the query plan is 
from an new empty database so the planner has no stats.


CREATE OR REPLACE FUNCTION
create_table()
RETURNS TABLE (a INTEGER, b INTEGER)
AS $$
UPDATE y
SET c = true,
d = false
WHERE e IS NOT true
AND f IS NOT true
AND g = 1
AND h = 0
AND i = 0
AND (j IS NULL
OR j > 0)
RETURNING y.a, y.b;
$$ LANGUAGE SQL;

-- Prepared statement (PQprepare)
CREATE TEMPORARY TABLE temp_table AS
SELECT * FROM create_table()

-- Prepared statement (PQprepare)
SELECT y.a, y.b,
 x.k,
 x.l,
 y.m,
 y.n,
 y.o
 FROM temp_table
 INNER JOIN y ON temp_table.b = y.b
 AND temp_table.a = y.a
 INNER JOIN x ON x.a = y.a

-- The two prepared statements are executed one after another
-- in the order shown many times an hour.

The query plan for the second prepared statement is:

 Nested Loop  (cost=17.14..64.38 rows=16 width=112)
   ->  Hash Join  (cost=17.00..61.47 rows=16 width=80)
 Hash Cond: ((temp_table.b = y.b) AND (temp_table.a = y.a))
 ->  Seq Scan on temp_table  (cost=0.00..32.60 rows=2260 width=8)
 ->  Hash  (cost=12.80..12.80 rows=280 width=76)
   ->  Seq Scan on y  (cost=0.00..12.80 rows=280 width=76)
   ->  Index Scan using x_pkey on x  (cost=0.14..0.18 rows=1 width=40)
 Index Cond: (a = temp_table.a)

Thanks,
Stefan

On 31/01/2018 21:23, Tomas Vondra wrote:



On 01/31/2018 09:51 PM, Jan Wieck wrote:



On Wed, Jan 31, 2018 at 12:32 PM, Stefan Blanke
mailto:stefan.bla...@framestore.com>> wrote:

 >
 > I'll bet you it's not that. It's quite unlikely that would fail with
 > exactly 1GB request size. It seems much more like a buffer that we keep
 > to be power of 2. The question is which one.

 I had dismissed corruption before writing in. It's exactly 1GB every
 time this has happened - and we can dump the full dataset
 periodically without issue.

 >> I have my money on a corrupted TOAST entry. Is this happening on
 >> trustworthy hardware or beige box with no ECC or RAID?

 It's good quality commercial hardware in our colo - no exactly sure
 what.


If it is a sporadic issue and you can dump the full dataset, then I just
lost my money (Tomas, you coming to PGConf in Jersey City?).



Unfortunately no, but I'm sure there will be other opportunities to buy
me a beer ;-) Like pgcon, for example.



But then, if this is a plain COPY to stdout ... I am wondering what
could possibly be in that path that wants to allocate 1GB. Or is this
not so plain but rather a COPY ... SELECT ...?



That's what I've been guessing, and why I was asking for a query plan.


regards






Re: ERROR: invalid memory alloc request size 1073741824

2020-03-11 Thread Tom Lane
Stefan Blanke  writes:
> We've upgraded to PostgreSQL 11.5 (postgresql.org rhel 6 rpm) and I have 
> had another occurrence of this invalid alloc of 1GB. Apologies for never 
> providing a query plan when discussing this two years ago; we decided to 
> move to a newer PostgreSQL to see if the issue went away but took a 
> while to complete the move.

> The invalid alloc still only occurs occasionally every few months on a 
> query that we run every minute; so unfortunately we still don't have a 
> contained reproducible test case.

Given the lack of stats, I wonder whether the issue could be related
to the plan sometimes being horribly bad, eg due to the temp table
being much larger than expected.  (A possible mechanism would be
hash table bloat, perhaps, but that's getting way ahead of the
evidence.)

Could you adjust your process to log the actual temp table size
each time, ie "select count(*) from temp_table" in between the
two steps, and then note whether the failures are correlated
with unusual temp table sizes?

regards, tom lane




Re: Force WAL cleanup on running instance

2020-03-11 Thread Simon Riggs
On Wed, 11 Mar 2020 at 08:59, Torsten Krah  wrote:

> Hi,
>
> I am building a docker image with a postgresql 12.2 instance and while
> doing so and importing a dump and running some update scripts wal size
> is increasing.
>
> When finished I don't need all those wal files anymore and tried to
> force the daemon to clean them up and tried this:
>
> select pg_switch_wal();
> CHECKPOINT;
>
> and did wait for a minute.
>
> Sometimes it works and wal files are cleaned and moved away so my image
> size is way smaller - but it does not happen always in that minute.
>
> So is there a way to tell postgres to force the housekeeping of the wal
> stuff via a statement / command line tool?
> In a "normal" running instance it just takes care of itself and it will
> happen sooner or later and it doesn't really matter when that will
> happen - but with my docker image which is automatically build it would
> be nice to have a deterministic way of trigger that to reduce the final
> size image.


The size of the task varies, so sometimes takes longer than 60s, depending
upon your hardware.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Solutions for the Enterprise


Re: Force WAL cleanup on running instance

2020-03-11 Thread Justin
Question everyone isn't this a problem with the order of operations?

switching the wal files  then running checkpoint means the Checkpoint can
cross wal files,  so the previous wal file can not be deleted???

To my understanding the order operations should be

Checkpoint
which  flushes everything to disk, then
pg_switch_wal()

which creates an empty wal file and the previous wal can be deleted?

http://www.interdb.jp/pg/pgsql09.html#_9.7.
https://www.postgresql.org/docs/current/wal-configuration.html

Or am i missing something?

On Wed, Mar 11, 2020 at 11:45 AM Simon Riggs  wrote:

> On Wed, 11 Mar 2020 at 08:59, Torsten Krah  wrote:
>
>> Hi,
>>
>> I am building a docker image with a postgresql 12.2 instance and while
>> doing so and importing a dump and running some update scripts wal size
>> is increasing.
>>
>> When finished I don't need all those wal files anymore and tried to
>> force the daemon to clean them up and tried this:
>>
>> select pg_switch_wal();
>> CHECKPOINT;
>>
>> and did wait for a minute.
>>
>> Sometimes it works and wal files are cleaned and moved away so my image
>> size is way smaller - but it does not happen always in that minute.
>>
>> So is there a way to tell postgres to force the housekeeping of the wal
>> stuff via a statement / command line tool?
>> In a "normal" running instance it just takes care of itself and it will
>> happen sooner or later and it doesn't really matter when that will
>> happen - but with my docker image which is automatically build it would
>> be nice to have a deterministic way of trigger that to reduce the final
>> size image.
>
>
> The size of the task varies, so sometimes takes longer than 60s, depending
> upon your hardware.
>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> 
> PostgreSQL Solutions for the Enterprise
>


Re: Force WAL cleanup on running instance

2020-03-11 Thread Torsten Krah
Am Mittwoch, den 11.03.2020, 15:45 + schrieb Simon Riggs:
> The size of the task varies, so sometimes takes longer than 60s,
> depending
> upon your hardware.

Yes that's what I am observing and why I am asking if there is some
select statement or command which triggers that task and returns after
it has finished - as I can't predict how long that task may run.

kind regards

Torsten

-- 






Re: Force WAL cleanup on running instance

2020-03-11 Thread Jerry Sievers
Torsten Krah  writes:

> Am Mittwoch, den 11.03.2020, 15:45 + schrieb Simon Riggs:
>
>> The size of the task varies, so sometimes takes longer than 60s,
>> depending
>> upon your hardware.
>
> Yes that's what I am observing and why I am asking if there is some
> select statement or command which triggers that task and returns after
> it has finished - as I can't predict how long that task may run.

If your site can afford a restart after the bulk load,

1. Clean shutdown.
2. pg_resetwal
3. Start 

That should leave you with a very small N WAL files, perhaps just 1,
though I've not run it lately to reverify.

HTH

>
> kind regards
>
> Torsten

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net




Re: Force WAL cleanup on running instance

2020-03-11 Thread Paul Förster
Hi Justin,

that came to my mind also. Then I tried and found that not always a new WAL is 
created. I admit I tried on a test DC with no other transactions going on. 
Maybe I should have done that. Anyway, I also always do the checkpoint first 
and then the WAL switch, which in my case is also an old habit from Oracle. ;-)

Cheers,
Paul


> On 11. Mar, 2020, at 16:51, Justin  wrote:
> 
> Question everyone isn't this a problem with the order of operations?
> 
> switching the wal files  then running checkpoint means the Checkpoint can 
> cross wal files,  so the previous wal file can not be deleted???
> 
> To my understanding the order operations should be 
> 
> Checkpoint 
> which  flushes everything to disk, then
> pg_switch_wal()
> 
> which creates an empty wal file and the previous wal can be deleted?
>  
> http://www.interdb.jp/pg/pgsql09.html#_9.7.
> https://www.postgresql.org/docs/current/wal-configuration.html
> 
> Or am i missing something?
> 
> On Wed, Mar 11, 2020 at 11:45 AM Simon Riggs  wrote:
> On Wed, 11 Mar 2020 at 08:59, Torsten Krah  wrote:
> Hi,
> 
> I am building a docker image with a postgresql 12.2 instance and while
> doing so and importing a dump and running some update scripts wal size
> is increasing.
> 
> When finished I don't need all those wal files anymore and tried to
> force the daemon to clean them up and tried this:
> 
> select pg_switch_wal();
> CHECKPOINT;
> 
> and did wait for a minute.
> 
> Sometimes it works and wal files are cleaned and moved away so my image
> size is way smaller - but it does not happen always in that minute.
> 
> So is there a way to tell postgres to force the housekeeping of the wal
> stuff via a statement / command line tool?
> In a "normal" running instance it just takes care of itself and it will
> happen sooner or later and it doesn't really matter when that will
> happen - but with my docker image which is automatically build it would
> be nice to have a deterministic way of trigger that to reduce the final
> size image.
> 
> The size of the task varies, so sometimes takes longer than 60s, depending 
> upon your hardware.
> 
> -- 
> Simon Riggshttp://www.2ndQuadrant.com/
> PostgreSQL Solutions for the Enterprise





Re: Force WAL cleanup on running instance

2020-03-11 Thread Torsten Krah
Am Mittwoch, den 11.03.2020, 11:46 -0500 schrieb Jerry Sievers:
> If your site can afford a restart after the bulk load,
> 
> 1. Clean shutdown.
> 2. pg_resetwal
> 3. Start 
> 
> That should leave you with a very small N WAL files, perhaps just 1,
> though I've not run it lately to reverify.

Thanks - that worked like a charm and is exactly what I was looking
for, of cause I can afford that, it's shutdown after the init anyway -
so after running pg_resetwal this was left:

root@53e0d45ce5d7:/var/lib/postgresql/data/pg_wal# ls -lh
total 17M
-rw--- 1 postgres postgres  16M Mär 11 17:23 00010012
drwx-- 2 postgres postgres 4,0K Mär 11 17:13 archive_status

Nice one :) thanks.

kind regards

Torsten





Re: How to discover what table is

2020-03-11 Thread Adrian Klaver

On 3/11/20 1:22 AM, PegoraroF10 wrote:

Well, for now it´s solved but I´ll explain what happens to solve it better on
future.
Suppose on Master you have a database with hundreds of schemas with same
structure, so table Customer happens 500 times on that DB. That database
being replicated with publication/subscription for all tables model, just
that. Now to have new schemas added to this database you go to replica,
create them with structure only, go to master and create them with data, go
back to replica and refresh publication. But then imagine that one of those
schemas you´ve created on replica with data. This is where problem occurs
and message is just PK of Table Customer is duplicated but is not easy to
find which table because I´ll have that table and that key on 90% of my
schemas. If, on error message we just have which schema belongs that table
would be great.


Some digging found that the ERROR uses get_rel_name for the relation 
name and that led to:



~/src/backend/utils/cache/lsyscache.c

/*
 * get_rel_name
 *  Returns the name of a given relation.
 *
 * Returns a palloc'd copy of the string, or NULL if no such relation.
 *
 * NOTE: since relation name is not unique, be wary of code that uses this
 * for anything except preparing error messages.
 */

Going any further is going to need someone with more knowledge of the 
above to chime in.







--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Streaming replication - 11.5

2020-03-11 Thread Adrian Klaver

On 3/11/20 2:54 AM, Nicola Contu wrote:

These are the lines before

2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [1-1] 
db=cmdv3,user=zabbix_check ERROR:  recovery is in progress
2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [2-1] 
db=cmdv3,user=zabbix_check HINT:  WAL control functions cannot be 
executed during recovery.
2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [3-1] 
db=cmdv3,user=zabbix_check STATEMENT:  select 
greatest(0,pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)) from 
pg_stat_replication where client_addr ='10.150.20.22'


That query is made by Zabbix. So I stopped the zabbix agent and tested 
again. But still failing, because of this now :


pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_51199"
*pg_basebackup: could not receive data from WAL stream: SSL SYSCALL 
error: EOF detected
*^C4699810/504983062 kB (70%), 0/1 tablespace 
(...ql11/data/base/16401/231363544.2)


So you started over with a pg_basebackup?

Also from below:

2020-03-11 09:43:53 GMT [10.150.20.22(54906)] [51199]: [1-1] 
db=[unknown],user=replicator LOG:  terminating walsender process due to 
replication timeout


Where are the master and standby in relation to each other network wise?

Intervening firewalls, network latency issues?






here the full log starting right before the last try :

2020-03-11 09:22:44 GMT [] [12598]: [4508-1] db=,user= LOG: 
  restartpoint complete: wrote 19565 buffers (0.2%); 0 WAL file(s) 
added, 0 removed, 7 recycled; write=270.014 s, sync=0.009 s, 
total=270.036 s; sync files=804, longest=0.001 s, average=0.000 s; 
distance=131239 kB, estimate=725998 kB
2020-03-11 09:22:44 GMT [] [12598]: [4509-1] db=,user= LOG:  recovery 
restart point at 643A/D8C05F70
2020-03-11 09:22:44 GMT [] [12598]: [4510-1] db=,user= DETAIL:  Last 
completed transaction was at log time 2020-03-11 09:22:44.050084+00.
2020-03-11 09:23:14 GMT [] [12598]: [4511-1] db=,user= LOG: 
  restartpoint starting: time
2020-03-11 09:27:44 GMT [] [12598]: [4512-1] db=,user= LOG: 
  restartpoint complete: wrote 17069 buffers (0.2%); 0 WAL file(s) 
added, 0 removed, 17 recycled; write=269.879 s, sync=0.006 s, 
total=269.902 s; sync files=811, longest=0.001 s, average=0.000 s; 
distance=120469 kB, estimate=665445 kB
2020-03-11 09:27:44 GMT [] [12598]: [4513-1] db=,user= LOG:  recovery 
restart point at 643A/E01AB438
2020-03-11 09:27:44 GMT [] [12598]: [4514-1] db=,user= DETAIL:  Last 
completed transaction was at log time 2020-03-11 09:27:43.945485+00.
2020-03-11 09:27:44 GMT [] [12598]: [4515-1] db=,user= LOG: 
  restartpoint starting: force wait
2020-03-11 09:29:24 GMT [10.222.8.2(47834)] [50961]: [1-1] 
db=cmdv3,user=nis LOG:  duration: 1402.004 ms  statement: SELECT id, 
name, parent_id, parent, short_name, sales_rep_id FROM mmx_clients;
2020-03-11 09:29:34 GMT [10.222.8.2(47834)] [50961]: [2-1] 
db=cmdv3,user=nis LOG:  duration: 9493.259 ms  statement: SELECT slid, 
gnid, sof_id, client_id, product FROM mmx_slids;
2020-03-11 09:32:14 GMT [] [12598]: [4516-1] db=,user= LOG: 
  restartpoint complete: wrote 71260 buffers (0.8%); 0 WAL file(s) 
added, 0 removed, 13 recycled; write=269.953 s, sync=0.012 s, 
total=269.979 s; sync files=760, longest=0.002 s, average=0.000 s; 
distance=123412 kB, estimate=611242 kB
2020-03-11 09:32:14 GMT [] [12598]: [4517-1] db=,user= LOG:  recovery 
restart point at 643A/E7A30498
2020-03-11 09:32:14 GMT [] [12598]: [4518-1] db=,user= DETAIL:  Last 
completed transaction was at log time 2020-03-11 09:32:13.916101+00.
2020-03-11 09:32:44 GMT [] [12598]: [4519-1] db=,user= LOG: 
  restartpoint starting: time
2020-03-11 09:37:14 GMT [] [12598]: [4520-1] db=,user= LOG: 
  restartpoint complete: wrote 27130 buffers (0.3%); 0 WAL file(s) 
added, 0 removed, 12 recycled; write=270.026 s, sync=0.007 s, 
total=270.052 s; sync files=814, longest=0.001 s, average=0.000 s; 
distance=280595 kB, estimate=578177 kB
2020-03-11 09:37:14 GMT [] [12598]: [4521-1] db=,user= LOG:  recovery 
restart point at 643A/F8C351C8
2020-03-11 09:37:14 GMT [] [12598]: [4522-1] db=,user= DETAIL:  Last 
completed transaction was at log time 2020-03-11 09:37:14.067443+00.
2020-03-11 09:37:44 GMT [] [12598]: [4523-1] db=,user= LOG: 
  restartpoint starting: time
2020-03-11 09:42:14 GMT [] [12598]: [4524-1] db=,user= LOG: 
  restartpoint complete: wrote 26040 buffers (0.3%); 0 WAL file(s) 
added, 0 removed, 9 recycled; write=269.850 s, sync=0.019 s, 
total=269.886 s; sync files=834, longest=0.002 s, average=0.000 s; 
distance=236392 kB, estimate=543999 kB
2020-03-11 09:42:14 GMT [] [12598]: [4525-1] db=,user= LOG:  recovery 
restart point at 643B/730F3F8
2020-03-11 09:42:14 GMT [] [12598]: [4526-1] db=,user= DETAIL:  Last 
completed transaction was at log time 2020-03-11 09:42:13.900088+00.
2020-03-11 09:42:44 GMT [] [12598]: [4527-1] db=,user= LOG: 
  restartpoint starting: time
2020-03-11 09:43:53 GMT [10.150.20.22(54906)] [51199]: [1-1] 
db=[unknown

Re: Streaming replication - 11.5

2020-03-11 Thread Nicola Contu
I am actually cascading.
The master is in nyh, the first slave is in Dallas and the one having
problems is in Dallas as well on the same switch of the one replicating
from the master.

It always worked not sure what is wrong now. We just encrypted disks on all
servers


Il mer 11 mar 2020, 18:57 Adrian Klaver  ha
scritto:

> On 3/11/20 2:54 AM, Nicola Contu wrote:
> > These are the lines before
> >
> > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [1-1]
> > db=cmdv3,user=zabbix_check ERROR:  recovery is in progress
> > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [2-1]
> > db=cmdv3,user=zabbix_check HINT:  WAL control functions cannot be
> > executed during recovery.
> > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [3-1]
> > db=cmdv3,user=zabbix_check STATEMENT:  select
> > greatest(0,pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)) from
> > pg_stat_replication where client_addr ='10.150.20.22'
> >
> > That query is made by Zabbix. So I stopped the zabbix agent and tested
> > again. But still failing, because of this now :
> >
> > pg_basebackup: starting background WAL receiver
> > pg_basebackup: created temporary replication slot "pg_basebackup_51199"
> > *pg_basebackup: could not receive data from WAL stream: SSL SYSCALL
> > error: EOF detected
> > *^C4699810/504983062 kB (70%), 0/1 tablespace
> > (...ql11/data/base/16401/231363544.2)
>
> So you started over with a pg_basebackup?
>
> Also from below:
>
> 2020-03-11 09:43:53 GMT [10.150.20.22(54906)] [51199]: [1-1]
> db=[unknown],user=replicator LOG:  terminating walsender process due to
> replication timeout
>
> Where are the master and standby in relation to each other network wise?
>
> Intervening firewalls, network latency issues?
>
>
>
> >
> >
> > here the full log starting right before the last try :
> >
> > 2020-03-11 09:22:44 GMT [] [12598]: [4508-1] db=,user= LOG:
> >   restartpoint complete: wrote 19565 buffers (0.2%); 0 WAL file(s)
> > added, 0 removed, 7 recycled; write=270.014 s, sync=0.009 s,
> > total=270.036 s; sync files=804, longest=0.001 s, average=0.000 s;
> > distance=131239 kB, estimate=725998 kB
> > 2020-03-11 09:22:44 GMT [] [12598]: [4509-1] db=,user= LOG:  recovery
> > restart point at 643A/D8C05F70
> > 2020-03-11 09:22:44 GMT [] [12598]: [4510-1] db=,user= DETAIL:  Last
> > completed transaction was at log time 2020-03-11 09:22:44.050084+00.
> > 2020-03-11 09:23:14 GMT [] [12598]: [4511-1] db=,user= LOG:
> >   restartpoint starting: time
> > 2020-03-11 09:27:44 GMT [] [12598]: [4512-1] db=,user= LOG:
> >   restartpoint complete: wrote 17069 buffers (0.2%); 0 WAL file(s)
> > added, 0 removed, 17 recycled; write=269.879 s, sync=0.006 s,
> > total=269.902 s; sync files=811, longest=0.001 s, average=0.000 s;
> > distance=120469 kB, estimate=665445 kB
> > 2020-03-11 09:27:44 GMT [] [12598]: [4513-1] db=,user= LOG:  recovery
> > restart point at 643A/E01AB438
> > 2020-03-11 09:27:44 GMT [] [12598]: [4514-1] db=,user= DETAIL:  Last
> > completed transaction was at log time 2020-03-11 09:27:43.945485+00.
> > 2020-03-11 09:27:44 GMT [] [12598]: [4515-1] db=,user= LOG:
> >   restartpoint starting: force wait
> > 2020-03-11 09:29:24 GMT [10.222.8.2(47834)] [50961]: [1-1]
> > db=cmdv3,user=nis LOG:  duration: 1402.004 ms  statement: SELECT id,
> > name, parent_id, parent, short_name, sales_rep_id FROM mmx_clients;
> > 2020-03-11 09:29:34 GMT [10.222.8.2(47834)] [50961]: [2-1]
> > db=cmdv3,user=nis LOG:  duration: 9493.259 ms  statement: SELECT slid,
> > gnid, sof_id, client_id, product FROM mmx_slids;
> > 2020-03-11 09:32:14 GMT [] [12598]: [4516-1] db=,user= LOG:
> >   restartpoint complete: wrote 71260 buffers (0.8%); 0 WAL file(s)
> > added, 0 removed, 13 recycled; write=269.953 s, sync=0.012 s,
> > total=269.979 s; sync files=760, longest=0.002 s, average=0.000 s;
> > distance=123412 kB, estimate=611242 kB
> > 2020-03-11 09:32:14 GMT [] [12598]: [4517-1] db=,user= LOG:  recovery
> > restart point at 643A/E7A30498
> > 2020-03-11 09:32:14 GMT [] [12598]: [4518-1] db=,user= DETAIL:  Last
> > completed transaction was at log time 2020-03-11 09:32:13.916101+00.
> > 2020-03-11 09:32:44 GMT [] [12598]: [4519-1] db=,user= LOG:
> >   restartpoint starting: time
> > 2020-03-11 09:37:14 GMT [] [12598]: [4520-1] db=,user= LOG:
> >   restartpoint complete: wrote 27130 buffers (0.3%); 0 WAL file(s)
> > added, 0 removed, 12 recycled; write=270.026 s, sync=0.007 s,
> > total=270.052 s; sync files=814, longest=0.001 s, average=0.000 s;
> > distance=280595 kB, estimate=578177 kB
> > 2020-03-11 09:37:14 GMT [] [12598]: [4521-1] db=,user= LOG:  recovery
> > restart point at 643A/F8C351C8
> > 2020-03-11 09:37:14 GMT [] [12598]: [4522-1] db=,user= DETAIL:  Last
> > completed transaction was at log time 2020-03-11 09:37:14.067443+00.
> > 2020-03-11 09:37:44 GMT [] [12598]: [4523-1] db=,user= LOG:
> >   restartpoint starting: time
> > 2020-03-11 09:42:14 GMT [] [12598]: [4524-1] db=,user= LOG:
> >   restartpoint complete: wrote 26040 buff

Re: Web users as database users?

2020-03-11 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> I'm not aware that anyone's done formal performance testing,
> but if you want to have a lot of roles in the system, my
> expectation is that you'd be better off granting privileges
> to a small number of group roles and then granting group
> role membership as appropriate.

Right- keep the ACL lists small for individual objects, then grant
membership in the appropriate roles to the actual users who log in and
such.  Having lots of roles works out a lot better that way.

Thanks,

Stephen


signature.asc
Description: PGP signature


Querying an index's btree version

2020-03-11 Thread Darren Lafreniere
Hello,

We've read that PG 12 has improved btree index support, and that the latest
internal btree version was bumped from 3 to 4. Is it possible to query the
btree version that a particular index is using? We'd like to automatically
start a concurrent re-index if we detect any btree indexes are still on
version 3.

Thank you,
Darren Lafreniere


Re: Querying an index's btree version

2020-03-11 Thread Peter Geoghegan
On Wed, Mar 11, 2020 at 1:26 PM Darren Lafreniere
 wrote:
> We've read that PG 12 has improved btree index support, and that the latest 
> internal btree version was bumped from 3 to 4. Is it possible to query the 
> btree version that a particular index is using? We'd like to automatically 
> start a concurrent re-index if we detect any btree indexes are still on 
> version 3.

It's possible, but you have to install the superuser-only pageinspect
extension. Here is how you'd determine that an index called
'pg_aggregate_fnoid_index' is on version 4:

regression=# create extension pageinspect;
CREATE EXTENSION
regression=# select version from bt_metap('pg_aggregate_fnoid_index');
 version
-
   4
(1 row)

-- 
Peter Geoghegan




Re: Streaming replication - 11.5

2020-03-11 Thread Adrian Klaver

On 3/11/20 11:59 AM, Nicola Contu wrote:

I am actually cascading.
The master is in nyh, the first slave is in Dallas and the one having 
problems is in Dallas as well on the same switch of the one replicating 
from the master.


It always worked not sure what is wrong now. We just encrypted disks on 
all servers


Do you have before and after on CPU load, I/O throughput?

Do system logs show anything relevant during replication drop out?




Il mer 11 mar 2020, 18:57 Adrian Klaver > ha scritto:


On 3/11/20 2:54 AM, Nicola Contu wrote:
 > These are the lines before
 >
 > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [1-1]
 > db=cmdv3,user=zabbix_check ERROR:  recovery is in progress
 > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [2-1]
 > db=cmdv3,user=zabbix_check HINT:  WAL control functions cannot be
 > executed during recovery.
 > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [3-1]
 > db=cmdv3,user=zabbix_check STATEMENT:  select
 > greatest(0,pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)) from
 > pg_stat_replication where client_addr ='10.150.20.22'
 >
 > That query is made by Zabbix. So I stopped the zabbix agent and
tested
 > again. But still failing, because of this now :
 >
 > pg_basebackup: starting background WAL receiver
 > pg_basebackup: created temporary replication slot
"pg_basebackup_51199"
 > *pg_basebackup: could not receive data from WAL stream: SSL SYSCALL
 > error: EOF detected
 > *^C4699810/504983062 kB (70%), 0/1 tablespace
 > (...ql11/data/base/16401/231363544.2)

So you started over with a pg_basebackup?

Also from below:

2020-03-11 09:43:53 GMT [10.150.20.22(54906)] [51199]: [1-1]
db=[unknown],user=replicator LOG:  terminating walsender process due to
replication timeout

Where are the master and standby in relation to each other network wise?

Intervening firewalls, network latency issues?



 >
 >
 > here the full log starting right before the last try :
 >
 > 2020-03-11 09:22:44 GMT [] [12598]: [4508-1] db=,user= LOG:
 >   restartpoint complete: wrote 19565 buffers (0.2%); 0 WAL file(s)
 > added, 0 removed, 7 recycled; write=270.014 s, sync=0.009 s,
 > total=270.036 s; sync files=804, longest=0.001 s, average=0.000 s;
 > distance=131239 kB, estimate=725998 kB
 > 2020-03-11 09:22:44 GMT [] [12598]: [4509-1] db=,user= LOG:
  recovery
 > restart point at 643A/D8C05F70
 > 2020-03-11 09:22:44 GMT [] [12598]: [4510-1] db=,user= DETAIL:  Last
 > completed transaction was at log time 2020-03-11 09:22:44.050084+00.
 > 2020-03-11 09:23:14 GMT [] [12598]: [4511-1] db=,user= LOG:
 >   restartpoint starting: time
 > 2020-03-11 09:27:44 GMT [] [12598]: [4512-1] db=,user= LOG:
 >   restartpoint complete: wrote 17069 buffers (0.2%); 0 WAL file(s)
 > added, 0 removed, 17 recycled; write=269.879 s, sync=0.006 s,
 > total=269.902 s; sync files=811, longest=0.001 s, average=0.000 s;
 > distance=120469 kB, estimate=665445 kB
 > 2020-03-11 09:27:44 GMT [] [12598]: [4513-1] db=,user= LOG:
  recovery
 > restart point at 643A/E01AB438
 > 2020-03-11 09:27:44 GMT [] [12598]: [4514-1] db=,user= DETAIL:  Last
 > completed transaction was at log time 2020-03-11 09:27:43.945485+00.
 > 2020-03-11 09:27:44 GMT [] [12598]: [4515-1] db=,user= LOG:
 >   restartpoint starting: force wait
 > 2020-03-11 09:29:24 GMT [10.222.8.2(47834)] [50961]: [1-1]
 > db=cmdv3,user=nis LOG:  duration: 1402.004 ms  statement: SELECT id,
 > name, parent_id, parent, short_name, sales_rep_id FROM mmx_clients;
 > 2020-03-11 09:29:34 GMT [10.222.8.2(47834)] [50961]: [2-1]
 > db=cmdv3,user=nis LOG:  duration: 9493.259 ms  statement: SELECT
slid,
 > gnid, sof_id, client_id, product FROM mmx_slids;
 > 2020-03-11 09:32:14 GMT [] [12598]: [4516-1] db=,user= LOG:
 >   restartpoint complete: wrote 71260 buffers (0.8%); 0 WAL file(s)
 > added, 0 removed, 13 recycled; write=269.953 s, sync=0.012 s,
 > total=269.979 s; sync files=760, longest=0.002 s, average=0.000 s;
 > distance=123412 kB, estimate=611242 kB
 > 2020-03-11 09:32:14 GMT [] [12598]: [4517-1] db=,user= LOG:
  recovery
 > restart point at 643A/E7A30498
 > 2020-03-11 09:32:14 GMT [] [12598]: [4518-1] db=,user= DETAIL:  Last
 > completed transaction was at log time 2020-03-11 09:32:13.916101+00.
 > 2020-03-11 09:32:44 GMT [] [12598]: [4519-1] db=,user= LOG:
 >   restartpoint starting: time
 > 2020-03-11 09:37:14 GMT [] [12598]: [4520-1] db=,user= LOG:
 >   restartpoint complete: wrote 27130 buffers (0.3%); 0 WAL file(s)
 > added, 0 removed, 12 recycled; write=270.026 s, sync=0.007 s,
 > total=270.052 s; sync files=814, longest=0.001 s, average=0.000 s;
 > distance=280595 kB, estimate=578177 kB
 > 

Re: Query to retrieve the index columns when a function is used.

2020-03-11 Thread David Rowley
On Tue, 10 Mar 2020 at 02:16, Sterpu Victor  wrote:
> I'm testing on Postgresql 12.1 and I have a index like this:
> "check_dates_gist" EXCLUDE USING gist (id_test1 WITH =, id_test2 WITH =, 
> tsrange(valid_from::timestamp without time zone, valid_to::timestamp without 
> time zone) WITH &&)
>
> When I run this query:
> "select pc.relname, pi.indisunique, pi.indisprimary, array_agg(a.attname) as 
> attname
> FROM pg_class pc
> JOIN pg_index pi ON pc.oid = pi.indexrelid AND pc.oid IN
> (SELECT indexrelid FROM pg_index, pg_class WHERE pg_class.relname='test' AND 
> pg_class.oid=pg_index.indrelid)
> JOIN pg_attribute a ON a.attrelid = pc.oid
> GROUP BY pc.relname,pi.indisunique, pi.indisprimary;"
> I retrieve the index but there is no detail about the columns valid_from and 
> valid_to.
>
> How can I retrieve this detail?

Those details are in the indexprs column.  There's an item there for
each 0 valued indkey. It's not going to be particularly easy for you
to parse that from SQL.  Internally in PostgreSQL, we have functions
that could give you that information easily, but to access those from
SQL you'd need to write something in C.  The C function is named
pull_varattnos(). That will give you a Bitmapset for each of the
columns that are found. You'd need to write a set-returning function
to return those values then join that to pg_attribute.

Alternatively, it likely wouldn't be impossible to do at the SQL level
with regexp_matches(), but building an expression to reliably extract
what you want might not be an easy task. On a quick test, I see that:

select indexrelid, unnest(r.e) from pg_index i, lateral
regexp_matches(i.indexprs, 'VAR :varno 1 :varattno (\d{1,})', 'g')
r(e) where i.indexprs is not null;

does kick out the correct results for the expression indexes I have in
my database, but there may be some more complex expressions that break
it.




Re: Streaming replication - 11.5

2020-03-11 Thread Nicola Contu
CPU load on the server to be built? No.
System logs don't show anything relevant unfortunately

Il mer 11 mar 2020, 21:34 Adrian Klaver  ha
scritto:

> On 3/11/20 11:59 AM, Nicola Contu wrote:
> > I am actually cascading.
> > The master is in nyh, the first slave is in Dallas and the one having
> > problems is in Dallas as well on the same switch of the one replicating
> > from the master.
> >
> > It always worked not sure what is wrong now. We just encrypted disks on
> > all servers
>
> Do you have before and after on CPU load, I/O throughput?
>
> Do system logs show anything relevant during replication drop out?
>
> >
> >
> > Il mer 11 mar 2020, 18:57 Adrian Klaver  > > ha scritto:
> >
> > On 3/11/20 2:54 AM, Nicola Contu wrote:
> >  > These are the lines before
> >  >
> >  > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [1-1]
> >  > db=cmdv3,user=zabbix_check ERROR:  recovery is in progress
> >  > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [2-1]
> >  > db=cmdv3,user=zabbix_check HINT:  WAL control functions cannot be
> >  > executed during recovery.
> >  > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [3-1]
> >  > db=cmdv3,user=zabbix_check STATEMENT:  select
> >  > greatest(0,pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)) from
> >  > pg_stat_replication where client_addr ='10.150.20.22'
> >  >
> >  > That query is made by Zabbix. So I stopped the zabbix agent and
> > tested
> >  > again. But still failing, because of this now :
> >  >
> >  > pg_basebackup: starting background WAL receiver
> >  > pg_basebackup: created temporary replication slot
> > "pg_basebackup_51199"
> >  > *pg_basebackup: could not receive data from WAL stream: SSL
> SYSCALL
> >  > error: EOF detected
> >  > *^C4699810/504983062 kB (70%), 0/1 tablespace
> >  > (...ql11/data/base/16401/231363544.2)
> >
> > So you started over with a pg_basebackup?
> >
> > Also from below:
> >
> > 2020-03-11 09:43:53 GMT [10.150.20.22(54906)] [51199]: [1-1]
> > db=[unknown],user=replicator LOG:  terminating walsender process due
> to
> > replication timeout
> >
> > Where are the master and standby in relation to each other network
> wise?
> >
> > Intervening firewalls, network latency issues?
> >
> >
> >
> >  >
> >  >
> >  > here the full log starting right before the last try :
> >  >
> >  > 2020-03-11 09:22:44 GMT [] [12598]: [4508-1] db=,user= LOG:
> >  >   restartpoint complete: wrote 19565 buffers (0.2%); 0 WAL file(s)
> >  > added, 0 removed, 7 recycled; write=270.014 s, sync=0.009 s,
> >  > total=270.036 s; sync files=804, longest=0.001 s, average=0.000 s;
> >  > distance=131239 kB, estimate=725998 kB
> >  > 2020-03-11 09:22:44 GMT [] [12598]: [4509-1] db=,user= LOG:
> >   recovery
> >  > restart point at 643A/D8C05F70
> >  > 2020-03-11 09:22:44 GMT [] [12598]: [4510-1] db=,user= DETAIL:
>  Last
> >  > completed transaction was at log time 2020-03-11
> 09:22:44.050084+00.
> >  > 2020-03-11 09:23:14 GMT [] [12598]: [4511-1] db=,user= LOG:
> >  >   restartpoint starting: time
> >  > 2020-03-11 09:27:44 GMT [] [12598]: [4512-1] db=,user= LOG:
> >  >   restartpoint complete: wrote 17069 buffers (0.2%); 0 WAL file(s)
> >  > added, 0 removed, 17 recycled; write=269.879 s, sync=0.006 s,
> >  > total=269.902 s; sync files=811, longest=0.001 s, average=0.000 s;
> >  > distance=120469 kB, estimate=665445 kB
> >  > 2020-03-11 09:27:44 GMT [] [12598]: [4513-1] db=,user= LOG:
> >   recovery
> >  > restart point at 643A/E01AB438
> >  > 2020-03-11 09:27:44 GMT [] [12598]: [4514-1] db=,user= DETAIL:
>  Last
> >  > completed transaction was at log time 2020-03-11
> 09:27:43.945485+00.
> >  > 2020-03-11 09:27:44 GMT [] [12598]: [4515-1] db=,user= LOG:
> >  >   restartpoint starting: force wait
> >  > 2020-03-11 09:29:24 GMT [10.222.8.2(47834)] [50961]: [1-1]
> >  > db=cmdv3,user=nis LOG:  duration: 1402.004 ms  statement: SELECT
> id,
> >  > name, parent_id, parent, short_name, sales_rep_id FROM
> mmx_clients;
> >  > 2020-03-11 09:29:34 GMT [10.222.8.2(47834)] [50961]: [2-1]
> >  > db=cmdv3,user=nis LOG:  duration: 9493.259 ms  statement: SELECT
> > slid,
> >  > gnid, sof_id, client_id, product FROM mmx_slids;
> >  > 2020-03-11 09:32:14 GMT [] [12598]: [4516-1] db=,user= LOG:
> >  >   restartpoint complete: wrote 71260 buffers (0.8%); 0 WAL file(s)
> >  > added, 0 removed, 13 recycled; write=269.953 s, sync=0.012 s,
> >  > total=269.979 s; sync files=760, longest=0.002 s, average=0.000 s;
> >  > distance=123412 kB, estimate=611242 kB
> >  > 2020-03-11 09:32:14 GMT [] [12598]: [4517-1] db=,user= LOG:
> >   recovery
> >  > restart point at 643A/E7A30498
> >  > 2020-03-11 09:32:14 GMT [] [12598]: [4518-1] db=,us

Re: Querying an index's btree version

2020-03-11 Thread Darren Lafreniere
Thank you for the info, Peter.

Separate but related follow-up question: when you restore a DB from a
backup, does the restored index use the old format or the latest one?

Thank you,
Darren Lafreniere



On Wed, Mar 11, 2020 at 4:30 PM Peter Geoghegan  wrote:

> On Wed, Mar 11, 2020 at 1:26 PM Darren Lafreniere
>  wrote:
> > We've read that PG 12 has improved btree index support, and that the
> latest internal btree version was bumped from 3 to 4. Is it possible to
> query the btree version that a particular index is using? We'd like to
> automatically start a concurrent re-index if we detect any btree indexes
> are still on version 3.
>
> It's possible, but you have to install the superuser-only pageinspect
> extension. Here is how you'd determine that an index called
> 'pg_aggregate_fnoid_index' is on version 4:
>
> regression=# create extension pageinspect;
> CREATE EXTENSION
> regression=# select version from bt_metap('pg_aggregate_fnoid_index');
>  version
> -
>4
> (1 row)
>
> --
> Peter Geoghegan
>


-- 
*Darren Lafreniere*
Senior Software Engineer | oneZero Financial Systems

site: www.onezero.com
email: dlafreni...@onezero.com




Re: Querying an index's btree version

2020-03-11 Thread Peter Geoghegan
On Wed, Mar 11, 2020 at 2:13 PM Darren Lafreniere
 wrote:
> when you restore a DB from a backup, does the restored index use the old 
> format or the latest one?

If you use pg_restore, it uses the latest index format.

If you're using pg_upgrade, the version won't change unless and until
you REINDEX. This includes cases where you're running pg_upgrade
against a restored physical backup.

-- 
Peter Geoghegan




Re: strange locks on PG 11 with Golang programs

2020-03-11 Thread Peter J. Holzer
On 2020-03-09 11:02:37 +0200, Achilleas Mantzios wrote:
> Fully review your programs for connection / xaction leaks. Do you use a
> connection pool?

Go's sql package encourages the use of connection pools (type DB) over
single connections (type Conn):

| Prefer running queries from DB unless there is a specific need for a
| continuous single database connection.

All tutorials I've seen follow this recommendation, so a Go programmer
might not even be aware that connections exist.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


pg_restore restores out of order

2020-03-11 Thread Kevin Brannen
I'm working thru our system to upgrade from 9.6.5 to 12.2. One of the last 
things I'm verifying is backup/restore and it's failing for no reason I can 
figure out. So I'm looking for pointers on this.

If it matters, the code is from the version 12.2 from the Pg site, RPMs for 
Centos 6 (.10).

The backup is made like:

# $PGPATH/pg_dump --format=d --jobs=2 --file=$exp --dbname=nms --schema=public

As far as I can tell, it worked well as the result is about the right size, 
exit code was at least 0. I then go change a couple of things so I know if the 
restore works or not (i.e. they should disappear).

Restore looks something like:

# cd $exp
# echo "
ALTER SCHEMA public RENAME TO savepublic;
CREATE SCHEMA public AUTHORIZATION nmsroot;
" | $PGPATH/psql -d nms
# /usr/pgsql-12/bin/pg_restore --jobs=2 --dbname=nms --schema=public .

So we move the current schema to the side just in case something goes wrong and 
we need to move it back, create an empty schema for it, then restore into that 
schema. Then it goes bad...

pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 6356; 0 23653 TABLE DATA subscribers nmsroot
pg_restore: error: COPY failed for table "subscribers": ERROR:  relation 
"su_profiles" does not exist
LINE 1: SELECT srvc_data   FROM su_profiles WHERE su_profile...
^
QUERY:  SELECT srvc_data   FROM su_profiles WHERE su_profile_pk = 
p_profile_fk
CONTEXT:  PL/pgSQL function public.check_data_ip(integer,integer,character 
varying) line 7 at SQL statement
COPY subscribers, line 1: "61   1002\N  SU_1002 t   \N  4   
\N  1   250 0   2015-06-22 16:56:27.79333+00 nmsmgr  \N 
 \N  \N"

And it goes further downhill with missing FK's, etc in other tables later 
because of the missing data. The exit code is definitely not 0.

I've tried removing the "--jobs=2" so it'll go single threaded -- it still 
fails. I also did a new dump with the default plain text dump, and the restore 
with that also failed in the same way. Checking out the backup itself, I get:

# pg_restore -Fd . --list | egrep 'su_profile|subscriber' > k1

254; 1259 23653 TABLE public subscribers nmsroot
335; 1259 24222 TABLE public su_profiles nmsroot
5442; 2604 18650 DEFAULT public su_profiles su_profile_pk nmsroot
5303; 2604 18651 DEFAULT public subscribers subscriber_pk nmsroot
6437; 0 24222 TABLE DATA public su_profiles nmsroot
6356; 0 23653 TABLE DATA public subscribers nmsroot
... constraints, index, triggers, FK -- down here

I've removed FUNCTIONS, VIEWS, & SEQ objects from the list. That list is in 
order that is in the file. I suppose creating the tables in that order is fine, 
but there is a FK constraint in subscribers pointing to su_profiles, so that 
order does worry me a little, except all the FK stuff is after the data load so 
I really should be fine. The data looks like it'll be in the right order (at 
least for --jobs=1, yet that failed too).

I don't see anything extra in the Pg error log either.

I don't think I'm running into any issues with "other DB objects being 
missing". Looking at the release notes, I see there were some changes for 12.1 
with parallel restores, but I can't tell if that's affecting me or not. I know 
that the public schema & search_path changed for dump & restore in 10.3, but 
this doesn't look like that's a problem here. I've reread the doc pages on 
pg_dump and pg_restore and don't see anything that look like it'd be bad for me.

Can anyone see anything that looks wrong? Suggest something else to try or look 
at?

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.




Re: Querying an index's btree version

2020-03-11 Thread Laurenz Albe
On Wed, 2020-03-11 at 17:12 -0400, Darren Lafreniere wrote:
> Separate but related follow-up question: when you restore a DB from a backup, 
> does the restored index use the old format or the latest one?

If you restore a pg_dump, you will have the latest version.

If you restore a file system backup, you will have what you had before.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: pg_restore restores out of order

2020-03-11 Thread Adrian Klaver

On 3/11/20 2:46 PM, Kevin Brannen wrote:

I'm working thru our system to upgrade from 9.6.5 to 12.2. One of the last 
things I'm verifying is backup/restore and it's failing for no reason I can 
figure out. So I'm looking for pointers on this.

If it matters, the code is from the version 12.2 from the Pg site, RPMs for 
Centos 6 (.10).

The backup is made like:

# $PGPATH/pg_dump --format=d --jobs=2 --file=$exp --dbname=nms --schema=public


Which version of pg_dump 9.6.5 or 12.2?

More below.


As far as I can tell, it worked well as the result is about the right size, 
exit code was at least 0. I then go change a couple of things so I know if the 
restore works or not (i.e. they should disappear).

Restore looks something like:

# cd $exp
# echo "
ALTER SCHEMA public RENAME TO savepublic;
CREATE SCHEMA public AUTHORIZATION nmsroot;
" | $PGPATH/psql -d nms
# /usr/pgsql-12/bin/pg_restore --jobs=2 --dbname=nms --schema=public .

So we move the current schema to the side just in case something goes wrong and 
we need to move it back, create an empty schema for it, then restore into that 
schema. Then it goes bad...

pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 6356; 0 23653 TABLE DATA subscribers nmsroot
pg_restore: error: COPY failed for table "subscribers": ERROR:  relation 
"su_profiles" does not exist
LINE 1: SELECT srvc_data   FROM su_profiles WHERE su_profile...
 ^
QUERY:  SELECT srvc_data   FROM su_profiles WHERE su_profile_pk = 
p_profile_fk
CONTEXT:  PL/pgSQL function public.check_data_ip(integer,integer,character 
varying) line 7 at SQL statement
COPY subscribers, line 1: "61   1002\N  SU_1002 t   \N  4   \N   
   1   250 0   2015-06-22 16:56:27.79333+00 nmsmgr  \N  \N  
\N"


What is in public.check_data_ip(integer,integer,character varying)?

Is it a trigger function on subscribers?



And it goes further downhill with missing FK's, etc in other tables later 
because of the missing data. The exit code is definitely not 0.

I've tried removing the "--jobs=2" so it'll go single threaded -- it still 
fails. I also did a new dump with the default plain text dump, and the restore with that 
also failed in the same way. Checking out the backup itself, I get:

# pg_restore -Fd . --list | egrep 'su_profile|subscriber' > k1

254; 1259 23653 TABLE public subscribers nmsroot
335; 1259 24222 TABLE public su_profiles nmsroot
5442; 2604 18650 DEFAULT public su_profiles su_profile_pk nmsroot
5303; 2604 18651 DEFAULT public subscribers subscriber_pk nmsroot
6437; 0 24222 TABLE DATA public su_profiles nmsroot
6356; 0 23653 TABLE DATA public subscribers nmsroot
... constraints, index, triggers, FK -- down here

I've removed FUNCTIONS, VIEWS, & SEQ objects from the list. That list is in 
order that is in the file. I suppose creating the tables in that order is fine, but 
there is a FK constraint in subscribers pointing to su_profiles, so that order does 
worry me a little, except all the FK stuff is after the data load so I really 
should be fine. The data looks like it'll be in the right order (at least for 
--jobs=1, yet that failed too).

I don't see anything extra in the Pg error log either.

I don't think I'm running into any issues with "other DB objects being missing". 
Looking at the release notes, I see there were some changes for 12.1 with parallel restores, but 
I can't tell if that's affecting me or not. I know that the public schema & search_path 
changed for dump & restore in 10.3, but this doesn't look like that's a problem here. I've 
reread the doc pages on pg_dump and pg_restore and don't see anything that look like it'd be bad 
for me.

Can anyone see anything that looks wrong? Suggest something else to try or look 
at?

Kevin
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Streaming replication - 11.5

2020-03-11 Thread Adrian Klaver

On 3/11/20 2:12 PM, Nicola Contu wrote:

CPU load on the server to be built? No.


CPU load, I/O load on the servers in the replication chain.

Basically you just recently, it seems, imposed extra overhead to the 
process by encrypting/decrypting. From what I gather from earlier post 
then your replication started failing. Trying to establish whether the 
former is causing the latter.


Not sure what you are referring to in regards to server to be built?


System logs don't show anything relevant unfortunately

Il mer 11 mar 2020, 21:34 Adrian Klaver > ha scritto:


On 3/11/20 11:59 AM, Nicola Contu wrote:
 > I am actually cascading.
 > The master is in nyh, the first slave is in Dallas and the one
having
 > problems is in Dallas as well on the same switch of the one
replicating
 > from the master.
 >
 > It always worked not sure what is wrong now. We just encrypted
disks on
 > all servers

Do you have before and after on CPU load, I/O throughput?

Do system logs show anything relevant during replication drop out?

 >
 >
 > Il mer 11 mar 2020, 18:57 Adrian Klaver
mailto:adrian.kla...@aklaver.com>
 > >> ha scritto:
 >
 >     On 3/11/20 2:54 AM, Nicola Contu wrote:
 >      > These are the lines before
 >      >
 >      > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [1-1]
 >      > db=cmdv3,user=zabbix_check ERROR:  recovery is in progress
 >      > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [2-1]
 >      > db=cmdv3,user=zabbix_check HINT:  WAL control functions
cannot be
 >      > executed during recovery.
 >      > 2020-03-11 09:05:08 GMT [127.0.0.1(40214)] [43853]: [3-1]
 >      > db=cmdv3,user=zabbix_check STATEMENT:  select
 >      > greatest(0,pg_wal_lsn_diff(pg_current_wal_lsn(),
replay_lsn)) from
 >      > pg_stat_replication where client_addr ='10.150.20.22'
 >      >
 >      > That query is made by Zabbix. So I stopped the zabbix
agent and
 >     tested
 >      > again. But still failing, because of this now :
 >      >
 >      > pg_basebackup: starting background WAL receiver
 >      > pg_basebackup: created temporary replication slot
 >     "pg_basebackup_51199"
 >      > *pg_basebackup: could not receive data from WAL stream:
SSL SYSCALL
 >      > error: EOF detected
 >      > *^C4699810/504983062 kB (70%), 0/1 tablespace
 >      > (...ql11/data/base/16401/231363544.2)
 >
 >     So you started over with a pg_basebackup?
 >
 >     Also from below:
 >
 >     2020-03-11 09:43:53 GMT [10.150.20.22(54906)] [51199]: [1-1]
 >     db=[unknown],user=replicator LOG:  terminating walsender
process due to
 >     replication timeout
 >
 >     Where are the master and standby in relation to each other
network wise?
 >
 >     Intervening firewalls, network latency issues?
 >
 >
 >
 >      >
 >      >
 >      > here the full log starting right before the last try :
 >      >
 >      > 2020-03-11 09:22:44 GMT [] [12598]: [4508-1] db=,user= LOG:
 >      >   restartpoint complete: wrote 19565 buffers (0.2%); 0 WAL
file(s)
 >      > added, 0 removed, 7 recycled; write=270.014 s, sync=0.009 s,
 >      > total=270.036 s; sync files=804, longest=0.001 s,
average=0.000 s;
 >      > distance=131239 kB, estimate=725998 kB
 >      > 2020-03-11 09:22:44 GMT [] [12598]: [4509-1] db=,user= LOG:
 >       recovery
 >      > restart point at 643A/D8C05F70
 >      > 2020-03-11 09:22:44 GMT [] [12598]: [4510-1] db=,user=
DETAIL:  Last
 >      > completed transaction was at log time 2020-03-11
09:22:44.050084+00.
 >      > 2020-03-11 09:23:14 GMT [] [12598]: [4511-1] db=,user= LOG:
 >      >   restartpoint starting: time
 >      > 2020-03-11 09:27:44 GMT [] [12598]: [4512-1] db=,user= LOG:
 >      >   restartpoint complete: wrote 17069 buffers (0.2%); 0 WAL
file(s)
 >      > added, 0 removed, 17 recycled; write=269.879 s, sync=0.006 s,
 >      > total=269.902 s; sync files=811, longest=0.001 s,
average=0.000 s;
 >      > distance=120469 kB, estimate=665445 kB
 >      > 2020-03-11 09:27:44 GMT [] [12598]: [4513-1] db=,user= LOG:
 >       recovery
 >      > restart point at 643A/E01AB438
 >      > 2020-03-11 09:27:44 GMT [] [12598]: [4514-1] db=,user=
DETAIL:  Last
 >      > completed transaction was at log time 2020-03-11
09:27:43.945485+00.
 >      > 2020-03-11 09:27:44 GMT [] [12598]: [4515-1] db=,user= LOG:
 >      >   restartpoint starting: force wait
 >      > 2020-03-11 09:29:24 GMT [10.222.8.2(47834)] [50961]: [1-1]
 >      > db=cmdv3,user=nis LOG:  duration: 1402.004 ms  statement:
SELECT id,
 > 

Re: Query to retrieve the index columns when a function is used.

2020-03-11 Thread Tom Lane
David Rowley  writes:
> On Tue, 10 Mar 2020 at 02:16, Sterpu Victor  wrote:
>> How can I retrieve this detail?

> Those details are in the indexprs column.  There's an item there for
> each 0 valued indkey. It's not going to be particularly easy for you
> to parse that from SQL.  Internally in PostgreSQL, we have functions
> that could give you that information easily, but to access those from
> SQL you'd need to write something in C.

You could use pg_get_indexdef().  I agree that parsing indexprs from
SQL is not to be recommended --- aside from being very complex, it
would break on a regular basis, because we change those structs often.

regards, tom lane




RE: pg_restore restores out of order

2020-03-11 Thread Kevin Brannen
>Adrian Klaver wrote:
>On 3/11/20 2:46 PM, Kevin Brannen wrote:
>> I'm working thru our system to upgrade from 9.6.5 to 12.2. One of the last 
>> things I'm verifying is backup/restore and it's failing for no reason I can 
>> figure out. So I'm looking for pointers on this.
>>
>> If it matters, the code is from the version 12.2 from the Pg site, RPMs for 
>> Centos 6 (.10).
>>
>> The backup is made like:
>>
>> # $PGPATH/pg_dump --format=d --jobs=2 --file=$exp --dbname=nms
>> --schema=public
>
>Which version of pg_dump 9.6.5 or 12.2?

Both pg_dump and pg_restore are with 12.2, on an already converted 12.2 DB.
So I'm 12.2 all the way on my test system by this point. :)

>>...
>> So we move the current schema to the side just in case something goes wrong 
>> and we need to move it back, create an empty schema for it, then restore 
>> into that schema. Then it goes bad...
>>
>> pg_restore: while PROCESSING TOC:
>> pg_restore: from TOC entry 6356; 0 23653 TABLE DATA subscribers
>> nmsroot
>> pg_restore: error: COPY failed for table "subscribers": ERROR:  relation 
>> "su_profiles" does not exist
>> LINE 1: SELECT srvc_data   FROM su_profiles WHERE su_profile...
>>  ^
>> QUERY:  SELECT srvc_data   FROM su_profiles WHERE su_profile_pk = 
>> p_profile_fk
>> CONTEXT:  PL/pgSQL function public.check_data_ip(integer,integer,character 
>> varying) line 7 at SQL statement
>> COPY subscribers, line 1: "61   1002\N  SU_1002 t   \N  4
>>\N  1   250 0   2015-06-22 16:56:27.79333+00 nmsmgr  
>> \N  \N  \N"
>
>What is in public.check_data_ip(integer,integer,character varying)?
>
>Is it a trigger function on subscribers?

Not quite...

nms=# \d subscribers
   Table "public.subscribers"
...
Check constraints:
"chk_su_data_ip" CHECK (check_data_ip(profile_fk, unit_id, data_ip))


The first line of the check_data_ip() function is the offender:

SELECT srvc_data INTO data
FROM su_profiles
WHERE su_profile_pk = p_profile_fk;

Hmm, why is this a problem now and not before?

(Probably the usual reason of code "tightening" as we go forward, or so I'd 
guess.)

I don't see any option that mentions "CHECK" in the pg_restore doc in regards to
possibly turning it off...more research...

And oh, this hurts. :( From the docs on CHECK constraints:

PostgreSQL does not support CHECK constraints that reference table data other
than the new or updated row being checked. While a CHECK constraint that
violates this rule may appear to work in simple tests, it cannot guarantee
that the database will not reach a state in which the constraint condition
is false (due to subsequent changes of the other row(s) involved). This
would cause a database dump and reload to fail. ...

It goes on to say a trigger is the right way to do this. So yeah, a "tightening
of the code" seems to be hitting me because of bad code from the past.

Problem identified and I'll solve it tomorrow.

Thank you so much Adrian for helping me to figure this out!

Kevin

.
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


Back Port Request for INVALID Startup Packet

2020-03-11 Thread Virendra Kumar
Hi Team,
Can you please back port patch where if a 0 byte packet sent to PG instance 
(Health Checks), it starts complaining about invalid startup packet and flood 
the log which increases log size considerably if the health checks are every 3 
seconds or something like that.
Patch Requested - 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=342cb650e
Please let me know if you have already back ported this to supported versions.

Regards,
Virendra Kumar


Re: pg_restore restores out of order

2020-03-11 Thread Adrian Klaver

On 3/11/20 4:11 PM, Kevin Brannen wrote:

Adrian Klaver wrote:
On 3/11/20 2:46 PM, Kevin Brannen wrote:

I'm working thru our system to upgrade from 9.6.5 to 12.2. One of the last 
things I'm verifying is backup/restore and it's failing for no reason I can 
figure out. So I'm looking for pointers on this.

If it matters, the code is from the version 12.2 from the Pg site, RPMs for 
Centos 6 (.10).

The backup is made like:

# $PGPATH/pg_dump --format=d --jobs=2 --file=$exp --dbname=nms
--schema=public


Which version of pg_dump 9.6.5 or 12.2?


Both pg_dump and pg_restore are with 12.2, on an already converted 12.2 DB.
So I'm 12.2 all the way on my test system by this point. :)


...
So we move the current schema to the side just in case something goes wrong and 
we need to move it back, create an empty schema for it, then restore into that 
schema. Then it goes bad...

pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 6356; 0 23653 TABLE DATA subscribers
nmsroot
pg_restore: error: COPY failed for table "subscribers": ERROR:  relation 
"su_profiles" does not exist
LINE 1: SELECT srvc_data   FROM su_profiles WHERE su_profile...
  ^
QUERY:  SELECT srvc_data   FROM su_profiles WHERE su_profile_pk = 
p_profile_fk
CONTEXT:  PL/pgSQL function public.check_data_ip(integer,integer,character 
varying) line 7 at SQL statement
COPY subscribers, line 1: "61   1002\N  SU_1002 t   \N  4   \N   
   1   250 0   2015-06-22 16:56:27.79333+00 nmsmgr  \N  \N  
\N"


What is in public.check_data_ip(integer,integer,character varying)?

Is it a trigger function on subscribers?


Not quite...

nms=# \d subscribers
Table "public.subscribers"
...
Check constraints:
 "chk_su_data_ip" CHECK (check_data_ip(profile_fk, unit_id, data_ip))


The first line of the check_data_ip() function is the offender:

SELECT srvc_data INTO data
FROM su_profiles
WHERE su_profile_pk = p_profile_fk;

Hmm, why is this a problem now and not before?

(Probably the usual reason of code "tightening" as we go forward, or so I'd 
guess.)

I don't see any option that mentions "CHECK" in the pg_restore doc in regards to
possibly turning it off...more research...

And oh, this hurts. :( From the docs on CHECK constraints:

PostgreSQL does not support CHECK constraints that reference table data other
than the new or updated row being checked. While a CHECK constraint that
violates this rule may appear to work in simple tests, it cannot guarantee
that the database will not reach a state in which the constraint condition
is false (due to subsequent changes of the other row(s) involved). This
would cause a database dump and reload to fail. ...

It goes on to say a trigger is the right way to do this. So yeah, a "tightening
of the code" seems to be hitting me because of bad code from the past.

Problem identified and I'll solve it tomorrow.


Glad it was solved.

However, looking a gift horse in the mouth, did this not error when 
moving from the 9.6.5 instance to the first 12.2 instance?


Or

Was the constraint added in the first 12.2 instance?



Thank you so much Adrian for helping me to figure this out!

Kevin



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Back Port Request for INVALID Startup Packet

2020-03-11 Thread Tom Lane
Virendra Kumar  writes:
> Can you please back port patch where if a 0 byte packet sent to PG instance 
> (Health Checks), it starts complaining about invalid startup packet and flood 
> the log which increases log size considerably if the health checks are every 
> 3 seconds or something like that.
> Patch Requested - 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=342cb650e

We generally don't like to change behavior of back branches without
ironclad agreement that the existing behavior is a bug ... which this
surely isn't.  Also, the discussion leading up to that patch specifically
considered and rejected back-patching; so I'm disinclined to overrule
that decision now.

I would suggest that an every-three-second health check is not
appropriate, especially one that is so minimal that it only
detects whether the postmaster is alive.

regards, tom lane




encrypt/decrypt between javascript and postgresql.

2020-03-11 Thread AC Gomez
I'm trying to encrypt/decrypt between javascript and postgresql.

I'm using this:
https://gist.github.com/vlucas/2bd40f62d20c1d49237a109d491974eb algorithm
to encrypt my text, and then in PostgreSQL I use PGCRYPTO.decrypt_iv to
decrypt the text.

I pass in 'ThisISMySign' to the Encrypt function.

Encrypted string returned from
above: "fc9a03cbc8a57d4061570575f197c29c:a319a4bf354516f392ba96a895478af6"

I have to remove the colon to get something out...and so this:

select
decrypt_iv(decode('fc9a03cbc8a57d4061570575f197c29ca319a4bf354516f392ba96a895478af6','hex')::bytea,
'sKCx49VgtHZ59bJOTLcU0Gr06ogUnDJi'::bytea, 'null'::bytea,
'aes-cbc/pad:pkcs');

Gives me this: 6 á¶ðÒÿÆÛÏBSïÅThisISMySign

"ThisISMySign" was the original string. So I'm getting the right result in
half of the decrypted string.

The paremeter after the key, 3rd parameter, it can be any string. That just
changes the first part of the output, the garbage part.

In decrypt_iv I tried using the encryption algorithm name in the javascript
used to encrypt, but that gets me nowhere.

I cannot see what i'm missing here.

Thanks


Re: Back Port Request for INVALID Startup Packet

2020-03-11 Thread Virendra Kumar
Hi Tom,
Thank you for your reply!

This is simple patch, would that impact badly if patched to prior versions or 
some other constraints forced to not do that. I am just trying to understand 
this a bit.
On AWS RDS we have primary and secondary hosts known in advance in most cases. 
So if a primary instance fails over it will be other host and hence we have to 
update the active nodes in targets using lamda function. AWS RDS fails over 
very quickly under 3 seconds mostly and hence we have set that health checks to 
3seconds. I'll go back to AWS folks and see if they can do this in prior 
releases.

Regards,
Virendra Kumar

 

On Wednesday, March 11, 2020, 5:29:38 PM PDT, Tom Lane  
wrote:  
 
 Virendra Kumar  writes:
> Can you please back port patch where if a 0 byte packet sent to PG instance 
> (Health Checks), it starts complaining about invalid startup packet and flood 
> the log which increases log size considerably if the health checks are every 
> 3 seconds or something like that.
> Patch Requested - 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=342cb650e

We generally don't like to change behavior of back branches without
ironclad agreement that the existing behavior is a bug ... which this
surely isn't.  Also, the discussion leading up to that patch specifically
considered and rejected back-patching; so I'm disinclined to overrule
that decision now.

I would suggest that an every-three-second health check is not
appropriate, especially one that is so minimal that it only
detects whether the postmaster is alive.

            regards, tom lane  

Re: Back Port Request for INVALID Startup Packet

2020-03-11 Thread Laurenz Albe
On Thu, 2020-03-12 at 01:16 +, Virendra Kumar wrote:
> This is simple patch, would that impact badly if patched to prior versions or 
> some other
> constraints forced to not do that. I am just trying to understand this a bit.

It is not that this patch would have a terrible impact.

There is a good reason for being very strict about what to backport: we want 
users
to install the latest minor release without them worrying if that will change 
any
behavior they rely on or not.

If the users are not confident that they can always install the latest minor 
release
without extra testing, they won't install them and run old, buggy releases.
This would be bad for PostgreSQL's reputation of being stable and reliable.

So also "harmless" changes that don't actually fix a bug are not backported.

> On AWS RDS we have primary and secondary hosts known in advance in most cases.
> So if a primary instance fails over it will be other host and hence we have to
> update the active nodes in targets using lamda function. AWS RDS fails over 
> very
> quickly under 3 seconds mostly and hence we have set that health checks to 
> 3seconds.
> I'll go back to AWS folks and see if they can do this in prior releases.

Hm.  A system that fails over withing three seconds seems fragile to me.
Doesn't that mean that evvery little hiccup will cause a failover?

Maybe I don't understand what you are doing, but wouldn't it be better to catch
errors whenever you perform a database operation and retry the operation if the
error indicates that you have lost the connection?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com