Re: Hot and PITR backups in same server

2020-05-19 Thread Kyotaro Horiguchi
At Mon, 18 May 2020 11:11:44 +0300, "Andrus"  wrote in 
> Hi!
> 
> >This is confused or at least confusing.
> >- "max_wal_size" of 5TB is clearly insane.

As Laurentz said, even if you set it to 5TB, no WAL files older than
the files needed by the last checkpoint don't remain. If you don't
need a hot-standby, you can use pg_receivewal to save WAL files on the
"standby" server.  If you need the hot-standby, WAL files should be
archived.

> >- I don't understand what you mean by "Start backup server for hot
> >- standby backups".
> >Do I get it right that you want to copy a streaming replication
> >standby server's data
> directory to perform PITR?
> 
> I want to copy only pg_wal directory contents from this.

If so, what you need seems to be pg_receivewal, not a full-fledged
server.

> After pg_basebackup has finished, copy of its data directory is saved
> for possible PITR.
> Its pg_wal contents will be replaced with current pg_wal directory
> contents.
> 
> After that recovery time will set it postgres.conf and separate server
> in 5433 is used to preform PITR using this data directory.
> 
> >That doesn't see like a good plan, because the standby
> >usually won't be much behind the primary server, and you can only
> >recover to a later
> >point in time.
> 
> I will get data from copy taken when hot backup server was created and
> replace its pg_wal directory contents from pg_wal in current backup
> server pg_wal directory.
> Since max pg wal size is big it should contain all WAL segments from
> time where base backup was taken.

As mentioned above, that assumption is wrong.  You need to archive WAL
files out of pg_wal directory, or pg_recievewal.

> >If you care to elaborate, perhaps the question can be answered.
> 
> Currently binary async hot standby backup server is working OK and
> replicates main sever database almost immediately.
> 
> However if important data is deleted in same day, it cannot recovered
> since hot standby server database has replicated same data as in
> master server.
> Master server disk space is limited. It cannot keep large number of
> wal segments.
> Standby server has lot of free disk space.

recovery_min_apply_delay might be useful for you.  With the setting,
the standby follows the primary after that minutes or hours.

https://www.postgresql.org/docs/12/runtime-config-replication.html#GUC-RECOVERY-MIN-APPLY-DELAY

I haven't even tried by myself, but the following steps would perhaps
work.

0. If data is found to be broken on the primary.

1. Stop the standby immediately and take a cold backup including
 pg_wal directory.

2. You may find the just dropped data in the standby.

3. If you need to advance the standby, you can proceed to recover upto
  arbitrary LSN after x min ago using recovery_target_lsn/time.  If
  you go too far, start again from the backup taken in the step 1.
  
https://www.postgresql.org/docs/12/runtime-config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY-TARGET


> I'm looking for a way to collect data for PITR recovery (WAL segments)
> in standby server.
> 
> I have initial base backup of data directory created using
> pg_basebackup.
> All WAL segments from master server after pg_basebackup should saved
> in backup server and should be used for PITR recovery when needed.
> 
> How to use hot standby server for this ?
> 
> Or should separate cluster in 5433 port created and pg_rewind or
> something other used for PITR.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Huge tables, trying to delete OID's taking 6+hours per table

2020-05-19 Thread Tory M Blue
The db is only 2TB, these tables are about 50% of the DB.


Just wondering what I can boost to give me some elevated temp performance
for what I would think would be a fairly quick operation.

Give this query more work_mem? Mine are set pretty low, based on previous
performance notes, this is a 9.5 server.

Thanks
Tory


Logical replication troubles

2020-05-19 Thread Anders Bøgh Bruun
Hi,

I have run into a (to me) weird issue with logical replication. We are
running Zalandos postgres-operator in our Kubernetes clusters and have
recently had a use-case where we wanted to start doing logical replication
of select tables to a data warehouse, also running postgres. It worked as
expected at first, but then after a pod-restart in Kubernetes, the
replication slots that were created for the subscription were gone. A bit
of reading later, and I learn we need to tell Patroni which slots should be
permanently available, so we specify a slot and try to set this up, but
then run into an error which says the publication does not exist, even
though we can verify that it does. At first I suspected Patroni handling
the replication slots to be the cause of the problem, but about a week's
worth of learning and experimenting later, I can now reliably replicate the
problem in pure postgres. Patroni is kind of the catalyst, since my
findings are that if the replication slot is created before data is
inserted into the source database, and a publication is created, then it
breaks. If the replication slot is created after data is inserted and the
publication is created, then it works. We just can't tell Patroni to not
create it until some arbitrary point in time. I am guessing this is either
a bug or a case of us not knowing what we are doing...

I have created a Github gist demonstrating the problem:
https://gist.github.com/drzero42/02b4082ce002c1d90ddd64f5fe03aee0

Anybody able to help? :)

-- 
Anders Bøgh Bruun

Infrastructure Architect

CellPoint digital
cellpointdigital.com
*WE MAKE TRAVEL EASIER™*

M: +45 31 14 87 41
E: and...@cellpointdigital.com

Chicago | *Copenhagen* | Dubai | London | Miami | Pune | Singapore


Re: Huge tables, trying to delete OID's taking 6+hours per table

2020-05-19 Thread Tory M Blue
On Tue, May 19, 2020 at 12:17 AM Tory M Blue  wrote:

>
> The db is only 2TB, these tables are about 50% of the DB.
>
>
> Just wondering what I can boost to give me some elevated temp performance
> for what I would think would be a fairly quick operation.
>
> Give this query more work_mem? Mine are set pretty low, based on previous
> performance notes, this is a 9.5 server.
>
> Thanks
> Tory
>

The command i'm using is

ALTER TABLE tablename SET WITHOUT OIDS;

Would a drop column oid be better?

Tory


Re: Hot and PITR backups in same server

2020-05-19 Thread Andrus

Hi!

Thank you.


As Laurentz said, even if you set it to 5TB, no WAL files older than
the files needed by the last checkpoint don't remain. If you don't
need a hot-standby, you can use pg_receivewal to save WAL files on the
"standby" server.  If you need the hot-standby, WAL files should be
archived.


So pg_receivewal  should running in parallel with hot standy server to receive 
wal files to separate directory.

Will each wal file transferred two times in this case? One time by hot standby 
server and second time by pg_receivewal.
Main server if priced by amount of bandwidth.
How to receive each wal file only once ? This would allow to decrease network 
bandwidth and thus cost two times.

pg_receivewal has compression option. Will this compress WAL files before 
transfer over network ?


If so, what you need seems to be pg_receivewal, not a full-fledged
server.


For hot standby and PITR in same server the following steps are required:

1. Create base backup using pg_basebackup
2. Create copy of base backup for PITR
3. Start hot standby server using data directory created in p.1
4. Run pg_receiceval as background process to save wal files to backup created 
in p.2

If PITR is required, specify recovery time in base backup created in p.2 in 
postgresql.conf
Start second postgresql server instance in port 5433 which uses this backup for 
recovery.

Is this OK or should something changed?

How to run pg_receivewal in background ? Is there some option which creates such service so will automatically restart if server is 
restarted ?


How to allow main server to keep sufficient number of WAL segments ?
Replication slot cannot used: if backup server stops replication_slot causes main server to fill disk space with untransferred WAL 
files.

After  that main server will also stop with "no space left on device" error.

Or is there some option like to reserve some disk space or limit wal size so 
that main server can continue on backup server crash.

Andrus. 






idle_in_transaction_session_timeout not getting log in PG v12.2

2020-05-19 Thread Ishan Joshi
Hi Team,

I have set idle_in_transaction_session_timeout on user level and it is set for 
5 min. I can see after 5 min the session with "idle in transaction" state got 
killed and it log the details in postgresql log file with "FATAL: terminating 
connection due to idle-in-transaction timeout" in v11.5

But I have testing the same in v12.2, The session getting killed with state 
"idle in transaction"  but the log is not generated in postgresql log file.

Is there any specific changes related to this for v12.2 or any additional 
parameters need to set for generating logs.


Thanks & Regards,
Ishan Joshi

This email and the information contained herein is proprietary and confidential 
and subject to the Amdocs Email Terms of Service, which you may review at 
https://www.amdocs.com/about/email-terms-of-service 



Re: Huge tables, trying to delete OID's taking 6+hours per table

2020-05-19 Thread Tom Lane
Tory M Blue  writes:
> The command i'm using is
> ALTER TABLE tablename SET WITHOUT OIDS;
> Would a drop column oid be better?

Unfortunately, you're kind of stuck.  OIDs are not like regular columns
(at least before v12) --- they are integrated into the tuple header in
a hackish way, and so there's no way to get rid of them without a table
rewrite.

regards, tom lane




Re: idle_in_transaction_session_timeout not getting log in PG v12.2

2020-05-19 Thread Adrian Klaver

On 5/19/20 2:08 AM, Ishan Joshi wrote:

Hi Team,

I have set idle_in_transaction_session_timeout on user level and it is 
set for 5 min. I can see after 5 min the session with “idle in 
transaction” state got killed and it log the details in postgresql log 
file with “FATAL: terminating connection due to idle-in-transaction 
timeout” in v11.5


But I have testing the same in v12.2, The session getting killed with 
state “idle in transaction”  but the log is not generated in postgresql 
log file.


Is there any specific changes related to this for v12.2 or any 
additional parameters need to set for generating logs.


Looking at code in ~/backend/tcop/postgres.c it has not changed.

Are you reusing the same *.conf in 12 as you did in 11?

What are your logging settings?



Thanks & Regards,

Ishan Joshi

*This email and the information contained herein is proprietary and 
confidential and subject to the Amdocs Email Terms of Service, which you 
may review at**https://www.amdocs.com/about/email-terms-of-service*





--
Adrian Klaver
adrian.kla...@aklaver.com




RE: idle_in_transaction_session_timeout not getting log in PG v12.2

2020-05-19 Thread Ishan Joshi
Thank Adrian for response.



Yes I have the same *.conf file as v11.5 in fact I have copied the same conf 
files in v12.2



Logging parameter are as follows


Parameter
Values
log_autovacuum_min_duration
 1s
 log_checkpoints
 on
 log_connections
 on
 log_destination
 stderr
 log_directory
 log
 log_disconnections
 on
 log_duration
 off
 log_error_verbosity
 default
 log_executor_stats
 off
 log_file_mode
600
log_filename
 postgresql-%Y-%m-%d_%H%M%S.log
 log_hostname
 off
 log_line_prefix
 %m %d %u %l %s pid:%p xid:%x ip:%h
 log_lock_waits
 on
 log_min_duration_statement
-1
log_min_error_statement
 error
 log_min_messages
 warning
 log_parser_stats
 off
 log_planner_stats
 off
 log_replication_commands
 on
 log_rotation_age
 1h
 log_rotation_size
0
log_statement
 none
 log_statement_stats
 off
 log_temp_files
0
log_transaction_sample_rate
0
log_truncate_on_rotation
 on
 logging_collector
 on







Thanks & Regards,

Ishan Joshi



-Original Message-
From: Adrian Klaver 
Sent: Tuesday, May 19, 2020 7:28 PM
To: Ishan Joshi ; pgsql-gene...@postgresql.org
Subject: Re: idle_in_transaction_session_timeout not getting log in PG v12.2



On 5/19/20 2:08 AM, Ishan Joshi wrote:

> Hi Team,

>

> I have set idle_in_transaction_session_timeout on user level and it is

> set for 5 min. I can see after 5 min the session with "idle in

> transaction" state got killed and it log the details in postgresql log

> file with "FATAL: terminating connection due to idle-in-transaction

> timeout" in v11.5

>

> But I have testing the same in v12.2, The session getting killed with

> state "idle in transaction"  but the log is not generated in

> postgresql log file.

>

> Is there any specific changes related to this for v12.2 or any

> additional parameters need to set for generating logs.



Looking at code in ~/backend/tcop/postgres.c it has not changed.



Are you reusing the same *.conf in 12 as you did in 11?



What are your logging settings?



>

> Thanks & Regards,

>

> Ishan Joshi

>

> *This email and the information contained herein is proprietary and

> confidential and subject to the Amdocs Email Terms of Service, which

> you may review

> at**https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2F

> www.amdocs.com%2Fabout%2Femail-terms-of-service*&data=02%7C01%7CIs

> han.Joshi%40amdocs.com%7C7d8dce6a7c7d4349a7ec08d7fbfcccd5%7Cc8eca3ca12

> 7646d59d9da0f2a028920f%7C0%7C0%7C637254935514522274&sdata=Gi7j8rno

> ogMRCpaAWM0UuH%2FzorFKy9vnpeEfNfu1dLo%3D&reserved=0

>





--

Adrian Klaver

adrian.kla...@aklaver.com
This email and the information contained herein is proprietary and confidential 
and subject to the Amdocs Email Terms of Service, which you may review at 
https://www.amdocs.com/about/email-terms-of-service 



Re: Unique index on hash of jsonb value - correct solution?

2020-05-19 Thread Stephen Frost
Greetings,

* Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> On Mon, 2020-05-18 at 18:43 +0200, Albrecht Dreß wrote:
> > in a database I have a table with a text, a jsonb and a bytea column, which
> > together shall be unique, like:
> > 
> > 
> >   Column  |  Type  | Collation | Nullable |   Default
> > -++---+--+-
> >   id  | bigint |   | not null | 
> > nextval('demotab_id_seq'::regclass)
> >   textval | text   |   | not null |
> >   jsonval | jsonb  |   | not null |
> >   blobval | bytea  |   | not null |
> > Indexes:
> >  "demo_idx" UNIQUE, btree (textval, jsonval, blobval)
> > 
> > 
> > This seems to work just fine in most cases, but adding a bigger json value 
> > (the text
> > and the bytea columns are always short) results in a “index row size 2840 
> > exceeds
> > maximum 2712 for index "…"” error.  Following the hint in the error message,
> > I replaced the index by
> > 
> > 
> > Indexes:
> >  "demo_idx" UNIQUE, btree (textval, md5(jsonval::text), blobval)
> > 
> > 
> > which seems to solve the issue.

That's only going to work up to a certain size for that text and blob
value too, of course..  This is looking like it might be some kind of KV
store which is generally discouraged.

> > My question: is this approach (a) correct and (b) still safe if the items 
> > in the
> > jsonb (always a dict in my case) are re-ordered?  I tested a few cases, and 
> > trying
> > to insert something like e.g. '{"a":1,"b":2}' and '{"b":2,"a":1}' actually 
> > does
> > produce the same hash (i.e. the 2nd insert attempt is rejected due to the 
> > index),
> > but is this guaranteed by design for every case?  Or is there a better 
> > solution
> > for this use case?
> 
> "jsonb" uses an internal binary representation that reorders the attributes in
> a deterministic fashin, so yes, that is guaranteed.

Sure- but the md5() isn't going to run on the jsonb data directly, it'll
run on the text representation that's returned, and I'm not at all
convinced that the project has agreed to make that text form always be
canonical and identical forever, including across major version
upgrades..  Further, there are some other cases to consider- such as: do
you think that 0.1230 is different from 0.123?  Since we'll
store numbers in jsonb as numeric, we preserve trailing franctional
zeroes.

eg:

select md5(c1::text), md5(c2::text), c1, c2 from (select '{"reading": 
1.230e-5}'::jsonb, '{"reading": 1.23e-5}'::jsonb) as x(c1,c2);
   md5|   md5|  
 c1|   c2   
--+--+-+
 d793380db9196092889ccdb4c84f2bd4 | 4fb5efd00c7a0f0e0e6f6375b8a834d5 | 
{"reading": 0.1230} | {"reading": 0.123}
(1 row)

You might wonder what PG normally thinks returns when comparing those
numbers, but that's easy to see:

select '1.230e-5'::numeric = '1.23e-5'::numeric;
 ?column? 
--
 t
(1 row)

> I would use an "md5" hash for the "bytea" column as well to keep the index 
> smaller.

This goes to the point above that you might want to include the other
columns in the hash, or hash them independently if needed, to avoid
hitting the max index row size.

> There can be collisions with an md5 hash, so it is possible for duplicates to
> creep in.  Besides, if you have an auto-generated "bigint" column, you should
> make that the primary key.  Perhaps then the uniqueness condition is no longer
> necessary.

The issue isn't just with the md5 hash, as illustrated above.  Having an
alternative column that can be used as a primary key certainly does
*not* remove the general need to ensure that these columns are unique,
if that's what the data model calls for.

Overall, if what you need is uniqueness here, I'd strongly reconsider
the choice to use jsonb to store this data and instead see if you can
break the data out into proper columns with a proper unique constraint
across them (or across some hash of the combination of them that's done
in a way that is clear and unambiguous).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: idle_in_transaction_session_timeout not getting log in PG v12.2

2020-05-19 Thread Adrian Klaver

On 5/19/20 7:12 AM, Ishan Joshi wrote:

Thank Adrian for response.

Yes I have the same *.conf file as v11.5 in fact I have copied the same 
conf files in v12.2


Logging parameter are as follows


I'm not seeing anything obviously wrong with below.

Are you sure you are looking at the correct log?



*Parameter*



*Values*

log_autovacuum_min_duration



  1s

  log_checkpoints



  on

  log_connections



  on

  log_destination



  stderr

  log_directory



  log

  log_disconnections



  on

  log_duration



  off

  log_error_verbosity



  default

  log_executor_stats



  off

  log_file_mode



600

log_filename



  postgresql-%Y-%m-%d_%H%M%S.log

log_hostname



  off

  log_line_prefix



  %m %d %u %l %s pid:%p xid:%x ip:%h

  log_lock_waits



  on

  log_min_duration_statement



-1

log_min_error_statement



  error

  log_min_messages



  warning

  log_parser_stats



  off

  log_planner_stats



  off

  log_replication_commands



  on

  log_rotation_age



  1h

  log_rotation_size



0

log_statement



  none

  log_statement_stats



  off

  log_temp_files



0

log_transaction_sample_rate



0

log_truncate_on_rotation



  on

  logging_collector



  on

Thanks & Regards,

Ishan Joshi

-Original Message-
From: Adrian Klaver 
Sent: Tuesday, May 19, 2020 7:28 PM
To: Ishan Joshi ; pgsql-gene...@postgresql.org
Subject: Re: idle_in_transaction_session_timeout not getting log in PG v12.2

On 5/19/20 2:08 AM, Ishan Joshi wrote:

 > Hi Team,

 >

 > I have set idle_in_transaction_session_timeout on user level and it is

 > set for 5 min. I can see after 5 min the session with “idle in

 > transaction” state got killed and it log the details in postgresql log

 > file with “FATAL: terminating connection due to idle-in-transaction

 > timeout” in v11.5

 >

 > But I have testing the same in v12.2, The session getting killed with

 > state “idle in transaction”  but the log is not generated in

 > postgresql log file.

 >

 > Is there any specific changes related to this for v12.2 or any

 > additional parameters need to set for generating logs.

Looking at code in ~/backend/tcop/postgres.c it has not changed.

Are you reusing the same *.conf in 12 as you did in 11?

What are your logging settings?

 >

 > Thanks & Regards,

 >

 > Ishan Joshi

 >

 > *This email and the information contained herein is proprietary and

 > confidential and subject to the Amdocs Email Terms of Service, which

 > you may review

 > at**https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2F

 > 
www.amdocs.com%2Fabout%2Femail-terms-of-service*&data=02%7C01%7CIs 



 > han.Joshi%40amdocs.com%7C7d8dce6a7c7d4349a7ec08d7fbfcccd5%7Cc8eca3ca12

 > 7646d59d9da0f2a028920f%7C0%7C0%7C637254935514522274&sdata=Gi7j8rno

 > ogMRCpaAWM0UuH%2FzorFKy9vnpeEfNfu1dLo%3D&reserved=0

 >

--

Adrian Klaver

adrian.kla...@aklaver.com 

*This email and the information contained herein is proprietary and 
confidential and subject to the Amdocs Email Terms of Service, which you 
may review at**https://www.amdocs.com/about/email-terms-of-service*





--
Adrian Klaver
adrian.kla...@aklaver.com




RE: idle_in_transaction_session_timeout not getting log in PG v12.2

2020-05-19 Thread Ishan Joshi
Yes Adrian...!!!

Thanks & Regards,
Ishan Joshi

-Original Message-
From: Adrian Klaver  
Sent: Tuesday, May 19, 2020 7:53 PM
To: Ishan Joshi ; pgsql-gene...@postgresql.org
Subject: Re: idle_in_transaction_session_timeout not getting log in PG v12.2

On 5/19/20 7:12 AM, Ishan Joshi wrote:
> Thank Adrian for response.
> 
> Yes I have the same *.conf file as v11.5 in fact I have copied the 
> same conf files in v12.2
> 
> Logging parameter are as follows

I'm not seeing anything obviously wrong with below.

Are you sure you are looking at the correct log?

> 
> *Parameter*
> 
>   
> 
> *Values*
> 
> log_autovacuum_min_duration
> 
>   
> 
>   1s
> 
>   log_checkpoints
> 
>   
> 
>   on
> 
>   log_connections
> 
>   
> 
>   on
> 
>   log_destination
> 
>   
> 
>   stderr
> 
>   log_directory
> 
>   
> 
>   log
> 
>   log_disconnections
> 
>   
> 
>   on
> 
>   log_duration
> 
>   
> 
>   off
> 
>   log_error_verbosity
> 
>   
> 
>   default
> 
>   log_executor_stats
> 
>   
> 
>   off
> 
>   log_file_mode
> 
>   
> 
> 600
> 
> log_filename
> 
>   
> 
>   postgresql-%Y-%m-%d_%H%M%S.log
> 
> log_hostname
> 
>   
> 
>   off
> 
>   log_line_prefix
> 
>   
> 
>   %m %d %u %l %s pid:%p xid:%x ip:%h
> 
>   log_lock_waits
> 
>   
> 
>   on
> 
>   log_min_duration_statement
> 
>   
> 
> -1
> 
> log_min_error_statement
> 
>   
> 
>   error
> 
>   log_min_messages
> 
>   
> 
>   warning
> 
>   log_parser_stats
> 
>   
> 
>   off
> 
>   log_planner_stats
> 
>   
> 
>   off
> 
>   log_replication_commands
> 
>   
> 
>   on
> 
>   log_rotation_age
> 
>   
> 
>   1h
> 
>   log_rotation_size
> 
>   
> 
> 0
> 
> log_statement
> 
>   
> 
>   none
> 
>   log_statement_stats
> 
>   
> 
>   off
> 
>   log_temp_files
> 
>   
> 
> 0
> 
> log_transaction_sample_rate
> 
>   
> 
> 0
> 
> log_truncate_on_rotation
> 
>   
> 
>   on
> 
>   logging_collector
> 
>   
> 
>   on
> 
> Thanks & Regards,
> 
> Ishan Joshi
> 
> -Original Message-
> From: Adrian Klaver 
> Sent: Tuesday, May 19, 2020 7:28 PM
> To: Ishan Joshi ; pgsql-gene...@postgresql.org
> Subject: Re: idle_in_transaction_session_timeout not getting log in PG 
> v12.2
> 
> On 5/19/20 2:08 AM, Ishan Joshi wrote:
> 
>  > Hi Team,
> 
>  >
> 
>  > I have set idle_in_transaction_session_timeout on user level and it 
> is
> 
>  > set for 5 min. I can see after 5 min the session with "idle in
> 
>  > transaction" state got killed and it log the details in postgresql 
> log
> 
>  > file with "FATAL: terminating connection due to idle-in-transaction
> 
>  > timeout" in v11.5
> 
>  >
> 
>  > But I have testing the same in v12.2, The session getting killed 
> with
> 
>  > state "idle in transaction"  but the log is not generated in
> 
>  > postgresql log file.
> 
>  >
> 
>  > Is there any specific changes related to this for v12.2 or any
> 
>  > additional parameters need to set for generating logs.
> 
> Looking at code in ~/backend/tcop/postgres.c it has not changed.
> 
> Are you reusing the same *.conf in 12 as you did in 11?
> 
> What are your logging settings?
> 
>  >
> 
>  > Thanks & Regards,
> 
>  >
> 
>  > Ishan Joshi
> 
>  >
> 
>  > *This email and the information contained herein is proprietary and
> 
>  > confidential and subject to the Amdocs Email Terms of Service, 
> which
> 
>  > you may review
> 
>  > 
> at**https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> 
>  >
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.a
> mdocs.com%2F&data=02%7C01%7CIshan.Joshi%40amdocs.com%7C17d82e5803e
> f4868071808d7fc004b39%7Cc8eca3ca127646d59d9da0f2a028920f%7C0%7C0%7C637
> 254950504363804&sdata=zCV9eeGBWKtq73TxXMLg3va2bNmqjT6UbHiDQbkGL8Y%
> 3D&reserved=0%2Fabout%2Femail-terms-of-service*&data=02%7C01%7
> CIs 
>  C01%7CIs>
> 
>  > 
> han.Joshi%40amdocs.com%7C7d8dce6a7c7d4349a7ec08d7fbfcccd5%7Cc8eca3ca12
> 
>  > 
> 7646d59d9da0f2a028920f%7C0%7C0%7C637254935514522274&sdata=Gi7j8rno
> 
>  > ogMRCpaAWM0UuH%2FzorFKy9vnpeEfNfu1dLo%3D&reserved=0
> 
>  >
> 
> --
> 
> Adrian Klaver
> 
> adrian.kla...@aklaver.com 
> 
> *This email and the information contained herein is proprietary and 
> confidential and subject to the Amdocs Email Terms of Service, which 
> you may review 
> at**https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> www.amdocs.com%2Fabout%2Femail-terms-of-service*&data=02%7C01%7CIs
> han.Joshi%40amdocs.com%7C17d82e5803ef4868071808d7fc004b39%7Cc8eca3ca12
> 7646d59d9da0f2a028920f%7C0%7C0%7C637254950504363804&sdata=63uX3IKT
> vCMAYnfcoLpNy4eTRwox8EZ99ZN4F1kf4SM%3D&reserved=0
> 


--
Adrian Klaver
adrian.kla...@aklaver.com
This email and the information contained herein is proprietary and confidential 
and subject to the Amdocs Email Terms of Service, which you may review at 
https://www.amdocs.com/about/ema

Re: idle_in_transaction_session_timeout not getting log in PG v12.2

2020-05-19 Thread Adrian Klaver

On 5/19/20 7:40 AM, Ishan Joshi wrote:

Yes Adrian...!!!


Well then:

1) What OS and version and did it change between 11 and 12?

2) How are you starting the server and did that change?

3) When did you copy the 11 *.conf files over and did you
restart the 12 server after?

4) What is being logged to the file e.g. connections/disconnects, etc?



Thanks & Regards,
Ishan Joshi





--
Adrian Klaver
adrian.kla...@aklaver.com




RE: idle_in_transaction_session_timeout not getting log in PG v12.2

2020-05-19 Thread Ishan Joshi
Hi Adrian,



1) What OS and version and did it change between 11 and 12? --> it is cent os 7 
an both v11.5 and v12.2 are on same server.



2) How are you starting the server and did that change?  --> through systemctl 
start postgresql-12



3) When did you copy the 11 *.conf files over and did you restart the 12 server 
after? --> copy and restart the server 12



4) What is being logged to the file e.g. connections/disconnects, etc? --> yes 
these are getting logged



Thanks & Regards,

Ishan Joshi



-Original Message-
From: Adrian Klaver 
Sent: Tuesday, May 19, 2020 8:28 PM
To: Ishan Joshi ; pgsql-gene...@postgresql.org
Subject: Re: idle_in_transaction_session_timeout not getting log in PG v12.2



On 5/19/20 7:40 AM, Ishan Joshi wrote:

> Yes Adrian...!!!



Well then:



1) What OS and version and did it change between 11 and 12?



2) How are you starting the server and did that change?



3) When did you copy the 11 *.conf files over and did you restart the 12 server 
after?



4) What is being logged to the file e.g. connections/disconnects, etc?



>

> Thanks & Regards,

> Ishan Joshi

>







--

Adrian Klaver

adrian.kla...@aklaver.com
This email and the information contained herein is proprietary and confidential 
and subject to the Amdocs Email Terms of Service, which you may review at 
https://www.amdocs.com/about/email-terms-of-service 



Re: idle_in_transaction_session_timeout not getting log in PG v12.2

2020-05-19 Thread Adrian Klaver

On 5/19/20 8:07 AM, Ishan Joshi wrote:

Hi Adrian,

1) What OS and version and did it change between 11 and 12? --> it is 
cent os 7 an both v11.5 and v12.2 are on same server.


2) How are you starting the server and did that change?  --> through 
systemctl start postgresql-12


3) When did you copy the 11 *.conf files over and did you restart the 12 
server after? --> copy and restart the server 12


4) What is being logged to the file e.g. connections/disconnects, etc? 
--> yes these are getting logged


Hmm, I'm at a loss for an explanation. All I have left is:

1) Do you see anything relevant in the logs when the session is killed?

2) Is the transaction actually being killed? In other words does it 
still show up in?:


https://www.postgresql.org/docs/12/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW



Thanks & Regards,

Ishan Joshi

-Original Message-
From: Adrian Klaver 
Sent: Tuesday, May 19, 2020 8:28 PM
To: Ishan Joshi ; pgsql-gene...@postgresql.org
Subject: Re: idle_in_transaction_session_timeout not getting log in PG v12.2

On 5/19/20 7:40 AM, Ishan Joshi wrote:

 > Yes Adrian...!!!

Well then:

1) What OS and version and did it change between 11 and 12?

2) How are you starting the server and did that change?

3) When did you copy the 11 *.conf files over and did you restart the 12 
server after?


4) What is being logged to the file e.g. connections/disconnects, etc?

 >

 > Thanks & Regards,

 > Ishan Joshi

 >

--

Adrian Klaver

adrian.kla...@aklaver.com 

*This email and the information contained herein is proprietary and 
confidential and subject to the Amdocs Email Terms of Service, which you 
may review at**https://www.amdocs.com/about/email-terms-of-service*





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Huge tables, trying to delete OID's taking 6+hours per table

2020-05-19 Thread Tory M Blue
On Tue, May 19, 2020 at 6:40 AM Tom Lane  wrote:

> Tory M Blue  writes:
> > The command i'm using is
> > ALTER TABLE tablename SET WITHOUT OIDS;
> > Would a drop column oid be better?
>
> Unfortunately, you're kind of stuck.  OIDs are not like regular columns
> (at least before v12) --- they are integrated into the tuple header in
> a hackish way, and so there's no way to get rid of them without a table
> rewrite.
>
> regards, tom lane
>

Poop :) kind of figured that, so it's just painful.

But  I guess if it's doing a table rewrite, is there any configuration
params I could boost to help it? Shared_buffers, give it more, work mem,
maintenance mem, temp buffers anything you can think of?

Thanks again Tom

Tory


Re: Huge tables, trying to delete OID's taking 6+hours per table

2020-05-19 Thread Ron

On 5/19/20 11:51 AM, Tory M Blue wrote:



On Tue, May 19, 2020 at 6:40 AM Tom Lane > wrote:


Tory M Blue mailto:tmb...@gmail.com>> writes:
> The command i'm using is
> ALTER TABLE tablename SET WITHOUT OIDS;
> Would a drop column oid be better?

Unfortunately, you're kind of stuck.  OIDs are not like regular columns
(at least before v12) --- they are integrated into the tuple header in
a hackish way, and so there's no way to get rid of them without a table
rewrite.

                        regards, tom lane


Poop :) kind of figured that, so it's just painful.

But  I guess if it's doing a table rewrite, is there any configuration 
params I could boost to help it? Shared_buffers, give it more, work mem, 
maintenance mem, temp buffers anything you can think of?


There's an alternative if this is a "transaction table" (named, in this 
example, FOO) which never gets updated (only inserted into and selected from).


Create a new, partitioned, oid-free copy of the table (named NEW_FOO) that's 
populated with *most* of the records (all except the most recent).  When 
ready to cut over, you'd stop the applications, copy over the most current 
records from FOO to NEW_FOO and then rename FOO to OLD_FOO and FOO to OLD_FOO.


Then you can drop OLD_FOO.

--
Angular momentum makes the world go 'round.


Re: idle_in_transaction_session_timeout not getting log in PG v12.2

2020-05-19 Thread Adrian Klaver

On 5/19/20 8:07 AM, Ishan Joshi wrote:

Hi Adrian,

1) What OS and version and did it change between 11 and 12? --> it is 
cent os 7 an both v11.5 and v12.2 are on same server.


2) How are you starting the server and did that change?  --> through 
systemctl start postgresql-12


3) When did you copy the 11 *.conf files over and did you restart the 12 
server after? --> copy and restart the server 12


4) What is being logged to the file e.g. connections/disconnects, etc? 
--> yes these are getting logged


Just an FYI it worked on 12.2 and 12.3 instances on my machines:

--2020-05-19 10:01:22.489 PDT-0LOG:  parameter 
"idle_in_transaction_session_timeout" changed to "30s"
[unknown]-[unknown]-2020-05-19 10:01:46.970 PDT-0LOG:  connection 
received: host=[local]
[unknown]-aklaver-2020-05-19 10:01:46.971 PDT-0LOG:  connection 
authorized: user=aklaver database=task_manager application_name=psql
psql-aklaver-2020-05-19 10:02:22.869 PDT-0FATAL:  terminating connection 
due to idle-in-transaction timeout
psql-aklaver-2020-05-19 10:02:22.869 PDT-0LOG:  disconnection: session 
time: 0:00:35.899 user=aklaver database=task_manager host=[local]




Thanks & Regards,

Ishan Joshi



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Unique index on hash of jsonb value - correct solution?

2020-05-19 Thread Albrecht Dreß

Hi,

first, thanks a lot for your input!

Am 19.05.20 16:16 schrieb(en) Stephen Frost:

That's only going to work up to a certain size for that text and blob value 
too, of course..


These three columns (stripped-down code, there are more in the table) are a 
“lazy” import from various python scripts.  The text and bytea (actually a 
sha256 hash) fields always have a fixed, short size, but the json varies 
between a single item and a quite large dict, so extracting the data into 
columns seems to be cumbersome.


This is looking like it might be some kind of KV store which is generally 
discouraged.


Well, yes, somehow…


"jsonb" uses an internal binary representation that reorders the attributes in 
a deterministic fashin, so yes, that is guaranteed.


Sure- but the md5() isn't going to run on the jsonb data directly, it'll run on 
the text representation that's returned, and I'm not at all convinced that the 
project has agreed to make that text form always be canonical and identical 
forever, including across major version upgrades..


Ok, I see.


Further, there are some other cases to consider- such as: do you think that 
0.1230 is different from 0.123?  Since we'll store numbers in jsonb as 
numeric, we preserve trailing franctional zeroes.


Good point; currently no problem for me (strings and ints only in the json 
dicts), but might be a serious issue in the future.


Overall, if what you need is uniqueness here, I'd strongly reconsider the 
choice to use jsonb to store this data and instead see if you can break the 
data out into proper columns with a proper unique constraint across them (or 
across some hash of the combination of them that's done in a way that is clear 
and unambiguous).


Got your point, I hope – probably the clean solution would be a defined, sorted 
serialisation plus hashing of the json (and the text and bytea columns) either 
in the python producers, or in the (already existing) db function called by the 
python scripts for inserting data…

Thanks again,
Albrecht.

pgpXHPj3X2xNI.pgp
Description: PGP signature


A limit clause can cause a poor index choice

2020-05-19 Thread Nick Cleaton
The attached script builds a 10G test table which demonstrates a
problem that we have in production with postgresql 12.3-1.pgdg18.04+1
on ubuntu linux. Indexes:

test_orders_o_date_idx btree(o_date)
test_orders_customer_id_o_date_idx btree(customer_id, o_date)

We query for the most recent orders for sets of customers, and
sometimes none of those customers have any orders and the results are
empty:

explain analyze select * from test_orders where
customer_id=any(ARRAY[9993,9997,9912,9954,9100,9101,9102,9234,9500,9512])
order by o_date desc;

QUERY PLAN

 Sort  (cost=24848.96..24870.67 rows=8686 width=1839) (actual
time=1.101..1.102 rows=0 loops=1)
   Sort Key: o_date DESC
   Sort Method: quicksort  Memory: 25kB
   ->  Index Scan using test_orders_customer_id_o_date_idx on
test_orders  (cost=0.43..17361.20 rows=8686 width=1839) (actual
time=1.047..1.047 rows=0 loops=1)
 Index Cond: (customer_id = ANY
('{9993,9997,9912,9954,9100,9101,9102,9234,9500,9512}'::integer[]))
 Planning Time: 3.821 ms
 Execution Time: 1.174 ms
(7 rows)

So far so good. But if we add a limit clause to the query then the
plan goes very wrong:

explain analyze select * from test_orders where
customer_id=any(ARRAY[9993,9997,9912,9954,9100,9101,9102,9234,9500,9512])
order by o_date desc limit 10;

  QUERY PLAN
-
 Limit  (cost=0.43..1660.98 rows=10 width=1839) (actual
time=4990.424..4990.424 rows=0 loops=1)
   ->  Index Scan Backward using test_orders_o_date_idx on test_orders
 (cost=0.43..1442355.43 rows=8686 width=1839) (actual
time=4990.423..4990.423 rows=0 loops=1)
 Filter: (customer_id = ANY
('{9993,9997,9912,9954,9100,9101,9102,9234,9500,9512}'::integer[]))
 Rows Removed by Filter: 500
 Planning Time: 0.063 ms
 Execution Time: 4990.435 ms


Is there something we can adjust to get it to prefer
test_orders_customer_id_o_date_idx even when there's a limit clause ?
#!/usr/bin/python3

import random
import datetime

secs_in_day = 24*60*60

longstr = """iufdpoaiusoto3u5034534i5j345k345lku09s80s9dfjwer.,newrwwerwerwerlwerjlwejrlkewjr""" * 10

print("""

drop table if exists test_orders;
drop sequence if exists test_orders_id_seq;

CREATE SEQUENCE test_orders_id_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;

CREATE TABLE test_orders (
id integer DEFAULT nextval('test_orders_id_seq'::regclass) NOT NULL,
o_date timestamp with time zone NOT NULL,
customer_id integer,
str1 text,
num1 integer,
long1 text,
long2 text,
long3 text,
long4 text
);

COPY test_orders(o_date, customer_id, str1, num1, long1, long2, long3, long4) FROM stdin;""")

for day in range(5000):
orders = [(secs_in_day * day + random.randrange(secs_in_day), customer) for customer in range(day, day+1000)]
for o_date, customer_id in sorted(orders):
print(datetime.datetime.fromtimestamp(1234234234 + o_date).isoformat(),
customer_id,
"blah",
random.randrange(100),
longstr,
longstr,
longstr,
longstr,
sep="\t")

print("""\\.

create index test_orders_o_date_idx on test_orders using btree(o_date);
create index test_orders_customer_id_o_date_idx on test_orders using btree(customer_id, o_date);

analyze test_orders;
""")


Configuring more than one hot standby server

2020-05-19 Thread Santhosh Kumar
Hi Community,
 I read about "hot stand by" server and as far as the document goes, it
does not explicitly explain about, whether is it possible to configure more
than on database as a Hot stand by server. If we can configure more than
one hot stand by server, which database will take the role of master, in
case the original master instance crashes? Does a leader election happen?

Regards,
KRS


Clarification relation logical replication

2020-05-19 Thread Santhosh Kumar
Hi,
 Can I have a cluster of multiple instances with logical replication
enabled for all tables with every single instance as both publisher and
subscriber, to mimic multi master replication?

The documentation says, care must be taken on subscribed publication
objects to not overlap? Ref :
https://www.postgresql.org/docs/12/logical-replication-subscription.html

"A subscriber node may have multiple subscriptions if desired. It is
possible to define multiple subscriptions between a single
publisher-subscriber pair, in which case care must be taken to ensure that
the subscribed publication objects don't overlap."


If the above multi master scenario is not possible in logical replication,
and if is one publisher and multiple subscribers, what will happen if the
publisher instance crashes? How to enable hot standby in cluster setup
where there is one publisher of logical replication and rest all instances
are subscribers?

Regards,
KRS


Re: A limit clause can cause a poor index choice

2020-05-19 Thread Mohamed Wael Khobalatte
Hi Nick,

I believe a second ordering, by id desc, will get your query to use the
right index, and shouldn't be functionally different from what you would
expect.

```
select * from test_orders where

customer_id=any(ARRAY[9993,9997,9912,9954,9100,9101,9102,9234,9500,9512])

order by o_date desc, id desc limit 10;
```

I didn't look closely as to why from your data though. I'll leave it to
more experienced people to comment as to why the planner misjudged your
query badly. What happens when you raise the limit? Say to a 1000?

On Tue, May 19, 2020 at 3:00 PM Nick Cleaton  wrote:

> The attached script builds a 10G test table which demonstrates a
> problem that we have in production with postgresql 12.3-1.pgdg18.04+1
> on ubuntu linux. Indexes:
>
> test_orders_o_date_idx btree(o_date)
> test_orders_customer_id_o_date_idx btree(customer_id, o_date)
>
> We query for the most recent orders for sets of customers, and
> sometimes none of those customers have any orders and the results are
> empty:
>
> explain analyze select * from test_orders where
> customer_id=any(ARRAY[9993,9997,9912,9954,9100,9101,9102,9234,9500,9512])
> order by o_date desc;
>
> QUERY PLAN
>
> 
>  Sort  (cost=24848.96..24870.67 rows=8686 width=1839) (actual
> time=1.101..1.102 rows=0 loops=1)
>Sort Key: o_date DESC
>Sort Method: quicksort  Memory: 25kB
>->  Index Scan using test_orders_customer_id_o_date_idx on
> test_orders  (cost=0.43..17361.20 rows=8686 width=1839) (actual
> time=1.047..1.047 rows=0 loops=1)
>  Index Cond: (customer_id = ANY
> ('{9993,9997,9912,9954,9100,9101,9102,9234,9500,9512}'::integer[]))
>  Planning Time: 3.821 ms
>  Execution Time: 1.174 ms
> (7 rows)
>
> So far so good. But if we add a limit clause to the query then the
> plan goes very wrong:
>
> explain analyze select * from test_orders where
> customer_id=any(ARRAY[9993,9997,9912,9954,9100,9101,9102,9234,9500,9512])
> order by o_date desc limit 10;
>
>   QUERY PLAN
>
> -
>  Limit  (cost=0.43..1660.98 rows=10 width=1839) (actual
> time=4990.424..4990.424 rows=0 loops=1)
>->  Index Scan Backward using test_orders_o_date_idx on test_orders
>  (cost=0.43..1442355.43 rows=8686 width=1839) (actual
> time=4990.423..4990.423 rows=0 loops=1)
>  Filter: (customer_id = ANY
> ('{9993,9997,9912,9954,9100,9101,9102,9234,9500,9512}'::integer[]))
>  Rows Removed by Filter: 500
>  Planning Time: 0.063 ms
>  Execution Time: 4990.435 ms
>
>
> Is there something we can adjust to get it to prefer
> test_orders_customer_id_o_date_idx even when there's a limit clause ?
>


Re: A limit clause can cause a poor index choice

2020-05-19 Thread Michael Lewis
What does pg_stats say about column customer_id? Specifically, how many
ndistinct, and what is the sum of the most common values? If you have 1000
distinct customer_id values, and the (default 100) most common values only
cover 2% of the total rows, then the optimizer will assume that any given
customer_id will yield approx reltuples * .98 / ( 5000 - 100 ) rows. So if
your table has 1 million rows, your estimate might be that there should be
200 rows in the table per customer_id in your array.

Looking at your query plan, the optimizer expects rows=8686 for those
customer_id and it knows you only want 10 of the most recent ones. It made
the right call based on the information it has.

Increase default_statistics_target, at least on that column, and see if you
get a much much better plan. I don't know where I got this query from
online, but here ya go. I'd be curious how frac_MCV in this changes
when default_statistics_target is more like 250 or 500 and the table is
analyzed again to reflect that change.


SELECT

( SELECT SUM (x) FROM UNNEST (most_common_freqs) x ) frac_MCV,

tablename,

attname,

inherited,

null_frac,

n_distinct,

array_length(most_common_vals,1) n_mcv,

array_length(histogram_bounds,1) n_hist,

correlation,

*

FROM pg_stats

WHERE

schemaname = 'public'

AND tablename='test_orders'

AND attname='customer_id'

ORDER BY 1;


Re: Configuring more than one hot standby server

2020-05-19 Thread Kyotaro Horiguchi
At Tue, 19 May 2020 22:39:20 +0530, Santhosh Kumar  
wrote in 
> Hi Community,
>  I read about "hot stand by" server and as far as the document goes, it
> does not explicitly explain about, whether is it possible to configure more
> than on database as a Hot stand by server. If we can configure more than
> one hot stand by server, which database will take the role of master, in
> case the original master instance crashes? Does a leader election happen?

You can configure two or more standbys for one primary and cofigure
some of or oall of them as synchronous standbys.  Howerver, PostgreSQL
doesn't choose the next-primary automatically.

A replication cluster involving three or more servers can be
configured in a couple of ways.  You may configure them in a cascade
replication chain, or in fanned-out configuration, or mixture of the
two.

If you made a cascaded replication set, you don't have a choice other
than the first standby in the chain.

   before: [primary] - [standby 1] - [standby 2] ...
   after : [old standby 1] - [standby 2] ...

If you made a fanned-out replication set. You can configure some of or
all of them to be synchronous standbys, which is guaranteed to have
the same data for all-commited transactions on the primary.  If you
configured priority-mode synchronous standbys, you can choose the next
primary among the first n active standbys.  If you didn't configured
synchronous standbys, you would need to check up the all standbys and
choose the most-"advanced" one. That is, the next primary is the
standby having the largest receive-LSN.  You need to reconstruct the
replication set anyways.

   before: [primary] -+- [standby 1] LSN = x - 8
  +- [standby 2] LSN = x - 2  <= the next primary
  +- [standby 3] LSN = x - 6

   after : [old standby 2] -+- [standby 1]
+- [standby 3]

That is the same for the case of quorum-mode synchronous standbys
setting.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Removing Last field from CSV string

2020-05-19 Thread Samuel Roseman
Regular expressions, in my opinion, can be a very powerful text search and 
replace engine if you know how to use it.Feel free to enhance what I provided 
below; it seems to work for the example you provided.
postgres=# select regexp_replace(substring('Class V,Class VI,Class 
VII,Competitive Exam,Class VIII,Class X,Class XI,Class IX,Class 
XII',1,60),'(.*),\w+','\1');                     regexp_replace                 
     Class V,Class 
VI,Class VII,Competitive Exam,Class VIII(1 row)



Regards,

PflugerGeek
On Saturday, May 16, 2020, 10:19:28 AM CDT, Alex Magnum 
 wrote:  
 
 Hi,
I have a string that I want to cut to 60 char and then remove the last field 
and comma.

substring('Class V,Class VI,Class VII,Competitive Exam,Class VIII,Class X,Class 
XI,Class IX,Class XII',1,60);

substring | Class V,Class VI,Class VII,Competitive Exam,Class VIII,Class

Now I try to remove the last  field and comma  ",Class"
To get Class V,Class VI,Class VII,Competitive Exam,Class VIII
Is there a function or easy way to do this?Any help would be appreciated.
Thank youAlex