Could not launch new process for connection: Could not allocate memory

2020-05-07 Thread Lisandro Rostagno
I need some help to understand the cause of this error and how to fix it.

I have a server running PostgreSQL 9.3
The cluster has around 1.000 databases
I'm running pgBouncer for pooling connections, and I set up a pool_size of
only 1 connection per database.
The total amount of active connections at any time is ~80. This is because
most of databases have almost no activity at all.

These are some of my PostgreSQL settings:

max_connections = 300
shared_buffers = 2GB
effective_cache_size = 2GB
maintenance_work_mem = 1GB
work_mem = 288MB
wal_buffers = 8MB
checkpoint_segments = 16


The server has 64GB of total RAM, 16 CPU cores, and it is running CentOS 7.
My stack is: Nginx, uWSGI, Redis, pgBouncer and PostgreSQL.
It's all installed in the same server, so resources must be shared between
the elements of the stack.

Redis is set to use no more than 30GB of RAM.
Nginx ~8GB of RAM.
uWSGI uses ~10GB of RAM.
PostgreSQL uses ~8GB of RAM.


I'm no an expert at PostgreSQL. I've been reading the documentation
regarding memory consumption, and considering my settings, my best guess is
that I need to set higher values for shared_buffers and
effective_cache_size. I got to that conclusion because those settings have
been there for a long time (when my server had much less resources). But in
the last few years, I've been adding resources to my server, but at the
same time it has become busier. So I think maybe shared_buffers should be
at least of 8gb, what do you think about that?

Also, for what I've read, I should make effective_cache_size higher than
shared_buffers (considering the amount of RAM the server has available). Am
I right?

One more thing: what do you think about my work_mem setting? I'm planning
to reduce it (I'm not sure why it is set to 288MB). I know that the vast
majority of the queries are very simple and fast. This is because all the
databases correspond to simple blog applications, where most of the queries
are SELECTs and the UPDATEs involve changing some value at an specific row,
nothing complex. So, what do you think? I plan to reduce work_mem, am I in
the right direction?

Thank you very much in advance!
Warm regards,
Lisandro.


Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread Amarendra Konda
Hi,

PostgreSQL version : PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by
gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit

We have noticed huge difference interms of execution plan ( response time)
, When we pass the direct values  Vs  inner query to IN clause.

High level details of the use case are as follows

   - As part of the SQL there are 2 tables named Process_instance (master)
   and Process_activity ( child)
   - Wanted to fetch TOP 50 rows from  Process_activity table for the given
   values of the Process_instance.
   - When we used Inner Join / Inner query ( query1)  between parent table
   and child table , LIMIT is not really taking in to account. Instead it is
   fetching more rows and columns that required, and finally limiting the
   result
   -


*Query1*

web_1=> EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) SELECT
pa.process_activity_id  FROM process_activity pa WHERE pa.app_id =
'427380312000560' AND pa.created > '1970-01-01 00:00:00' AND
pa.process_instance_id in *(SELECT pi.process_instance_id FROM
process_instance pi WHERE pi.user_id = '317079413683604' AND pi.app_id =
'427380312000560')* ORDER BY pa.process_instance_id,pa.created limit 50;


QUERY PLAN


--
 Limit  (cost=1071.47..1071.55 rows=31 width=24) (actual
time=85.958..85.991 rows=50 loops=1)
   Output: pa.process_activity_id, pa.process_instance_id, pa.created
   Buffers: shared hit=43065
   ->  Sort  (cost=1071.47..1071.55 rows=31 width=24) (actual
time=85.956..85.971 rows=50 loops=1)
 Output: pa.process_activity_id, pa.process_instance_id, pa.created
 Sort Key: pa.process_instance_id, pa.created
 Sort Method: top-N heapsort  Memory: 28kB
 Buffers: shared hit=43065
 ->  Nested Loop  (cost=1.14..1070.70 rows=31 width=24) (actual
time=0.031..72.183 rows=46992 loops=1)
   Output: pa.process_activity_id, pa.process_instance_id,
pa.created
   Buffers: shared hit=43065
   ->  Index Scan using fki_conv_konotor_user_user_id on
public.process_instance pi  (cost=0.43..2.66 rows=1 width=8) (actual
time=0.010..0.013 rows=2 loops=1)
 Output: pi.process_instance_id
 Index Cond: (pi.user_id = '317079413683604'::bigint)
 Filter: (pi.app_id = '427380312000560'::bigint)
 Buffers: shared hit=5
   ->  Index Scan using
process_activity_process_instance_id_app_id_created_idx on
public.process_activity pa  (cost=0.70..1053.80 rows=1425 width=24) (actual
time=0.015..20.702 rows=*23496* loops=2)

*  Output: pa.process_activity_id, pa.process_activity_type, pa.voice_url,
pa.process_activity_user_id, pa.app_id, pa.process_instance_id, pa.alias,
pa.read_by_user, pa.source, pa.label_category_id, pa.label_id,
pa.csat_response_id, pa.process_activity_fragments, pa.created, pa.updated,
pa.rule_id, pa.marketing_reply_id, pa.delivered_at, pa.reply_fragments,
pa.status_fragment, pa.internal_meta, pa.interaction_id,
pa.do_not_translate, pa.should_translate, pa.in_reply_to*
 Index Cond: ((pa.process_instance_id =
pi.process_instance_id) AND (pa.app_id = '427380312000560'::bigint) AND
(pa.created > '1970-01-01 00:00:00'::timestamp without time zone))
 Buffers: shared hit=43060
 Planning time: 0.499 ms
 Execution time: 86.040 ms
(22 rows)

*Query 2*

web_1=>  EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) SELECT
pa.process_activity_id AS m_process_activity_id FROM process_activity m
WHERE pa.app_id = '427380312000560' AND pa.created > '1970-01-01 00:00:00'
AND pa.process_instance_id in (
*240117466018927,325820556706970,433008275197305*) ORDER BY
pa.process_instance_id,pa.created limit 50;

   QUERY PLAN

-
 Limit  (cost=0.70..37.66 rows=50 width=24) (actual time=0.023..0.094
rows=50 loops=1)
   Output: process_activity_id, process_instance_id, created
   Buffers: shared hit=50
   ->  Index Scan using
process_activity_process_instance_id_app_id_created_idx on
public.process_activity pa  (cost=0.70..3124.97 rows=4226 width=24) (actual
time=0.022..0.079 *rows=50* loops=1)
 Output: process_activity_id, process_instance_id, created
 Index Cond: ((pa.process_instance_id = ANY
('{140117466018927,225820556706970,233008275197305}'::bigint[])) AND
(pa.app_id = '427380312000560'::bigint) AND (pa.created > '1970-01-01
00:00:00'::timestamp without time zone))
 Buffers: shared hi

RE: pg_basebackup inconsistent performance

2020-05-07 Thread Jasen Lentz


Is the backup server shared among other systems..?
No,  physical system

And there's no increased activity on the backup server either?
No

Have you looked at network traffic for the duration?  And/or disk i/o on each 
system?  If you ran a backup once and then immediately after and that's the 
'fast' case then you may be seeing performance be better due to a lot of data 
being in the filesystem cache.  pg_basebackup being single-threaded probably 
doesn't help here either, you might want to consider one of the parallel-backup 
options.

Yes, looked at all the system stats, nothing changed, just the backups running 
extremely long.

Was looking into pgbackrest, just haven't gotten it configured yet


-Original Message-
From: Stephen Frost  
Sent: Wednesday, May 6, 2020 12:30 PM
To: Jasen Lentz 
Cc: Adrian Klaver ; 
pgsql-general@lists.postgresql.org
Subject: Re: pg_basebackup inconsistent performance

Greetings,

* Jasen Lentz (jle...@sescollc.com) wrote:
> Where are the machines you are backing up from/to relative to each on the 
> network?
> Direct ethernet connection between 10G network interfaces

Is the backup server shared among other systems..?

> Is there increased activity on the database servers e.g. inserts, updates, 
> etc during the extended backups?
> Not according to sar reports

And there's no increased activity on the backup server either?

Have you looked at network traffic for the duration?  And/or disk i/o on each 
system?  If you ran a backup once and then immediately after and that's the 
'fast' case then you may be seeing performance be better due to a lot of data 
being in the filesystem cache.  pg_basebackup being single-threaded probably 
doesn't help here either, you might want to consider one of the parallel-backup 
options.

Thanks,

Stephen




Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread Adrian Klaver

On 5/7/20 4:19 AM, Amarendra Konda wrote:

Hi,

PostgreSQL version : PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled 
by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit


We have noticed huge difference interms of execution plan ( response 
time) , When we pass the direct values  Vs  inner query to IN clause.


High level details of the use case are as follows

  * As part of the SQL there are 2 tables named Process_instance
(master) and Process_activity ( child)
  * Wanted to fetch TOP 50 rows from  Process_activity table for the
given values of the Process_instance.
  * When we used Inner Join / Inner query ( query1)  between parent
table and child table , LIMIT is not really taking in to account.
Instead it is fetching more rows and columns that required, and
finally limiting the result


It is doing what you told it to do which is SELECT all 
process_instance_i's for user_id='317079413683604' and app_id = 
'427380312000560' and then filtering further. I am going to guess that 
if you run the inner query alone you will find it returns ~23496 rows.
You might have better results if you an actual join between 
process_activity and process_instance. Something like below(obviously 
not tested):


SELECT
pa.process_activity_id
FROM
process_activity pa
JOIN
process_instance pi
ON
pa.process_instance_id = pi.process_instance_id
WHERE
pa.app_id = '427380312000560'
AND
 pa.created > '1970-01-01 00:00:00'
AND
 pi.user_id = '317079413683604'
ORDER BY
pa.process_instance_id,
pa.created
LIMIT 50;

The second query is not equivalent as you are not filtering on user_id 
and you are filtering on only three process_instance_id's.




  *


*Query1*

web_1=> EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) SELECT 
pa.process_activity_id  FROM process_activity pa WHERE pa.app_id = 
'427380312000560' AND pa.created > '1970-01-01 00:00:00' AND 
pa.process_instance_id in *_(SELECT pi.process_instance_id FROM 
process_instance pi WHERE pi.user_id = '317079413683604' AND pi.app_id = 
'427380312000560')_* ORDER BY pa.process_instance_id,pa.created limit 50;
 
 
                                                                 QUERY PLAN
 
--
  Limit  (cost=1071.47..1071.55 rows=31 width=24) (actual 
time=85.958..85.991 rows=50 loops=1)

    Output: pa.process_activity_id, pa.process_instance_id, pa.created
    Buffers: shared hit=43065
    ->  Sort  (cost=1071.47..1071.55 rows=31 width=24) (actual 
time=85.956..85.971 rows=50 loops=1)

          Output: pa.process_activity_id, pa.process_instance_id, pa.created
          Sort Key: pa.process_instance_id, pa.created
          Sort Method: top-N heapsort  Memory: 28kB
          Buffers: shared hit=43065
          ->  Nested Loop  (cost=1.14..1070.70 rows=31 width=24) (actual 
time=0.031..72.183 rows=46992 loops=1)
                Output: pa.process_activity_id, pa.process_instance_id, 
pa.created

                Buffers: shared hit=43065
                ->  Index Scan using fki_conv_konotor_user_user_id on 
public.process_instance pi  (cost=0.43..2.66 rows=1 width=8) (actual 
time=0.010..0.013 rows=2 loops=1)

                      Output: pi.process_instance_id
                      Index Cond: (pi.user_id = '317079413683604'::bigint)
                      Filter: (pi.app_id = '427380312000560'::bigint)
                      Buffers: shared hit=5
                ->  Index Scan using 
process_activity_process_instance_id_app_id_created_idx on 
public.process_activity pa  (cost=0.70..1053.80 rows=1425 width=24) 
(actual time=0.015..20.702 rows=*23496* loops=2)
* Output: pa.process_activity_id, pa.process_activity_type, 
pa.voice_url, pa.process_activity_user_id, pa.app_id, 
pa.process_instance_id, pa.alias, pa.read_by_user, pa.source, 
pa.label_category_id, pa.label_id, pa.csat_response_id, 
pa.process_activity_fragments, pa.created, pa.updated, pa.rule_id, pa.market
ing_reply_id, pa.delivered_at, pa.reply_fragments, pa.status_fragment, 
pa.internal_meta, pa.interaction_id, pa.do_not_translate, 
pa.should_translate, pa.in_reply_to*
                      Index Cond: ((pa.process_instance_id = 
pi.process_instance_id) AND (pa.app_id = '427380312000560'::bigint) AND 
(pa.created > '1970-01-01 00:00:00'::timestamp without time zone))

                      Buffers: shared hit=43060
  Planning time: 0.499 ms
  Execution time: 86.040 ms
(22 rows)

*_Query 2_*

web_1=>  EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) SELECT 
pa.process_activity_id AS m_process_activity_id

RE: Abnormal Growth of Index Size - Index Size 3x large than table size.

2020-05-07 Thread Ashish Chugh
Hi Ravi,

Thanks for your reply. One more query from our side.

To improve performance and release index space from database, We are running 
FULL Vacuum on monthly basis.
On PostgreSQL website it is not recommended to run FULL Vacuum on Production 
Database and this also requires long downtime along with huge log space 
requirement.

What are the recommendations regarding vacuum. Can we run FULL Vacuum on 
monthly basis or we should be running Online Auto Vacuum instead.

Regards,
Ashish


From: Ravi Krishna [mailto:srkrish...@comcast.net]
Sent: Wednesday, May 06, 2020 9:07 PM
To: Ashish Chugh 
Cc: pgsql-gene...@postgresql.org; Ram Pratap Maurya 

Subject: Re: Abnormal Growth of Index Size - Index Size 3x large than table 
size.




On May 6, 2020, at 10:52 AM, Ashish Chugh 
mailto:ashish.ch...@lavainternational.in>> 
wrote:

Hello Ravi,

Total number of indexes are 10 and size is 65 GB. Shall we consider this as a 
normal scenario or we need to look into the growth of the indexes as this is 
increasing day by day and table data is not increasing so drastically. Due to 
this performance degradation is there and we have to run full vacuum on monthly 
basis.
Table size is only 25 gb.
Any help in this regard is appreciable.

Indexes are stored just like tables. From storage perspective there is no 
difference between a table and an index.
So the sum of 10 different tables to 65GB, compared to 25GB of one table sounds 
possible.



Re: pg_restore V12 fails consistently against piped pg_dumps

2020-05-07 Thread Mohamed Wael Khobalatte
On Thu, May 7, 2020 at 2:35 AM Tom Lane  wrote:
>
> Without a concrete example it's hard to say, but maybe the issue is that
> v12 is more aggressive about parallelizing restores --- see 548e50976.

Thanks Tom, I will review those changes tonight. In the meantime, to
reproduce, I run the following:

- createdb test
- create some tables, I used a simple loop to create 10 empty ones.
- pg_dump -v -Fc test | tee test.dump > /dev/null (so it's through a
pipe, same as if the file is streamed through curl which was the
original case)
- pg_restore -j4 --verbose --clean --no-acl --no-owner -d test test.dump

That should fail consistently. pg_restore_12 and pg_dump_12. Same
passes in if I run in earlier versions.




Re: Lock Postgres account after X number of failed logins?

2020-05-07 Thread Stephen Frost
Greetings,

* Geoff Winkless (pgsqlad...@geoff.dj) wrote:
> On Wed, 6 May 2020, 14:28 Stephen Frost,  wrote:
> > * Geoff Winkless (pgsqlad...@geoff.dj) wrote:
> > > On Wed, 6 May 2020 at 00:05, Tim Cross  wrote:
> > > > Where Tom's solution fails is with smaller companies that cannot afford
> > > > this level of infrastructure.
> > >
> > > Is there an objection to openldap?
> >
> > LDAP-based authentication in PG involves passing the user's password to
> > the database server in the clear (or tunneled through SSL, but that
> > doesn't help if the DB is compromised), so it's really not a good
> > solution
> 
> If your DB is compromised then (if the LDAP server is only used for the db)
> what difference does it make to lose the passwords?

Seems rather unlikely to be reasonable to set up an LDAP server for
every independent DB, but if you really want to go down that route,
you're very likely to end up in the same situation- users will reuse
their password from their AD account (and/or somewhere else).

> I was (as per the thread) suggesting a simple way for small companies to
> achieve the OP's requirements without a large infrastructure investment and
> without involving the pg team undertaking the rediscovery of novel circular
> transportation-assisting devices.

The OP's comment was actually that they *have* AD in place, but there's
cases where they don't want to use AD for one reason or another.
Suggesting setting up another service like OpenLDAP seems unlikely to
really answer that, but who knows.

> Any large company will have an AD or similar setup already, clearly I'm not
> suggesting using it in that situation.

Except that's the case that was presented here- they have AD but they
don't want to use it for some subset of accounts.

> AIUI you can configure kerberos with openldap if that's more your thing,
> fwiw, but then IME the learning curve (and thus setup cost) increases
> exponentially.

Yes, you can set up OpenLDAP with Kerberos auth through SASL, you just
need an appropriate service princ from a KDC (eg: Active Directory, MIT
Kerberos, or Heimdal).  Isn't terribly hard, and is part of what's
called "LDAPv3", which is when SASL support was added (RFC 2251... from
1997).

I have to admit that I've not tried to make PG w/ 'ldap' auth use
Kerberos to connect to an LDAP server.  While it likely could be done,
in such a case you've already got a KDC and presumably would rather be
using Kerberos to auth to all of your services, including PG.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: New Role drop with Grant/Revokes stop working after subsequent runs

2020-05-07 Thread Stephen Frost
Greetings,

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Wed, May 6, 2020 at 5:05 PM AC Gomez  wrote:
> > I suppose the main question is, why would a bunch of grant and revoke
> > commands run and not do anything, not even throw an error?
> 
> Maybe its a bug? - I doubt this kind of manipulation is all that common or
> tested given the presence of what seems to be a superior alternative.

Didn't read through the rest of this, but for this part- the SQL spec
has some rather odd requirements when it comes to GRANT/REVOKEs and what
happens if no GRANT or REVOKE is able to be performed (like: don't throw
an error).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread David G. Johnston
On Thu, May 7, 2020 at 7:40 AM Adrian Klaver 
wrote:

> On 5/7/20 4:19 AM, Amarendra Konda wrote:
> > Hi,
> >
> > PostgreSQL version : PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled
> > by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
> >
> > We have noticed huge difference interms of execution plan ( response
> > time) , When we pass the direct values  Vs  inner query to IN clause.
> >
> > High level details of the use case are as follows
> >
> >   * As part of the SQL there are 2 tables named Process_instance
> > (master) and Process_activity ( child)
> >   * Wanted to fetch TOP 50 rows from  Process_activity table for the
> > given values of the Process_instance.
> >   * When we used Inner Join / Inner query ( query1)  between parent
> > table and child table , LIMIT is not really taking in to account.
> > Instead it is fetching more rows and columns that required, and
> > finally limiting the result
>
> It is doing what you told it to do which is SELECT all
> process_instance_i's for user_id='317079413683604' and app_id =
> '427380312000560' and then filtering further. I am going to guess that
> if you run the inner query alone you will find it returns ~23496 rows.
> You might have better results if you an actual join between
> process_activity and process_instance. Something like below(obviously
> not tested):
>

What the OP seems to want is a semi-join:

(not tested)

SELECT pa.process_activity_id
FROM process_activity pa WHERE pa.app_id = '427380312000560' AND pa.created
> '1970-01-01 00:00:00'
AND EXISTS (
  SELECT 1 FROM process_instance pi WHERE pi.app_id = pa.app_id AND
pi.user_id = '317079413683604'
)
ORDER BY
pa.process_instance_id,
pa.created limit 50;

I'm unsure exactly how this will impact the plan choice but it should be an
improvement, and in any case more correctly defines what it is you are
looking for.

David J.


wal_sender_timeout default

2020-05-07 Thread Jasen Lentz
In Postgres 11.6-2, what is the default timeout for wal_sender_timeout if it is 
commented out?


[cid:image003.png@01D62466.8C9AAEE0]
Jasen M. Lentz, M.Ed
Lead Systems Administrator
Sesco Enterprises, LLC
4977 State Route 30 East (Mailing Address Only)
Greensburg, PA 15601
W:  (724) 837-1991 x207
C:  (412) 848-5612




Postgres 11.6-2 to 11.7 worth the upgrade?

2020-05-07 Thread Jasen Lentz

Would upgrading from 11.6-2 to 11.7-current give us any performance advantages? 
 We are seeing intermittent performance problems that come and go.  Also are 
there any fixes for the wal_sender and pg_basebackup between the versions?


Re: Postgres 11.6-2 to 11.7 worth the upgrade?

2020-05-07 Thread Michael Lewis
https://www.postgresql.org/docs/release/11.7/

It doesn't seem like it. Always best to run the most current minor version
though.


Re: Postgres 11.6-2 to 11.7 worth the upgrade?

2020-05-07 Thread David G. Johnston
On Thu, May 7, 2020 at 8:58 AM Jasen Lentz  wrote:

>
>
> Would upgrading from 11.6-2 to 11.7-current give us any performance
> advantages?  We are seeing intermittent performance problems that come and
> go.  Also are there any fixes for the wal_sender and pg_basebackup between
> the versions?
>

You can peruse the release notes online but regardless the project policy
is that you should never not apply a minor release update (though we also
don't use 11.6-2 version numbering...).

David J.


Re: wal_sender_timeout default

2020-05-07 Thread Michael Lewis
https://www.postgresql.org/docs/11/runtime-config-replication.html

>


Re: wal_sender_timeout default

2020-05-07 Thread Adrian Klaver

On 5/7/20 8:56 AM, Jasen Lentz wrote:
In Postgres 11.6-2, what is the default timeout for wal_sender_timeout 
if it is commented out?


Form postgresql.conf

# The commented-out settings shown in this file represent the default 
values.
# Re-commenting a setting is NOT sufficient to revert it to the default 
value;

# you need to reload the server.

...

#wal_sender_timeout = 60s   # in milliseconds; 0 disables



Jasen M. Lentz, M.Ed

Lead Systems Administrator

Sesco Enterprises, LLC

4977 State Route 30 East (Mailing Address Only)
Greensburg, PA 15601

W:  (724) 837-1991 x207

C:  (412) 848-5612




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: wal_sender_timeout default

2020-05-07 Thread David G. Johnston
On Thu, May 7, 2020 at 8:56 AM Jasen Lentz  wrote:

> In Postgres 11.6-2, what is the default timeout for wal_sender_timeout if
> it is commented out?
>
>
>

The same value you would get if you uncommented the commented out setting...

Also the value you would get by querying the database while the setting is
commented out (assuming it isn't being set elsewhere).

show wal_sender_timeout;
select * from pg_settings where name = 'wal_sender_timeout';

Its also the value that the documentation says is the default value.

https://www.postgresql.org/docs/12/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-SENDER

David J.


Re: Abnormal Growth of Index Size - Index Size 3x large than table size.

2020-05-07 Thread Michael Lewis
On Thu, May 7, 2020 at 8:50 AM Ashish Chugh <
ashish.ch...@lavainternational.in> wrote

> To improve performance and release index space from database, We are
> running FULL Vacuum on monthly basis.
>
> On PostgreSQL website it is not recommended to run FULL Vacuum on
> Production Database and this also requires long downtime along with huge
> log space requirement.
>
>
>
> What are the recommendations regarding vacuum. Can we run FULL Vacuum on
> monthly basis or we should be running Online Auto Vacuum instead.
>


Autovacuum should be tuned to be more aggressive if it is not keeping up.
Never turn it off. Decrease scale factor and cost_delay to get more
throughput. Perhaps increase the number of workers, particularly if there
are multiple databases in the cluster. Be aware that the cost limit is
shared among workers so that cost limit may need to be increased when
increasing workers or else you will be working on more tables concurrently,
but not getting anymore work done in total per minute. Consider customizing
parameters on very large tables (100 million rows or more?) to have a
smaller scale factor than your new default even. Your goal should be to
reach a "steady state" with rows being removed, that space marked as
re-usable by autovacuum, and then the new updates/inserts using that space.
If you are oscillating between 1GB and 10GB for storing a table as it
bloats and then vacuum full is done periodically, then you are doing things
wrong. If it hurts to clean up, do it more often and a little at a time.

Oh, and an old blog post I read mentioned that autovacuum reserves the full
maintenance_work_mem at the start. I don't know if that is changed (fixed)
now, but I like to have maintenance_work_mem high for index creation and
such, but set autovacuum_work_mem to be lower such that perhaps it has to
re-scan some large indexes multiple times to finish its work, but I'm not
constantly holding large amounts of memory when doing vacuum on smaller
tables.


Re: wal_sender_timeout default

2020-05-07 Thread Adrian Klaver

On 5/7/20 9:07 AM, Adrian Klaver wrote:

On 5/7/20 8:56 AM, Jasen Lentz wrote:
In Postgres 11.6-2, what is the default timeout for wal_sender_timeout 
if it is commented out?


Form postgresql.conf


Oops, from.



# The commented-out settings shown in this file represent the default 
values.
# Re-commenting a setting is NOT sufficient to revert it to the default 
value;

# you need to reload the server.

...

#wal_sender_timeout = 60s   # in milliseconds; 0 disables


Realized that the comment maybe confusing. If you just include an 
integer it will be in milliseconds. Including a time unit s(econds) 
overrides that.






Jasen M. Lentz, M.Ed

Lead Systems Administrator

Sesco Enterprises, LLC

4977 State Route 30 East (Mailing Address Only)
Greensburg, PA 15601

W:  (724) 837-1991 x207

C:  (412) 848-5612







--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Abnormal Growth of Index Size - Index Size 3x large than table size.

2020-05-07 Thread Adrian Klaver

On 5/7/20 6:34 AM, Ashish Chugh wrote:

Hi Ravi,

Thanks for your reply. One more query from our side.

To improve performance and release index space from database, We are 
running FULL Vacuum on monthly basis.


As I recently learned:

https://www.postgresql.org/message-id/1392022649.706483.1587523402642%40mail.yahoo.com

To release index space index without a FULL vacuum you need to REINDEX. 
Look at the message above for more information.




On PostgreSQL website it is not recommended to run FULL Vacuum on 
Production Database and this also requires long downtime along with huge 
log space requirement.




What are the recommendations regarding vacuum. Can we run FULL Vacuum on 
monthly basis or we should be running Online Auto Vacuum instead.


Regards,

Ashish

*From:*Ravi Krishna [mailto:srkrish...@comcast.net]
*Sent:* Wednesday, May 06, 2020 9:07 PM
*To:* Ashish Chugh 
*Cc:* pgsql-gene...@postgresql.org; Ram Pratap Maurya 

*Subject:* Re: Abnormal Growth of Index Size - Index Size 3x large than 
table size.




On May 6, 2020, at 10:52 AM, Ashish Chugh
mailto:ashish.ch...@lavainternational.in>> wrote:

Hello Ravi,

Total number of indexes are 10 and size is 65 GB. Shall we consider
this as a normal scenario or we need to look into the growth of the
indexes as this is increasing day by day and table data is not
increasing so drastically. Due to this performance degradation is
there and we have to run full vacuum on monthly basis.

Table size is only 25 gb.

Any help in this regard is appreciable.

Indexes are stored just like tables. From storage perspective there is 
no difference between a table and an index.


So the sum of 10 different tables to 65GB, compared to 25GB of one table 
sounds possible.





--
Adrian Klaver
adrian.kla...@aklaver.com




pg_dump negation regex

2020-05-07 Thread Eudald Valcàrcel Lacasa
Hello!
I've been looking at the documentation and there seems to be no keyword for
negation purposes.

Am I missing something? Or it's like this?

I'm trying to use pg_dump to create a database skeleton and I have tables
like program_0, program_1, etc. And I'd like to dump all the tables but
client_1 onwards.

Does anybody happen to know a way to accomplish it?

Thank you!
Eudald


increase of xact_commit vs txid_current

2020-05-07 Thread reg_pg_stefanz

Hi,

I am confused, the documentation says for pg_stat_database
   xact_commit      Number of transactions in this database that have 
been committed

and somewhere else
    txid_current()    get current transaction ID, assigning a new one 
if the current transaction does not have one


I would have naively expected txid_current() be more or less in line 
with xact_commit,  or  increasing faster as txid_current() should be 
global but xact_commit is per database.


However xact_commit seems to increases faster than txid_current(),  what 
am I missing? Are there commits that do not increase the xid number?


Thanks
Stefan





Re: Lock Postgres account after X number of failed logins?

2020-05-07 Thread Christian Ramseyer



On 06.05.20 13:48, Guillaume Lelarge wrote:
> Le mer. 6 mai 2020 à 04:18, Christian Ramseyer  > a écrit :
> 
> Here is a quick, rough example with still some blanks to fill in - I put
> it on github for readability:
> 
> 
> The main blanks are in the postgres-action.conf section. The called
> scripts in /usr/local/bin would need to be written. It can be as simple
> as "psql -c alter role xxx nologin", but you might add some features
> like connecting to the primary server if fail2ban triggered on the
> standby. Also I'm not sure if setting nologin is the best way to disable
> an account, but I'm sure somebody on here could tell you.
> 
> 
> I already knew about fail2ban, but didn't know it could be set up this
> way. That's pretty impressive. I've just finished testing your config
> files, and it works really well (well, when you finally get rid of the
> selinux permission errors :) ). Anyway, thanks a lot for sharing this.
> 

Thanks for trying it out and the kind words, Guillaume & Ken !

There are some rough corners, I think to make it useful we would need to
do at least:

1. Write reasonable scripts for account locking/unlocking

2. Currently the lockout will also be executed for non-existing user
names and thus make the DOS worse, so we'd need a smart solution for
that (config file with valid users, or cached queries into PG from time
to time to get the existing users, or just being smarter on the log
parsing DETAILS line)

3. Examples how to combine with
https://www.postgresql.org/docs/current/auth-delay.html and/or firewall
drops, so that an attacker gets slowed down. Even if the account is
locked already, the system will still be harmed otherwise.


I'm happy to host this project if it helps enterprise adaption of
Postgres. I've converted the gist into an acutal repository, and you're
all very welcome to become contributors:
https://github.com/rc9000/postgres-fail2ban-lockout

Cheers
Christian


-- 
Christian Ramseyer, netnea ag
Network Management. Security. OpenSource.
https://www.netnea.com







Re: pg_dump negation regex

2020-05-07 Thread Adrian Klaver

On 5/7/20 9:53 AM, Eudald Valcàrcel Lacasa wrote:

Hello!
I've been looking at the documentation and there seems to be no keyword 
for negation purposes.


Am I missing something? Or it's like this?


https://www.postgresql.org/docs/12/app-pgdump.html

-T pattern
--exclude-table=pattern

Do not dump any tables matching pattern. The pattern is interpreted 
according to the same rules as for -t. -T can be given more than once to 
exclude tables matching any of several patterns.


When both -t and -T are given, the behavior is to dump just the 
tables that match at least one -t switch but no -T switches. If -T 
appears without -t, then tables matching -T are excluded from what is 
otherwise a normal dump.



For what patterns you can use see:

https://www.postgresql.org/docs/12/app-psql.html#APP-PSQL-PATTERNS



I'm trying to use pg_dump to create a database skeleton and I have 
tables like program_0, program_1, etc. And I'd like to dump all the 
tables but client_1 onwards.


Does anybody happen to know a way to accomplish it?

Thank you!
Eudald



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread Amarendra Konda
Hi Adrian,

Thanks for the reply.  And i have kept latest execution plans, for various
SQL statements ( inner join, sub queries and placing values instead of sub
query) .
As suggested, tried with INNER JOIN, however result was similar to
subquery.

Is there any way we can tell the optimiser to process less number of rows
based on the LIMIT value ? ( i.e. may be SQL re-write) ?


*INNER SQL*

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) SELECT pi.process_instance_id AS
pi_process_instance_id FROM process_instance pi WHERE pi.user_id =
'137074931866340' AND pi.app_id = '126502930200650';
 QUERY
PLAN
-
 Index Scan using fki_conv_konotor_user_user_id on public.process_instance
pi  (cost=0.43..2.66 rows=1 width=8) *(actual time=0.018..0.019 rows=2
loops=1)*
   Output: process_instance_id
   Index Cond: (pi.user_id = '137074931866340'::bigint)
   Filter: (pi.app_id = '126502930200650'::bigint)
   Buffers: shared hit=5
 Planning time: 0.119 ms
 Execution time: 0.041 ms


*Full query - Sub query*

 EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS) SELECT pa.process_activity_id
AS pa_process_activity_id  FROM process_activity pa WHERE pa.app_id =
'126502930200650' AND pa.created > '1970-01-01 00:00:00' AND
pa.process_instance_id in (SELECT pi.process_instance_id AS
pi_process_instance_id FROM process_instance pi WHERE pi.user_id =
'137074931866340' AND pi.app_id = '126502930200650') ORDER BY
pa.process_instance_id, pa.created limit 50;



  QUERY PLAN



--
--
---
 Limit  (cost=1072.91..1072.99 rows=31 width=24) (actual
time=744.386..744.415 rows=50 loops=1)
   Output: pa.process_activity_id, pa.process_instance_id, pa.created
   Buffers: shared hit=3760 read=39316
   ->  Sort  (cost=1072.91..1072.99 rows=31 width=24) (actual
time=744.384..744.396 rows=50 loops=1)
 Output: pa.process_activity_id, pa.process_instance_id, pa.created
 Sort Key: pa.process_instance_id, pa.created
 Sort Method: top-N heapsort  Memory: 28kB
 Buffers: shared hit=3760 read=39316
 ->  Nested Loop  (cost=1.14..1072.14 rows=31 width=24) (actual
time=0.044..727.297 rows=47011 loops=1)
   Output: pa.process_activity_id, pa.process_instance_id,
pa.created
   Buffers: shared hit=3754 read=39316
   ->  Index Scan using fki_conv_konotor_user_user_id on
public.process_instance pi  (cost=0.43..2.66 rows=1 width=8) *(actual
time=0.009..0.015 rows=2 loops=1)*
 Output: pi.process_instance_id
 Index Cond: (pi.user_id = '137074931866340'::bigint)
 Filter: (pi.app_id = '126502930200650'::bigint)
 Buffers: shared hit=5
   ->  Index Scan using
process_activity_process_instance_id_app_id_created_idx on
public.process_activity pa  (cost=0.70..1055.22 rows=1427 width=24) *(actual
time=0.029..349.000 rows=23506 loops=2)*
 Output: pa.process_activity_id,
pa.process_activity_type, pa.voice_url, pa.process_activity_user_id,
pa.app_id, pa.process_instance_id, pa.alias, pa.read_by_user, pa.source,
pa.label_category_id, pa.label_id, pa.csat_respons
e_id, pa.process_activity_fragments, pa.created, pa.updated, pa.rule_id,
pa.marketing_reply_id, pa.delivered_at, pa.reply_fragments,
pa.status_fragment, pa.internal_meta, pa.interaction_id,
pa.do_not_translate, pa.should_tr
anslate, pa.in_reply_to
 Index Cond: ((pa.process_instance_id =
pi.process_instance_id) AND (pa.app_id = '126502930200650'::bigint) AND
(pa.created > '1970-01-01 00:00:00'::timestamp without time zone))
 Buffers: shared hit=3749 read=39316
 Planning time: 2.547 ms
 Execution time: 744.499 ms
(22 rows)

*Full query - INNER JOIN*

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)  SELECT pa.process_activity_id
AS pa_process_activity_id  FROM process_activity pa INNER JOIN
process_instance pi ON pi.process_instance_id = pa.process_instance_id AND
pa.app_id = '126502930200650' AND pa.created > '1970-01-01 00:00:00' AND
pi.user_id = '137074931866340' AND pi.app_id = '126502930200650' ORDER BY
pa.process_instance_id, pa.created limit 50;



  QUERY PLAN



--
---

Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread Amarendra Konda
Hi David,

Thanks for the reply.This has optimized number of rows.

Can you please explain, why it is getting more columns in output, even
though we have asked for only one column ?


 EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)  SELECT pa.process_activity_id
AS pa_process_activity_id  FROM process_activity pa WHERE pa.app_id =
'126502930200650' AND pa.created > '1970-01-01 00:00:00'  AND EXISTS (
SELECT 1 FROM process_instance pi where pi.app_id = pa.app_id  AND
pi.user_id = '137074931866340') ORDER BY pa.process_instance_id,m.created
limit 50;



   QUERY PLAN



--
--
-
 Limit  (cost=1.14..37.39 rows=50 width=24) (actual time=821.283..891.629
rows=50 loops=1)
   Output: pa.process_activity_id, pa.process_instance_id, pa.created
   Buffers: shared hit=274950
   ->  Nested Loop Semi Join  (cost=1.14..20108.78 rows=367790473
width=24) (actual time=821.282..891.607 rows=50 loops=1)
 Output: pa.process_activity_id, pa.process_instance_id, pa.created
 Buffers: shared hit=274950
 ->  Index Scan using
process_activity_process_instance_id_app_id_created_idx on
public.process_activity pa  (cost=0.70..262062725.21 rows=367790473
width=32) (actual time=821.253..891.517 rows=50 loops=1)


* Output: pa.process_activity_id, pa.process_activity_type, pa.voice_url,
pa.process_activity_user_id, pa.app_id, pa.process_instance_id, pa.alias,
pa.read_by_user, pa.source, pa.label_category_id, pa.label_id,
pa.csat_response_id, m.process_activity_fragments, pa.created, pa.updated,
pa.rule_id, pa.marketing_reply_id, pa.delivered_at, pa.reply_fragments,
pa.status_fragment, pa.internal_meta, pa.interaction_id,
pa.do_not_translate, pa.should_translate, pa.in_reply_to*
   Index Cond: ((m.app_id = '126502930200650'::bigint) AND
(m.created > '1970-01-01 00:00:00'::timestamp without time zone))
   Buffers: shared hit=274946
 ->  Materialize  (cost=0.43..2.66 rows=1 width=8) (actual
time=0.001..0.001 rows=1 loops=50)
   Output: pi.app_id
   Buffers: shared hit=4
   ->  Index Scan using fki_conv_konotor_user_user_id on
public.process_instance pi  (cost=0.43..2.66 rows=1 width=8) (actual
time=0.020..0.020 rows=1 loops=1)
 Output: pi.app_id
 Index Cond: (pi.user_id = '137074931866340'::bigint)
 Filter: (pi.app_id = '126502930200650'::bigint)
 Buffers: shared hit=4
 Planning time: 0.297 ms
 Execution time: 891.686 ms
(20 rows)

On Thu, May 7, 2020 at 9:17 PM David G. Johnston 
wrote:

> On Thu, May 7, 2020 at 7:40 AM Adrian Klaver 
> wrote:
>
>> On 5/7/20 4:19 AM, Amarendra Konda wrote:
>> > Hi,
>> >
>> > PostgreSQL version : PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled
>> > by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
>> >
>> > We have noticed huge difference interms of execution plan ( response
>> > time) , When we pass the direct values  Vs  inner query to IN clause.
>> >
>> > High level details of the use case are as follows
>> >
>> >   * As part of the SQL there are 2 tables named Process_instance
>> > (master) and Process_activity ( child)
>> >   * Wanted to fetch TOP 50 rows from  Process_activity table for the
>> > given values of the Process_instance.
>> >   * When we used Inner Join / Inner query ( query1)  between parent
>> > table and child table , LIMIT is not really taking in to account.
>> > Instead it is fetching more rows and columns that required, and
>> > finally limiting the result
>>
>> It is doing what you told it to do which is SELECT all
>> process_instance_i's for user_id='317079413683604' and app_id =
>> '427380312000560' and then filtering further. I am going to guess that
>> if you run the inner query alone you will find it returns ~23496 rows.
>> You might have better results if you an actual join between
>> process_activity and process_instance. Something like below(obviously
>> not tested):
>>
>
> What the OP seems to want is a semi-join:
>
> (not tested)
>
> SELECT pa.process_activity_id
> FROM process_activity pa WHERE pa.app_id = '427380312000560' AND
> pa.created > '1970-01-01 00:00:00'
> AND EXISTS (
>   SELECT 1 FROM process_instance pi WHERE pi.app_id = pa.app_id AND
> pi.user_id = '317079413683604'
> )
> ORDER BY
> pa.process_instance_id,
> pa.created limit 50;
>
> I'm unsure exactly how this will impact the plan choice but it should be
> an improvement, and in any case more correctly defines what it is you are
> looking for.
>
> David J.
>
>


Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread Amarendra Konda
Hi David,

In earlier reply, Over looked another condition, hence please ignore that
one

Here is the correct one with all the needed conditions. According to the
latest one, exists also not limiting rows from the process_activity table.


EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)  SELECT pa.process_activity_id
AS pa_process_activity_id  FROM process_activity pa WHERE pa.app_id =
'126502930200650' AND pa.created > '1970-01-01 00:00:00'  AND EXISTS (
SELECT 1 FROM process_instance pi where pi.app_id = pa.app_id AND
*pi.process_instance_id
= pa.process_instance_id * AND pi.user_id = '137074931866340') ORDER BY
pa.process_instance_id,  pa.created limit 50;



  QUERY PLAN



--
--
---
 Limit  (cost=1079.44..1079.52 rows=32 width=24) (actual
time=85.747..85.777 rows=50 loops=1)
   Output: pa.process_activity_id, pa.process_instance_id, pa.created
   Buffers: shared hit=43070
   ->  Sort  (cost=1079.44..1079.52 rows=32 width=24) (actual
time=85.745..85.759 rows=50 loops=1)
 Output: pa.process_activity_id, pa.process_instance_id, pa.created
 Sort Key: pa.process_instance_id, pa.created
 Sort Method: top-N heapsort  Memory: 28kB
 Buffers: shared hit=43070
 ->  Nested Loop  (cost=1.14..1078.64 rows=32 width=24) (actual
time=0.025..72.115 rows=47011 loops=1)
   Output: pa.process_activity_id, pa.process_instance_id,
pa.created
   Buffers: shared hit=43070
   ->  Index Scan using fki_conv_konotor_user_user_id on
public.process_instance pi  (cost=0.43..2.66 rows=1 width=16) (actual
time=0.010..0.015 rows=2 loops=1)
 Output: pi.app_id, pi.process_instance_id
 Index Cond: (c.user_id = '137074931866340'::bigint)
 Filter: (c.app_id = '126502930200650'::bigint)
 Buffers: shared hit=5
   ->  Index Scan using
process_activity_process_instance_id_app_id_created_idx on
public.process_activity pa  (cost=0.70..1061.62 rows=1436 width=32) *(actual
time=0.011..20.320 rows=23506 loops=2)*
 Output: pa.process_activity_id,
pa.process_activity_type, pa.voice_url, pa.process_activity_user_id,
pa.app_id, pa.process_instance_id, pa.alias, pa.read_by_user, pa.source,
pa.label_category_id, pa.label_id, pa.csat_respons
e_id, pa.process_activity_fragments, pa.created, pa.updated, pa.rule_id,
pa.marketing_reply_id, pa.delivered_at, pa.reply_fragments,
pa.status_fragment, pa.internal_meta, pa.interaction_id,
pa.do_not_translate, pa.should_tr
anslate, pa.in_reply_to
 Index Cond: ((m.process_instance_id =
pi.process_instance_id) AND (m.app_id = '126502930200650'::bigint) AND
(m.created > '1970-01-01 00:00:00'::timestamp without time zone))
 Buffers: shared hit=43065
 Planning time: 0.455 ms
 Execution time: 85.830 ms

On Thu, May 7, 2020 at 11:19 PM Amarendra Konda 
wrote:

> Hi David,
>
> Thanks for the reply.This has optimized number of rows.
>
> Can you please explain, why it is getting more columns in output, even
> though we have asked for only one column ?
>
>
>  EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)  SELECT pa.process_activity_id
> AS pa_process_activity_id  FROM process_activity pa WHERE pa.app_id =
> '126502930200650' AND pa.created > '1970-01-01 00:00:00'  AND EXISTS (
> SELECT 1 FROM process_instance pi where pi.app_id = pa.app_id  AND
> pi.user_id = '137074931866340') ORDER BY pa.process_instance_id,m.created
> limit 50;
>
>
>
>QUERY PLAN
>
>
>
>
> --
>
> --
> -
>  Limit  (cost=1.14..37.39 rows=50 width=24) (actual time=821.283..891.629
> rows=50 loops=1)
>Output: pa.process_activity_id, pa.process_instance_id, pa.created
>Buffers: shared hit=274950
>->  Nested Loop Semi Join  (cost=1.14..20108.78 rows=367790473
> width=24) (actual time=821.282..891.607 rows=50 loops=1)
>  Output: pa.process_activity_id, pa.process_instance_id, pa.created
>  Buffers: shared hit=274950
>  ->  Index Scan using
> process_activity_process_instance_id_app_id_created_idx on
> public.process_activity pa  (cost=0.70..262062725.21 rows=367790473
> width=32) (act

Re: increase of xact_commit vs txid_current

2020-05-07 Thread Julien Rouhaud
On Thu, May 7, 2020 at 7:01 PM  wrote:
>
> Hi,
>
> I am confused, the documentation says for pg_stat_database
> xact_commit  Number of transactions in this database that have
> been committed
> and somewhere else
>  txid_current()get current transaction ID, assigning a new one
> if the current transaction does not have one
>
> I would have naively expected txid_current() be more or less in line
> with xact_commit,  or  increasing faster as txid_current() should be
> global but xact_commit is per database.
>
> However xact_commit seems to increases faster than txid_current(),  what
> am I missing? Are there commits that do not increase the xid number?

Indeed, read-only transactions usually don't consume a transaction id,
to avoid the need to perform a FREEZE too often.  Note that pcalling
txid_current() will consume such a transaction id, even in a read only
query.




AutoVacuum and growing transaction XID's

2020-05-07 Thread github kran
Hello Team,

We are using a PostgreSQL version -9.6.12 version and from last 4 weeks our
Transaction ID's (XID's) have increased by 195 million to 341 million
transactions.  I see the below from pg_stat_activity from the postGreSQL DB.

1) Viewing the pg_stat-activity  I noticed  that the vacuum query is
running for a runtime interval of few hours to 3-5 days whenever I check
the pg_stat-activity. Is this a common process postgreSQL runs ? I have
noticed this running and show in the pg_stat activity from last few weeks
only. Also the query shows the table name with
(to prevent wrap around) for each of the tables in the vacuum query as
output. What does this mean ?

2) Does it mean I need to run a manual auto vacuum process for these
tables ? as the transaction ids have increased from 195 million to 341
million ?.

What other things I need to check in the database around this ?.

Thanks !!


Re: AutoVacuum and growing transaction XID's

2020-05-07 Thread Michael Lewis
It is trying to do a vacuum freeze. Do you have autovacuum turned off? Any
settings changed from default related to autovacuum?

https://www.postgresql.org/docs/9.6/routine-vacuuming.html
Read 24.1.5. Preventing Transaction ID Wraparound Failures

These may also be of help-
https://info.crunchydata.com/blog/managing-transaction-id-wraparound-in-postgresql
https://www.2ndquadrant.com/en/blog/managing-freezing/

Note that you need to ensure the server gets caught up, or you risk being
locked out to prevent data corruption.


Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread Amarendra Konda
Hi Virendra,

Thanks for your time.

Here is the table and index structure

* process_activity*
Table "public.process_activity"
   Column   |Type | Modifiers

+-+
 process_activity_id | bigint  | not null
default next_id()
 process_activity_type   | smallint| not null
 voice_url  | text|
 process_activity_user_id| bigint  | not null
 app_id | bigint  | not null
 process_instance_id| bigint  | not null
 alias  | text| not null
 read_by_user   | smallint| default 0
 source | smallint| default 0
 label_category_id  | bigint  |
 label_id   | bigint  |
 csat_response_id   | bigint  |
 process_activity_fragments  | jsonb   |
 created| timestamp without time zone | not null
 updated| timestamp without time zone |
 rule_id| bigint  |
 marketing_reply_id | bigint  |
 delivered_at   | timestamp without time zone |
 reply_fragments| jsonb   |
 status_fragment| jsonb   |
 internal_meta  | jsonb   |
 interaction_id | text|
 do_not_translate   | boolean |
 should_translate   | integer |
 in_reply_to| jsonb   |
Indexes:
"process_activity_pkey" PRIMARY KEY, btree (process_activity_id)
"fki_process_activity_konotor_user_user_id" btree
(process_activity_user_id) WITH (fillfactor='70')
"*process_activity_process_instance_id_app_id_created_idx*" btree
(process_instance_id, app_id, created) WITH (fillfactor='70')
"process_activity_process_instance_id_app_id_read_by_user_created_idx"
btree (process_instance_id, app_id, read_by_user, created) WITH
(fillfactor='70')
"process_activity_process_instance_id_idx" btree (process_instance_id)
WITH (fillfactor='70')




*process_instance*
 Table "public.process_instance"
 Column  |Type |  Modifiers

-+-+-
 process_instance_id | bigint  | not null default
next_id()
 process_instance_alias  | text| not null
 app_id  | bigint  | not null
 user_id | bigint  | not null

Indexes:
"process_instance_pkey" PRIMARY KEY, btree (process_instance_id)
"*fki_conv_konotor_user_user_id*" btree (user_id) WITH (fillfactor='70')

Regards, Amarendra

On Fri, May 8, 2020 at 12:01 AM Virendra Kumar  wrote:

> Sending table structure with indexes might help little further in
> understanding.
>
> Regards,
> Virendra
>
> On Thursday, May 7, 2020, 11:08:14 AM PDT, Amarendra Konda <
> amar.vij...@gmail.com> wrote:
>
>
> Hi David,
>
> In earlier reply, Over looked another condition, hence please ignore that
> one
>
> Here is the correct one with all the needed conditions. According to the
> latest one, exists also not limiting rows from the process_activity table.
>
>
> EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)  SELECT pa.process_activity_id
> AS pa_process_activity_id  FROM process_activity pa WHERE pa.app_id =
> '126502930200650' AND pa.created > '1970-01-01 00:00:00'  AND EXISTS (
> SELECT 1 FROM process_instance pi where pi.app_id = pa.app_id AND 
> *pi.process_instance_id
> = pa.process_instance_id * AND pi.user_id = '137074931866340') ORDER BY
> pa.process_instance_id,  pa.created limit 50;
>
>
>
>   QUERY PLAN
>
>
>
>
> --
>
> --
> ---
>  Limit  (cost=1079.44..1079.52 rows=32 width=24) (actual
> time=85.747..85.777 rows=50 loops=1)
>Output: pa.process_activity_id, pa.process_instance_id, pa.created
>Buffers: shared hit=43070
>->  Sort  (cost=1079.44..1079.52 rows=32 width=24) (actual
> time=85.745..85.759 rows=50 loops=1)
>  Output: pa.process_activity_id, pa.process_instance_id, pa.created
>  Sort Key: pa.process_instance_id, pa.created
>  Sort Method: top-N heapsort  Memory: 28kB
>  Buffers: shared hit=43070
>  ->  Nest

Odd postgres12 upgrade is changing or munging a password?

2020-05-07 Thread Tory M Blue
Going from 9.5 to 12 and 2 times now, I've had a password either go missing
or munged. I've had to add an alter statement at the end of the upgrade.

The DB is functioning fine, shut it down, do the upgrade and the password
is munged. Seems like an odd occurrence, we have not noted any other weird
issues.

Anyone else see or hear of this?

Thanks
Tory


Re: pg_dump negation regex

2020-05-07 Thread Eudald Valcàrcel Lacasa
I know the -T command, but I'm trying to regex the "pattern" in -T in order
to exclude all tables named program_1 onwards.
I believe commonly you'd use -T "program_!0", but the ! negation keyword is
not defined.

Thanks,
Eudald

El jue., 7 may. 2020 a las 19:36, Adrian Klaver ()
escribió:

> On 5/7/20 9:53 AM, Eudald Valcàrcel Lacasa wrote:
> > Hello!
> > I've been looking at the documentation and there seems to be no keyword
> > for negation purposes.
> >
> > Am I missing something? Or it's like this?
>
> https://www.postgresql.org/docs/12/app-pgdump.html
>
> -T pattern
> --exclude-table=pattern
>
>  Do not dump any tables matching pattern. The pattern is interpreted
> according to the same rules as for -t. -T can be given more than once to
> exclude tables matching any of several patterns.
>
>  When both -t and -T are given, the behavior is to dump just the
> tables that match at least one -t switch but no -T switches. If -T
> appears without -t, then tables matching -T are excluded from what is
> otherwise a normal dump.
>
>
> For what patterns you can use see:
>
> https://www.postgresql.org/docs/12/app-psql.html#APP-PSQL-PATTERNS
>
> >
> > I'm trying to use pg_dump to create a database skeleton and I have
> > tables like program_0, program_1, etc. And I'd like to dump all the
> > tables but client_1 onwards.
> >
> > Does anybody happen to know a way to accomplish it?
> >
> > Thank you!
> > Eudald
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Odd postgres12 upgrade is changing or munging a password?

2020-05-07 Thread Adrian Klaver

On 5/7/20 11:55 AM, Tory M Blue wrote:
Going from 9.5 to 12 and 2 times now, I've had a password either go 
missing or munged. I've had to add an alter statement at the end of the 
upgrade.


What are the commands you are using?

Is it the same password?



The DB is functioning fine, shut it down, do the upgrade and the 
password is munged. Seems like an odd occurrence, we have not noted any 
other weird issues.


Anyone else see or hear of this?

Thanks
Tory



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: pg_dump negation regex

2020-05-07 Thread Rob Sargent



> On May 7, 2020, at 1:03 PM, Eudald Valcàrcel Lacasa 
>  wrote:
> 
> I know the -T command, but I'm trying to regex the "pattern" in -T in order 
> to exclude all tables named program_1 onwards.
> I believe commonly you'd use -T "program_!0", but the ! negation keyword is 
> not defined.
> 
> Thanks,
> Eudald
> 

 More like ‘program[^0].*’ of there are no leading zero (e.g.09) or 
‘program[1-9][0-9]*'



Re: Odd postgres12 upgrade is changing or munging a password?

2020-05-07 Thread Tory M Blue
Yes same password, I'm using a basic alter command to put the right
password back.

I'm doing another upgrade in an hour, and will do some more checks to see
if it's trying to use another password or what. I obviously can't read the
password from the file , so knowing if it's munged or other, I'm not sure
is possible.

Upgrade command i'm running

time /usr/pgsql-12/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin/
--new-bindir /usr/pgsql-12/bin/ --old-datadir /pgsql/9.5/data --new-datadir
/pgsql/12/data --link

So it's very odd. and I've not experienced this in other environments, it's
just this one. Now it's a bigger data set, but very odd.

I'm also not seeing any other data issues, just seems to be this one
password.

Thanks,

If there are commands I can run on the data before I do an alter, to give
someone more info, let me know

Tory

On Thu, May 7, 2020 at 12:08 PM Adrian Klaver 
wrote:

> On 5/7/20 11:55 AM, Tory M Blue wrote:
> > Going from 9.5 to 12 and 2 times now, I've had a password either go
> > missing or munged. I've had to add an alter statement at the end of the
> > upgrade.
>
> What are the commands you are using?
>
> Is it the same password?
>
> >
> > The DB is functioning fine, shut it down, do the upgrade and the
> > password is munged. Seems like an odd occurrence, we have not noted any
> > other weird issues.
> >
> > Anyone else see or hear of this?
> >
> > Thanks
> > Tory
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread Adrian Klaver

On 5/7/20 10:49 AM, Amarendra Konda wrote:

Hi David,

Thanks for the reply.This has optimized number of rows.


Yeah, but your execution time has increased an order of magnitude. Not 
sure if that is what you want.




Can you please explain, why it is getting more columns in output, even 
though we have asked for only one column ?



  EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)  SELECT 
pa.process_activity_id AS pa_process_activity_id  FROM process_activity 
pa WHERE pa.app_id = '126502930200650' AND pa.created > '1970-01-01 
00:00:00'  AND EXISTS ( SELECT 1 FROM process_instance pi where 
pi.app_id = pa.app_id  AND pi.user_id = '137074931866340') ORDER BY 
pa.process_instance_id,m.created limit 50;


    QUERY PLAN

--
--
-
  Limit  (cost=1.14..37.39 rows=50 width=24) (actual 
time=821.283..891.629 rows=50 loops=1)

    Output: pa.process_activity_id, pa.process_instance_id, pa.created
    Buffers: shared hit=274950
    ->  Nested Loop Semi Join  (cost=1.14..20108.78 rows=367790473 
width=24) (actual time=821.282..891.607 rows=50 loops=1)

          Output: pa.process_activity_id, pa.process_instance_id, pa.created
          Buffers: shared hit=274950
          ->  Index Scan using 
process_activity_process_instance_id_app_id_created_idx on 
public.process_activity pa  (cost=0.70..262062725.21 rows=367790473 
width=32) (actual time=821.253..891.517 rows=50 loops=1)
* Output: pa.process_activity_id, pa.process_activity_type, 
pa.voice_url, pa.process_activity_user_id, pa.app_id, 
pa.process_instance_id, pa.alias, pa.read_by_user, pa.source, 
pa.label_category_id, pa.label_id, pa.csat_response_id,
m.process_activity_fragments, pa.created, pa.updated, pa.rule_id, 
pa.marketing_reply_id, pa.delivered_at, pa.reply_fragments, 
pa.status_fragment, pa.internal_meta, pa.interaction_id, 
pa.do_not_translate, pa.should_translat

e, pa.in_reply_to*
                Index Cond: ((m.app_id = '126502930200650'::bigint) AND 
(m.created > '1970-01-01 00:00:00'::timestamp without time zone))

                Buffers: shared hit=274946
          ->  Materialize  (cost=0.43..2.66 rows=1 width=8) (actual 
time=0.001..0.001 rows=1 loops=50)

                Output: pi.app_id
                Buffers: shared hit=4
                ->  Index Scan using fki_conv_konotor_user_user_id on 
public.process_instance pi  (cost=0.43..2.66 rows=1 width=8) (actual 
time=0.020..0.020 rows=1 loops=1)

                      Output: pi.app_id
                      Index Cond: (pi.user_id = '137074931866340'::bigint)
                      Filter: (pi.app_id = '126502930200650'::bigint)
                      Buffers: shared hit=4
  Planning time: 0.297 ms
  Execution time: 891.686 ms
(20 rows)

On Thu, May 7, 2020 at 9:17 PM David G. Johnston 
mailto:david.g.johns...@gmail.com>> wrote:


On Thu, May 7, 2020 at 7:40 AM Adrian Klaver
mailto:adrian.kla...@aklaver.com>> wrote:

On 5/7/20 4:19 AM, Amarendra Konda wrote:
 > Hi,
 >
 > PostgreSQL version : PostgreSQL 9.6.2 on x86_64-pc-linux-gnu,
compiled
 > by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit
 >
 > We have noticed huge difference interms of execution plan (
response
 > time) , When we pass the direct values  Vs  inner query to IN
clause.
 >
 > High level details of the use case are as follows
 >
 >   * As part of the SQL there are 2 tables named Process_instance
 >     (master) and Process_activity ( child)
 >   * Wanted to fetch TOP 50 rows from  Process_activity table
for the
 >     given values of the Process_instance.
 >   * When we used Inner Join / Inner query ( query1)  between
parent
 >     table and child table , LIMIT is not really taking in to
account.
 >     Instead it is fetching more rows and columns that
required, and
 >     finally limiting the result

It is doing what you told it to do which is SELECT all
process_instance_i's for user_id='317079413683604' and app_id =
'427380312000560' and then filtering further. I am going to
guess that
if you run the inner query alone you will find it returns ~23496
rows.
You might have better results if you an actual join between
process_activity and process_instance. Something like
below(obviously
not tested):


What the OP seems to want is a semi-join:

(not tested)

SELECT pa.process_activity_id
FROM process_ac

Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread Virendra Kumar
Here is my thought on why row is not limiting when joined vs why it is limiting 
when not joined.
When not joined and where clause is having IN, it is using index 
process_activity_process_instance_id_app_id_created_idx which has columns 
process_instance_id, created which is in order by and hence no additional 
ordering is required and a direct rows limit can be applied here.

When in join condition it has to fetch rows according to filter clause, join 
them and then order ( sort node in plan) hence it cannot limit rows while 
fetching it first time from the table.
You are also missing pi.user_id = '317079413683604' in exists clause. It is 
worth trying to put there and run explain again and see where it takes. But to 
your point row limitation cannot happen in case of join as such in the query.

Regards,
Virendra 

On Thursday, May 7, 2020, 11:52:00 AM PDT, Amarendra Konda 
 wrote:  
 
 Hi Virendra,
Thanks for your time. 
Here is the table and index structure
 process_activity
                            Table "public.process_activity"
       Column       |            Type             |         Modifiers          
+-+
 process_activity_id         | bigint                      | not null default 
next_id()
 process_activity_type       | smallint                    | not null
 voice_url          | text                        | 
 process_activity_user_id    | bigint                      | not null
 app_id             | bigint                      | not null
 process_instance_id    | bigint                      | not null
 alias              | text                        | not null
 read_by_user       | smallint                    | default 0
 source             | smallint                    | default 0
 label_category_id  | bigint                      | 
 label_id           | bigint                      | 
 csat_response_id   | bigint                      | 
 process_activity_fragments  | jsonb                       | 
 created            | timestamp without time zone | not null
 updated            | timestamp without time zone | 
 rule_id            | bigint                      | 
 marketing_reply_id | bigint                      | 
 delivered_at       | timestamp without time zone | 
 reply_fragments    | jsonb                       | 
 status_fragment    | jsonb                       | 
 internal_meta      | jsonb                       | 
 interaction_id     | text                        | 
 do_not_translate   | boolean                     | 
 should_translate   | integer                     | 
 in_reply_to        | jsonb                       | 
Indexes:
    "process_activity_pkey" PRIMARY KEY, btree (process_activity_id)
    "fki_process_activity_konotor_user_user_id" btree 
(process_activity_user_id) WITH (fillfactor='70')
    "process_activity_process_instance_id_app_id_created_idx" btree 
(process_instance_id, app_id, created) WITH (fillfactor='70')
    "process_activity_process_instance_id_app_id_read_by_user_created_idx" 
btree (process_instance_id, app_id, read_by_user, created) WITH 
(fillfactor='70')
    "process_activity_process_instance_id_idx" btree (process_instance_id) WITH 
(fillfactor='70')
 



process_instance
                             Table "public.process_instance"
         Column          |            Type             |          Modifiers     
     
-+-+-
 process_instance_id     | bigint                      | not null default 
next_id()
 process_instance_alias  | text                        | not null
 app_id                  | bigint                      | not null
 user_id                 | bigint                      | not null
 
Indexes:
    "process_instance_pkey" PRIMARY KEY, btree (process_instance_id)
    "fki_conv_konotor_user_user_id" btree (user_id) WITH (fillfactor='70')

Regards, Amarendra
On Fri, May 8, 2020 at 12:01 AM Virendra Kumar  wrote:

Sending table structure with indexes might help little further in understanding.

Regards,
Virendra
On Thursday, May 7, 2020, 11:08:14 AM PDT, Amarendra Konda 
 wrote:  
 
 Hi David,
In earlier reply, Over looked another condition, hence please ignore that one
Here is the correct one with all the needed conditions. According to the latest 
one, exists also not limiting rows from the process_activity table.

EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)  SELECT pa.process_activity_id AS 
pa_process_activity_id  FROM process_activity pa WHERE pa.app_id = 
'126502930200650' AND pa.created > '1970-01-01 00:00:00'  AND EXISTS ( SELECT 1 
FROM process_instance pi where pi.app_id = pa.app_id AND pi.process_instance_id 
= pa.process_instance_id  AND pi.user_id = '137074931866340') ORDER BY 
pa.process_instance_id,  pa.created limit 50;
                                                                                
                                                           

Re: Odd postgres12 upgrade is changing or munging a password?

2020-05-07 Thread Adrian Klaver

On 5/7/20 12:24 PM, Tory M Blue wrote:
Yes same password, I'm using a basic alter command to put the right 
password back.


I'm doing another upgrade in an hour, and will do some more checks to 
see if it's trying to use another password or what. I obviously can't 
read the password from the file , so knowing if it's munged or other, 
I'm not sure is possible.


Upgrade command i'm running

time /usr/pgsql-12/bin/pg_upgrade --old-bindir /usr/pgsql-9.5/bin/ 
--new-bindir /usr/pgsql-12/bin/ --old-datadir /pgsql/9.5/data 
--new-datadir /pgsql/12/data --link


So it's very odd. and I've not experienced this in other environments, 
it's just this one. Now it's a bigger data set, but very odd.


Anything different about this environment e.g. locale?

What is the encoding/character set for the database?



I'm also not seeing any other data issues, just seems to be this one 
password.


I'm assuming you have super user access so you could look at the 
password in:


https://www.postgresql.org/docs/12/view-pg-shadow.html

on the old server and then on the new server.




Thanks,

If there are commands I can run on the data before I do an alter, to 
give someone more info, let me know


Tory

On Thu, May 7, 2020 at 12:08 PM Adrian Klaver > wrote:


On 5/7/20 11:55 AM, Tory M Blue wrote:
 > Going from 9.5 to 12 and 2 times now, I've had a password either go
 > missing or munged. I've had to add an alter statement at the end
of the
 > upgrade.

What are the commands you are using?

Is it the same password?

 >
 > The DB is functioning fine, shut it down, do the upgrade and the
 > password is munged. Seems like an odd occurrence, we have not
noted any
 > other weird issues.
 >
 > Anyone else see or hear of this?
 >
 > Thanks
 > Tory


-- 
Adrian Klaver

adrian.kla...@aklaver.com 




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: pg_dump negation regex

2020-05-07 Thread Eudald Valcàrcel Lacasa
You're right Rob.

Thanks, it worked!
Eudald

El jue., 7 may. 2020 a las 21:16, Rob Sargent ()
escribió:

>
>
> > On May 7, 2020, at 1:03 PM, Eudald Valcàrcel Lacasa <
> eudald.valcar...@gmail.com> wrote:
> >
> > I know the -T command, but I'm trying to regex the "pattern" in -T in
> order to exclude all tables named program_1 onwards.
> > I believe commonly you'd use -T "program_!0", but the ! negation keyword
> is not defined.
> >
> > Thanks,
> > Eudald
> >
>
>  More like ‘program[^0].*’ of there are no leading zero (e.g.09) or
> ‘program[1-9][0-9]*'


Memory footprint diff between 9.5 and 12

2020-05-07 Thread Tory M Blue
I hadn't noticed this until today, but a running 9.5 system with buffers at
10GB starts and has been running years without issues. (15GB available)

Postgres 12 will not start with that configuration, complaining about
memory availability.  So Postgres12 won't start until shared buffers is
6GB, but even with that, my DB servers , postgres queries started
complaining about being unable to allocate memory "unable to allocate".

So dropping them to 4GB (on a 15GB system), may help, but did I miss a huge
note about significant memory changes between 9.5 to 12?

Is there something else I'm missing that on busy systems is important,
something introduced in 10 or 11 as again I'm not seeing anything noted in
12.
Thanks
Tory


Re: AutoVacuum and growing transaction XID's

2020-05-07 Thread github kran
On Thu, May 7, 2020 at 1:33 PM Michael Lewis  wrote:

> It is trying to do a vacuum freeze. Do you have autovacuum turned off? Any
> settings changed from default related to autovacuum?
>
> https://www.postgresql.org/docs/9.6/routine-vacuuming.html
> Read 24.1.5. Preventing Transaction ID Wraparound Failures
>
> These may also be of help-
>
> https://info.crunchydata.com/blog/managing-transaction-id-wraparound-in-postgresql
> https://www.2ndquadrant.com/en/blog/managing-freezing/
>
> Note that you need to ensure the server gets caught up, or you risk being
> locked out to prevent data corruption.
>

  Thanks Mike.
1)  We haven't changed anything related to autovacuum except a work_mem
parameter which was increased to 4 GB which I believe is not related to
autovacuum
2)  The vacuum was not turned off and few parameters we had on vacuum are
 *autovacuum_analyze_scale_factor = 0.02* and
*autovacuum_vacuum_scale_factor
= 0.05*
*3) *The database curently we are running is 2 years old for now and we
have around close to 40 partitions and the datfrozenxid on the table is 343
million whereas the default is 200 million.  I would try doing a manual
auto vacuum on those tables
where the autovacuum_freeze_max_age > 200 million. Do you think It's a
right thing to do ?.

I will also go through this documents.

Tahnks


Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread David G. Johnston
On Thu, May 7, 2020 at 10:49 AM Amarendra Konda 
wrote:

> Can you please explain, why it is getting more columns in output, even
> though we have asked for only one column ?
>
>
>
> * Output: pa.process_activity_id, pa.process_activity_type, pa.voice_url,
> pa.process_activity_user_id, pa.app_id, pa.process_instance_id, pa.alias,
> pa.read_by_user, pa.source, pa.label_category_id, pa.label_id,
> pa.csat_response_id, m.process_activity_fragments, pa.created, pa.updated,
> pa.rule_id, pa.marketing_reply_id, pa.delivered_at, pa.reply_fragments,
> pa.status_fragment, pa.internal_meta, pa.interaction_id,
> pa.do_not_translate, pa.should_translate, pa.in_reply_to*
>
Not knowing the source code in this area at all...

I'm pretty sure its because it doesn't matter.  The executor retrieves data
"pages", 8k blocks containing multiple records, then extracts specific full
tuples from there.  At that point its probably just data pointers being
passed around.  Its not until the end that the planner/executor has to
decide which subset of columns to return to the user, or when a new tuple
structure has to be created anyway (say because of joining), maybe, does it
take the effort of constructing a minimally necessary output column set.

David J.


Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread Tom Lane
"David G. Johnston"  writes:
> On Thu, May 7, 2020 at 10:49 AM Amarendra Konda 
> wrote:
>> Can you please explain, why it is getting more columns in output, even
>> though we have asked for only one column ?
>> * Output: pa.process_activity_id, pa.process_activity_type, pa.voice_url,
>> pa.process_activity_user_id, pa.app_id, pa.process_instance_id, pa.alias,
>> pa.read_by_user, pa.source, pa.label_category_id, pa.label_id,
>> pa.csat_response_id, m.process_activity_fragments, pa.created, pa.updated,
>> pa.rule_id, pa.marketing_reply_id, pa.delivered_at, pa.reply_fragments,
>> pa.status_fragment, pa.internal_meta, pa.interaction_id,
>> pa.do_not_translate, pa.should_translate, pa.in_reply_to*

> Not knowing the source code in this area at all...

> I'm pretty sure its because it doesn't matter.

It's actually intentional, to save a projection step within that plan
node.  We'll discard the extra columns once it matters, at some higher
plan level.

(There have been some debates on -hackers about whether this optimization
is still worth anything, given all the executor improvements that have
been made since it went in.  But it was clearly a win at the time.)

regards, tom lane




Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread David G. Johnston
On Thu, May 7, 2020 at 11:07 AM Amarendra Konda 
wrote:

> EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)  SELECT pa.process_activity_id
> AS pa_process_activity_id  FROM process_activity pa WHERE pa.app_id =
> '126502930200650' AND pa.created > '1970-01-01 00:00:00'  AND EXISTS (
> SELECT 1 FROM process_instance pi where pi.app_id = pa.app_id AND 
> *pi.process_instance_id
> = pa.process_instance_id * AND pi.user_id = '137074931866340') ORDER BY
> pa.process_instance_id,  pa.created limit 50;
>
>
>
>
>->  Index Scan using
> process_activity_process_instance_id_app_id_created_idx on
> public.process_activity pa  (cost=0.70..1061.62 rows=1436 width=32) *(actual
> time=0.011..20.320 rows=23506 loops=2)*
>
> Index Cond: ((m.process_instance_id = pi.process_instance_id) AND
(m.app_id = '126502930200650'::bigint) AND (m.created > '1970-01-01
00:00:00'::timestamp without time zone))

I suppose during the nested loop the inner index scan could limit itself to
the first 50 entries it finds (since the first two index columns are being
held constant on each scan, m.created should define the traversal order...)
so that the output of the nested loop ends up being (max 2 x 50) 100
entries which are then sorted and only the top 50 returned.

Whether the executor could but isn't doing that here or isn't programmed to
do that (or my logic is totally off) I do not know.

David J.


Re: Memory footprint diff between 9.5 and 12

2020-05-07 Thread Alan Hodgson
On Thu, 2020-05-07 at 13:33 -0700, Tory M Blue wrote:
> I hadn't noticed this until today, but a running 9.5 system with
> buffers at 10GB starts and has been running years without issues.
> (15GB available)
> Postgres 12 will not start with that configuration, complaining about
> memory availability.  So Postgres12 won't start until shared buffers
> is 6GB, but even with that, my DB servers , postgres queries started
> complaining about being unable to allocate memory "unable to
> allocate".
> 
> So dropping them to 4GB (on a 15GB system), may help, but did I miss a
> huge note about significant memory changes between 9.5 to 12?
> 
> Is there something else I'm missing that on busy systems is important,
> something introduced in 10 or 11 as again I'm not seeing anything
> noted in 12.
> Thanks

Is this running on an otherwise identical system? Or do you have a
different kernel, overcommit settings, or swap configuration?


Re: Explain plan changes - IN CLAUSE ( Passing direct values Vs INNER Query )

2020-05-07 Thread David Rowley
On Fri, 8 May 2020 at 10:00, David G. Johnston
 wrote:
>
> On Thu, May 7, 2020 at 11:07 AM Amarendra Konda  wrote:
>>
>> EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)  SELECT pa.process_activity_id AS 
>> pa_process_activity_id  FROM process_activity pa WHERE pa.app_id = 
>> '126502930200650' AND pa.created > '1970-01-01 00:00:00'  AND EXISTS ( 
>> SELECT 1 FROM process_instance pi where pi.app_id = pa.app_id AND 
>> pi.process_instance_id = pa.process_instance_id  AND pi.user_id = 
>> '137074931866340') ORDER BY pa.process_instance_id,  pa.created limit 50;
>>
>>
>>->  Index Scan using 
>> process_activity_process_instance_id_app_id_created_idx on 
>> public.process_activity pa  (cost=0.70..1061.62 rows=1436 width=32) (actual 
>> time=0.011..20.320 rows=23506 loops=2)
>
> > Index Cond: ((m.process_instance_id = pi.process_instance_id) AND (m.app_id 
> > = '126502930200650'::bigint) AND (m.created > '1970-01-01 
> > 00:00:00'::timestamp without time zone))
>
> I suppose during the nested loop the inner index scan could limit itself to 
> the first 50 entries it finds (since the first two index columns are being 
> held constant on each scan, m.created should define the traversal order...) 
> so that the output of the nested loop ends up being (max 2 x 50) 100 entries 
> which are then sorted and only the top 50 returned.
>
> Whether the executor could but isn't doing that here or isn't programmed to 
> do that (or my logic is totally off) I do not know.

I think the planner is likely not putting the process_activity table
on the outer side of the nested loop join due to the poor row
estimates.  If it knew that so many rows would match the join then it
likely would have done that to save from having to perform the sort at
all.  However, because the planner has put the process_instance on the
outer side of the nested loop join, it's the pathkeys from that path
that the nested loop node has, which is not the same as what the ORDER
BY needs, so the planner must add a sort step, which means that all
rows from the nested loop plan must be read so that they can be
sorted.

It might be worth trying: create index on process_instance
(user_id,app_id); as that might lower the cost of performing the join
in the opposite order and have the planner prefer that order instead.
If doing that, the OP could then ditch the
fki_conv_konotor_user_user_id index to save space.

If that's not enough to convince the planner that the opposite order
is better then certainly SET enable_sort TO off; would.

David




Re: walreceiver termination

2020-05-07 Thread Kyotaro Horiguchi
Hello.

At Mon, 4 May 2020 09:09:15 -0500, Justin King  wrote in 
> Would there be anyone that might be able to help troubleshoot this
> issue -- or at least give me something that would be helpful to look
> for?
> 
> https://www.postgresql.org/message-id/flat/CAGH8ccdWLLGC7qag5pDUFbh96LbyzN_toORh2eY32-2P1%3Dtifg%40mail.gmail.com
> https://www.postgresql.org/message-id/flat/CANQ55Tsoa6%3Dvk2YkeVUN7qO-2YdqJf_AMVQxqsVTYJm0qqQQuw%40mail.gmail.com
> https://dba.stackexchange.com/questions/116569/postgresql-docker-incorrect-resource-manager-data-checksum-in-record-at-46f-6
> 
> I'm not the first one to report something similar and all the
> complaints have a different filesystem in common -- particularly ZFS
> (or btrfs, in the bottom case).  Is there anything more we can do here
> to help narrow down this issue?  I'm happy to help, but I honestly
> wouldn't even know where to begin.

The sendto() call at the end of your strace output is "close
connecion" request to wal sender and normally should be followed by
close() and kill().  If it is really the last strace output, the
sendto() is being blocked with buffer-full.

My diagnosis of the situation is that your replication connection had
a trouble and the TCP session is broken in the way wal receiver
couldn't be aware of the breakage.  As the result feedback message
packets from wal receiver were detained in tcp send buffer then
finally the last sendto() was blocked while sending the
close-connection message.

If it happens constantly, routers or firewalls between the primary and
standby may be discarding sessions inadvertantly.

I'm not sure how ZFS can be involved in this trouble, though.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: AutoVacuum and growing transaction XID's

2020-05-07 Thread github kran
On Thu, May 7, 2020 at 4:18 PM github kran  wrote:

>
>
> On Thu, May 7, 2020 at 1:33 PM Michael Lewis  wrote:
>
>> It is trying to do a vacuum freeze. Do you have autovacuum turned off?
>> Any settings changed from default related to autovacuum?
>>
>> https://www.postgresql.org/docs/9.6/routine-vacuuming.html
>> Read 24.1.5. Preventing Transaction ID Wraparound Failures
>>
>> These may also be of help-
>>
>> https://info.crunchydata.com/blog/managing-transaction-id-wraparound-in-postgresql
>> https://www.2ndquadrant.com/en/blog/managing-freezing/
>>
>> Note that you need to ensure the server gets caught up, or you risk being
>> locked out to prevent data corruption.
>>
>
>   Thanks Mike.
> 1)  We haven't changed anything related to autovacuum except a work_mem
> parameter which was increased to 4 GB which I believe is not related to
> autovacuum
> 2)  The vacuum was not turned off and few parameters we had on vacuum are
>  *autovacuum_analyze_scale_factor = 0.02* and 
> *autovacuum_vacuum_scale_factor
> = 0.05*
> *3) *The database curently we are running is 2 years old for now and we
> have around close to 40 partitions and the *datfrozenxid on the table is
> 343 million whereas the default is 200 million*.  I would try doing a
> manual auto vacuum on those tables
> where the *autovacuum_freeze_max_age > 200 million*. Do you think It's a
> right thing to do ?.
>
> I will also go through this documents.
>

*   Few more things 5/7 - 8:40 PM CDT*
   1)  I see there are *8 Vacuum workers* ( Not sure what changed) running
in the background and the concern I have is all of these vacuum processes
are running with wrap around and while they are running
  I can't either DROP or ALTER any other tables ( REMOVE Inheritance
for any of old tables where the WRITES are not getting written to).* Any of
the ALTER TABLE OR DROP TABLE  DDL's arer not getting exeucted even I
WAITED FOR SEVERAL MINUTES , so I have terminated those queries as I didn't
have luck.*
   2)  T*he VACUUM Process wrap around is running for last 1 day and
several hrs on other tables. *
   3)  *Can I increase the  autovacuum_freeze_max_age on the tables
on production system* ?

>
> Thanks
>



>
>


Re: AutoVacuum and growing transaction XID's

2020-05-07 Thread David Rowley
On Fri, 8 May 2020 at 09:18, github kran  wrote:
> 1)  We haven't changed anything related to autovacuum except a work_mem 
> parameter which was increased to 4 GB which I believe is not related to 
> autovacuum

It might want to look into increasing vacuum_cost_limit to something
well above 200 or dropping autovacuum_vacuum_cost_delay down from 20
to something much lower. However, you say you've not changed the
autovacuum settings, but you've also said:

>1)  I see there are 8 Vacuum workers ( Not sure what changed) running in 
> the background and the concern I have is all of these vacuum processes are 
> running with wrap around and while they are running

The default is 3, so if you have 8 then the settings are non-standard.

It might be good to supply the output of:

SELECT name,setting from pg_Settings where name like '%vacuum%';

You should know that the default speed that autovacuum runs at is
quite slow in 9.6. If you end up with all your autovacuum workers tied
up with anti-wraparound vacuums then other tables are likely to get
neglected and that could lead to stale stats or bloated tables. Best
to aim to get auto-vacuum running faster or aim to perform some manual
vacuums of tables that are over their max freeze age during an
off-peak period to make use of the lower load during those times.
Start with tables in pg_class with the largest age(relfrozenxid).
You'll still likely want to look at the speed autovacuum runs at
either way.

Please be aware that the first time a new cluster crosses the
autovacuum_freeze_max_age threshold can be a bit of a pain point as it
can mean that many tables require auto-vacuum activity all at once.
The impact of this is compounded if you have many tables that never
receive an UPDATE/DELETE as auto-vacuum, in 9.6, does not visit those
tables for any other reason. After the first time, the relfrozenxids
of tables tend to be more staggered so their vacuum freeze
requirements are also more staggered and that tends to cause fewer
problems.

David




Re: AutoVacuum and growing transaction XID's

2020-05-07 Thread David Rowley
On Fri, 8 May 2020 at 13:51, github kran  wrote:
>   I can't either DROP or ALTER any other tables ( REMOVE Inheritance for 
> any of old tables where the WRITES are not getting written to). Any of the 
> ALTER TABLE OR DROP TABLE  DDL's arer not getting exeucted even I WAITED FOR 
> SEVERAL MINUTES , so I have terminated those queries as I didn't have luck.

The auto-vacuum freeze holds an SharedUpdateExclusiveLock on the table
being vacuumed. If you try any DDL that requires an
AccessExclusiveLock, it'll have to wait until the vacuum has
completed. If you leave the DDL running then all accesses to the table
will be queued behind the ungranted AccessExclusiveLock.  It's likely
a good idea to always run DDL with a fairly short lock_timeout, just
in case this happens.

>3)  Can I increase the  autovacuum_freeze_max_age on the tables on 
> production system ?

Yes, but you cannot increase the per-table setting above the global
setting. Changing the global setting requires a restart.

David




Question correct Way to switch Slave to master

2020-05-07 Thread Hauke Homburg

Hello,

I actually work on a Master Slave Server with postgresql 11 on redhat 8. 
Ich am Building the Server with Ansible.


When i need to restore the Master Server i want to build it as Slave, 
connect it to the actually master to sync the Data, and then build the 
postgres Server itself with Ansible new and start the Server with the 
synct slave data.


I don't know f i can do this without data loss?

I have read https://www.postgresql.org/docs/11/standby-settings.html so 
i think the only differenz between master and slave is the recover.conf. 
Postgres makes noch changes in the Database itself. Is this correct?


Many thanks

Hauke

--
www.compi-creative.net


www.libre-chat.net





Re: AutoVacuum and growing transaction XID's

2020-05-07 Thread github kran
Thanks David for your replies.

On Thu, May 7, 2020 at 11:01 PM David Rowley  wrote:

> On Fri, 8 May 2020 at 09:18, github kran  wrote:
> > 1)  We haven't changed anything related to autovacuum except a work_mem
> parameter which was increased to 4 GB which I believe is not related to
> autovacuum
>
> It might want to look into increasing vacuum_cost_limit to something
> well above 200 or dropping autovacuum_vacuum_cost_delay down from 20
> to something much lower. However, you say you've not changed the
> autovacuum settings, but you've also said:
>
> >1)  I see there are 8 Vacuum workers ( Not sure what changed) running
> in the background and the concern I have is all of these vacuum processes
> are running with wrap around and while they are running
>

   - Yes I said it was originally 3 but I noticed  the work_mem parameter
   was changed few weeks back to 4 GB and then from that day onwards there is
   an increasing trend of  the MaxUsedTransactionIds from 200 Million to 347
   million ( It's growing day by day from last 2 -3 weeks)
   - Do you think there could be a formula on how the workers could have
   increased based on this increase in WORK_MEM controlled by database ?.


> The default is 3, so if you have 8 then the settings are non-standard.
>
> It might be good to supply the output of:
>
> SELECT name,setting from pg_Settings where name like '%vacuum%';
>
   Output of vacuum

name setting min_val max_val boot_val reset_val
autovacuum on null null on on
autovacuum_analyze_scale_factor 0.02 0 100 0.1 0.02
autovacuum_analyze_threshold 50 0 2147483647 50 50
autovacuum_freeze_max_age 2 10 20 2 2
autovacuum_max_workers 8 1 262143 3 8
autovacuum_multixact_freeze_max_age 4 1 20 4
4
autovacuum_naptime 5 1 2147483 60 5
autovacuum_vacuum_cost_delay 5 -1 100 20 5
autovacuum_vacuum_cost_limit -1 -1 1 -1 -1
autovacuum_vacuum_scale_factor 0.05 0 100 0.2 0.05
autovacuum_vacuum_threshold 50 0 2147483647 50 50
autovacuum_work_mem -1 -1 2147483647 -1 -1


>
> You should know that the default speed that autovacuum runs at is
> quite slow in 9.6. If you end up with all your autovacuum workers tied
> up with anti-wraparound vacuums then other tables are likely to get
> neglected and that could lead to stale stats or bloated tables. Best
> to aim to get auto-vacuum running faster or aim to perform some manual
> vacuums of tables that are over their max freeze age during an
> off-peak period to make use of the lower load during those times.
> Start with tables in pg_class with the largest age(relfrozenxid).
> You'll still likely want to look at the speed autovacuum runs at
> either way.
>
> Please be aware that the first time a new cluster crosses the
> autovacuum_freeze_max_age threshold can be a bit of a pain point as it
> can mean that many tables require auto-vacuum activity all at once.
> The impact of this is compounded if you have many tables that never
> receive an UPDATE/DELETE as auto-vacuum, in 9.6, does not visit those
> tables for any other reason. After the first time, the relfrozenxids
> of tables tend to be more staggered so their vacuum freeze
> requirements are also more staggered and that tends to cause fewer
> problems.
>

  The current situation I have is the auto vacuum kicked with 8 tables with
each of those tied to each worker and it's running very slow in 9.6 as you
mentioned
   i observed VACUUM  on those 8 tables is running from last 15 hrs and
other process are running for 1 hr+ and others for few minutes for
different tables.

   Finally I would wait for your reply to see what could be done for this
VACUUM and growing TXIDs  values.

   -Do you think I should consider changing back the work_mem back to 4
   MB what it was originally ?
   -   Can I apply your recommendations on a production instance directly
   or you prefer me to apply initially in other environment before applying on
   Prod ?
   -   Also like I said I want to clean up few unused tables OR MANUAL
   VACUUM but current system doesn't allow me to do it considering these
   factors.
   -  I will try to run VACUUM Manually during off peak hrs , Can I STOP
   the Manual VACUUM process if its take more than 10 minutes or what is the
   allowed time in mins I can have it running  ?.

David
>


Re: AutoVacuum and growing transaction XID's

2020-05-07 Thread github kran
On Thu, May 7, 2020 at 11:04 PM David Rowley  wrote:

> On Fri, 8 May 2020 at 13:51, github kran  wrote:
> >   I can't either DROP or ALTER any other tables ( REMOVE Inheritance
> for any of old tables where the WRITES are not getting written to). Any of
> the ALTER TABLE OR DROP TABLE  DDL's arer not getting exeucted even I
> WAITED FOR SEVERAL MINUTES , so I have terminated those queries as I didn't
> have luck.
>
> The auto-vacuum freeze holds an SharedUpdateExclusiveLock on the table
> being vacuumed. If you try any DDL that requires an
> AccessExclusiveLock, it'll have to wait until the vacuum has
> completed. If you leave the DDL running then all accesses to the table
> will be queued behind the ungranted AccessExclusiveLock.  It's likely
> a good idea to always run DDL with a fairly short lock_timeout, just
> in case this happens.
>
*  How much value I can assign to lock_timeout so that I dont get into
trouble to test my DDL commands and without impacting other sessions.*

>
> >3)  Can I increase the  autovacuum_freeze_max_age on the tables on
> production system ?
>


>
> Yes, but you cannot increase the per-table setting above the global
> setting. Changing the global setting requires a restart.
>
>How can I change the value of the global setting of the
autovacuum_freeze_max_Age value.


> David
>


Re: pg_temp schema created while using DB Link

2020-05-07 Thread Laurenz Albe
On Fri, 2020-05-08 at 10:11 +0400, Jill Jade wrote:
> We have an issue with DB_Link from Oracle to PostgreSQL. When we try to 
> access tables from
> Oracle database via DB_link, temp schemas(pg_temp,pg_toast) are automatically 
> created
> in the database while using the query below.
> 
> create table as select * from table@oralink  
> 
> Even if we reload the database, the temp schemas are still there. 
> 
> Do you know the reason why the temp schemas are being created with DB_link?
> Does anyone has a solution for this problem?

These schemas contain temporary tables and are not connected to your connection
from Oracle (unless you create temporary tables via that connection).

They are an implementation detail and should not bother you.

What is your problem with these schemmas?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Memory footprint diff between 9.5 and 12

2020-05-07 Thread Laurenz Albe
On Thu, 2020-05-07 at 13:33 -0700, Tory M Blue wrote:
> I hadn't noticed this until today, but a running 9.5 system with buffers at 
> 10GB starts
> and has been running years without issues. (15GB available)
> 
> Postgres 12 will not start with that configuration, complaining about memory 
> availability.
> So Postgres12 won't start until shared buffers is 6GB, but even with that, my 
> DB servers ,
> postgres queries started complaining about being unable to allocate memory 
> "unable to allocate".
> 
> So dropping them to 4GB (on a 15GB system), may help, but did I miss a huge 
> note about
> significant memory changes between 9.5 to 12?
> 
> Is there something else I'm missing that on busy systems is important, 
> something introduced
> in 10 or 11 as again I'm not seeing anything noted in 12.

There must be something else running on the machine that allocates memory.

Did you perchance run the 9.5 and the v12 server on the same machine?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Odd postgres12 upgrade is changing or munging a password?

2020-05-07 Thread Laurenz Albe
On Thu, 2020-05-07 at 11:55 -0700, Tory M Blue wrote:
> Going from 9.5 to 12 and 2 times now, I've had a password either go missing 
> or munged.
> I've had to add an alter statement at the end of the upgrade.
> 
> The DB is functioning fine, shut it down, do the upgrade and the password is 
> munged.
> Seems like an odd occurrence, we have not noted any other weird issues.
> 
> Anyone else see or hear of this?

The only explanation I can come up with is that "password_encryption" is set to
"scram-sha-256" on the v12 server.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: pg_restore V12 fails consistently against piped pg_dumps

2020-05-07 Thread Laurenz Albe
On Thu, 2020-05-07 at 11:04 -0400, Mohamed Wael Khobalatte wrote:
> Thanks Tom, I will review those changes tonight. In the meantime, to
> reproduce, I run the following:
> 
> - createdb test
> - create some tables, I used a simple loop to create 10 empty ones.
> - pg_dump -v -Fc test | tee test.dump > /dev/null (so it's through a
> pipe, same as if the file is streamed through curl which was the
> original case)
> - pg_restore -j4 --verbose --clean --no-acl --no-owner -d test test.dump
> 
> That should fail consistently. pg_restore_12 and pg_dump_12. Same
> passes in if I run in earlier versions.

I just tried that and didn't encounter any errors.

Please come up with a more complete example.

Are you OS user "postgres" when you run that?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com