Re: Query much slower on 9.6.5 than on 9.3.5

2019-04-07 Thread Joe Conway
On 4/5/19 6:23 AM, Rob Northcott wrote:
> Ah, fair enough – I’ll get into the habit of doing that then.  Thanks again.
> 
> Rob

> *From:*Ron 
> *Sent:* 05 April 2019 11:07
> *To:* pgsql-general@lists.postgresql.org
> *Subject:* Re: Query much slower on 9.6.5 than on 9.3.5
> 
> pg_dump/restore gets rid of all the dead space, and you should *always*
> run an ANALYZE after pg_restore, since pg_restore doesn't populate the
> statistics tables.

After the dump/restore cycle I would recommend a "VACUUM FREEZE ANALYZE"
if you can tolerate the time it takes, as this will not only ensure that
the statistics are populated, but it will also get future rewrites of
the rows out of the way all at once (visibility hint bits and freezing
of the tuples).

Joe
-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development




Logical replication failed recovery

2019-04-07 Thread Lou Tseng
Hi folks,

Is there a good tutorial to recover from logical replication out of sync?  We 
ran into this error stating WAL has been removed and now replica is out of 
sync.  Also, is there a way to increase the number of WAL kept on the master?

Thanks!


2019-04-07 12:28:37.180 UTC [22292] ERROR:  could not receive data from WAL 
stream: ERROR:  requested WAL segment 0001520800EB has already been 
removed

2019-04-07 12:28:37.182 UTC [114186] LOG:  worker process: logical replication 
worker for subscription 8907992 (PID 22292) exited with exit code 1



Lou Tseng


lts...@advancedpricing.com
[http://advancedpricing.com/wp-content/uploads/logo_AMPS_email.png]
Advanced Medical Pricing Solutions
35 Technology Parkway South, Suite. 100
Peachtree Corners, GA 30092



Re: Logical replication failed recovery

2019-04-07 Thread Pavan Teja
Hi Lou,

Try setting wal_keep_segments parameter to a higher value. I think it will
suffice to solve the problem.

Regards,
Pavan Teja,
9841380956

On Sun, 7 Apr, 2019, 6:53 PM Lou Tseng,  wrote:

> Hi folks,
>
> Is there a good tutorial to recover from logical replication out of sync?
> We ran into this error stating WAL has been removed and now replica is out
> of sync.  Also, is there a way to increase the number of WAL kept on the
> master?
>
> Thanks!
>
> 2019-04-07 12:28:37.180 UTC [22292] ERROR:  could not receive data from
> WAL stream: ERROR:  requested WAL segment 0001520800EB has
> already been removed
>
> 2019-04-07 12:28:37.182 UTC [114186] LOG:  worker process: logical
> replication worker for subscription 8907992 (PID 22292) exited with exit
> code 1
>
>
> Lou Tseng
>
> lts...@advancedpricing.com
> 
> Advanced Medical Pricing Solutions 
> 35 Technology Parkway South, Suite. 100
> Peachtree Corners, GA 30092
>
>


Re: pg_upgrade --jobs

2019-04-07 Thread Sherrylyn Branchaw
are there any shortcuts to upgrading that would circumvent exporting the
entire schema?

By "shortcuts," do you mean you want to minimize the time and energy you
put into the upgrade, or that you want to minimize database downtime? If
you mean downtime, I was able to upgrade a customer-facing database with
~350,000 tables from Postgres 9.0 to 9.6 last year with only 86 seconds of
downtime, using Slony, but I had to make many custom modifications to Slony
and test thoroughly beforehand, and it was not for the faint of heart, the
pressed for time, or the inexperienced. There may be better ways (and if
so, I would be curious to learn about them), but Slony was the tool with
which I was most familiar at the time.

This method does, of course, require exporting the entire schema, but
because our only constraint was to minimize customer downtime, and the
database was online while the schema was being exported, we didn't care how
long it took. Your constraints may be different.

For those reading: we do know that 350,000 tables is Doing It Wrong, and
we're getting rid of them, but we decided being on an EOLed version of
Postgres was worse and should be fixed first.

Sherrylyn


Re: Logical replication failed recovery

2019-04-07 Thread Adrian Klaver

On 4/7/19 6:22 AM, Lou Tseng wrote:

Hi folks,

Is there a good tutorial to recover from logical replication out of 
sync?  We ran into this error stating WAL has been removed and now 
replica is out of sync.  Also, is there a way to increase the number of 
WAL kept on the master?


I have to believe there is more to the story. When a subscription is 
enabled it sets up a replication 
slot(www.postgresql.org/docs/11/warm-standby.html#STREAMING-REPLICATION-SLOTS) 
that should ensure that the WAL is not recycled until it is used. Did 
you stop a Subscription then

restart it? Or one of the other scenarios shown here?:

https://www.postgresql.org/docs/11/logical-replication-subscription.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT



Thanks!

2019-04-07 12:28:37.180 UTC [22292] ERROR:could not receive data from 
WAL stream: ERROR:requested WAL segment 0001520800EB has 
already been removed


2019-04-07 12:28:37.182 UTC [114186] LOG:worker process: logical 
replication worker for subscription 8907992 (PID 22292) exited with exit 
code 1




Lou Tseng

lts...@advancedpricing.com 

Advanced Medical Pricing Solutions 
35 Technology Parkway South, Suite. 100
Peachtree Corners, GA 30092





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: pg_upgrade --jobs

2019-04-07 Thread Adrian Klaver

On 4/6/19 5:47 PM, senor wrote:

Thanks Tom for the explanation. I assumed it was my ignorance of how the schema 
was handled that was making this look like a problem that had already been 
solved and I was missing something.

I fully expected the "You're Doing It Wrong" part. That is out of my control 
but not beyond my influence.

I suspect I know the answer to this but have to ask. Using a simplified example 
where there are 100K sets of 4 tables, each representing the output of a single 
job, are there any shortcuts to upgrading that would circumvent exporting the 
entire schema? I'm sure a different DB design would be better but that's not 
what I'm working with.


An answer is going to depend on more information:

1) What is the time frame for moving from one version to another?
Both the setup and the actual downtime.

2) There are 500,000+ tables, but what is the amount of data involved?

3) Are all the tables active?

4) How are the tables distributed across databases in the cluster and 
schemas in each database?





Thanks


From: Ron 
Sent: Saturday, April 6, 2019 4:57 PM
To: pgsql-general@lists.postgresql.org
Subject: Re: pg_upgrade --jobs

On 4/6/19 6:50 PM, Tom Lane wrote:

senor  writes:


[snip]

The --link option to pg_upgrade would be so much more useful if it
weren't still bound to serially dumping the schemas of half a million
tables.



To be perfectly blunt, if you've got a database with half a million
tables, You're Doing It Wrong.

Heavy (really heavy) partitioning?

--
Angular momentum makes the world go 'round.






--
Adrian Klaver
adrian.kla...@aklaver.com




assembling PGresults from multiple simultaneous queries (libpq, singlerowmode)

2019-04-07 Thread Konstantin Izmailov
Hi,
I'm experimenting with Postgres 10 and protocol v3. I noticed that the
Postgres allows executing multiple queries simultaneously (I basically
commented out a check that prevents sending another query in libpq while
previous result(s) reading is not complete). Things appear like working,
but I just wanted to ask if anyone else tried the same (logically separate
results from multiple simultaneous queries)?

Afaik libpq was not designed for that scenario, but it seems the Postgres
server supports it. Any thought or information will be appreciated.

Thank you!
Konstantin


Re: assembling PGresults from multiple simultaneous queries (libpq, singlerowmode)

2019-04-07 Thread Pavel Stehule
Hi

ne 7. 4. 2019 v 20:47 odesílatel Konstantin Izmailov 
napsal:

> Hi,
> I'm experimenting with Postgres 10 and protocol v3. I noticed that the
> Postgres allows executing multiple queries simultaneously (I basically
> commented out a check that prevents sending another query in libpq while
> previous result(s) reading is not complete). Things appear like working,
> but I just wanted to ask if anyone else tried the same (logically separate
> results from multiple simultaneous queries)?
>
>
Postgres cannot to execute simultaneous queries in one session. So queries
should be executed in serial form every time.

Regards

Pavel



> Afaik libpq was not designed for that scenario, but it seems the Postgres
> server supports it. Any thought or information will be appreciated.
>
> Thank you!
> Konstantin
>


Re: pg_upgrade --jobs

2019-04-07 Thread senor
Thank you Adrian. I'm not sure if I can provide as much as you'd need for a 
definite answer but I'll give you what I have.

The original scheduled downtime for one installation was 24 hours. By 21 hours 
it had not completed the pg_dump schema-only so it was returned to operation.
The amount of data per table is widely varied. Some daily tables are 100-200GB 
and thousands of reports tables with stats are much smaller. I'm not connected 
to check now but I'd guess 1GB max. We chose to use the --link option partly 
because some servers do not have the disk space to copy. The time necessary to 
copy 1-2TB was also going to be an issue.
The vast majority of activity is on current day inserts and stats reports of 
that data. All previous days and existing reports are read only.
As is all too common, the DB usage grew with no redesign so it is a single 
database on a single machine with a single schema.
I get the impression there may be an option of getting the schema dump while in 
service but possibly not in this scenario. Plan B is to drop a lot of tables 
and deal with imports later.

I appreciate the help. 


From: Adrian Klaver 
Sent: Sunday, April 7, 2019 8:19 AM
To: senor; pgsql-general@lists.postgresql.org
Subject: Re: pg_upgrade --jobs

On 4/6/19 5:47 PM, senor wrote:
> Thanks Tom for the explanation. I assumed it was my ignorance of how the 
> schema was handled that was making this look like a problem that had already 
> been solved and I was missing something.
>
> I fully expected the "You're Doing It Wrong" part. That is out of my control 
> but not beyond my influence.
>
> I suspect I know the answer to this but have to ask. Using a simplified 
> example where there are 100K sets of 4 tables, each representing the output 
> of a single job, are there any shortcuts to upgrading that would circumvent 
> exporting the entire schema? I'm sure a different DB design would be better 
> but that's not what I'm working with.

An answer is going to depend on more information:

1) What is the time frame for moving from one version to another?
Both the setup and the actual downtime.

2) There are 500,000+ tables, but what is the amount of data involved?

3) Are all the tables active?

4) How are the tables distributed across databases in the cluster and
schemas in each database?


>
> Thanks
>
> 
> From: Ron 
> Sent: Saturday, April 6, 2019 4:57 PM
> To: pgsql-general@lists.postgresql.org
> Subject: Re: pg_upgrade --jobs
>
> On 4/6/19 6:50 PM, Tom Lane wrote:
>
> senor  writes:
>
>
> [snip]
>
> The --link option to pg_upgrade would be so much more useful if it
> weren't still bound to serially dumping the schemas of half a million
> tables.
>
>
>
> To be perfectly blunt, if you've got a database with half a million
> tables, You're Doing It Wrong.
>
> Heavy (really heavy) partitioning?
>
> --
> Angular momentum makes the world go 'round.
>
>
>


--
Adrian Klaver
adrian.kla...@aklaver.com




Re: How to use full-text search URL parser to filter query results by domain name?

2019-04-07 Thread Jess Wren
On 4/6/19 11:42 PM, haman...@t-online.de wrote:
> Hi,
>
> I have no real idea about solving the complete problem, and would probably try
> something with a temp table first.
> For extracting the hostname from a url you could use
>
> select regex_replace('https?://(.*=)/.*', '\\1', url)
>
> instead of the fulltext parser
>
> Best regards
> Wolfgang

Thanks Wolfgang, I understand that I could implement a function using
regex for this, or just create an extra column/table to store the
hostname data. But there are other parts of the application where I'll
need to extract URL path, others where i'll want to extract scheme, etc.
Since postgres has builtin capabilities for parsing URLs to do alll of
this, I'd rather just use the builtin functions instead of writing them
myself using regex or having to generate a temp table each time I do a
lookup.

So although I'm aware that there are a variety of ways to extract
hostname (right now I'm just doing it in Python), I'm really most
interested in understanding how to use the builtin Postgres URL parsers
to extract host, url path, etc and how to appropriately create indexes
based on them. The documentation for the URL parser is very sparse, and
I can't find much info online either.





Re: pg_upgrade --jobs

2019-04-07 Thread senor
I just noticed I missed Sherrylyn's post.
I did some reading about Slony and believe it is would be useful if I had the 
time to dig in. As pointed out, it's not an out-of-the box solution. It is 
included on the TODO list though. For now I can only dream of the 86 second 
down time.

Thanks


From: Sherrylyn Branchaw 
Sent: Sunday, April 7, 2019 6:43 AM
To: senor
Cc: pgsql-general@lists.postgresql.org
Subject: Re: pg_upgrade --jobs

are there any shortcuts to upgrading that would circumvent exporting the entire 
schema?

By "shortcuts," do you mean you want to minimize the time and energy you put 
into the upgrade, or that you want to minimize database downtime? If you mean 
downtime, I was able to upgrade a customer-facing database with ~350,000 tables 
from Postgres 9.0 to 9.6 last year with only 86 seconds of downtime, using 
Slony, but I had to make many custom modifications to Slony and test thoroughly 
beforehand, and it was not for the faint of heart, the pressed for time, or the 
inexperienced. There may be better ways (and if so, I would be curious to learn 
about them), but Slony was the tool with which I was most familiar at the time.

This method does, of course, require exporting the entire schema, but because 
our only constraint was to minimize customer downtime, and the database was 
online while the schema was being exported, we didn't care how long it took. 
Your constraints may be different.

For those reading: we do know that 350,000 tables is Doing It Wrong, and we're 
getting rid of them, but we decided being on an EOLed version of Postgres was 
worse and should be fixed first.

Sherrylyn




Re: assembling PGresults from multiple simultaneous queries (libpq, singlerowmode)

2019-04-07 Thread Andres Freund
Hi,

On 2019-04-07 20:57:56 +0200, Pavel Stehule wrote:
> ne 7. 4. 2019 v 20:47 odesílatel Konstantin Izmailov 
> napsal:
> 
> > Hi,
> > I'm experimenting with Postgres 10 and protocol v3. I noticed that the
> > Postgres allows executing multiple queries simultaneously (I basically
> > commented out a check that prevents sending another query in libpq while
> > previous result(s) reading is not complete). Things appear like working,
> > but I just wanted to ask if anyone else tried the same (logically separate
> > results from multiple simultaneous queries)?
> >
> >
> Postgres cannot to execute simultaneous queries in one session. So queries
> should be executed in serial form every time.

I think what Konstantin is really talking about is pipelining
(i.e. sending multiple queries without waiting for the results
inbetween, thereby reducing latency), and that is actually supported by
postgres. Some drivers make fairly extensive use of it (e.g. the pgjdbc
driver).

As for libpq: No, that's currently not supported. There is a patch that
I hope to get into v13 however: https://commitfest.postgresql.org/20/1317

Greetings,

Andres Freund




Re: pg_upgrade --jobs

2019-04-07 Thread Adrian Klaver

On 4/7/19 12:05 PM, senor wrote:

Thank you Adrian. I'm not sure if I can provide as much as you'd need for a 
definite answer but I'll give you what I have.

The original scheduled downtime for one installation was 24 hours. By 21 hours 
it had not completed the pg_dump schema-only so it was returned to operation.


So this is more then one cluster?

I am assuming the below was repeated at different sites?


The amount of data per table is widely varied. Some daily tables are 100-200GB 
and thousands of reports tables with stats are much smaller. I'm not connected 
to check now but I'd guess 1GB max. We chose to use the --link option partly 
because some servers do not have the disk space to copy. The time necessary to 
copy 1-2TB was also going to be an issue.
The vast majority of activity is on current day inserts and stats reports of 
that data. All previous days and existing reports are read only.
As is all too common, the DB usage grew with no redesign so it is a single 
database on a single machine with a single schema.
I get the impression there may be an option of getting the schema dump while in 
service but possibly not in this scenario. Plan B is to drop a lot of tables 
and deal with imports later.


I take the above to mean that a lot of the tables are cruft, correct?



I appreciate the help.




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: pg_upgrade --jobs

2019-04-07 Thread Melvin Davidson
 > The original scheduled downtime for one installation was 24 hours. By 21
hours it had not >completed the pg_dump schema-only so it was returned to
operation.

To me, your best option is to create a slony cluster with the version you
need to upgrade to.
When slony is in sync, simply make it the master and switch to it. It may
take a while for
slony replication to be in sync, but when it is, there will be very little
down time to switch
over.

Slony 

On Sun, Apr 7, 2019 at 3:36 PM Adrian Klaver 
wrote:

> On 4/7/19 12:05 PM, senor wrote:
> > Thank you Adrian. I'm not sure if I can provide as much as you'd need
> for a definite answer but I'll give you what I have.
> >
> > The original scheduled downtime for one installation was 24 hours. By 21
> hours it had not completed the pg_dump schema-only so it was returned to
> operation.
>
> So this is more then one cluster?
>
> I am assuming the below was repeated at different sites?
>
> > The amount of data per table is widely varied. Some daily tables are
> 100-200GB and thousands of reports tables with stats are much smaller. I'm
> not connected to check now but I'd guess 1GB max. We chose to use the
> --link option partly because some servers do not have the disk space to
> copy. The time necessary to copy 1-2TB was also going to be an issue.
> > The vast majority of activity is on current day inserts and stats
> reports of that data. All previous days and existing reports are read only.
> > As is all too common, the DB usage grew with no redesign so it is a
> single database on a single machine with a single schema.
> > I get the impression there may be an option of getting the schema dump
> while in service but possibly not in this scenario. Plan B is to drop a lot
> of tables and deal with imports later.
>
> I take the above to mean that a lot of the tables are cruft, correct?
>
> >
> > I appreciate the help.
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>

-- 
*Melvin Davidson*
*Maj. Database & Exploration Specialist*
*Universe Exploration Command – UXC*
Employment by invitation only!


Re: assembling PGresults from multiple simultaneous queries (libpq, singlerowmode)

2019-04-07 Thread Konstantin Izmailov
Yes, Andres, I meant "pipelining", just couldn't choose correct word. Thank
you for the answer(s)!

I also made changes in my own copy of libpq, and they work fine. I think
the pipelining support is needed in libpq. Btw, how can I get the patch
code? I want to compare your approach with mine. I couldn't figure out how
to get the patch from the link.

Thanks so much!


On Sun, Apr 7, 2019 at 12:21 PM Andres Freund  wrote:

> Hi,
>
> On 2019-04-07 20:57:56 +0200, Pavel Stehule wrote:
> > ne 7. 4. 2019 v 20:47 odesílatel Konstantin Izmailov 
> > napsal:
> >
> > > Hi,
> > > I'm experimenting with Postgres 10 and protocol v3. I noticed that the
> > > Postgres allows executing multiple queries simultaneously (I basically
> > > commented out a check that prevents sending another query in libpq
> while
> > > previous result(s) reading is not complete). Things appear like
> working,
> > > but I just wanted to ask if anyone else tried the same (logically
> separate
> > > results from multiple simultaneous queries)?
> > >
> > >
> > Postgres cannot to execute simultaneous queries in one session. So
> queries
> > should be executed in serial form every time.
>
> I think what Konstantin is really talking about is pipelining
> (i.e. sending multiple queries without waiting for the results
> inbetween, thereby reducing latency), and that is actually supported by
> postgres. Some drivers make fairly extensive use of it (e.g. the pgjdbc
> driver).
>
> As for libpq: No, that's currently not supported. There is a patch that
> I hope to get into v13 however: https://commitfest.postgresql.org/20/1317
>
> Greetings,
>
> Andres Freund
>


Re: assembling PGresults from multiple simultaneous queries (libpq, singlerowmode)

2019-04-07 Thread Konstantin Izmailov
Never mind, I found the link to the github in emails from the link. Thanks
again!

Konstantin

On Sun, Apr 7, 2019 at 1:28 PM Konstantin Izmailov  wrote:

> Yes, Andres, I meant "pipelining", just couldn't choose correct word.
> Thank you for the answer(s)!
>
> I also made changes in my own copy of libpq, and they work fine. I think
> the pipelining support is needed in libpq. Btw, how can I get the patch
> code? I want to compare your approach with mine. I couldn't figure out how
> to get the patch from the link.
>
> Thanks so much!
>
>
> On Sun, Apr 7, 2019 at 12:21 PM Andres Freund  wrote:
>
>> Hi,
>>
>> On 2019-04-07 20:57:56 +0200, Pavel Stehule wrote:
>> > ne 7. 4. 2019 v 20:47 odesílatel Konstantin Izmailov 
>> > napsal:
>> >
>> > > Hi,
>> > > I'm experimenting with Postgres 10 and protocol v3. I noticed that the
>> > > Postgres allows executing multiple queries simultaneously (I basically
>> > > commented out a check that prevents sending another query in libpq
>> while
>> > > previous result(s) reading is not complete). Things appear like
>> working,
>> > > but I just wanted to ask if anyone else tried the same (logically
>> separate
>> > > results from multiple simultaneous queries)?
>> > >
>> > >
>> > Postgres cannot to execute simultaneous queries in one session. So
>> queries
>> > should be executed in serial form every time.
>>
>> I think what Konstantin is really talking about is pipelining
>> (i.e. sending multiple queries without waiting for the results
>> inbetween, thereby reducing latency), and that is actually supported by
>> postgres. Some drivers make fairly extensive use of it (e.g. the pgjdbc
>> driver).
>>
>> As for libpq: No, that's currently not supported. There is a patch that
>> I hope to get into v13 however: https://commitfest.postgresql.org/20/1317
>>
>> Greetings,
>>
>> Andres Freund
>>
>


Re: assembling PGresults from multiple simultaneous queries (libpq, singlerowmode)

2019-04-07 Thread Andres Freund
Hi,

On postgres mailing lists please don't write your reply at the top of a
fully quoted email. We like the reply to be inline and trimmed to the
necessary parts.

On 2019-04-07 13:28:46 -0700, Konstantin Izmailov wrote:
> Yes, Andres, I meant "pipelining", just couldn't choose correct word. Thank
> you for the answer(s)!
> 
> I also made changes in my own copy of libpq, and they work fine. I think
> the pipelining support is needed in libpq. Btw, how can I get the patch
> code? I want to compare your approach with mine. I couldn't figure out how
> to get the patch from the link.

Hm, odd. There's a link on the page "Latest attachment" - but for
unknown reasons that's broken. I've attached it for now, but will also
inquire with the webadmin team about what's up.

Greetings,

Andres Freund
>From ba93ae02eca024997f2ce6a9c2c2987aea4a77b8 Mon Sep 17 00:00:00 2001
From: Prabakaran 
Date: Fri, 12 Jan 2018 10:09:09 +1100
Subject: [PATCH] Pipelining-batch-support-for-libpq-code-v16

---
 doc/src/sgml/libpq.sgml| 502 +
 doc/src/sgml/lobj.sgml |   4 +
 .../libpqwalreceiver/libpqwalreceiver.c|   3 +
 src/interfaces/libpq/exports.txt   |   5 +
 src/interfaces/libpq/fe-connect.c  |  28 +
 src/interfaces/libpq/fe-exec.c | 595 +++--
 src/interfaces/libpq/fe-protocol2.c|   6 +
 src/interfaces/libpq/fe-protocol3.c|  15 +-
 src/interfaces/libpq/libpq-fe.h|  24 +-
 src/interfaces/libpq/libpq-int.h   |  47 +-
 10 files changed, 1186 insertions(+), 43 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 4e46451..6aae637 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -4752,6 +4752,500 @@ int PQflush(PGconn *conn);
 
  
 
+ 
+  Batch mode and query pipelining
+
+  
+   libpq
+   batch mode
+  
+
+  
+   libpq
+   pipelining
+  
+
+  
+   libpq supports queueing up queries into
+   a pipeline to be executed as a batch on the server. Batching queries allows
+   applications to avoid a client/server round-trip after each query to get
+   the results before issuing the next query.
+  
+
+  
+   When to use batching
+
+   
+Much like asynchronous query mode, there is no performance disadvantage to
+using batching and pipelining. It increases client application complexity
+and extra caution is required to prevent client/server deadlocks but
+can sometimes offer considerable performance improvements.
+   
+
+   
+Batching is most useful when the server is distant, i.e. network latency
+(ping time) is high, and when many small operations are being performed in
+rapid sequence. There is usually less benefit in using batches when each
+query takes many multiples of the client/server round-trip time to execute.
+A 100-statement operation run on a server 300ms round-trip-time away would take
+30 seconds in network latency alone without batching; with batching it may spend
+as little as 0.3s waiting for results from the server.
+   
+
+   
+Use batches when your application does lots of small
+INSERT, UPDATE and
+DELETE operations that can't easily be transformed into
+operations on sets or into a
+COPY operation.
+   
+
+   
+Batching is not useful when information from one operation is required by the
+client before it knows enough to send the next operation. The client must
+introduce a synchronisation point and wait for a full client/server
+round-trip to get the results it needs. However, it's often possible to
+adjust the client design to exchange the required information server-side.
+Read-modify-write cycles are especially good candidates; for example:
+
+ BEGIN;
+ SELECT x FROM mytable WHERE id = 42 FOR UPDATE;
+ -- result: x=2
+ -- client adds 1 to x:
+ UPDATE mytable SET x = 3 WHERE id = 42;
+ COMMIT;
+
+could be much more efficiently done with:
+
+ UPDATE mytable SET x = x + 1 WHERE id = 42;
+
+   
+
+   
+
+ The batch API was introduced in PostgreSQL 10.0, but clients using PostgresSQL 10.0 version of libpq can
+ use batches on server versions 7.4 and newer. Batching works on any server
+ that supports the v3 extended query protocol.
+
+   
+
+  
+
+  
+   Using batch mode
+
+   
+To issue batches the application must switch
+a connection into batch mode. Enter batch mode with PQenterBatchMode(conn) or test
+whether batch mode is active with PQbatchStatus(conn). In batch mode only asynchronous operations are permitted, and
+COPY is not recommended as it most likely will trigger failure in batch processing. 
+Using any synchronous command execution functions such as PQfn,
+PQexec or one of its sibling functions are error conditions.
+Functions allowed in batch mode are described i

Re: assembling PGresults from multiple simultaneous queries (libpq, singlerowmode)

2019-04-07 Thread Konstantin Izmailov
Got it! Thanks!


Re: pg_upgrade --jobs

2019-04-07 Thread Sherrylyn Branchaw
> It may take a while for slony replication to be in sync, but when it is,
there will be very little down time to switch over.

I agree in principle, which is why I chose Slony over pg_upgrade for my
company's very similar situation, but my experience was that, out of the
box, Slony was projected to take unacceptably long (months) to sync our
350,000 tables, and downtime was going to be many hours. In order to get
those numbers down, I had to reduce the number of per-table steps Slony was
executing, e.g. by eliminating unnecessary-for-us ones and by rewriting
others to happen in bulk.

Here's something I didn't know Slony did when I didn't have 350,000 tables
to worry about: add a table to replication, run through the existing list
of replicated tables to make sure it's captured any changes that have
happened in the meantime, add one more table to replication, run through
the existing list of tables to make sure no changes have happened, and so
on. The more tables you add, the longer it takes to add the next table.
Here's another thing I didn't know it did: during the switchover, manage 4
triggers per table serially on primary and standby. 4 * 35 * 2 = 2.8
million triggers. (I knew it managed 4 triggers per table, but not that it
was serial or how long that would take when pg_trigger had almost 3 million
relevant rows.)

I would love to help the OP out in a more hands-on way (I have
upgrade-via-Slony consulting experience), as well as to open source the
custom changes I came up with, but I'm debating whether I have the time to
offer to take on another project right now. I'm also reluctant to summarize
exactly what I did, because messing with pg_catalog directly is very
delicate and likely to go wrong, and I don't recommend it to the
inexperienced all, or to the experienced if they have alternatives.

> Plan B is to drop a lot of tables and deal with imports later.

If it were me, I would take a copy of my database, restore it to a sandbox
environment, set up Slony, and get an estimate for the projected sync time.
Let it run for a few hours to see how dramatically the sync time slows down
as the number of tables grows. I.e. don't count the number of replicated
tables after an hour and assume that's your hourly rate of syncing. If it
looks like you can sync your entire database in a few days or less, then
let it run and test out your failover time. If that's acceptable, you're
good to go.

If sync time looks like it's going to be incredibly long on your schema +
hardware + Postgres version, etc., then failover time probably would be
too. In that case, temporarily dropping the tables you can drop may be
preferable to the complexity of making changes to speed up Slony, if you
don't have a seriously experienced DBA on hand.

Sherrylyn


Unable to Vacuum Large Defragmented Table

2019-04-07 Thread Igal Sapir
I have a table for which pg_relation_size() shows only 31MB, but
pg_total_relation_size() shows a whopping 84GB.

The database engine is running inside a Docker container, with the data
mounted as a volume from a partition on the host's file system.

When I try to run `VACUUM FULL`, the disk usage goes up until it reaches
the full capacity of the partition (about 27GB of free space), at which
point it fails.

How can I reclaim the disk space here other than write the data to a new
table and drop the old one?

Thank you,

Igal


Re: Unable to Vacuum Large Defragmented Table

2019-04-07 Thread Adrian Klaver

On 4/7/19 3:08 PM, Igal Sapir wrote:
I have a table for which pg_relation_size() shows only 31MB, but 
pg_total_relation_size() shows a whopping 84GB.


What does:

pg_indexes_size()

show.

Also how many indexes are on the table?



The database engine is running inside a Docker container, with the data 
mounted as a volume from a partition on the host's file system.


When I try to run `VACUUM FULL`, the disk usage goes up until it reaches 
the full capacity of the partition (about 27GB of free space), at which 
point it fails.


Yeah it would:

https://www.postgresql.org/docs/11/sql-vacuum.html
"Selects “full” vacuum, which can reclaim more space, but takes much 
longer and exclusively locks the table. This method also requires extra 
disk space, since it writes a new copy of the table and doesn't release 
the old copy until the operation is complete. ..."




How can I reclaim the disk space here other than write the data to a new 
table and drop the old one?


Thank you,

Igal






--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Unable to Vacuum Large Defragmented Table

2019-04-07 Thread David Rowley
On Mon, 8 Apr 2019 at 10:09, Igal Sapir  wrote:
>
> I have a table for which pg_relation_size() shows only 31MB, but 
> pg_total_relation_size() shows a whopping 84GB.
>
> The database engine is running inside a Docker container, with the data 
> mounted as a volume from a partition on the host's file system.
>
> When I try to run `VACUUM FULL`, the disk usage goes up until it reaches the 
> full capacity of the partition (about 27GB of free space), at which point it 
> fails.

That sort of indicates that the table might not be as bloated as you
seem to think it is.  Remember that variable length attributes can be
toasted and stored in the relation's toast table.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: Unable to Vacuum Large Defragmented Table

2019-04-07 Thread Igal Sapir
David,

On Sun, Apr 7, 2019 at 6:20 PM David Rowley 
wrote:

> On Mon, 8 Apr 2019 at 10:09, Igal Sapir  wrote:
> >
> > I have a table for which pg_relation_size() shows only 31MB, but
> pg_total_relation_size() shows a whopping 84GB.
> >
> > The database engine is running inside a Docker container, with the data
> mounted as a volume from a partition on the host's file system.
> >
> > When I try to run `VACUUM FULL`, the disk usage goes up until it reaches
> the full capacity of the partition (about 27GB of free space), at which
> point it fails.
>
> That sort of indicates that the table might not be as bloated as you
> seem to think it is.  Remember that variable length attributes can be
> toasted and stored in the relation's toast table.
>

I think that you're on to something here.  The table has a JSONB column
which has possibly toasted.

I have deleted many rows from the table itself though, and still fail to
reclaim disk space.  Is there something else I should do to delete the
toasted data?

Thanks,

Igal


Re: Unable to Vacuum Large Defragmented Table

2019-04-07 Thread David Rowley
On Mon, 8 Apr 2019 at 14:19, Igal Sapir  wrote:
>
> On Sun, Apr 7, 2019 at 6:20 PM David Rowley  
> wrote:
>>
>> On Mon, 8 Apr 2019 at 10:09, Igal Sapir  wrote:
>> >
>> > I have a table for which pg_relation_size() shows only 31MB, but 
>> > pg_total_relation_size() shows a whopping 84GB.
>> >
>> > The database engine is running inside a Docker container, with the data 
>> > mounted as a volume from a partition on the host's file system.
>> >
>> > When I try to run `VACUUM FULL`, the disk usage goes up until it reaches 
>> > the full capacity of the partition (about 27GB of free space), at which 
>> > point it fails.
>>
>> That sort of indicates that the table might not be as bloated as you
>> seem to think it is.  Remember that variable length attributes can be
>> toasted and stored in the relation's toast table.
>
>
> I think that you're on to something here.  The table has a JSONB column which 
> has possibly toasted.
>
> I have deleted many rows from the table itself though, and still fail to 
> reclaim disk space.  Is there something else I should do to delete the 
> toasted data?

The toast data is part of the data. It's just stored out of line since
there's a hard limit of just under 8k per tuple and since tuples
cannot span multiple pages, PostgreSQL internally breaks them into
chunks, possibly compresses them and stores them in the toast table.
This can occur for any variable length type.

This means if you want to remove the toast data, then you'll need to
remove the data from the main table, either the form of deleting rows
or updating them to remove the toasted values.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: Unable to Vacuum Large Defragmented Table

2019-04-07 Thread Igal Sapir
David,

On Sun, Apr 7, 2019 at 7:28 PM David Rowley 
wrote:

> On Mon, 8 Apr 2019 at 14:19, Igal Sapir  wrote:
> >
> > On Sun, Apr 7, 2019 at 6:20 PM David Rowley <
> david.row...@2ndquadrant.com> wrote:
> >>
> >> On Mon, 8 Apr 2019 at 10:09, Igal Sapir  wrote:
> >> >
> >> > I have a table for which pg_relation_size() shows only 31MB, but
> pg_total_relation_size() shows a whopping 84GB.
> >> >
> >> > The database engine is running inside a Docker container, with the
> data mounted as a volume from a partition on the host's file system.
> >> >
> >> > When I try to run `VACUUM FULL`, the disk usage goes up until it
> reaches the full capacity of the partition (about 27GB of free space), at
> which point it fails.
> >>
> >> That sort of indicates that the table might not be as bloated as you
> >> seem to think it is.  Remember that variable length attributes can be
> >> toasted and stored in the relation's toast table.
> >
> >
> > I think that you're on to something here.  The table has a JSONB column
> which has possibly toasted.
> >
> > I have deleted many rows from the table itself though, and still fail to
> reclaim disk space.  Is there something else I should do to delete the
> toasted data?
>
> The toast data is part of the data. It's just stored out of line since
> there's a hard limit of just under 8k per tuple and since tuples
> cannot span multiple pages, PostgreSQL internally breaks them into
> chunks, possibly compresses them and stores them in the toast table.
> This can occur for any variable length type.
>
> This means if you want to remove the toast data, then you'll need to
> remove the data from the main table, either the form of deleting rows
> or updating them to remove the toasted values.
>

However, I have now deleted about 50,000 rows more and the table has only
119,688 rows.  The pg_relation_size() still shows 31MB and
pg_total_relation_size() still shows 84GB.

It doesn't make sense that after deleting about 30% of the rows the values
here do not change.

Attempting to copy the data to a different table results in the out of disk
error as well, so that is in line with your assessment.  But it actually
just shows the problem.  The new table to which the data was copied (though
failed due to out of disk) shows 0 rows, but  pg_total_relation_size() for
that table shows 27GB.  So now I have an "empty" table that takes 27GB of
disk space.

This is mostly transient data, so I don't mind deleting rows, but if some
day this could happen in production then I have to know how to deal with it
without losing all of the data.

Thanks,

Igal


Re: Unable to Vacuum Large Defragmented Table

2019-04-07 Thread David Rowley
On Mon, 8 Apr 2019 at 14:57, Igal Sapir  wrote:
> However, I have now deleted about 50,000 rows more and the table has only 
> 119,688 rows.  The pg_relation_size() still shows 31MB and 
> pg_total_relation_size() still shows 84GB.
>
> It doesn't make sense that after deleting about 30% of the rows the values 
> here do not change.

deleting rows does not make the table any smaller, it just creates
dead rows in the table.  VACUUM tries to release the space used by
those dead rows and turns it back into free space.  Normal vacuum (not
FULL) can only shrink the table if completely empty pages are found at
the end of the table.

> Attempting to copy the data to a different table results in the out of disk 
> error as well, so that is in line with your assessment.  But it actually just 
> shows the problem.  The new table to which the data was copied (though failed 
> due to out of disk) shows 0 rows, but  pg_total_relation_size() for that 
> table shows 27GB.  So now I have an "empty" table that takes 27GB of disk 
> space.

I think the best thing to do is perform a normal VACUUM on the table
then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
FROM pgstattuple('); and the same again on the toast table.
If your table still contains many dead rows then perhaps an open
transaction is stopping rows from being turned into free space. Once
pgstattuples reports that "tuple_len" from the table, its toast table
and all its indexes has been reduced to an acceptable value then you
should try a VACUUM FULL.  Remember that VACUUM FULL must also write
WAL, so if WAL is on the same volume, then you'll need to consider
space required for that when deciding how much data to remove from the
table.

> This is mostly transient data, so I don't mind deleting rows, but if some day 
> this could happen in production then I have to know how to deal with it 
> without losing all of the data.

For the future, it would be better to delete more often than waiting
until the table grows too large.  A normal VACUUM will turn space used
by dead tuples back into free space, so if done often enough there
won't be a need to vacuum full.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services




Re: Fwd: Postgresql with nextcloud in Windows Server

2019-04-07 Thread 김준형
Sorry for late but my server works without problem for a while.

> What problem occurs?
> Where is the Windows server?

Problem means Windows server doesn't accept new connection and non-admin
connection.
Only connected admin connection lives.
Windows server is not so close on my work space. So I use Windows server
with remote connection.

> Not sure you know that if you cannot connect to the Windows server?

I know I cannot connect to the Windows server but sometimes I got remote
connection of admin when occurred connection problem.
That time, I had tried to stop the PostgreSQL server but It didn't work.


At last, I leave logs what I got.

1) The Windows system logs

1096 Process closing UDP socket with local port number 64347 is running
longer than expected. The local port number can not be used until the close
operation is completed. This problem is usually caused by malfunctioning
network drivers. Make sure you have the latest updates for third-party
networking software, including Windows and NIC drivers, firewalls, and
other security products.

2) PostgreSQL logs

...
2019-04-06 04:34:03.984 KST [129704] LOG:  connection received:
host=128.1.99.51 port=40602
2019-04-06 04:34:03.990 KST [129704] LOG:  connection authorized:
user=oc_abmiadmin database=abminext
2019-04-06 04:34:05.117 KST [129720] LOG:  connection received:
host=128.1.99.51 port=40604
2019-04-06 04:34:05.125 KST [129720] LOG:  connection authorized:
user=oc_abmiadmin database=abminext
2019-04-06 04:34:05.125 KST [129720] FATAL:  remaining connection slots are
reserved for non-replication superuser connections
2019-04-06 04:34:05.171 KST [129736] LOG:  connection received:
host=128.1.99.51 port=40606
2019-04-06 04:34:05.179 KST [129736] LOG:  connection authorized:
user=oc_abmiadmin database=abminext
2019-04-06 04:34:05.179 KST [129736] FATAL:  remaining connection slots are
reserved for non-replication superuser connections
...

p.s) PostgreSQL max_connections =1200. Is there problem in here?



2019년 4월 2일 (화) 오후 11:30, Adrian Klaver 님이 작성:

> On 4/1/19 10:03 PM, 김준형 wrote:
> >
> > 보낸사람: *김준형* mailto:wnsuddl...@gmail.com>>
> > Date: 2019년 4월 2일 (화) 오후 2:02
> > Subject: Re: Postgresql with nextcloud in Windows Server
> > To: Adrian Klaver  > >
> >
> >
> > Thanks for your reply and I hope this answers can help your questions
> >
> > 1) The nextcloud is running on PostgreSQL server. Cloud system needs
> > PostgreSQL server.
>
> Well I got that backwards. Forget that Nextcloud is an ownCloud fork.
>
> >
> > 2) Nextcloud system try to connect PostgreSQL server all time.
> > 2019-03-27 20:46:59.396 LOG:  connection received: host=xxx.xxx.xxx.xxx
> > port=
> > 2019-03-27 20:46:59.403 LOG:  connection authorized: user=user_name
> > database=db_name
> > 2019-03-27 20:46:59.463 LOG:  disconnection: session time: 0:00:00.067
> > user=user_name database=db_name host=xxx.xxx.xxx.xxx port=
> > this connections repeat almost per 10sec.
> > Other clients well... use this PostgreSQL but not so much.(almost 30
> > people use this PostgreSQL include nextcloud system users)
>
> There is nothing unusual about the above, just shows a client
> successfully connecting and then disconnecting. I set up an ownCloud
> server years ago as a test and as I remember it was constantly talking
> to the Postgres server. In postgresql.conf you could set log_statement =
> 'all' to see what is actually being done by the client(s). Just be aware
> this will generate a lot of logs so you probably want to do this for
> short period only.
>
> >
> >
> > 3) Yes. log files doesn't shows problems clearly. I just checked log
> > files and saw difference when server couldn't connected.
> > 2019-03-27 20:46:59.396 LOG:  connection received: host=xxx.xxx.xxx.xxx
> > port=
> > 2019-03-27 20:46:59.403 LOG:  connection authorized: user=user_name
> > database=db_name
> > this log repeated and no disconnection log.
>
> This shows a successful connection. The disconnection maybe much further
> in the future or has not happened at all. To see current connections
> select from pg_stat_activity:
>
> www.postgresql.org/docs/11/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW
>
> >
> > 4) After problem occur, if try to connect to windows server, windows
> > remote access shows time-out error. PostgreSQL server also too.
>
> What problem occurs?
> Where is the Windows server?
>
> >
> > 5) Before the server doesn't work normally, there is no problem to use
> > PostgreSQL (even nextcloud system, too.)
> >
> > 6) No, It doesn't work. PostgreSQL service status doesn't changed.
>
> Not sure you know that if you cannot connect to the Windows server?
>
> >
> > 7) When I restart server, I check PostgreSQL data and I see some schema
> > data disappeared(only data). log files says 'db system was not properly
> > shut down' so 'automatic recover in progress' when PostgreSQL server
> > started after Windows Server get restarted.
> > I think this 'not properly shut dow

Re: Unable to Vacuum Large Defragmented Table

2019-04-07 Thread Igal Sapir
David,

On Sun, Apr 7, 2019 at 8:11 PM David Rowley 
wrote:

> On Mon, 8 Apr 2019 at 14:57, Igal Sapir  wrote:
> > However, I have now deleted about 50,000 rows more and the table has
> only 119,688 rows.  The pg_relation_size() still shows 31MB and
> pg_total_relation_size() still shows 84GB.
> >
> > It doesn't make sense that after deleting about 30% of the rows the
> values here do not change.
>
> deleting rows does not make the table any smaller, it just creates
> dead rows in the table.  VACUUM tries to release the space used by
> those dead rows and turns it back into free space.  Normal vacuum (not
> FULL) can only shrink the table if completely empty pages are found at
> the end of the table.
>

ACK


>
> > Attempting to copy the data to a different table results in the out of
> disk error as well, so that is in line with your assessment.  But it
> actually just shows the problem.  The new table to which the data was
> copied (though failed due to out of disk) shows 0 rows, but
> pg_total_relation_size() for that table shows 27GB.  So now I have an
> "empty" table that takes 27GB of disk space.
>
> I think the best thing to do is perform a normal VACUUM on the table
>

Running VACUUM on the newly created table cleared the 27GB so that's good
(I was planning to run normal VACUUM but ran FULL).


> then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
> FROM pgstattuple('); and the same again on the toast table.
> If your table still contains many dead rows then perhaps an open
> transaction is stopping rows from being turned into free space.


I am not sure how to read the below.  I see a lot of "free_space" but not
"dead":

-[ RECORD 1 ]--+
?column?   | primary
table_len  | 32055296
tuple_count| 120764
tuple_len  | 9470882
tuple_percent  | 29.55
dead_tuple_count   | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 20713580
free_percent   | 64.62
-[ RECORD 2 ]--+
?column?   | toast
table_len  | 88802156544
tuple_count| 15126830
tuple_len  | 30658625743
tuple_percent  | 34.52
dead_tuple_count   | 0
dead_tuple_len | 0
dead_tuple_percent | 0
free_space | 57653329312
free_percent   | 64.92


> Once pgstattuples reports that "tuple_len" from the table, its toast table
> and all its indexes has been reduced to an acceptable value then you
> should try a VACUUM FULL.  Remember that VACUUM FULL must also write
> WAL, so if WAL is on the same volume, then you'll need to consider
> space required for that when deciding how much data to remove from the
> table.
>

WAL is on the same volume.  The PGDATA directory is mounted in a Docker
container.

Isn't there any way to do an in-place VACUUM or pause the WAL at the risk
of losing some data if recovery is required?

There is a catch-22 here.  I can't reclaim the disk space because that
requires disk space.  Surely I'm not the first one to have encountered that
problem with Postgres.


>
> > This is mostly transient data, so I don't mind deleting rows, but if
> some day this could happen in production then I have to know how to deal
> with it without losing all of the data.
>
> For the future, it would be better to delete more often than waiting
> until the table grows too large.  A normal VACUUM will turn space used
> by dead tuples back into free space, so if done often enough there
> won't be a need to vacuum full.
>

ACK.  This issue came up while implementing a retention policy that will be
enforced regularly.

Thank you for all of your help,

Igal


Re: Unable to Vacuum Large Defragmented Table

2019-04-07 Thread Pavel Stehule
po 8. 4. 2019 v 7:57 odesílatel Igal Sapir  napsal:

> David,
>
> On Sun, Apr 7, 2019 at 8:11 PM David Rowley 
> wrote:
>
>> On Mon, 8 Apr 2019 at 14:57, Igal Sapir  wrote:
>> > However, I have now deleted about 50,000 rows more and the table has
>> only 119,688 rows.  The pg_relation_size() still shows 31MB and
>> pg_total_relation_size() still shows 84GB.
>> >
>> > It doesn't make sense that after deleting about 30% of the rows the
>> values here do not change.
>>
>> deleting rows does not make the table any smaller, it just creates
>> dead rows in the table.  VACUUM tries to release the space used by
>> those dead rows and turns it back into free space.  Normal vacuum (not
>> FULL) can only shrink the table if completely empty pages are found at
>> the end of the table.
>>
>
> ACK
>
>
>>
>> > Attempting to copy the data to a different table results in the out of
>> disk error as well, so that is in line with your assessment.  But it
>> actually just shows the problem.  The new table to which the data was
>> copied (though failed due to out of disk) shows 0 rows, but
>> pg_total_relation_size() for that table shows 27GB.  So now I have an
>> "empty" table that takes 27GB of disk space.
>>
>> I think the best thing to do is perform a normal VACUUM on the table
>>
>
> Running VACUUM on the newly created table cleared the 27GB so that's good
> (I was planning to run normal VACUUM but ran FULL).
>

you can drop some indexes, then you can run vacuum full, and create dropped
indexes again.



>
>> then CREATE EXTENSION IF NOT EXISTS pgstattuple; then do; SELECT *
>> FROM pgstattuple('); and the same again on the toast table.
>> If your table still contains many dead rows then perhaps an open
>> transaction is stopping rows from being turned into free space.
>
>
> I am not sure how to read the below.  I see a lot of "free_space" but not
> "dead":
>
> -[ RECORD 1 ]--+
> ?column?   | primary
> table_len  | 32055296
> tuple_count| 120764
> tuple_len  | 9470882
> tuple_percent  | 29.55
> dead_tuple_count   | 0
> dead_tuple_len | 0
> dead_tuple_percent | 0
> free_space | 20713580
> free_percent   | 64.62
> -[ RECORD 2 ]--+
> ?column?   | toast
> table_len  | 88802156544
> tuple_count| 15126830
> tuple_len  | 30658625743
> tuple_percent  | 34.52
> dead_tuple_count   | 0
> dead_tuple_len | 0
> dead_tuple_percent | 0
> free_space | 57653329312
> free_percent   | 64.92
>


it say, so your table can be reduced about 60%


>
>
>> Once pgstattuples reports that "tuple_len" from the table, its toast table
>> and all its indexes has been reduced to an acceptable value then you
>> should try a VACUUM FULL.  Remember that VACUUM FULL must also write
>> WAL, so if WAL is on the same volume, then you'll need to consider
>> space required for that when deciding how much data to remove from the
>> table.
>>
>
> WAL is on the same volume.  The PGDATA directory is mounted in a Docker
> container.
>
> Isn't there any way to do an in-place VACUUM or pause the WAL at the risk
> of losing some data if recovery is required?
>
> There is a catch-22 here.  I can't reclaim the disk space because that
> requires disk space.  Surely I'm not the first one to have encountered that
> problem with Postgres.
>
>
>>
>> > This is mostly transient data, so I don't mind deleting rows, but if
>> some day this could happen in production then I have to know how to deal
>> with it without losing all of the data.
>>
>> For the future, it would be better to delete more often than waiting
>> until the table grows too large.  A normal VACUUM will turn space used
>> by dead tuples back into free space, so if done often enough there
>> won't be a need to vacuum full.
>>
>
> ACK.  This issue came up while implementing a retention policy that will
> be enforced regularly.
>
> Thank you for all of your help,
>
> Igal
>
>


Re: 10.2: high cpu usage on update statement

2019-04-07 Thread Laurenz Albe
Kevin Wilkinson wrote:
> on 10.2, we're seeing very high cpu usage when doing an update statement 
> on a relatively small table (1GB). one of the updated columns is text, 
> about 1k bytes. there are four threads doing similar updates 
> concurrently to the same table (but different rows). each thread does an 
> update about every two seconds, i.e., the tables gets updated every 1/2 
> second. the stack trace below shows the process stuck in reading the 
> update results. this seems very odd. has anyone seen something similar?  
> this is a modest server of 8 cores, all of which are 90% busy.

Try to profile the server ("perf" on Linux) to see where the time is spent.

Are there any foreign key constraints pointing to the table being updated?
Then make sure that either no key column is updates or that the foreign
keys are indexed.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com