Size on disk of INT and BIGINT - not sure I'm getting it?

2021-05-28 Thread Pól Ua Laoínecháin
Hi all,

I did the following (Asus laptop, Intel core i5, SATA SSD):

BIGINTEGER and INTEGER tables the same size? Comparing tables of 100M records.

Create tables:

=

test=# CREATE TABLE big_int (x BIGINT NOT NULL);
CREATE TABLE


test=# CREATE TABLE int_32 (y INT);
CREATE TABLE



Do my inserts:


=

test=# INSERT INTO big_int SELECT 1 FROM GENERATE_SERIES(1, 1);
INSERT 0 1

test=# INSERT INTO int_32 SELECT 1 FROM GENERATE_SERIES(1, 1);
INSERT 0 1
Time: 115153.540 ms (01:55.154)

=

Checks on my table/record sizes:


tables:



test=# SELECT pg_size_pretty( pg_total_relation_size('big_int'));
 pg_size_pretty

 3458 MB
(1 row)

Time: 0.494 ms



test=# SELECT pg_size_pretty( pg_total_relation_size('int_32'));
 pg_size_pretty

 3458 MB
(1 row)

Time: 0.828 ms

Ooops! *_Exactly_* the same size!

==

Double checked and tried to be more precise:

=

Time: 0.639 ms
test=# SELECT pg_total_relation_size('big_int');
 pg_total_relation_size

 3625803776
(1 row)

Time: 0.355 ms

test=# SELECT pg_total_relation_size('int_32');
 pg_total_relation_size

 3625803776
(1 row)


Same size to the byte!!

=

Final check to see what's going on?

=

test=# SELECT SUM(pg_column_size(x))/1 FROM big_int;
 ?column?
--
8
(1 row)

Time: 3873.954 ms (00:03.874)
test=# SELECT SUM(pg_column_size(y))/1 FROM int_32;
 ?column?
--
4
(1 row)

Time: 3913.429 ms (00:03.913)
test=#

===

I'm trying to make sure that my table takes up 8 bytes for each entry
in the bit_int table and 4 bytes for every entry in int_32.

I'd be grateful for

a) an explanation as to what's going on?

and

b) what do I have to do to ensure that records will take up the space
that I want them to?

This latter part would possibly be covered by a good answer to a).


Any decent references, URLs... &c. much appreciated - although not to
source code please - a bit above my pay grade! :-)


TIA and rgs,


Pól...


Re: Size on disk of INT and BIGINT - not sure I'm getting it?

2021-05-28 Thread Francisco Olarte
Pól:

On Fri, May 28, 2021 at 11:36 AM Pól Ua Laoínecháin  wrote:
> BIGINTEGER and INTEGER tables the same size? Comparing tables of 100M records.
...
> test=# CREATE TABLE big_int (x BIGINT NOT NULL);
> test=# CREATE TABLE int_32 (y INT);
Bear in mind a row has more data than just the fields ( thingx like
xmin,xmax and some others ) and there are alignment issues, and other
things.

If, for example, the whole row needs to be 8-byte aligned on a
synthetic example with a single field the rows may be the same lengths
for both cases, note how your tables ocupy
3458Mb but the data is just 800/400 Mb.

You may want to investgate putting some more realistic data ( i.e.,
try 2/4/8 fields per row ) to gain some insight, and reading
https://www.postgresql.org/docs/current/storage-page-layout.html which
gives some pointers on where to go next.

Francisco Olarte.




Modelling versioning in Postgres

2021-05-28 Thread Laura Smith
Hi

I was wondering what the current thinking is on ways to model versioning in 
Postgres.

The overall premise is that the latest version is the current version unless a 
rollback has occurred, in which case versions get tracked from the rollback 
point (forking ?).

My initial naïve starting point is something along the lines of :

create table objects (
objectID uuid,
versionID uuid,
versionTS timestamp
objectData text
);

This obviously creates a fool-proof answer to "latest version is the current 
version" because its a simple case of an "where objectID=x order by versionTS 
desc limit 1" query.  However it clearly doesn't cover the rollback to prior 
scenarios.

I then though about adding a simple "versionActive boolean".

But the problem with that is it needs hand-holding somewhere because there can 
only be one active version and so it would introduce the need for a "active 
switch" script somewhere that activated the desired version and deactivated the 
others.  It also perhaps is not the right way to deal with tracking of changes 
post-rollback.

How have others approached the problem ?

N.B. If it makes any difference, I'm dealing with a 12.5 install here, but this 
could easily be pushed up to 13 if there are benefits.

Thanks for your time.

Laura




Re: TRUNCATE memory leak with temporary tables?

2021-05-28 Thread Vijaykumar Jain
Yes,
I too see growth when text type is used, but not when int or even fixed
size char(10) is used.

I always thought truncate was similar to delete + vacuum full,
but checking for your scenarios, I did not see an update on
pg_stat_user_table on truncate for vacuums.

then i checked
PostgreSQL Internals: TRUNCATE (pykello.github.io)

to help understand truncation better.

but then i still do not understand how a col type *text* which is dynamic
results in mem growth (coz there are no rows inserted, i understand for
long strings db does work to compress, move them to toast tables etc) but
these are empty rows.

Maybe someone else will be able to explain what is going on.




On Fri, 28 May 2021 at 06:52, Nick Muerdter  wrote:

> I've been seeing what looks like unbounded memory growth (until the OOM
> killer kicks in and kills the postgres process) when running a pl/pgsql
> function that performs TRUNCATE statements against various temporary tables
> in a loop. I think I've been able to come up with some fairly simple
> reproductions of the issue in isolation, but I'm trying to figure out if
> this is a memory leak or of I'm perhaps doing something wrong with tuning
> or other settings.
>
> What I've observed:
>
> - The memory growth occurs if the temp table has indexes or a primary key
> set on it.
> - Alternatively, the memory growth also occurs if the temp table has
> certain column types on it (eg, "text" types).
> - If the table doesn't have indexes and only has integer columns present,
> then the memory growth does *not* occur.
> - I originally saw this against a PostgreSQL 12 server, but I've tested
> this against PostgreSQL 9.6.22, 12.7, and 13.3 Docker containers and
> reproduced it against all versions in the containers.
>
> Here are 2 separate examples that seem to show the memory growth on the
> server (the first being a table with a "text" column, the second example
> having no text column but a primary key index):
>
> DO $$
>   DECLARE
> i bigint;
>   BEGIN
> CREATE TEMPORARY TABLE pg_temp.foo (id integer, bar text);
>
> FOR i IN 1..2 LOOP
>   TRUNCATE pg_temp.foo;
> END LOOP;
>   END
> $$
>
> DO $$
>   DECLARE
> i bigint;
>   BEGIN
> CREATE TEMPORARY TABLE pg_temp.foo (id integer);
> ALTER TABLE pg_temp.foo ADD PRIMARY KEY (id);
>
> FOR i IN 1..2 LOOP
>   TRUNCATE pg_temp.foo;
> END LOOP;
>   END
> $$
>
> Compare that to this example (which doesn't have an index or any other
> column types that trigger this), which does *not* show any memory growth:
>
> DO $$
>   DECLARE
> i bigint;
>   BEGIN
> CREATE TEMPORARY TABLE pg_temp.foo (id integer);
>
> FOR i IN 1..2 LOOP
>   TRUNCATE pg_temp.foo;
> END LOOP;
>   END
> $$
>
> Any help in determining what's going on here (or if there are other ways
> to go about this) would be greatly appreciated!
>
> Thank you!
> Nick
>
>
>

-- 
Thanks,
Vijay
Mumbai, India


Re: TRUNCATE memory leak with temporary tables?

2021-05-28 Thread Ravi Krishna

Truncate is not delete + vaccum.
It creates a new empty table , followed by rename of the existing table to the 
new empty table and finally dropping of the old table.





On May 28, 2021 at 7:05 AM, Vijaykumar Jain  
wrote:


Yes,
I too see growth when text type is used, but not when int or even fixed size 
char(10) is used.


I always thought truncate was similar to delete + vacuum full, 
but checking for your scenarios, I did not see an update on pg_stat_user_table 
on truncate for vacuums.


then i checked 
PostgreSQL Internals: TRUNCATE (pykello.github.io)

to help understand truncation better.


but then i still do not understand how a col type text which is dynamic results 
in mem growth (coz there are no rows inserted, i understand for long strings db 
does work to compress, move them to toast tables etc) but these are empty rows.


Maybe someone else will be able to explain what is going on.








On Fri, 28 May 2021 at 06:52, Nick Muerdter  wrote:

I've been seeing what looks like unbounded memory growth (until the OOM killer 
kicks in and kills the postgres process) when running a pl/pgsql function that 
performs TRUNCATE statements against various temporary tables in a loop. I 
think I've been able to come up with some fairly simple reproductions of the 
issue in isolation, but I'm trying to figure out if this is a memory leak or of 
I'm perhaps doing something wrong with tuning or other settings.

What I've observed:

- The memory growth occurs if the temp table has indexes or a primary key set 
on it.
- Alternatively, the memory growth also occurs if the temp table has certain column types 
on it (eg, "text" types).
- If the table doesn't have indexes and only has integer columns present, then 
the memory growth does *not* occur.
- I originally saw this against a PostgreSQL 12 server, but I've tested this 
against PostgreSQL 9.6.22, 12.7, and 13.3 Docker containers and reproduced it 
against all versions in the containers.

Here are 2 separate examples that seem to show the memory growth on the server (the first 
being a table with a "text" column, the second example having no text column 
but a primary key index):

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer, bar text);

    FOR i IN 1..2 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer);
    ALTER TABLE pg_temp.foo ADD PRIMARY KEY (id);

    FOR i IN 1..2 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

Compare that to this example (which doesn't have an index or any other column 
types that trigger this), which does *not* show any memory growth:

DO $$
  DECLARE
    i bigint;
  BEGIN
    CREATE TEMPORARY TABLE pg_temp.foo (id integer);

    FOR i IN 1..2 LOOP
      TRUNCATE pg_temp.foo;
    END LOOP;
  END
$$

Any help in determining what's going on here (or if there are other ways to go 
about this) would be greatly appreciated!

Thank you!
Nick







--

Thanks,
Vijay
Mumbai, India

Re: TRUNCATE memory leak with temporary tables?

2021-05-28 Thread Ahmet Demir
Hi Ravi,

I am not sure about that
"It creates a new empty table , followed by rename of the existing table to
the new empty table and finally dropping of the old table."

You mean table is re-created with new oid?

thanks
Ahmet

On Fri, 28 May 2021 at 15:10, Ravi Krishna  wrote:

> Truncate is not delete + vaccum.
> It creates a new empty table , followed by rename of the existing table to
> the new empty table and finally dropping of the old table.
>
> On May 28, 2021 at 7:05 AM, Vijaykumar Jain <
> vijaykumarjain.git...@gmail.com> wrote:
>
> Yes,
> I too see growth when text type is used, but not when int or even fixed
> size char(10) is used.
>
> I always thought truncate was similar to delete + vacuum full,
> but checking for your scenarios, I did not see an update on
> pg_stat_user_table on truncate for vacuums.
>
> then i checked
> PostgreSQL Internals: TRUNCATE (pykello.github.io)
> 
> to help understand truncation better.
>
> but then i still do not understand how a col type *text* which is dynamic
> results in mem growth (coz there are no rows inserted, i understand for
> long strings db does work to compress, move them to toast tables etc) but
> these are empty rows.
>
> Maybe someone else will be able to explain what is going on.
>
>
>
>
> On Fri, 28 May 2021 at 06:52, Nick Muerdter  wrote:
>
>> I've been seeing what looks like unbounded memory growth (until the OOM
>> killer kicks in and kills the postgres process) when running a pl/pgsql
>> function that performs TRUNCATE statements against various temporary tables
>> in a loop. I think I've been able to come up with some fairly simple
>> reproductions of the issue in isolation, but I'm trying to figure out if
>> this is a memory leak or of I'm perhaps doing something wrong with tuning
>> or other settings.
>>
>> What I've observed:
>>
>> - The memory growth occurs if the temp table has indexes or a primary key
>> set on it.
>> - Alternatively, the memory growth also occurs if the temp table has
>> certain column types on it (eg, "text" types).
>> - If the table doesn't have indexes and only has integer columns present,
>> then the memory growth does *not* occur.
>> - I originally saw this against a PostgreSQL 12 server, but I've tested
>> this against PostgreSQL 9.6.22, 12.7, and 13.3 Docker containers and
>> reproduced it against all versions in the containers.
>>
>> Here are 2 separate examples that seem to show the memory growth on the
>> server (the first being a table with a "text" column, the second example
>> having no text column but a primary key index):
>>
>> DO $$
>>   DECLARE
>> i bigint;
>>   BEGIN
>> CREATE TEMPORARY TABLE pg_temp.foo (id integer, bar text);
>>
>> FOR i IN 1..2 LOOP
>>   TRUNCATE pg_temp.foo;
>> END LOOP;
>>   END
>> $$
>>
>> DO $$
>>   DECLARE
>> i bigint;
>>   BEGIN
>> CREATE TEMPORARY TABLE pg_temp.foo (id integer);
>> ALTER TABLE pg_temp.foo ADD PRIMARY KEY (id);
>>
>> FOR i IN 1..2 LOOP
>>   TRUNCATE pg_temp.foo;
>> END LOOP;
>>   END
>> $$
>>
>> Compare that to this example (which doesn't have an index or any other
>> column types that trigger this), which does *not* show any memory growth:
>>
>> DO $$
>>   DECLARE
>> i bigint;
>>   BEGIN
>> CREATE TEMPORARY TABLE pg_temp.foo (id integer);
>>
>> FOR i IN 1..2 LOOP
>>   TRUNCATE pg_temp.foo;
>> END LOOP;
>>   END
>> $$
>>
>> Any help in determining what's going on here (or if there are other ways
>> to go about this) would be greatly appreciated!
>>
>> Thank you!
>> Nick
>>
>>
>>
>
> --
> Thanks,
> Vijay
> Mumbai, India
>
>


Re: TRUNCATE memory leak with temporary tables?

2021-05-28 Thread Ravi Krishna

I am not sure about that
"It creates a new empty table , followed by rename of the existing table to the new 
empty table and finally dropping of the old table."

You mean table is re-created with new oid?


I don't think oid changes, but the file relnode on the disk changes. So let me 
rephrase it


truncate does the following:


1 - create a new empty file on the disk.
2 - at commit time, map the table oid to the new empty file.
3 - drop the old file.



Re: Modelling versioning in Postgres

2021-05-28 Thread Michael van der Kolff
One thing you could consider is a range type for your "versionTS" field
instead of a single point in time.

So that would be:

CREATE TABLE objects (
  objectID uuid,
  versionID uuid,
  validRange tsrange,
  objectData text,
);

See https://www.postgresql.org/docs/12.5/rangetypes.html for more
information.

In particular, you can enforce the obvious business rule, that there is no
objectID with overlapping validRanges (as long as you have the btree_gist
extension):

CREATE EXTENSION btree_gist;
CREATE TABLE objects (
  objectID uuid,
  versionID uuid,
  validRange tsrange,
  objectData text,
  EXCLUDE USING GIST(objectID WITH =, validRange WITH &&)
);

On Fri, May 28, 2021 at 8:20 PM Laura Smith <
n5d9xq3ti233xiyif...@protonmail.ch> wrote:

> Hi
>
> I was wondering what the current thinking is on ways to model versioning
> in Postgres.
>
> The overall premise is that the latest version is the current version
> unless a rollback has occurred, in which case versions get tracked from the
> rollback point (forking ?).
>
> My initial naïve starting point is something along the lines of :
>
> create table objects (
> objectID uuid,
> versionID uuid,
> versionTS timestamp
> objectData text
> );
>
> This obviously creates a fool-proof answer to "latest version is the
> current version" because its a simple case of an "where objectID=x order by
> versionTS desc limit 1" query.  However it clearly doesn't cover the
> rollback to prior scenarios.
>
> I then though about adding a simple "versionActive boolean".
>
> But the problem with that is it needs hand-holding somewhere because there
> can only be one active version and so it would introduce the need for a
> "active switch" script somewhere that activated the desired version and
> deactivated the others.  It also perhaps is not the right way to deal with
> tracking of changes post-rollback.
>
> How have others approached the problem ?
>
> N.B. If it makes any difference, I'm dealing with a 12.5 install here, but
> this could easily be pushed up to 13 if there are benefits.
>
> Thanks for your time.
>
> Laura
>
>
>


Re: TRUNCATE memory leak with temporary tables?

2021-05-28 Thread Tom Lane
Vijaykumar Jain  writes:
> I too see growth when text type is used, but not when int or even fixed
> size char(10) is used.
> ...
> but then i still do not understand how a col type *text* which is dynamic
> results in mem growth (coz there are no rows inserted, i understand for
> long strings db does work to compress, move them to toast tables etc) but
> these are empty rows.

The text column would cause the table to have an associated toast table [1],
which in turn would have an index.  Both of those would be reallocated as
new files on-disk during TRUNCATE, just like the table proper.

A plausible theory here is that TRUNCATE leaks some storage associated
with an index's relcache entry, but not any for a plain table.

regards, tom lane

[1] https://www.postgresql.org/docs/current/storage-toast.html




Re: TRUNCATE memory leak with temporary tables?

2021-05-28 Thread Vijaykumar Jain
i tried to reproduce tracking mem allocation.

demo=# DO $$
DECLAREi bigint;
BEGIN
CREATE TEMPORARY TABLE pg_temp.foo (id int) with (
AUTOVACUUM_ENABLED = 0, TOAST.AUTOVACUUM_ENABLED = 0 );
FOR i IN 1..2 LOOP
TRUNCATE pg_temp.foo;
END LOOP;
END
$$;

in a parallel tmux session.

strace -p 1620 --trace=memory

no movement/ no new output




when i replace the col with type *text*.

demo=# DO $$
DECLAREi bigint;
BEGIN
CREATE TEMPORARY TABLE pg_temp.foo (id *text*) with (
AUTOVACUUM_ENABLED = 0, TOAST.AUTOVACUUM_ENABLED = 0 );
FOR i IN 1..2 LOOP
TRUNCATE pg_temp.foo;
END LOOP;
END
$$;

strace -p 1620 --trace=memory
strace: Process 1620 attached
--- SIGINT {si_signo=SIGINT, si_code=SI_USER, si_pid=1878, si_uid=1001} ---
brk(0x556c502ad000) = 0x556c502ad000
brk(0x556c502ed000) = 0x556c502ed000
brk(0x556c5036d000) = 0x556c5036d000
brk(0x556c5046d000) = 0x556c5046d000
brk(0x556c5066d000) = 0x556c5066d000
brk(0x556c50a6d000) = 0x556c50a6d000
brk(0x556c5126d000) = 0x556c5126d000

brk(2) - Linux manual page (man7.org)

it seems it does try memory allocation repeatedly.
I am not a C developer :), please ignore if i am diverting.




On Fri, 28 May 2021 at 18:52, Tom Lane  wrote:

> Vijaykumar Jain  writes:
> > I too see growth when text type is used, but not when int or even fixed
> > size char(10) is used.
> > ...
> > but then i still do not understand how a col type *text* which is dynamic
> > results in mem growth (coz there are no rows inserted, i understand for
> > long strings db does work to compress, move them to toast tables etc) but
> > these are empty rows.
>
> The text column would cause the table to have an associated toast table
> [1],
> which in turn would have an index.  Both of those would be reallocated as
> new files on-disk during TRUNCATE, just like the table proper.
>
> A plausible theory here is that TRUNCATE leaks some storage associated
> with an index's relcache entry, but not any for a plain table.
>
> regards, tom lane
>
> [1] https://www.postgresql.org/docs/current/storage-toast.html
>


-- 
Thanks,
Vijay
Mumbai, India


Re: TRUNCATE memory leak with temporary tables?

2021-05-28 Thread Nick Muerdter
On Fri, May 28, 2021, at 7:22 AM, Tom Lane wrote:
> The text column would cause the table to have an associated toast table [1],
> which in turn would have an index.  Both of those would be reallocated as
> new files on-disk during TRUNCATE, just like the table proper.
> 
> A plausible theory here is that TRUNCATE leaks some storage associated
> with an index's relcache entry, but not any for a plain table.
> 
>   regards, tom lane
> 
> [1] https://www.postgresql.org/docs/current/storage-toast.html

Yeah, I forgot to mention this originally, but I see memory growth against a 
"varchar(501)" field, but *not* against a "varchar(500)" field, so I was 
wondering if there was some length threshold that triggered something with 
toast table behavior somehow involved. But if the toast table involves an 
index, then maybe all of this gets back to just the indexes like you say.

And I originally thought this issue was limited to temp tables, but now I'm not 
so sure. I seem to be able to reproduce the memory growth against regular 
tables (both normal and UNLOGGED) too:

DO $$
  DECLARE
i bigint;
  BEGIN
CREATE TABLE public.foo (id integer, bar text);

FOR i IN 1..2 LOOP
  TRUNCATE public.foo;
END LOOP;
  END
$$

The memory growth seems to be slower in this case, so maybe that's why I didn't 
catch it earlier, but I think it's maybe growing at the same rate, it's just 
that this loop goes slower against real tables than temp tables. For example, I 
see similar total memory growth by the time this reaches 100,000 loops for 
either temp or non-temp tables, the temp version just reaches that point a lot 
more quickly (which makes sense).

Thanks!
Nick




Re: How long to get a password reset ???

2021-05-28 Thread Adrian Klaver

On 5/27/21 8:42 PM, Dean Gibson (DB Administrator) wrote:

It's pretty simple:

 1. Not having used this mailing list for a while, I went to
https://lists.postgresql.org/ to make sure my settings were as I
wanted them.
 2. I attempted to log in with the above eMail address, which is
obviously part of this list, since I receive messages to that
address from this list.
 3. All known passwords failed.
 4. I clicked on the link to the "password reset" form, entered my above
eMail address, & pressed "Reset Password."  The response was a web
page that said I'd soon get an eMail for resetting my password.
 5. The rest is history.  Meaning zero history (response).  And yes, I
did it twice to make sure I didn't type the eMail address
incorrectly (I cut & pasted the 2nd time).  I can try again if it
will help (barring the usual definition of insanity).


I just tried it and it worked. You might want to check whether the email 
got caught in a Spam filter. Otherwise give it another try.



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: How long to get a password reset ???

2021-05-28 Thread Magnus Hagander
On Fri, May 28, 2021 at 5:42 AM Dean Gibson (DB Administrator)
 wrote:
>
> It's pretty simple:
>
> Not having used this mailing list for a while, I went to 
> https://lists.postgresql.org/ to make sure my settings were as I wanted them.
> I attempted to log in with the above eMail address, which is obviously part 
> of this list, since I receive messages to that address from this list.
> All known passwords failed.
> I clicked on the link to the "password reset" form, entered my above eMail 
> address, & pressed "Reset Password."  The response was a web page that said 
> I'd soon get an eMail for resetting my password.
> The rest is history.  Meaning zero history (response).  And yes, I did it 
> twice to make sure I didn't type the eMail address incorrectly (I cut & 
> pasted the 2nd time).  I can try again if it will help (barring the usual 
> definition of insanity).
>
>
> That was six hours ago.  Since it looks like I (now) have satisfactory access 
> to the list (I was moderated for a few hours), I will post (in a separate 
> thread) my PostgreSQL issue, & wait for the next full moon (26 days) for my 
> password reset.

You do not appear to have an account under your ultimeth.com address,
or under the mailpen.com address you used in your first email.

However, if you have not used the website for a long time, it is
possible your subscription was migrated over from the old
installation, in which case it can exist without an email address. In
this case, you can just go ahead and create a new account with your
email address, and it will automatically get connected to your
existing subscriptions, and you will be set.

(And in general, please send questions like this to
webmas...@postgresql.org, so they don't have to go out to thousands of
people. They'll still be picked up of course, but that will be more
targeted.)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: AWS forcing PG upgrade from v9.6 a disaster

2021-05-28 Thread Adrian Klaver

On 5/27/21 8:41 PM, Dean Gibson (DB Administrator) wrote:
I started to use PostgreSQL v7.3 in 2003 on my home Linux systems (4 at 
one point), gradually moving to v9.0 w/ replication in 2010.  In 2017 I 
moved my 20GB database to AWS/RDS, gradually upgrading to v9.6, & was 
entirely satisfied with the result.


In March of this year, AWS announced that v9.6 was nearing end of 
support, & AWS would forcibly upgrade everyone to v12 on January 22, 
2022, if users did not perform the upgrade earlier.  My first attempt 
was successful as far as the upgrade itself, but complex queries that 
normally ran in a couple of seconds on v9.x, were taking minutes in v12.


Did you run a plain 
ANALYZE(https://www.postgresql.org/docs/12/sql-analyze.html) on the 
tables in the new install?




I didn't have the time in March to diagnose the problem, other than some 
futile adjustments to server parameters, so I reverted back to a saved 
copy of my v9.6 data.


On Sunday, being retired, I decided to attempt to solve the issue in 
earnest.  I have now spent five days (about 14 hours a day), trying 
various things.  Keeping the v9.6 data online for web users, I've 
"forked" the data into a new copy, & updated it in turn to PostgreSQL 
v10, v11, v12, & v13.  All exhibit the same problem: As you will see 
below, it appears that versions 10 & above are doing a sequential scan 
of some of the "large" (200K rows) tables. Note that the expected & 
actual run times for v9.6 & v13.2 both differ by more than *two orders 
of magnitude*. Rather than post a huge eMail (ha ha), I'll start with 
this one, that shows an "EXPLAIN ANALYZE" from both v9.6 & v13.2, 
followed by the related table & view definitions.  With one exception, 
table definitions are from the FCC (Federal Communications Commission); 
the view definitions are my own.







--
Adrian Klaver
adrian.kla...@aklaver.com




Re: TRUNCATE memory leak with temporary tables?

2021-05-28 Thread Tom Lane
"Nick Muerdter"  writes:
> I've been seeing what looks like unbounded memory growth (until the OOM 
> killer kicks in and kills the postgres process) when running a pl/pgsql 
> function that performs TRUNCATE statements against various temporary tables 
> in a loop. I think I've been able to come up with some fairly simple 
> reproductions of the issue in isolation, but I'm trying to figure out if this 
> is a memory leak or of I'm perhaps doing something wrong with tuning or other 
> settings.

Hmm, so the "leak" occurs here:

#0  AllocSetAlloc (context=0x2b70820, size=528) at aset.c:730
#1  0x009617fc in MemoryContextAlloc (context=0x2b70820, 
size=size@entry=528) at mcxt.c:875
#2  0x009234b9 in AddInvalidationMessage (listHdr=0x2b70be8, 
msg=msg@entry=0x7ffdb35893d0) at inval.c:244
#3  0x009235c0 in AddRelcacheInvalidationMessage (relId=37554, 
dbId=, hdr=) at inval.c:414
#4  RegisterRelcacheInvalidation (dbId=, relId=37554)
at inval.c:520
#5  0x00923b87 in CacheInvalidateRelcacheByTuple (
classTuple=classTuple@entry=0x2bf82e8) at inval.c:1328
#6  0x005da654 in index_update_stats (rel=0x7f23b49eb4d0, 
hasindex=, reltuples=) at index.c:2872
#7  0x005dbca2 in index_build (
heapRelation=heapRelation@entry=0x7f23b49eb4d0, 
indexRelation=indexRelation@entry=0x7f23b49eb900, 
indexInfo=indexInfo@entry=0x2bb5250, isreindex=isreindex@entry=true, 
parallel=parallel@entry=false) at index.c:3051
#8  0x005d66be in RelationTruncateIndexes (
heapRelation=heapRelation@entry=0x7f23b49eb4d0) at heap.c:3305
#9  0x005d9875 in heap_truncate_one_rel (rel=rel@entry=0x7f23b49e72b0)
at heap.c:3387
#10 0x0068047f in ExecuteTruncateGuts (explicit_rels=0x2bb4b78, 
relids=, relids_logged=0x0, behavior=DROP_RESTRICT, 
restart_seqs=false) at tablecmds.c:1957
#11 0x00680b65 in ExecuteTruncate (stmt=0x2bcd538) at tablecmds.c:1709

That is, we're accumulating a record of system catalog invalidation
events, which have to be kept until end of transaction when they'll
be sent out to other backends.  (For the case of events on temp
tables, maybe that wouldn't be strictly necessary, but this certainly
can't be avoided for normal tables.)  So it's not really a leak, but
just a record of unfinished work caused by the TRUNCATEs.

However ... what seems odd is that we only get an inval message when
there's an index involved.  In principle such messages need to be
issued against the table as well.  I suspect that we're optimizing
away the message for the table, either on the grounds that it's temp
or that it's new in the current transaction; but the index code path
didn't get the same TLC.

Alternatively, maybe it's a bug that there's no message for the table.

regards, tom lane




WARNING: oldest xmin is far in the past

2021-05-28 Thread Alban Hertroys

Good day,

We have a PG 11.11 instance here that serves as a data-warehouse for us. 
This morning I was investigating an issue with our ETL's and discovered 
this error in the logs, that keeps repeating:


2021-05-28 15:01:54.094 CEST [20164]   WARNING:  oldest xmin is far in 
the past
2021-05-28 15:01:54.094 CEST [20164]   HINT:  Close open transactions 
soon to avoid wraparound problems.
    You might also need to commit or roll back old prepared 
transactions, or drop stale replication slots.


We don't have any idle in transaction sessions, but we do have a 
replication slot that turns out to have been inactive for an unknown while.


The current situation around our xid's is this:

avbv=# select slot_name, slot_type, database, active, catalog_xmin, 
restart_lsn, confirmed_flush_lsn from pg_replication_slots ;
   slot_name   | slot_type | database | active | catalog_xmin | 
restart_lsn  | confirmed_flush_lsn

---+---+--++--+---+-
 debezium_prod | logical   | avbv | t  |    616648922 | 
1166/C45B5140 | 1167/65C7AA0

(1 row)

avbv=# select datname, datfrozenxid from pg_database ;
    datname    | datfrozenxid
---+--
 postgres  |    610128180
 speeltuin |    610128180
 template1 |    610128180
 template0 |    591773830
 reportinfo    |    610128180
 avbv_20190314 |    610128180
 avbv  |    610128180
 ensfocus-tst  |    610128180
 ensfocus  |    610128180
 ensfocuswf8   |    610128180
 portal_prd    |    610128180
 portal_tst    |    610128180
(12 rows)

Clearly, the gap between the higher frozen xid's (610128180) and the 
replication slots xmin (616648922 ) is rather small; a mere 650k xid's 
apart.


We have that single logical replication slot that Debezium subscribes 
to, to push committed records for some tables to Kafka. Those are tables 
that get frequent inserts, a batch of new records arrives about every 15 
minutes, 24/7.


As mentioned, initially when I detected this problem, the Debezium 
connector (the subscriber) had failed to attach. Restarting it fixed 
that (that's a known issue that was recently discovered in the current 
version 1.4.0). I had hopes the xmin issue would be gone once it caught 
up, but it did catch up earlier today and the issue remains...


I did already take several actions in attempts to solve the issue, so 
far to little avail:


* I restarted the database, closing any idle in transaction sessions 
that might have gone unnoticed otherwise
* I ran vacuum -a -U postgres, which printed a number of repetitions of 
the same error message on the console

* I ran vacuum -a -F -U postgres
* I added a heartbeat interval of 1ms (10s) to the Debezium 
connector, although I didn't think that was necessary


Should I just wait for the replication slot xmin to increase into a safe 
area? It is slowly increasing, while the frozen xid's have remained the 
same while monitoring this issue.

Or is there some action I should take?



For the record:

avbv=# select version();
 version
--
 PostgreSQL 11.11 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 
20150623 (Red Hat 4.8.5-44), 64-bit

(1 row)


Regards,

Alban Hertroys

P.S. Sorry about below company disclaimer, there is nothing I can do 
about that.




Alban Hertroys 
D: 8776 |M:  |T: +31 (0)53 4888 888 | E: alban.hertr...@apollotyres.com

Apollo Vredestein B.V.| Ir. E.L.C. Schiffstraat 370, 7547 RD Enschede, The 
Netherlands
Chamber of Commerce number: 34223268

   
The information contained in this e-mail is intended solely for the use of the individual or entity to whom it is addressed. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or action in relation to the contents of this information is strictly prohibited and may be unlawful and request you to delete this message and any attachments and advise the sender by return e-mail. The confidentiality of this message is not warranted. Apollo Vredestein and its subsidiaries rule out any and every liability resulting from this or any other electronic transmission.
Please consider the environment before printing this e-mail 



CIN: L25111KL1972PLC002449

Registered Office: Apollo Tyres Ltd, 3rd Floor, Areekal Mansion, Panampilly 
Nagar, Kochi  682036, India



Disclaimer:

The information contained in this e-mail is intended solely for the use of the 
individual or entity to whom it is addressed. If you are not the intended 
recipient, you are hereby notified that any disclosure, copying, distribution 
or action in relation to the contents of this information is strictly 
prohibited and may be unlawful and request you to delete this message and any 
attachments and advise the sender by return e-mail. The confidentiality of

Re: WARNING: oldest xmin is far in the past

2021-05-28 Thread Vijaykumar Jain
If the replication slot is still inactive,
It will prevent vacuum to do the cleanup no matter how much vacuum is run
manually.

did the slot show as active after the restart of the collector ?

If it is active then may be increase maintenance_work_mem to a aggresive
value and lower nap time for auto vacuum to ensure it gets priority to get
the cleanup done quickly.

https://www.cybertec-postgresql.com/en/reasons-why-vacuum-wont-remove-dead-rows/
https://www.cybertec-postgresql.com/en/tuning-autovacuum-postgresql/


On Fri, May 28, 2021, 8:57 PM Alban Hertroys 
wrote:

> Good day,
>
> We have a PG 11.11 instance here that serves as a data-warehouse for us.
> This morning I was investigating an issue with our ETL's and discovered
> this error in the logs, that keeps repeating:
>
> 2021-05-28 15:01:54.094 CEST [20164]   WARNING:  oldest xmin is far in the
> past
> 2021-05-28 15:01:54.094 CEST [20164]   HINT:  Close open transactions soon
> to avoid wraparound problems.
> You might also need to commit or roll back old prepared
> transactions, or drop stale replication slots.
>
> We don't have any idle in transaction sessions, but we do have a
> replication slot that turns out to have been inactive for an unknown while.
>
> The current situation around our xid's is this:
>
> avbv=# select slot_name, slot_type, database, active, catalog_xmin,
> restart_lsn, confirmed_flush_lsn from pg_replication_slots ;
>slot_name   | slot_type | database | active | catalog_xmin |
> restart_lsn  | confirmed_flush_lsn
>
> ---+---+--++--+---+-
>  debezium_prod | logical   | avbv | t  |616648922 |
> 1166/C45B5140 | 1167/65C7AA0
> (1 row)
>
> avbv=# select datname, datfrozenxid from pg_database ;
> datname| datfrozenxid
> ---+--
>  postgres  |610128180
>  speeltuin |610128180
>  template1 |610128180
>  template0 |591773830
>  reportinfo|610128180
>  avbv_20190314 |610128180
>  avbv  |610128180
>  ensfocus-tst  |610128180
>  ensfocus  |610128180
>  ensfocuswf8   |610128180
>  portal_prd|610128180
>  portal_tst|610128180
> (12 rows)
>
> Clearly, the gap between the higher frozen xid's (610128180) and the
> replication slots xmin (616648922 ) is rather small; a mere 650k xid's
> apart.
>
> We have that single logical replication slot that Debezium subscribes to,
> to push committed records for some tables to Kafka. Those are tables that
> get frequent inserts, a batch of new records arrives about every 15
> minutes, 24/7.
>
> As mentioned, initially when I detected this problem, the Debezium
> connector (the subscriber) had failed to attach. Restarting it fixed that
> (that's a known issue that was recently discovered in the current version
> 1.4.0). I had hopes the xmin issue would be gone once it caught up, but it
> did catch up earlier today and the issue remains...
>
> I did already take several actions in attempts to solve the issue, so far
> to little avail:
>
> * I restarted the database, closing any idle in transaction sessions that
> might have gone unnoticed otherwise
> * I ran vacuum -a -U postgres, which printed a number of repetitions of
> the same error message on the console
> * I ran vacuum -a -F -U postgres
> * I added a heartbeat interval of 1ms (10s) to the Debezium connector,
> although I didn't think that was necessary
>
> Should I just wait for the replication slot xmin to increase into a safe
> area? It is slowly increasing, while the frozen xid's have remained the
> same while monitoring this issue.
> Or is there some action I should take?
>
>
>
> For the record:
>
> avbv=# select version();
>  version
>
> --
>  PostgreSQL 11.11 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
> 20150623 (Red Hat 4.8.5-44), 64-bit
> (1 row)
>
>
> Regards,
>
> Alban Hertroys
>
> P.S. Sorry about below company disclaimer, there is nothing I can do about
> that.
>
>
>
> *Alban Hertroys *
> D: 8776 |M:  |T: +31 (0)53 4888 888 | E: alban.hertr...@apollotyres.com
> Apollo Vredestein B.V.| Ir. E.L.C. Schiffstraat 370, 7547 RD Enschede, The
> Netherlands
> Chamber of Commerce number: 34223268
>
>
>
>
> *The information contained in this e-mail is intended solely for the use
> of the individual or entity to whom it is addressed. If you are not the
> intended recipient, you are hereby notified that any disclosure, copying,
> distribution or action in relation to the contents of this information is
> strictly prohibited and may be unlawful and request you to delete this
> message and any attachments and advise the sender by return e-mail. The
> confidentiality of this message is not warranted. Apollo Vredestein and its
> subsidiaries rule out any and every li

Re: How long to get a password reset ???

2021-05-28 Thread Ron

On 5/28/21 10:02 AM, Magnus Hagander wrote:

On Fri, May 28, 2021 at 5:42 AM Dean Gibson (DB Administrator)
 wrote:

It's pretty simple:

Not having used this mailing list for a while, I went to 
https://lists.postgresql.org/ to make sure my settings were as I wanted them.
I attempted to log in with the above eMail address, which is obviously part of 
this list, since I receive messages to that address from this list.
All known passwords failed.
I clicked on the link to the "password reset" form, entered my above eMail address, & 
pressed "Reset Password."  The response was a web page that said I'd soon get an eMail for 
resetting my password.
The rest is history.  Meaning zero history (response).  And yes, I did it twice to 
make sure I didn't type the eMail address incorrectly (I cut & pasted the 2nd 
time).  I can try again if it will help (barring the usual definition of insanity).


That was six hours ago.  Since it looks like I (now) have satisfactory access to 
the list (I was moderated for a few hours), I will post (in a separate thread) my 
PostgreSQL issue, & wait for the next full moon (26 days) for my password reset.

You do not appear to have an account under your ultimeth.com address,
or under the mailpen.com address you used in your first email.

However, if you have not used the website for a long time, it is
possible your subscription was migrated over from the old
installation, in which case it can exist without an email address.


This must be why I wasn't able to reset my password.


  In
this case, you can just go ahead and create a new account with your
email address, and it will automatically get connected to your
existing subscriptions, and you will be set.


This important info should be mentioned at https://www.postgresql.org/list/ 
to eliminate a lot of angst and confusion.


--
Angular momentum makes the world go 'round.




Re: TRUNCATE memory leak with temporary tables?

2021-05-28 Thread Tom Lane
I wrote:
> However ... what seems odd is that we only get an inval message when
> there's an index involved.  In principle such messages need to be
> issued against the table as well.  I suspect that we're optimizing
> away the message for the table, either on the grounds that it's temp
> or that it's new in the current transaction; but the index code path
> didn't get the same TLC.

Yeah, after a bit of digging, ExecuteTruncateGuts knows this:

 * Normally, we need a transaction-safe truncation here.  However, if
 * the table was either created in the current (sub)transaction or has
 * a new relfilenode in the current (sub)transaction, then we can just
 * truncate it in-place, because a rollback would cause the whole
 * table or the current physical file to be thrown away anyway.

The in-place-truncate code path isn't issuing any invalidation message,
which seems okay as far as I can think at the moment.  But that only
applies to truncating the table proper.  Indexes are "truncated" by
rebuilding them as empty, so that any required infrastructure such
as metapages will be recreated properly.  So that's largely the same
code path as a REINDEX, and index.c is careful to send out an inval
for that:

 * NOTE: an important side-effect of this operation is that an SI invalidation
 * message is sent out to all backends --- including me --- causing relcache
 * entries to be flushed or updated with the new data.  This must happen even
 * if we find that no change is needed in the pg_class row.  When updating
 * a heap entry, this ensures that other backends find out about the new
 * index.  When updating an index, it's important because some index AMs
 * expect a relcache flush to occur after REINDEX.

Maybe that could be optimized in this situation, but it's not totally
clear how, especially given the angle about index AM expectations.

Anyway, I don't see a lot of low-hanging fruit here as far as avoiding
storing such messages is concerned.

We could reduce the rate of the leak if inval.c were more stingy about
memory allocation for the invalidation message list.  In this particular
example, it's allocating space for 32 (FIRSTCHUNKSIZE) messages per
statement, but then only using 2.  So perhaps FIRSTCHUNKSIZE could be
reduced?  I'm not sure that anyone has ever made a survey of how many
messages different sorts of DDL tend to produce, so I suspect that number
was mostly plucked from the air.

A more aggressive idea that I've thought of but never done anything about
is that once we've accumulated more than a certain number of messages,
maybe we should just figure on issuing a full cache reset rather than
continuing to track individual inval events.  This'd not only bound
the storage required for pending invals, but very possibly be more
efficient when the time comes to actually process them.

regards, tom lane




Re: How different is AWS-RDS postgres?

2021-05-28 Thread Rob Sargent
Found it.  Though there are indeed separate transactions for the bulk 
copy  v. target update, I had missed that there is an over-arching 
transaction in play and of course when things go south, /all/ things go 
south.


I get away with this, for now by bumping the tomcat idle time to an 
hour, which is serious overkill.











Re: Modelling versioning in Postgres

2021-05-28 Thread Laura Smith
Thanks both for the interesting idea of using tsrange, but also for introducing 
me to EXCLUDE USING GIST, I had never heard of it before.

Have a good weekend

‐‐‐ Original Message ‐‐‐
On Friday, 28 May 2021 14:13, Michael van der Kolff  
wrote:

> One thing you could consider is a range type for your "versionTS" field 
> instead of a single point in time.
>
> So that would be:
>
> CREATE TABLE objects (
>   objectID uuid,
>   versionID uuid,
>   validRange tsrange,
>   objectData text,
> );
>
> See https://www.postgresql.org/docs/12.5/rangetypes.html for more information.
>
> In particular, you can enforce the obvious business rule, that there is no 
> objectID with overlapping validRanges (as long as you have the btree_gist 
> extension):
>
> CREATE EXTENSION btree_gist;
> CREATE TABLE objects (
>   objectID uuid,
>   versionID uuid,
>   validRange tsrange,
>   objectData text,
>   EXCLUDE USING GIST(objectID WITH =, validRange WITH &&)
> );
>
> On Fri, May 28, 2021 at 8:20 PM Laura Smith 
>  wrote:
>
> > Hi
> >
> > I was wondering what the current thinking is on ways to model versioning in 
> > Postgres.
> >
> > The overall premise is that the latest version is the current version 
> > unless a rollback has occurred, in which case versions get tracked from the 
> > rollback point (forking ?).
> >
> > My initial naïve starting point is something along the lines of :
> >
> > create table objects (
> > objectID uuid,
> > versionID uuid,
> > versionTS timestamp
> > objectData text
> > );
> >
> > This obviously creates a fool-proof answer to "latest version is the 
> > current version" because its a simple case of an "where objectID=x order by 
> > versionTS desc limit 1" query.  However it clearly doesn't cover the 
> > rollback to prior scenarios.
> >
> > I then though about adding a simple "versionActive boolean".
> >
> > But the problem with that is it needs hand-holding somewhere because there 
> > can only be one active version and so it would introduce the need for a 
> > "active switch" script somewhere that activated the desired version and 
> > deactivated the others.  It also perhaps is not the right way to deal with 
> > tracking of changes post-rollback.
> >
> > How have others approached the problem ?
> >
> > N.B. If it makes any difference, I'm dealing with a 12.5 install here, but 
> > this could easily be pushed up to 13 if there are benefits.
> >
> > Thanks for your time.
> >
> > Laura




Re: AWS forcing PG upgrade from v9.6 a disaster

2021-05-28 Thread Dean Gibson (DB Administrator)

On 2021-05-28 08:12, Adrian Klaver wrote:

On 5/27/21 8:41 PM, Dean Gibson (DB Administrator) wrote:
I started to use PostgreSQL v7.3 in 2003 on my home Linux systems (4 
at one point), gradually moving to v9.0 w/ replication in 2010.  In 
2017 I moved my 20GB database to AWS/RDS, gradually upgrading to 
v9.6, & was entirely satisfied with the result.


In March of this year, AWS announced that v9.6 was nearing end of 
support, & AWS would forcibly upgrade everyone to v12 on January 22, 
2022, if users did not perform the upgrade earlier. My first attempt 
was successful as far as the upgrade itself, but complex queries that 
normally ran in a couple of seconds on v9.x, were taking minutes in v12.


Did you run a plain 
ANALYZE(https://www.postgresql.org/docs/12/sql-analyze.html) on the 
tables in the new install?


After each upgrade (to 10, 11, 12, & 13), I did a "VACUUM FULL 
ANALYZE".  On 10 through 12, it took about 45 minutes & significant CPU 
activity, & temporarily doubled the size of the disk space required.  As 
you know, that disk space is not shrinkable under AWS's RDS.  On v13, it 
took 10 hours with limited CPU activity, & actually slightly less disk 
space required.




Re: AWS forcing PG upgrade from v9.6 a disaster

2021-05-28 Thread Ron

On 5/28/21 1:40 PM, Dean Gibson (DB Administrator) wrote:

On 2021-05-28 08:12, Adrian Klaver wrote:

On 5/27/21 8:41 PM, Dean Gibson (DB Administrator) wrote:
I started to use PostgreSQL v7.3 in 2003 on my home Linux systems (4 at 
one point), gradually moving to v9.0 w/ replication in 2010.  In 2017 I 
moved my 20GB database to AWS/RDS, gradually upgrading to v9.6, & was 
entirely satisfied with the result.


In March of this year, AWS announced that v9.6 was nearing end of 
support, & AWS would forcibly upgrade everyone to v12 on January 22, 
2022, if users did not perform the upgrade earlier.  My first attempt 
was successful as far as the upgrade itself, but complex queries that 
normally ran in a couple of seconds on v9.x, were taking minutes in v12.


Did you run a plain 
ANALYZE(https://www.postgresql.org/docs/12/sql-analyze.html) on the 
tables in the new install?


After each upgrade (to 10, 11, 12, & 13), I did a "VACUUM FULL ANALYZE".  
On 10 through 12, it took about 45 minutes & significant CPU activity, & 
temporarily doubled the size of the disk space required.  As you know, 
that disk space is not shrinkable under AWS's RDS.  On v13, it took 10 
hours with limited CPU activity, & actually slightly less disk space 
required.


Under normal conditions, VACUUM FULL is pointless on a freshly-loaded 
database; in RDS, it's *anti-useful*.


That's why Adrian asked if you did a plain ANALYZE.

--
Angular momentum makes the world go 'round.


WAL accumulating, Logical Replication pg 13

2021-05-28 Thread Willy-Bas Loos
Hi , I'm upgrading a 1.5TB database from postgres 9.3 to postgres 13 on
Debian 10. This is now in an Acceptance stage (DTAP). I have encountered a
problem: the WAL is not being deleted. I now have 1.4 TB of WAL in pg_wal
and my disks are getting full. The oldest WAL file is 18 days old.
I use Logical Replication from the new cluster to another new cluster with
1 subscriber and 1 subscription.

pg_stat_subscription tells me all recent timestamps.
and this:
db=# select * from pg_replication_slots;
-[ RECORD 1 ]---+-
slot_name   | my_pub1
plugin  | pgoutput
slot_type   | logical
datoid  | 16401
database| db
temporary   | f
active  | t
active_pid  | 9480
xmin|
catalog_xmin| 269168
restart_lsn | D4/908BC268
confirmed_flush_lsn | E1/25BF5710
wal_status  | extended
safe_wal_size   |



I've had problems with diskspace on this server, with postgres crashing
because of it, then added more diskspace and postgres recovered. This
doesn't seem to be a problem now.

The *publication* has the options publish = 'insert, update, delete,
truncate', publish_via_partition_root = false
The *subscription* has the options connect = true, enabled = true,
create_slot = false, slot_name = my_pub1, synchronous_commit = 'off'

The log on the publisher says:
2021-05-25 21:25:18.973 CEST [4584] user@db LOG:  starting logical decoding
for slot "my_pub1"
2021-05-25 21:25:18.973 CEST [4584] user@db DETAIL:  Streaming transactions
committing after D6/A82B5FE0, reading WAL from D4/908BC268.
2021-05-25 21:25:18.973 CEST [4584] user@db LOG:  logical decoding found
consistent point at D4/908BC268
2021-05-25 21:25:18.973 CEST [4584] user@db DETAIL:  There are no running
transactions.
2021-05-25 21:29:49.456 CEST [4614] user@db ERROR:  replication slot
"my_pub1" is active for PID 4584
2021-05-25 21:29:54.474 CEST [4615] user@db ERROR:  replication slot
"my_pub1" is active for PID 4584

And on the subscriber:
2021-05-28 21:23:46.702 CEST [40039] LOG:  logical replication apply worker
for subscription "my_pub1" has started
2021-05-28 21:23:46.712 CEST [40039] ERROR:  could not start WAL streaming:
ERROR:  replication slot "my_pub1" is active for PID 730
2021-05-28 21:23:46.714 CEST [19794] LOG:  background worker "logical
replication worker" (PID 40039) exited with exit code 1

The postgres settings on the *publisher* are:
max_connections = 100 # (change requires restart)
tcp_keepalives_idle = 120 # TCP_KEEPIDLE, in seconds;
shared_buffers = 50GB # min 128kB
work_mem = 1GB # min 64kB
maintenance_work_mem = 10GB # min 1MB
logical_decoding_work_mem = 5GB # min 64kB
dynamic_shared_memory_type = posix # the default is the first option
max_worker_processes = 20 # (change requires restart)
max_parallel_maintenance_workers = 10 # taken from max_parallel_workers
max_parallel_workers_per_gather = 5 # taken from max_parallel_workers
max_parallel_workers = 15 # maximum number of max_worker_processes that
wal_level = logical # minimal, replica, or logical
max_wal_size = 1GB
min_wal_size = 80MB
#archive_mode = off
max_wal_senders = 10 # max number of walsender processes
wal_sender_timeout = 1min # in milliseconds; 0 disables
max_replication_slots = 7 # max number of replication slots

On postgres settings on the *subscriber*:
max_connections = 100 # (change requires restart)
tcp_keepalives_idle = 120 # TCP_KEEPIDLE, in seconds;
shared_buffers = 25GB # min 128kB
work_mem = 1GB # min 64kB
maintenance_work_mem = 10GB # min 1MB
logical_decoding_work_mem = 5GB # min 64kB
dynamic_shared_memory_type = posix # the default is the first option
max_worker_processes = 20 # (change requires restart)
max_parallel_maintenance_workers = 10 # taken from max_parallel_workers
max_parallel_workers_per_gather = 5 # taken from max_parallel_workers
max_parallel_workers = 15 # maximum number of max_worker_processes that
wal_level = logical # minimal, replica, or logical
max_wal_size = 3GB
min_wal_size = 80MB
#archive_mode = off
wal_receiver_timeout = 1min # time that receiver waits for
max_logical_replication_workers = 10 # taken from max_worker_processes
max_sync_workers_per_subscription = 5 # taken from
max_logical_replication_workers

I've tried increasing wal_sender_timeout and wal_receiver_timeout to 10
minutes each, but this had no positive effect.

Some advice would be helpful
-- 
Willy-Bas Loos


Re: WAL accumulating, Logical Replication pg 13

2021-05-28 Thread Vijaykumar Jain
I am not too sure with 9.3
i tried an upgrade from 9.6 to 11 using logical replication (pg_logical
extension)

one thing to note.
logical replication initiates a copy from a snapshot, then changes from
then on.

I had a very high insert rate on my source tables (v9.6) and the
destination (v11) could not keep up (it had tons of indexes when I copied
the schema) and it took around a day as the table had around 12 indexes.

So at the destination(v11), I dropped all but the primary index for each
table, started subscription and when it was almost caught up, rebuilt the
index on the destination concurrently.
it completed in 4-5 hours without stopping the source.
migration completed in a few mins :)

not sure if this would help, but just FYI.


On Sat, 29 May 2021 at 01:36, Willy-Bas Loos  wrote:

> Hi , I'm upgrading a 1.5TB database from postgres 9.3 to postgres 13 on
> Debian 10. This is now in an Acceptance stage (DTAP). I have encountered a
> problem: the WAL is not being deleted. I now have 1.4 TB of WAL in pg_wal
> and my disks are getting full. The oldest WAL file is 18 days old.
> I use Logical Replication from the new cluster to another new cluster with
> 1 subscriber and 1 subscription.
>
> pg_stat_subscription tells me all recent timestamps.
> and this:
> db=# select * from pg_replication_slots;
> -[ RECORD 1 ]---+-
> slot_name   | my_pub1
> plugin  | pgoutput
> slot_type   | logical
> datoid  | 16401
> database| db
> temporary   | f
> active  | t
> active_pid  | 9480
> xmin|
> catalog_xmin| 269168
> restart_lsn | D4/908BC268
> confirmed_flush_lsn | E1/25BF5710
> wal_status  | extended
> safe_wal_size   |
>
>
>
> I've had problems with diskspace on this server, with postgres crashing
> because of it, then added more diskspace and postgres recovered. This
> doesn't seem to be a problem now.
>
> The *publication* has the options publish = 'insert, update, delete,
> truncate', publish_via_partition_root = false
> The *subscription* has the options connect = true, enabled = true,
> create_slot = false, slot_name = my_pub1, synchronous_commit = 'off'
>
> The log on the publisher says:
> 2021-05-25 21:25:18.973 CEST [4584] user@db LOG:  starting logical
> decoding for slot "my_pub1"
> 2021-05-25 21:25:18.973 CEST [4584] user@db DETAIL:  Streaming
> transactions committing after D6/A82B5FE0, reading WAL from D4/908BC268.
> 2021-05-25 21:25:18.973 CEST [4584] user@db LOG:  logical decoding found
> consistent point at D4/908BC268
> 2021-05-25 21:25:18.973 CEST [4584] user@db DETAIL:  There are no running
> transactions.
> 2021-05-25 21:29:49.456 CEST [4614] user@db ERROR:  replication slot
> "my_pub1" is active for PID 4584
> 2021-05-25 21:29:54.474 CEST [4615] user@db ERROR:  replication slot
> "my_pub1" is active for PID 4584
>
> And on the subscriber:
> 2021-05-28 21:23:46.702 CEST [40039] LOG:  logical replication apply
> worker for subscription "my_pub1" has started
> 2021-05-28 21:23:46.712 CEST [40039] ERROR:  could not start WAL
> streaming: ERROR:  replication slot "my_pub1" is active for PID 730
> 2021-05-28 21:23:46.714 CEST [19794] LOG:  background worker "logical
> replication worker" (PID 40039) exited with exit code 1
>
> The postgres settings on the *publisher* are:
> max_connections = 100 # (change requires restart)
> tcp_keepalives_idle = 120 # TCP_KEEPIDLE, in seconds;
> shared_buffers = 50GB # min 128kB
> work_mem = 1GB # min 64kB
> maintenance_work_mem = 10GB # min 1MB
> logical_decoding_work_mem = 5GB # min 64kB
> dynamic_shared_memory_type = posix # the default is the first option
> max_worker_processes = 20 # (change requires restart)
> max_parallel_maintenance_workers = 10 # taken from max_parallel_workers
> max_parallel_workers_per_gather = 5 # taken from max_parallel_workers
> max_parallel_workers = 15 # maximum number of max_worker_processes that
> wal_level = logical # minimal, replica, or logical
> max_wal_size = 1GB
> min_wal_size = 80MB
> #archive_mode = off
> max_wal_senders = 10 # max number of walsender processes
> wal_sender_timeout = 1min # in milliseconds; 0 disables
> max_replication_slots = 7 # max number of replication slots
>
> On postgres settings on the *subscriber*:
> max_connections = 100 # (change requires restart)
> tcp_keepalives_idle = 120 # TCP_KEEPIDLE, in seconds;
> shared_buffers = 25GB # min 128kB
> work_mem = 1GB # min 64kB
> maintenance_work_mem = 10GB # min 1MB
> logical_decoding_work_mem = 5GB # min 64kB
> dynamic_shared_memory_type = posix # the default is the first option
> max_worker_processes = 20 # (change requires restart)
> max_parallel_maintenance_workers = 10 # taken from max_parallel_workers
> max_parallel_workers_per_gather = 5 # taken from max_parallel_workers
> max_parallel_workers = 15 # maximum number of max_worker_processes that
> wal_level = logical # minimal, replica, or logical
> max_wal_size = 

Re: AWS forcing PG upgrade from v9.6 a disaster

2021-05-28 Thread Dean Gibson (DB Administrator)

On 2021-05-28 12:38, Ron wrote:

On 5/28/21 1:40 PM, Dean Gibson (DB Administrator) wrote:

On 2021-05-28 08:12, Adrian Klaver wrote:

On 5/27/21 8:41 PM, Dean Gibson (DB Administrator) wrote:
I started to use PostgreSQL v7.3 in 2003 on my home Linux systems 
(4 at one point), gradually moving to v9.0 w/ replication in 2010.  
In 2017 I moved my 20GB database to AWS/RDS, gradually upgrading to 
v9.6, & was entirely satisfied with the result.


In March of this year, AWS announced that v9.6 was nearing end of 
support, & AWS would forcibly upgrade everyone to v12 on January 
22, 2022, if users did not perform the upgrade earlier.  My first 
attempt was successful as far as the upgrade itself, but complex 
queries that normally ran in a couple of seconds on v9.x, were 
taking minutes in v12.


Did you run a plain 
ANALYZE(https://www.postgresql.org/docs/12/sql-analyze.html) on the 
tables in the new install?


After each upgrade (to 10, 11, 12, & 13), I did a "VACUUM FULL 
ANALYZE".  On 10 through 12, it took about 45 minutes & significant 
CPU activity, & temporarily doubled the size of the disk space 
required.  As you know, that disk space is not shrinkable under AWS's 
RDS.  On v13, it took 10 hours with limited CPU activity, & actually 
slightly less disk space required.


Under normal conditions, VACUUM FULL is pointless on a freshly-loaded 
database; in RDS, it's *anti-useful*.


That's why Adrian asked if you did a plain ANALYZE.


Just now did.  No change in EXPLAIN ANALYZE output.



Re: AWS forcing PG upgrade from v9.6 a disaster

2021-05-28 Thread Ron

On 5/28/21 5:06 PM, Dean Gibson (DB Administrator) wrote:

On 2021-05-28 12:38, Ron wrote:

On 5/28/21 1:40 PM, Dean Gibson (DB Administrator) wrote:

On 2021-05-28 08:12, Adrian Klaver wrote:

On 5/27/21 8:41 PM, Dean Gibson (DB Administrator) wrote:
I started to use PostgreSQL v7.3 in 2003 on my home Linux systems (4 
at one point), gradually moving to v9.0 w/ replication in 2010.  In 
2017 I moved my 20GB database to AWS/RDS, gradually upgrading to v9.6, 
& was entirely satisfied with the result.


In March of this year, AWS announced that v9.6 was nearing end of 
support, & AWS would forcibly upgrade everyone to v12 on January 22, 
2022, if users did not perform the upgrade earlier.  My first attempt 
was successful as far as the upgrade itself, but complex queries that 
normally ran in a couple of seconds on v9.x, were taking minutes in v12.


Did you run a plain 
ANALYZE(https://www.postgresql.org/docs/12/sql-analyze.html) on the 
tables in the new install?


After each upgrade (to 10, 11, 12, & 13), I did a "VACUUM FULL 
ANALYZE".  On 10 through 12, it took about 45 minutes & significant CPU 
activity, & temporarily doubled the size of the disk space required.  As 
you know, that disk space is not shrinkable under AWS's RDS.  On v13, it 
took 10 hours with limited CPU activity, & actually slightly less disk 
space required.


Under normal conditions, VACUUM FULL is pointless on a freshly-loaded 
database; in RDS, it's *anti-useful*.


That's why Adrian asked if you did a plain ANALYZE.


Just now did.  No change in EXPLAIN ANALYZE output.


Did it run in less than 10 hours?

--
Angular momentum makes the world go 'round.


Re: AWS forcing PG upgrade from v9.6 a disaster

2021-05-28 Thread Dean Gibson (DB Administrator)

On 2021-05-28 16:51, Ron wrote:

On 5/28/21 5:06 PM, Dean Gibson (DB Administrator) wrote:

On 2021-05-28 12:38, Ron wrote:

On 5/28/21 1:40 PM, Dean Gibson (DB Administrator) wrote:

On 2021-05-28 08:12, Adrian Klaver wrote:

On 5/27/21 8:41 PM, Dean Gibson (DB Administrator) wrote:
I started to use PostgreSQL v7.3 in 2003 on my home Linux systems 
(4 at one point), gradually moving to v9.0 w/ replication in 
2010.  In 2017 I moved my 20GB database to AWS/RDS, gradually 
upgrading to v9.6, & was entirely satisfied with the result.


In March of this year, AWS announced that v9.6 was nearing end of 
support, & AWS would forcibly upgrade everyone to v12 on January 
22, 2022, if users did not perform the upgrade earlier.  My first 
attempt was successful as far as the upgrade itself, but complex 
queries that normally ran in a couple of seconds on v9.x, were 
taking minutes in v12.


Did you run a plain 
ANALYZE(https://www.postgresql.org/docs/12/sql-analyze.html) on 
the tables in the new install?


After each upgrade (to 10, 11, 12, & 13), I did a "VACUUM FULL 
ANALYZE".  On 10 through 12, it took about 45 minutes & significant 
CPU activity, & temporarily doubled the size of the disk space 
required.  As you know, that disk space is not shrinkable under 
AWS's RDS.  On v13, it took 10 hours with limited CPU activity, & 
actually slightly less disk space required.


Under normal conditions, VACUUM FULL is pointless on a 
freshly-loaded database; in RDS, it's *anti-useful*.


That's why Adrian asked if you did a plain ANALYZE.


Just now did.  No change in EXPLAIN ANALYZE output.


Did it run in less than 10 hours?



The original VACUUM FULL ANALYZE ran in 10 hours.  The plain ANALYZE ran 
in 88 seconds.