Index no longer being used, destroying and recreating it restores use.

2020-06-04 Thread Koen De Groote
Greetings,

The following is using version 11.2 of PostgreSQL.

I've got a table with about 30 million rows and a particular index that up
until recently was actively being used.

And then it stopped being used and the query that the index was made for,
is now doing sequential scans.

Deleting the index and creating it again, seems to fix the problem. The new
index, which is identical in composition, is being used and the query in
question no longer uses sequential scans.


It's the exact same query and the index is identical in composition. Yet
after a while the database stops using it. I'd like to find out why that is
and how to prevent it.


Also, I'm assuming this is the correct list for such a question?


In the past, I had asked this somewhere else, but then no longer had time
to spend on it: https://dba.stackexchange.com/questions/264237/

Some data I gathered then:

   1. Size of the index not being used is 101MB.
   2. Size of the index being used is 16MB.

The query takes the form of:

"select * from myTable where bool1 = true and bool2 = false and timestamp
<= ('timestampField'::timestamp without time zone) order by stringField
asc, id asc limit 100 offset 3;"

3 is an example value.
-

Here is the "explain analyze" for index used:
https://explain.depesz.com/s/H5X9y
-

Here is the "explain analyze" for index not used:
https://explain.depesz.com/s/n6bP

And I'm frankly stumped.An index growing from 16MB to 101MB isn't that big
of an increase, I would think? Is that the reason it's no longer being
used? Or is something else going on here?

The entire database, in which this table belongs, undergoes a "vacuum
analyze" every single night, which takes about 8 minutes. Do I perhaps need
to do something additional in terms of cleanup/maintenance?

I've tried altering statistics, to very large values even, but no changes
there either.

Any help or suggestion would be appreciated.

Kind regards,
Koen De Groote


Re: Index no longer being used, destroying and recreating it restores use.

2020-06-05 Thread Koen De Groote
I've collected all relevant info(I think so at least) and put it here:

The table in question is used to keep filepath data, of files on a
harddrive.
The query in question is used to retrieve items which should be backed up,
but have not yet been.

The relevant columns of the table:

Table "public.item"
   Column   |Type |
Collation | Nullable |   Default
+-+---+--+--
 id | bigint  |
  | not null | nextval('item_id_seq'::regclass)
 shouldbebackedup   | boolean |
  | not null | true
 backupperformed| boolean |
  | not null | false
 itemCreated| timestamp without time zone |
  |  | now()
 filepath   | text|
  |  |


The existing index, which no longer gets used:
"index_in_question" btree (shouldbebackedup, backupperformed, itemCreated,
filepath) WHERE shouldbebackedup = true AND backupperformed = false

The new index, made out of the exact same columns and conditions, get used
immediately after creation:
CREATE INDEX CONCURRENTLY index_test ON item USING btree (shouldbebackedup,
backupperformed, itemCreated, filepath) WHERE shouldbebackedup = true AND
backupperformed = false;


The query in question will look something like this:
select * from item where shouldbebackedup=true and itemCreated<='2020-06-05
00:00:00.000' and backupperformed=false order by filepath asc, id asc limit
100 offset 10400;

Having done a count, there are around 13000 items here, without the offset
and limit.
That being said, the amount is entirely dependant on what was added on a
previous day.


I tried creating an extended statistic, like this, but it had no effect:
CREATE STATISTICS s1 (dependencies) ON shouldbebackedup, backupperformed
FROM item;

Settings from the conf file I think are related:

shared_buffers = 1024MB
effective_cache_size = 2048MB
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 32MB

Finally, I state again that this database gets a nightly "vacuum analyze".

My thanks for looking at this and any suggestions one might have.

Regards,
Koen

On Thu, Jun 4, 2020 at 7:08 PM Tom Lane  wrote:

> Adrian Klaver  writes:
> > On 6/4/20 9:43 AM, Tom Lane wrote:
> >> It's possible that the index had bloated to the point where the planner
> >> thought it was cheaper to use a seqscan.  Did you make a note of the
> >> cost estimates for the different plans?
>
> > I missed the part where the OP pointed to a SO question. In that
> > question where links to explain.depesz.com output.
>
> Ah, I didn't bother to chase that link either.
>
> So the cost estimates are only a fraction of a percent apart, making
> it unsurprising for not-so-large changes in the index size to cause
> a flip in the apparently-cheapest plan.  The real question then is
> why the cost estimates aren't actually modeling the real execution
> times very well; and I'd venture that that question boils down to
> why is this rowcount estimate so far off:
>
> >->  Parallel Seq Scan on oscar mike_three
> > (cost=0.000..1934568.500 rows=2385585 width=3141) (actual
> > time=159.800..158018.961 rows=23586 loops=3)
> >Filter: (four AND (NOT bravo) AND (zulu <=
> > 'echo'::timestamp without time zone))
> >Rows Removed by Filter: 8610174
>
> We're not going to be able to answer that if the OP doesn't wish
> to decloak his data a bit more ... but a reasonable guess is that
> those filter conditions are correlated.  With late-model Postgres
> you might be able to improve matters by creating extended statistics
> for this table.
>
> regards, tom lane
>


Re: Index no longer being used, destroying and recreating it restores use.

2020-06-05 Thread Koen De Groote
The requested result: https://explain.depesz.com/s/G7mU

Also, the data from the statistic itself:

=> SELECT stxname, stxkeys, stxdependencies

->   FROM pg_statistic_ext

->   WHERE stxname = 's1';
 stxname | stxkeys | stxdependencies
-+-+-
 s1  | 29 35   | 


On Fri, Jun 5, 2020 at 4:15 PM Adrian Klaver 
wrote:

> On 6/5/20 7:05 AM, Koen De Groote wrote:
> > I've collected all relevant info(I think so at least) and put it here:
> >
> > The table in question is used to keep filepath data, of files on a
> > harddrive.
> > The query in question is used to retrieve items which should be backed
> > up, but have not yet been.
> >
> > The relevant columns of the table:
> >
> >  Table "public.item"
> > Column   |Type |
> > Collation | Nullable |   Default
> >
> +-+---+--+--
> >   id | bigint  |
> >| not null | nextval('item_id_seq'::regclass)
> >   shouldbebackedup   | boolean |
> >| not null | true
> >   backupperformed| boolean |
> >| not null | false
> >   itemCreated| timestamp without time zone |
> >|  | now()
> >   filepath   | text|
> >|  |
> >
> >
> > The existing index, which no longer gets used:
> > "index_in_question" btree (shouldbebackedup, backupperformed,
> > itemCreated, filepath) WHERE shouldbebackedup = true AND backupperformed
> > = false
> >
> > The new index, made out of the exact same columns and conditions, get
> > used immediately after creation:
> > CREATE INDEX CONCURRENTLY index_test ON item USING btree
> > (shouldbebackedup, backupperformed, itemCreated, filepath) WHERE
> > shouldbebackedup = true AND backupperformed = false;
> >
> >
> > The query in question will look something like this:
> > select * from item where shouldbebackedup=true and
> > itemCreated<='2020-06-05 00:00:00.000' and backupperformed=false order
> > by filepath asc, id asc limit 100 offset 10400;
>
> The result of EXPLAIN ANALYZE for above.
>
> >
> > Having done a count, there are around 13000 items here, without the
> > offset and limit.
> > That being said, the amount is entirely dependant on what was added on a
> > previous day.
> >
> >
> > I tried creating an extended statistic, like this, but it had no effect:
> > CREATE STATISTICS s1 (dependencies) ON shouldbebackedup, backupperformed
> > FROM item;
> >
> > Settings from the conf file I think are related:
> >
> > shared_buffers = 1024MB
> > effective_cache_size = 2048MB
> > random_page_cost = 1.1
> > effective_io_concurrency = 200
> > work_mem = 32MB
> >
> > Finally, I state again that this database gets a nightly "vacuum
> analyze".
> >
> > My thanks for looking at this and any suggestions one might have.
> >
> > Regards,
> > Koen
> >
> > On Thu, Jun 4, 2020 at 7:08 PM Tom Lane  > <mailto:t...@sss.pgh.pa.us>> wrote:
> >
> > Adrian Klaver  > <mailto:adrian.kla...@aklaver.com>> writes:
> >  > On 6/4/20 9:43 AM, Tom Lane wrote:
> >  >> It's possible that the index had bloated to the point where the
> > planner
> >  >> thought it was cheaper to use a seqscan.  Did you make a note of
> the
> >  >> cost estimates for the different plans?
> >
> >  > I missed the part where the OP pointed to a SO question. In that
> >  > question where links to explain.depesz.com
> > <http://explain.depesz.com> output.
> >
> > Ah, I didn't bother to chase that link either.
> >
> > So the cost estimates are only a fraction of a percent apart, making
> > it unsurprising for not-so-large changes in the index size to cause
> > a flip in the apparently-cheapest plan.  The real question then is
> > why the cost estimates aren't actually modeling the real execution
> > times very well; and I'd venture that that question boils down to
> > why is this rowcount estimate so far off:
> >

Re: Index no longer being used, destroying and recreating it restores use.

2020-06-06 Thread Koen De Groote
I'll attempt this next week.

On Fri, Jun 5, 2020, 21:11 Michael Lewis  wrote:

> Those row estimates are pretty far off.
>
> Standard indexes and partial indexes don't get custom statistics created
> on them, but functional indexes do. I wonder if a small function
> needs_backup( shouldbebackedup, backupperformed ) and an index created on
> that function would nicely alleviate the pain. I would expect PG12 and
> extended statistics of type MCVs would bypass the need for that work around
> though.
>


Re: Index no longer being used, destroying and recreating it restores use.

2020-06-08 Thread Koen De Groote
So, this query:

select * from item where shouldbebackedup=true and itemCreated<='2020-06-05
00:00:00.000' and backupperformed=false order by filepath asc, id asc limit
100 offset 10400;

Was made into a function:

create or replace function NeedsBackup(text, int, int default 100)
returns setof item as $$
BEGIN
return query select * from item where shouldbebackedup=true and
itemCreated<=$1::timestamp without time zone and backupperformed=false
order by filepath asc, id asc limit $3 offset $2;
END;
$$
language 'plpgsql';

Having read the documentation, I do think this is STABLE and not VOLATILE?
While the external process performing the query will eventually update
these rows, setting backupperformed to true instead of false, that doesn't
happen within this function.

Doing an EXPLAIN ANALYZE yields this output:

 QUERY PLAN

---
 Function Scan on public.needsbackup  (cost=0.25..10.25 rows=1000
width=5275) (actual time=80.107..80.115 rows=100 loops=1)
   Output: id, shouldbebackedup, backupperformed, itemCreated, filepath
   Function Call: needsbackup('2020-06-08 12:00:00.016'::text, 2100, 100)
   Buffers: shared read=1572
 Planning Time: 0.199 ms
 Execution Time: 80.499 ms
(6 rows)

And the actual output is identical to the full query.

I'm kind of worried I'm not seeing any more details as to how it all
worked. Yes, it's as fast as I hoped, but that also happened last time when
I re-created the index. Fast for a few weeks, then suddenly not anymore.

I tried looking up how to create indexes on functions, but found nothing
that could help, also the suggestion that this is not possible.

So, at this point, what should I still look at, I wonder? EXPLAIN says it
did a function call, but what happened under the hood there?

And before I forget: thank you most kindly for the advice so far, to all
involved.

Regards,
Koen


On Sun, Jun 7, 2020 at 12:45 AM Koen De Groote  wrote:

> I'll attempt this next week.
>
> On Fri, Jun 5, 2020, 21:11 Michael Lewis  wrote:
>
>> Those row estimates are pretty far off.
>>
>> Standard indexes and partial indexes don't get custom statistics created
>> on them, but functional indexes do. I wonder if a small function
>> needs_backup( shouldbebackedup, backupperformed ) and an index created on
>> that function would nicely alleviate the pain. I would expect PG12 and
>> extended statistics of type MCVs would bypass the need for that work around
>> though.
>>
>


Re: Index no longer being used, destroying and recreating it restores use.

2020-06-09 Thread Koen De Groote
Right. In that case, the function I ended up with is this:

create or replace function still_needs_backup(bool, bool)
returns BOOLEAN as $$
BEGIN
PERFORM 1 from item where shouldbebackedup=$1 and backupperformed=$2;
  IF FOUND THEN
RETURN TRUE;
  ELSE
RETURN FALSE;
  END IF;
END;
$$
language plpgsql;


And the index for that should be this, I think: "CREATE INDEX CONCURRENTLY
index_test ON item USING btree (still_needs_backup, itemCreated, filepath)
WHERE still_needs_backup(true, false) = true;"
However postgres throws an error here, saying "ERROR:  functions in index
predicate must be marked IMMUTABLE".

I tried it also without the first argument, same error.

And I don't think I can do that, because the return is not IMMUTABLE. It is
at best STABLE, but certainly not IMMUTABLE.

So yeah, I'm probably not understanding the suggestion properly? Either
way, I still have questions about the earlier function I created, namely
how reliable that performance is. If not the same thing will happen as with
the re-created index.

Regards,
Koen


On Mon, Jun 8, 2020 at 11:15 PM Michael Lewis  wrote:

> On Mon, Jun 8, 2020 at 2:34 PM Koen De Groote  wrote:
>
>> So, this query:
>>
>> select * from item where shouldbebackedup=true and
>> itemCreated<='2020-06-05 00:00:00.000' and backupperformed=false order by
>> filepath asc, id asc limit 100 offset 10400;
>>
>> Was made into a function:
>>
>> create or replace function NeedsBackup(text, int, int default 100)
>> returns setof item as $$
>> BEGIN
>> return query select * from item where shouldbebackedup=true and
>> itemCreated<=$1::timestamp without time zone and backupperformed=false
>> order by filepath asc, id asc limit $3 offset $2;
>> END;
>> $$
>> language 'plpgsql';
>>
>
>
> What I had meant was a function perhaps called backup_needed_still(
> backupperformed bool, shouldbebackedup bool) which would return bool; This
> could be written in SQL only with no need for plpgsql. By the way, the
> language name being in single quotes is deprecated.
>
> Then you could create an index on the table- either on the timestamp
> column where that function returns true, or just creating the index
> directly on the boolean result of that function call if that is what is
> needed to get the custom stats from a functional index. Then you would
> include the function call in your query instead of the two individual
> boolean columns.
>


Re: Index no longer being used, destroying and recreating it restores use.

2020-06-16 Thread Koen De Groote
Alright, I've done that, and that seems to be a very good result:
https://explain.depesz.com/s/xIph

The method I ended up using:

create or replace function still_needs_backup(shouldbebackedup bool,
backupperformed bool)
returns BOOLEAN as $$
   select $1 AND NOT $2;
$$
language sql immutable;

And the index is as suggested.

It seems the amount of rows we end up with has improved.

Thank you for your help. I wasn't aware functions could interact with
indexes in such a manner.

Regards,
Koen De Groote

On Mon, Jun 15, 2020 at 8:27 PM Michael Lewis  wrote:

> On Tue, Jun 9, 2020 at 6:24 AM Koen De Groote  wrote:
>
>> Right. In that case, the function I ended up with is this:
>>
>> create or replace function still_needs_backup(bool, bool)
>> returns BOOLEAN as $$
>> BEGIN
>> PERFORM 1 from item where shouldbebackedup=$1 and backupperformed=$2;
>>   IF FOUND THEN
>> RETURN TRUE;
>>   ELSE
>> RETURN FALSE;
>>   END IF;
>> END;
>> $$
>> language plpgsql;
>>
>
> I meant something like the below (not tested)-
>
> create or replace function still_needs_backup(shouldbebackedup bool,
> backupperformed bool)
> returns BOOLEAN as $$
> BEGIN
>return $1 AND NOT $2;
> END;
> $$
> language sql;
>
> CREATE INDEX CONCURRENTLY index_test ON item USING btree (itemCreated)
> WHERE still_needs_backup(shouldbebackedup, backupperformed);
> ANALYZE item;
>
>>


Clarification on Expression indexes

2020-06-16 Thread Koen De Groote
Greetings all.

The following page:
https://www.postgresql.org/docs/11/indexes-expressional.html

States the following:

Index expressions are relatively expensive to maintain, because the derived
> expression(s) must be computed for each row upon insertion and whenever it
> is updated
>

I'd like to get an idea on "relatively expensive". For instance, compared
to a partial index, which is split on one or more boolean values. As
opposed to making a function for this that serves the same identical
calculation.

Let's say that over the lifetime of a row, it rarely gets updated more than
2000 times, and the majority of this is right after creation?

Is this a concern?

Regards,
Koen De Groote


In case of network issues, how long before archive_command does retries

2022-05-18 Thread Koen De Groote
I've got a setup where archive_command will gzip the wal archive to a
directory that is itself an NFS mount.

When connection is gone or blocked, archive_command fails after the timeout
specified by the NFS mount, as expected. (for a soft mount. hard mount
hangs, as expected)

However, on restoring connection, it's not clear to me how long it takes
before the command is retried.

Experience says "a few minutes", but I can't find documentation on an exact
algorithm.

To be clear, the question is: if archive_command fails, what are the
specifics of retrying? Is there a timeout? How is that timeout defined?

Is this detailed somewhere? Perhaps in the source code? I couldn't find it
in the documentation.

For detail, I'm using postgres 11, running on Ubuntu 20.

Regards,
Koen


Reasons for not overwriting processed wal archives?

2022-05-18 Thread Koen De Groote
The documentation here:
https://www.postgresql.org/docs/11/continuous-archiving.html

States:

> It is advisable to test your proposed archive command to ensure that it
indeed does not overwrite an existing file, *and that it returns nonzero
status in this case*.

Why exactly is this?

Assuming a situation in which the files, as part of the archive_command,
are gzipped to their final location.

If the gzip fails, or stops halfway, perhaps due to a network issue...
you're left with a file that isn't the full representation of the wal
archive.

So it needs to be done again.

In cases where the file is simply moved, this can also occur. Wal size can
be configured, and transfers might fail.

So they'd have to be done again.

If we test first for a file with that name being present, and refusing to
go forward if there is... an admin is going to have to manually intervene
and delete the half-finished file.

So isn't it better to just overwrite?

What scenario am I missing here?

Regards,
Koen


Re: In case of network issues, how long before archive_command does retries

2022-05-19 Thread Koen De Groote
Hello Laurenz,

Thanks for the reply. That would mean the source code is here:
https://github.com/postgres/postgres/blob/REL_11_0/src/backend/postmaster/pgarch.c

Just to be sure, the "signal" you speak of, this is the result of the
command executed by archive_command?

If my understanding of the code is right, if no SIGTERM or other signal
arrives, it won't ever happen that a walarchive is skipped if the
archive_command fails too many times or takes too long? It will simply
check again every 60 seconds(PGARCH_AUTOWAKE_INTERVAL) ? Or is the 60
seconds the point where it stops trying, waiting for the next time
archive_command is invoked?

I'm assuming that as long as the file is still in the pg_wal directory and
as long as there is no ".done" file for that walarchive under
pg_wal/archive_status, it will keep trying forever(or until someone
forcefully switches the timeline with for instance a basebackup)?

Apologies, I already sent this message once, but only to Laurenz. Sending
again to have it in the archives.

Regards,
Koen

On Thu, May 19, 2022 at 9:10 AM Laurenz Albe 
wrote:

> On Wed, 2022-05-18 at 22:51 +0200, Koen De Groote wrote:
> > I've got a setup where archive_command will gzip the wal archive to a
> directory that is itself an NFS mount.
> >
> > When connection is gone or blocked, archive_command fails after the
> timeout specified by the NFS mount, as expected. (for a soft mount. hard
> mount hangs, as expected)
> >
> > However, on restoring connection, it's not clear to me how long it takes
> before the command is retried.
> >
> > Experience says "a few minutes", but I can't find documentation on an
> exact algorithm.
> >
> > To be clear, the question is: if archive_command fails, what are the
> specifics of retrying? Is there a timeout? How is that timeout defined?
> >
> > Is this detailed somewhere? Perhaps in the source code? I couldn't find
> it in the documentation.
> >
> > For detail, I'm using postgres 11, running on Ubuntu 20.
>
> You can find the details in "src/backend/postmaster/pgarch.c".
>
> The archiver will try to archive three times (NUM_ARCHIVE_RETRIES) in an
> interval
> of one second, then back off until it receives a signal, PostgreSQL shutd
> down
> or a minute has passed.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>


Re: In case of network issues, how long before archive_command does retries

2022-05-20 Thread Koen De Groote
Thank you for your thorough explanation.

On Thu, May 19, 2022 at 5:47 PM Laurenz Albe 
wrote:

> On Thu, 2022-05-19 at 15:43 +0200, Koen De Groote wrote:
> > On Thu, May 19, 2022 at 9:10 AM Laurenz Albe 
> wrote:
> > > On Wed, 2022-05-18 at 22:51 +0200, Koen De Groote wrote:
> > > > When connection is gone or blocked, archive_command fails after the
> timeout specified
> > > > by the NFS mount, as expected. (for a soft mount. hard mount hangs,
> as expected)
> > > >
> > > > However, on restoring connection, it's not clear to me how long it
> takes before the command is retried.
> > > >
> > > > Experience says "a few minutes", but I can't find documentation on
> an exact algorithm.
> > > >
> > > > To be clear, the question is: if archive_command fails, what are the
> specifics of retrying?
> > > > Is there a timeout? How is that timeout defined?
> > > >
> > > > Is this detailed somewhere? Perhaps in the source code? I couldn't
> find it in the documentation.
> > > >
> > > > For detail, I'm using postgres 11, running on Ubuntu 20.
> > >
> > > You can find the details in "src/backend/postmaster/pgarch.c".
> > >
> > > The archiver will try to archive three times (NUM_ARCHIVE_RETRIES) in
> an interval
> > > of one second, then back off until it receives a signal, PostgreSQL
> shutd down
> > > or a minute has passed.
> >
> > Thanks for the reply. That would mean the source code is here:
> >
> https://github.com/postgres/postgres/blob/REL_11_0/src/backend/postmaster/pgarch.c
>
> For release 11.0, yes.
>
> > Just to be sure, the "signal" you speak of, this is the result of the
> command executed by archive_command?
>
> No, that is an operating system signal.
> PostgreSQL processes communicate by sending signals to each other, and if
> anybody
> wakes up the archiver, it will try again.
>
> > If my understanding of the code is right, if no SIGTERM or other signal
> arrives, it won't ever happen
> > that a walarchive is skipped if the archive_command fails too many times
> or takes too long? It
> > will simply check again every 60 seconds(PGARCH_AUTOWAKE_INTERVAL) ? Or
> is the 60 seconds the point
> > where it stops trying, waiting for the next time archive_command is
> invoked?
>
> Even if a signal arrives, PostgreSQL will keep trying to archive that same
> WAL segment
> that failed until it is done.
>
> This is a potential sequence of events:
>
>   try to archive -> fail
>   sleep 1 second
>   try to archive -> fail
>   sleep 1 second
>   try to archive -> fail
>   sleep 60 seconds
>   try to archive -> fail
>   sleep 1 second
>   try to archive -> fail
>   sleep 1 second
>   try to archive -> fail
>   sleep 60 seconds -> get woken up by a signal after 30 seconds
>   try to archive -> fail
>   sleep 1 second
>   try to archive -> fail
>   get shutdown request -> exit
>
> When PostgreSQL restarts, it will continue trying to archive the same
> segment.
>
> > I'm assuming that as long as the file is still in the pg_wal directory
> and as long as there is no
> > ".done" file for that walarchive under pg_wal/archive_status, it will
> keep trying forever(or until
> > someone forcefully switches the timeline with for instance a basebackup)?
>
> Yes, it will keep trying, and a timeline switch won't change that.
>
> Yours,
> Laurenz Albe
> --
> Cybertec | https://www.cybertec-postgresql.com
>


Question regarding failover behavior

2022-08-23 Thread Koen De Groote
Hello all,

I have a system that was originally set up on 9.3, a few years ago moved to
11.2

A shared mount is used to host the basebackup and wal archives.

The failover procedure was basically manual and as follow:

1/ Take out the primary, remove the IP from the primary machine/VM
2/ Create the trigger_file on the standby, add the IP to the new primary
machine/VM
3/ Create a basebackup from the new primary
4/ Redeploy the new standby, which will unpack this basebackup from step 3
5/ The new standby will have a restore_command to get synced back up.

I see there's more recent tools like pg_rewind these days. Haven't had time
to check that out and probably won't for a while.

My question is: is the basebackup actually necessary? Can I not simply
modify the configuration old the old primary, set it up like a standby, and
then start the process/container, and have it pick up the necessary
wal_archives with the restore_command?

Or is it more complicated than that? Something like a timeline jump or
something else that makes it impossible to simply set the old primary to
standby and start it again?

Kind regards,
Koen De Groote


Re: Question regarding failover behavior

2022-08-24 Thread Koen De Groote
Updating to say I tested and ran into the issue where the timeline switched
from 0C to 0D.

Trying to bring the old primary back up as standby fails. It recovers upto
a point and then gets stuck asked for a next file that doesn't exist.

Regardless of taking the existing data directory or restoring a new
basebackup.

Taking a new basebackup, with the new primary, and restoring that, works.
Standby gets in sync and everything is as expected.

Kind regards,
Koen De Groote



On Wed, Aug 24, 2022 at 1:09 AM Koen De Groote  wrote:

> Hello all,
>
> I have a system that was originally set up on 9.3, a few years ago moved
> to 11.2
>
> A shared mount is used to host the basebackup and wal archives.
>
> The failover procedure was basically manual and as follow:
>
> 1/ Take out the primary, remove the IP from the primary machine/VM
> 2/ Create the trigger_file on the standby, add the IP to the new primary
> machine/VM
> 3/ Create a basebackup from the new primary
> 4/ Redeploy the new standby, which will unpack this basebackup from step 3
> 5/ The new standby will have a restore_command to get synced back up.
>
> I see there's more recent tools like pg_rewind these days. Haven't had
> time to check that out and probably won't for a while.
>
> My question is: is the basebackup actually necessary? Can I not simply
> modify the configuration old the old primary, set it up like a standby, and
> then start the process/container, and have it pick up the necessary
> wal_archives with the restore_command?
>
> Or is it more complicated than that? Something like a timeline jump or
> something else that makes it impossible to simply set the old primary to
> standby and start it again?
>
> Kind regards,
> Koen De Groote
>


Questions on logical replication

2024-06-04 Thread Koen De Groote
I recently read the entire documentation on logical replication, but am
left with a question on the buildup of WAL

On this page:
https://www.postgresql.org/docs/current/logical-replication-subscription.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT

It is written: " When dropping a subscription, the remote host is not
reachable. In that case, disassociate the slot from the subscription
using ALTER
SUBSCRIPTION before attempting to drop the subscription. If the remote
database instance no longer exists, no further action is then necessary.
If, however, the remote database instance is just unreachable, the
replication slot (and any still remaining table synchronization slots)
should then be dropped manually; otherwise it/they would continue to
reserve WAL and might eventually cause the disk to fill up. Such cases
should be carefully investigated."


Assuming a situation where I add tables 1 at a time to the publisher, and
refresh the subscription every time.

What happens if I shut down the subscriber database for a while? The
subscription isn't dropped, so am I reading it right that the disk on the
publisher will slowly be filling up with WAL? Isn't that always the case if
wall is enabled?

This "cause disk to fill up" warning is quite concerning, and I'd like to
understand what could cause it and how likely it is? I thought logical
replication uses WAL by default, so doesn't that mean there has to be a log
of changes kept anyhow? Even if the WAL isn't written to disk by an
"archive_command"?

Regards,
Koen De Groote


Re: Questions on logical replication

2024-06-04 Thread Koen De Groote
Reading this:
https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS

" Replication slots provide an automated way to ensure that the primary
does not remove WAL segments until they have been received by all standbys,
and that the primary does not remove rows which could cause a recovery
conflict
<https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT>
even when the standby is disconnected. "

Am I to understand that a subscription is considered that same as a
standby, in this context?

On Wed, Jun 5, 2024 at 12:55 AM Koen De Groote  wrote:

> I recently read the entire documentation on logical replication, but am
> left with a question on the buildup of WAL
>
> On this page:
> https://www.postgresql.org/docs/current/logical-replication-subscription.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT
>
> It is written: " When dropping a subscription, the remote host is not
> reachable. In that case, disassociate the slot from the subscription using 
> ALTER
> SUBSCRIPTION before attempting to drop the subscription. If the remote
> database instance no longer exists, no further action is then necessary.
> If, however, the remote database instance is just unreachable, the
> replication slot (and any still remaining table synchronization slots)
> should then be dropped manually; otherwise it/they would continue to
> reserve WAL and might eventually cause the disk to fill up. Such cases
> should be carefully investigated."
>
>
> Assuming a situation where I add tables 1 at a time to the publisher, and
> refresh the subscription every time.
>
> What happens if I shut down the subscriber database for a while? The
> subscription isn't dropped, so am I reading it right that the disk on the
> publisher will slowly be filling up with WAL? Isn't that always the case if
> wall is enabled?
>
> This "cause disk to fill up" warning is quite concerning, and I'd like to
> understand what could cause it and how likely it is? I thought logical
> replication uses WAL by default, so doesn't that mean there has to be a log
> of changes kept anyhow? Even if the WAL isn't written to disk by an
> "archive_command"?
>
> Regards,
> Koen De Groote
>


Re: Questions on logical replication

2024-06-05 Thread Koen De Groote
>
> https://www.postgresql.org/docs/current/wal-configuration.html
>
> "Checkpoints are points in the sequence of transactions at which it is
> guaranteed that the heap and index data files have been updated with all
> information written before that checkpoint. At checkpoint time, all
> dirty data pages are flushed to disk and a special checkpoint record is
> written to the WAL file. (The change records were previously flushed to
> the WAL files.) In the event of a crash, the crash recovery procedure
> looks at the latest checkpoint record to determine the point in the WAL
> (known as the redo record) from which it should start the REDO
> operation. Any changes made to data files before that point are
> guaranteed to be already on disk. Hence, after a checkpoint, WAL
> segments preceding the one containing the redo record are no longer
> needed and can be recycled or removed. (When WAL archiving is being
> done, the WAL segments must be archived before being recycled or removed.)"
>

And this is the same for logical replication and physical replication, I
take it.

Thus, if a leader has a standby of the same version, and meanwhile logical
replication is being done to a newer version, both those replications are
taken into account, is that correct?


 When you set up logical replication you are 'asking' via the replication

slot that WAL records be kept on the publisher until the subscriber
>
retrieves them.
>

And if it cannot sync them, due to connectivity loss for instance, the WAL
records will not be removed, then?

Regards,
Koen De Groote


On Wed, Jun 5, 2024 at 1:05 AM Adrian Klaver 
wrote:

> On 6/4/24 15:55, Koen De Groote wrote:
> > I recently read the entire documentation on logical replication, but am
> > left with a question on the buildup of WAL
> >
> > On this page:
> >
> https://www.postgresql.org/docs/current/logical-replication-subscription.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT
> <
> https://www.postgresql.org/docs/current/logical-replication-subscription.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT
> >
> >
> > It is written: " When dropping a subscription, the remote host is not
> > reachable. In that case, disassociate the slot from the subscription
> > using |ALTER SUBSCRIPTION| before attempting to drop the subscription.
> > If the remote database instance no longer exists, no further action is
> > then necessary. If, however, the remote database instance is just
> > unreachable, the replication slot (and any still remaining table
> > synchronization slots) should then be dropped manually; otherwise
> > it/they would continue to reserve WAL and might eventually cause the
> > disk to fill up. Such cases should be carefully investigated."
> >
> >
> > Assuming a situation where I add tables 1 at a time to the publisher,
> > and refresh the subscription every time.
> >
> > What happens if I shut down the subscriber database for a while? The
> > subscription isn't dropped, so am I reading it right that the disk on
> > the publisher will slowly be filling up with WAL? Isn't that always the
> > case if wall is enabled?
>
> https://www.postgresql.org/docs/current/wal-configuration.html
>
> "Checkpoints are points in the sequence of transactions at which it is
> guaranteed that the heap and index data files have been updated with all
> information written before that checkpoint. At checkpoint time, all
> dirty data pages are flushed to disk and a special checkpoint record is
> written to the WAL file. (The change records were previously flushed to
> the WAL files.) In the event of a crash, the crash recovery procedure
> looks at the latest checkpoint record to determine the point in the WAL
> (known as the redo record) from which it should start the REDO
> operation. Any changes made to data files before that point are
> guaranteed to be already on disk. Hence, after a checkpoint, WAL
> segments preceding the one containing the redo record are no longer
> needed and can be recycled or removed. (When WAL archiving is being
> done, the WAL segments must be archived before being recycled or removed.)"
>
> >
> > This "cause disk to fill up" warning is quite concerning, and I'd like
> > to understand what could cause it and how likely it is? I thought
> > logical replication uses WAL by default, so doesn't that mean there has
> > to be a log of changes kept anyhow? Even if the WAL isn't written to
> > disk by an "archive_command"?
>
>
> https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS
>
> "Replication slots provide an automated way to ensure that th

Re: Questions on logical replication

2024-06-06 Thread Koen De Groote
I'll give them a read, though it might take a few weekends

Meanwhile, this seems to be what I'm looking for:

From
https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS

" Replication slots provide an automated way to ensure that the primary
does not remove WAL segments until they have been received by all standbys,
and that the primary does not remove rows which could cause a recovery
conflict
<https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT>
even when the standby is disconnected."

I'm reading that as: "if there is a replication slot, if the standby is
disconnected, WAL is kept"

And if we know WAL is kept in the "pg_wal" directory, that sounds like it
could slowly but surely fill up disk space.


But again, I'll give them a read. I've read all of logical replication
already, and I feel like I didn't get my answer there.

Thanks for the help


Regards,
Koen De Groote

On Thu, Jun 6, 2024 at 12:19 AM Adrian Klaver 
wrote:

> On 6/5/24 14:54, Koen De Groote wrote:
> > https://www.postgresql.org/docs/current/wal-configuration.html
> > <https://www.postgresql.org/docs/current/wal-configuration.html>
> >
> > "Checkpoints are points in the sequence of transactions at which it
> is
> > guaranteed that the heap and index data files have been updated with
> > all
> > information written before that checkpoint. At checkpoint time, all
> > dirty data pages are flushed to disk and a special checkpoint record
> is
> > written to the WAL file. (The change records were previously flushed
> to
> > the WAL files.) In the event of a crash, the crash recovery procedure
> > looks at the latest checkpoint record to determine the point in the
> WAL
> > (known as the redo record) from which it should start the REDO
> > operation. Any changes made to data files before that point are
> > guaranteed to be already on disk. Hence, after a checkpoint, WAL
> > segments preceding the one containing the redo record are no longer
> > needed and can be recycled or removed. (When WAL archiving is being
> > done, the WAL segments must be archived before being recycled or
> > removed.)"
> >
> >
> > And this is the same for logical replication and physical replication, I
> > take it.
>
> High level explanation, both physical and logical replication use the
> WAL files as the starting point. When the recycling is done is dependent
> on various factors. My suggestion would be to read through the below to
> get a better idea of what is going. There is a lot to cover, but if you
> really want to understand it you will need to go through it.
>
> Physical replication
>
> https://www.postgresql.org/docs/current/high-availability.html
>
> 27.2.5. Streaming Replication
> 27.2.6. Replication Slots
>
> Logical replication
>
> https://www.postgresql.org/docs/current/logical-replication.html
>
> WAL
>
> https://www.postgresql.org/docs/current/wal.html
>
>
>
> >
> > Thus, if a leader has a standby of the same version, and meanwhile
> > logical replication is being done to a newer version, both those
> > replications are taken into account, is that correct?
>
> Yes, see links above.
>
>
> > And if it cannot sync them, due to connectivity loss for instance, the
> > WAL records will not be removed, then?
>
> Depends on the type of replication being done. It is possible for
> physical replication to have WAL records removed that are still needed
> downstream.
>
> From
>
>
> https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION
>
> "If you use streaming replication without file-based continuous
> archiving, the server might recycle old WAL segments before the standby
> has received them. If this occurs, the standby will need to be
> reinitialized from a new base backup. You can avoid this by setting
> wal_keep_size to a value large enough to ensure that WAL segments are
> not recycled too early, or by configuring a replication slot for the
> standby. If you set up a WAL archive that's accessible from the standby,
> these solutions are not required, since the standby can always use the
> archive to catch up provided it retains enough segments."
>
> This is why it is good idea to go through the links I posted above.
>
> >
> > Regards,
> > Koen De Groote
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Questions on logical replication

2024-06-08 Thread Koen De Groote
What I'm trying to do is upgrade a PG11 database to PG16, using logical
replication.

The PG11 has an active and a standby, there are a handful of databases. On
particular one has a few tables just over 100GB, then a few 100 tables near
1GB.

What I'd do is start a publication with no tables and add them 1 at a time,
refreshing subscription each time.

This might take a long time, so my main questions relate to potential
network issues or various situations where the instance receiving the
logical replication, suddenly stop being able to receive.

Resyncing, and the effects of WAL buildup, are my main concern.

Accidentally sent a mail to only your email, sorry for that.

Regards,
Koen De Groote

On Fri, Jun 7, 2024 at 5:15 PM Adrian Klaver 
wrote:

> On 6/6/24 15:19, Koen De Groote wrote:
> > I'll give them a read, though it might take a few weekends
> >
> > Meanwhile, this seems to be what I'm looking for:
> >
> >  From
> >
> https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS
> <
> https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS
> >
> >
> > " Replication slots provide an automated way to ensure that the primary
> > does not remove WAL segments until they have been received by all
> > standbys, and that the primary does not remove rows which could cause a
> > recovery conflict
> > <
> https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT>
> even when the standby is disconnected."
> >
> > I'm reading that as: "if there is a replication slot, if the standby is
> > disconnected, WAL is kept"
> >
> > And if we know WAL is kept in the "pg_wal" directory, that sounds like
> > it could slowly but surely fill up disk space.
> >
> >
> > But again, I'll give them a read. I've read all of logical replication
> > already, and I feel like I didn't get my answer there.
>
> It would be a good idea to provide an a fairly specific outline of what
> you are trying to achieve, then it would be easier for folks to offer
> suggestions on what to do or not to do.
>
> >
> > Thanks for the help
> >
> >
> > Regards,
> > Koen De Groote
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Questions on logical replication

2024-06-11 Thread Koen De Groote
> Have you looked at pg_upgrade?:

I have, but I want to keep downtime to a minimum and from my understanding
the switching of a fully synced logical replica only requires updating your
sequences. Which should be possible in less than 60 seconds.

> 1 GB each?

Yes, each. Roughly around there.


On Sat, Jun 8, 2024 at 7:46 PM Adrian Klaver 
wrote:

> On 6/8/24 10:40, Koen De Groote wrote:
> > What I'm trying to do is upgrade a PG11 database to PG16, using logical
> > replication.
>
> Have you looked at pg_upgrade?:
>
> https://www.postgresql.org/docs/current/pgupgrade.html
>
> >
> > The PG11 has an active and a standby, there are a handful of databases.
> > On particular one has a few tables just over 100GB, then a few 100
> > tables near 1GB.
>
> 1 GB each?
>
> >
> > What I'd do is start a publication with no tables and add them 1 at a
> > time, refreshing subscription each time.
> >
> > This might take a long time, so my main questions relate to potential
> > network issues or various situations where the instance receiving the
> > logical replication, suddenly stop being able to receive.
> >
> > Resyncing, and the effects of WAL buildup, are my main concern.
> >
> > Accidentally sent a mail to only your email, sorry for that.
> >
> > Regards,
> > Koen De Groote
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Questions on logical replication

2024-06-11 Thread Koen De Groote
> If there are any errors during the replay of WAL such as missing indexes
for Replica Identities during an Update or Delete  this will cause the main
subscriber worker slot on the publisher to start backing up WAL files

And also if the connection breaks, from what I understand, is that correct?
Anything that stops the subscription, including disabling the subscription,
is that right?

> I suggest confirming all tables have replica identities or primary keys
before going any further.

Yes, I am aware of this. I made me a small script that prints which tables
I have added to the publication and are done syncing, and which are
currently not being replicated.

> With PG 11 avoid REPLICA IDENTITY FULL as this causes full table scan on
the subscriber for PG 15 and earlier.

I'm also aware of this. My plan is to create a publication with no tables,
and add them 1 by 1, refreshing the subscriber each time.

I'm not planning on using "REPLICA IDENTITY FULL" anywhere.


On Sat, Jun 8, 2024 at 10:33 PM Justin  wrote:

>
> On Sat, Jun 8, 2024 at 1:41 PM Koen De Groote  wrote:
>
>> What I'm trying to do is upgrade a PG11 database to PG16, using logical
>> replication.
>>
>> The PG11 has an active and a standby, there are a handful of databases.
>> On particular one has a few tables just over 100GB, then a few 100 tables
>> near 1GB.
>>
>> What I'd do is start a publication with no tables and add them 1 at a
>> time, refreshing subscription each time.
>>
>> This might take a long time, so my main questions relate to potential
>> network issues or various situations where the instance receiving the
>> logical replication, suddenly stop being able to receive.
>>
>> Resyncing, and the effects of WAL buildup, are my main concern.
>>
>> Accidentally sent a mail to only your email, sorry for that.
>>
>> Regards,
>> Koen De Groote
>>
>>>
>>>
> This approach does not prevent WAL build up.
>
> The WAL build up occurs during the initial sync worker once that table is
> synced the WAL is replayed and released.   The parent worker then become
> responsible for replaying the WAL for that table
>
> The WAL build up is during the initial sync of the data by table NOT
> during the entire synce of all the tables that have been published.
>
> For 1 gb table the initial sync will be very fast so I doubt any
> individual table will cause any significant WAL build up to put the
> publisher at risk of of crashing
>
> Once a table becomes synced the main subscriber worker keeps the WAL
> replayed.  If there are any errors during the replay of WAL such as missing
> indexes for Replica Identities during an Update or Delete  this will cause
> the main subscriber worker slot on the publisher to start backing up WAL
> files. If there are missing replica identities the affected tables will
> have to be dropped from the publication and subscription refreshed.  The
> WAL  file is already written with incorrect information so the table on the
> subscriber table is most likely not in recoverable state.
>
> I suggest confirming all tables have replica identities or primary keys
> before going any further.With PG 11 avoid REPLICA IDENTITY FULL as this
> causes full table scan on the subscriber for PG 15 and eariler.  PG 16 on
> the subsciber can use a different unique index that has NOT NULL for all
> participating columns if the publisher is using  Replicate Identity FULL on
> the published table
>
> One must understand the above before deploying logical replication.
>
> Hope this helps
>
>


Re: Questions on logical replication

2024-06-13 Thread Koen De Groote
> Why?  what benefit does this provide you??   Add all the tables when
creating the publication and be done with it...  I get this when trying to
understand how this all works on test boxes, but for production NO idea
what you're trying to accomplish

Adding all tables at once means adding the gigantic tables as well. Disk IO
and Network traffic are a serious concern, increased CPU usage affecting
queries of the live system, as well as transaction wraparound.

Initial sync can be a serious concern, depending on the size of the table.

Here's a nice guide where people did a logical replication upgrade,
explaining why they did it this way:
https://knock.app/blog/zero-downtime-postgres-upgrades

On Wed, Jun 12, 2024 at 7:01 PM Justin  wrote:

>
>
> On Tue, Jun 11, 2024 at 5:43 PM Koen De Groote  wrote:
>
>> > If there are any errors during the replay of WAL such as missing
>> indexes for Replica Identities during an Update or Delete  this will cause
>> the main subscriber worker slot on the publisher to start backing up WAL
>> files
>>
>> And also if the connection breaks, from what I understand, is that
>> correct? Anything that stops the subscription, including disabling the
>> subscription, is that right?
>>
>
> Yes to all
>
>
>> > I suggest confirming all tables have replica identities or primary keys
>> before going any further.
>>
>> Yes, I am aware of this. I made me a small script that prints which
>> tables I have added to the publication and are done syncing, and which are
>> currently not being replicated.
>>
>
>
>>
>> > With PG 11 avoid REPLICA IDENTITY FULL as this causes full table scan
>> on the subscriber for PG 15 and earlier.
>>
>> I'm also aware of this. My plan is to create a publication with no
>> tables, and add them 1 by 1, refreshing the subscriber each time.
>>
>
> Why?  what benefit does this provide you??   Add all the tables when
> creating the publication and be done with it...  I get this when trying to
> understand how this all works on test boxes, but for production NO idea
> what you're trying to accomplish
>
>
>> I'm not planning on using "REPLICA IDENTITY FULL" anywhere.
>>
> Good
>


Is a VACUUM or ANALYZE necessary after logical replication?

2024-06-15 Thread Koen De Groote
I've gone over all of
https://www.postgresql.org/docs/current/logical-replication.html and the
only mentions of the word "index" I could find was in relation to replica
identity and examples of table definitions showing primary key indexes.

Nothing is said about indexes. Maybe for good reason, maybe they are fully
functionality immediately after replication?

So the main question: Once a table is fully replicated, do I need to
vacuum(analyze) that table, or are the indexes on that table already
functional?

Regards,
Koen De Groote


Understanding conflicts on publications and subscriptions

2024-07-30 Thread Koen De Groote
Reading this document:
https://www.postgresql.org/docs/16/logical-replication-conflicts.html

There is talk of the "disable_on_error" option when creating a subscription.

The conflicts this applies to, I am assuming are only conflicts caused on
the side of the subscription?

As an attempt to apply new data doesn't work, because of modifications made
since the initial copy, is that correct?


I'm a bit confused by errors on the side of the publisher. Reading this
document: https://www.postgresql.org/docs/16/sql-createpublication.html

It states:

> The tables added to a publication that publishes UPDATE and/or DELETE 
> operations
must have REPLICA IDENTITY defined. Otherwise those operations will be
disallowed on those tables.

This is not related to the subscription option "disable_on_error", I take
it?

Because it sure would be nice if there was a way to do a similar thing for
the subscription, disabling it on error.

Am I getting this right? "disable_on_error" is only on subscription, and
errors on the publishers related to replica identity are not tied to that?

Thanks for your time.


Re: Understanding conflicts on publications and subscriptions

2024-07-30 Thread Koen De Groote
Just to add a thought:

If the subscriber gets a bit of logic to say "Something went wrong, so I'm
automatically stopping what I'm doing", it sounds logical to give the
publisher the same ability.

On Tue, Jul 30, 2024 at 3:47 PM Koen De Groote  wrote:

> Reading this document:
> https://www.postgresql.org/docs/16/logical-replication-conflicts.html
>
> There is talk of the "disable_on_error" option when creating a
> subscription.
>
> The conflicts this applies to, I am assuming are only conflicts caused on
> the side of the subscription?
>
> As an attempt to apply new data doesn't work, because of modifications
> made since the initial copy, is that correct?
>
>
> I'm a bit confused by errors on the side of the publisher. Reading this
> document: https://www.postgresql.org/docs/16/sql-createpublication.html
>
> It states:
>
> > The tables added to a publication that publishes UPDATE and/or DELETE 
> > operations
> must have REPLICA IDENTITY defined. Otherwise those operations will be
> disallowed on those tables.
>
> This is not related to the subscription option "disable_on_error", I take
> it?
>
> Because it sure would be nice if there was a way to do a similar thing for
> the subscription, disabling it on error.
>
> Am I getting this right? "disable_on_error" is only on subscription, and
> errors on the publishers related to replica identity are not tied to that?
>
> Thanks for your time.
>


Re: Understanding conflicts on publications and subscriptions

2024-07-30 Thread Koen De Groote
David,

By "addition" do you mean "adding the table to the publication"? I suppose
that's an option, though I was more thinking about disabling the
publication if an error occurs, similarly to how a subscription is disabled
if "disable_on_error" is set to true, and an error occurs there.

However, thinking about that is fantasizing, at this point.

My main worry is understanding the behavior as it is. And if my
understanding is correct: if a table doesn't have a replica identity, any
UPDATE or DELETE statement that happens on the publisher, for that table,
will be refused.

Is that correct?

Regards,
Koen


On Tue, Jul 30, 2024 at 4:04 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tuesday, July 30, 2024, Koen De Groote  wrote:
>>
>> If the subscriber gets a bit of logic to say "Something went wrong, so
>> I'm automatically stopping what I'm doing", it sounds logical to give the
>> publisher the same ability.
>>
>
> The wording for that option is:
>  Specifies whether the subscription should be automatically disabled if
> any errors are detected by subscription workers during data replication
> from the publisher.
>
> A subscription worker has no clue what the publisher is doing.  It
> operates on the “when I see data I act on it” model.
>
> As for whether the publisher should have this clause - the errors in
> question are logical, data-oriented, errors, which the publisher is
> incapable of having.
>
> I believe what you are effectively requesting is that instead of
> disallowing updates and deletes on the added table that lacks replica
> identity you wish for the addition itself to fail.  That would have made a
> better default behavior with an option to override when the current
> behavior is desired.  But it seems too late to change this decision now.
>
> David J.
>


Re: Understanding conflicts on publications and subscriptions

2024-07-30 Thread Koen De Groote
It indeed seems to be that.

My initial thought of " will be disallowed on those tables" was "on the
subscriber side". After all, why have a publication be of any effect if
there's nobody subscribing to it.

But it appears the publication influences behavior, regardless of there
being a subscriber, which feels counter-intuitive to me.

Thanks for stepping me through it.

On Tue, Jul 30, 2024 at 4:34 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Tue, Jul 30, 2024 at 7:16 AM Koen De Groote  wrote:
>
>>  And if my understanding is correct: if a table doesn't have a replica
>> identity, any UPDATE or DELETE statement that happens on the publisher, for
>> that table, will be refused.
>>
>>
> That is how I read the sentence "Otherwise those operations will be
> disallowed on those tables."
>
> Upon adding said table to a publication, future attempts to run updates
> and deletes will result in failures in the transactions performing said DML.
>
> Feel free to experiment that the behavior indeed matches the wording in
> the documentation.
>
> David J.
>
>


On a subscriber, why is last_msg_send_time in pg_stat_subscription sometimes null?

2024-08-23 Thread Koen De Groote
I'm running this query:

SELECT subname, received_lsn, latest_end_lsn,
last_msg_send_time, last_msg_receipt_time
FROM pg_catalog.pg_stat_subscription;

And "last_msg_send_time" will sometimes be null. All I can find in the
documentation is that this value represents: "Send time of last message
received from origin WAL sender; NULL for parallel apply workers"

But I have no context for what parallel apply workers are or why they are
NULL sometimes and other times not.

Can anyone explain?

Regards,
Koen De Groote


Re: On a subscriber, why is last_msg_send_time in pg_stat_subscription sometimes null?

2024-08-23 Thread Koen De Groote
Adrian,

Thanks for pointing me at the source code.

Digging a bit, the view seems to eventually get its data on last msg send
time from here:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/launcher.c;h=c566d50a072b92bd07f4179100275d0d0b1f4c7c;hb=HEAD#l1288

And in particular, this:

1332 if (worker.last_send_time == 0)
1333 nulls[5] = true;
1334 else
1335 values[5] = TimestampTzGetDatum(worker.last_send_time);
1336 if (worker.last_recv_time == 0)
1337 nulls[6] = true;
1338 else
1339 values[6] = TimestampTzGetDatum(worker.last_recv_time);

I don't have any knowledge of C, or postgres internals, so I may well be
wrong in what follows:

>From the bit of comment you posted my impression is that this means there
are separate workers that each send their update, that is then reflected in
the output of the pg_stat_subscription table... many workers, but only 1
table to show metrics, to show both the update by the leader and the
parallel workers...

And these parallel workers get created on the fly, I assume? So they might
well have a last_send_time of 0 if they haven't done anything yet?

What I would expect to see is a table that tells me how a particular
publishers/subscriber is doing, and the metrics around that process, and
the concept of "when data was last sent" to be persistent on the level of
the publisher/subscriber, not based on the lifespan of ephemeral workers
that each time they get created start at 0 and so now the table claims
"last_send_msg" is NULL because of it.

Am I getting that wrong? Is my understanding mistaken?

Regards,
Koen De Groote





On Fri, Aug 23, 2024 at 5:02 PM Adrian Klaver 
wrote:

> On 8/23/24 07:33, Koen De Groote wrote:
> > I'm running this query:
> >
> > SELECT subname, received_lsn, latest_end_lsn,
> > last_msg_send_time, last_msg_receipt_time
> > FROM pg_catalog.pg_stat_subscription;
> >
> > And "last_msg_send_time" will sometimes be null. All I can find in the
> > documentation is that this value represents: "Send time of last message
> > received from origin WAL sender; NULL for parallel apply workers"
> >
> > But I have no context for what parallel apply workers are or why they
> > are NULL sometimes and other times not. >
> > Can anyone explain?
>
> The source has that information:
>
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/applyparallelworker.c;h=e7f7d4c5e4bd3ff63687cb21b84534a05a3962cc;hb=HEAD
>
> The top part explains the process.
>
> As to the time:
>
> /*
>   * There are three fields in each message received by the parallel apply
>   * worker: start_lsn, end_lsn and send_time. Because we have updated these
>   * statistics in the leader apply worker, we can ignore these fields in
> the
>   * parallel apply worker (see function LogicalRepApplyLoop).
>   */
>
> >
> > Regards,
> > Koen De Groote
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Logical replication without direct link between publisher and subscriber?

2024-09-09 Thread Koen De Groote
I want to have a PG instance receive logical replication from a publisher.

However, the subscriber should not have network access to the publisher, in
any way. This is for security reasons. No VPN or any setup that allows the
subscriber to send traffic to the publisher host.

The publisher, however, can reach the subscriber IP and send traffic to it.

Is there a way to start a logical replication setup without having the
subscriber create a subscription? Run a few commands on the publisher and
the subscriber suddenly starts receiving data?

>From what I've googled, an option would be to use pgrecvlogical:
https://www.postgresql.org/docs/current/app-pgrecvlogical.html

Dumping changes periodically, sending them directly or uploading to cloud
storage and then downloading and applying them on the subscriber side.

But maybe there's a simpler option someone here knows about?

Any help and/or suggestions appreciated.

Thanks,
Koen


Re: Logical replication without direct link between publisher and subscriber?

2024-09-12 Thread Koen De Groote
I've considered it, but it sounds like a lot of work and failure prone.
Even projects like Debezium seem like it's a ton to set up.

Thanks for the suggestions.

Regards,
Koen

On Wed, Sep 11, 2024 at 3:20 PM Greg Sabino Mullane 
wrote:

> Dumping changes periodically, sending them directly or uploading to cloud
>> storage and then downloading and applying them on the subscriber side.
>> But maybe there's a simpler option someone here knows about?
>
>
> How about using WAL shipping to populate a replica, and either query that
> directly or use that as the publisher to your sequestered node?
>
> Cheers,
> Greg
>
>


Re: Upgrading from 11 to 13

2021-04-01 Thread Koen De Groote
I seem to recall that going from 11 to 12, a certain configuration file was
removed and the keys are now expected to be set in the regular
configuration file? The logic being there should only ever be 1
configuration file.

I can't find it, but at the same time I don't recall what it's called. I
believe it has to do with streaming replication?

Is this a thing or am I imagining stuff?

On Fri, Apr 2, 2021 at 12:34 AM Bruce Momjian  wrote:

> On Tue, Mar 30, 2021 at 08:30:00AM -0700, David G. Johnston wrote:
> > On Tue, Mar 30, 2021 at 8:25 AM Daniel Westermann (DWE) <
> > daniel.westerm...@dbi-services.com> wrote:
> > The best place would be the release notes, I guess. Right at the
> beginning
> > here:
> >
> > Release notes are probably a good place too but there is already a
> section in
> > the main documentation where this needs to be added.
> >
> > https://www.postgresql.org/docs/current/upgrading.html
> >
> > Which is linked to from the main website.
> >
> > https://www.postgresql.org/support/versioning/
> >
> > An explicit sentence stating that major upgrades can skip major versions
> is
> > needed.  The document is written assuming the reading knows this, and
> just
> > makes a few minor notes on the topic:
> > e.g., "If you are upgrading across several major versions, be sure to
> read the
> > release notes for each intervening version."
>
> I have two patches --- one for our website, and another for our docs,
> though not for the release notes specifically.  The release notes do
> reference this doc section though.
>
> My idea was to link the ability to skip upgrading to intervening
> versions with the need to read intervening version release notes.
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   If only the physical world exists, free will is an illusion.
>
>


Re: Upgrading from 11 to 13

2021-04-03 Thread Koen De Groote
Yes that's it. Probably something to alert people to. If they have this set
up, they can't "just to straight to 13".

On Fri, Apr 2, 2021 at 1:32 AM Alvaro Herrera 
wrote:

> On 2021-Apr-02, Koen De Groote wrote:
>
> > I seem to recall that going from 11 to 12, a certain configuration file
> was
> > removed and the keys are now expected to be set in the regular
> > configuration file? The logic being there should only ever be 1
> > configuration file.
> >
> > I can't find it, but at the same time I don't recall what it's called. I
> > believe it has to do with streaming replication?
> >
> > Is this a thing or am I imagining stuff?
>
> recovery.conf to postgresql.conf?  Yes, you're right.  (There are more
> reasons beyond "just one config file", though.)
>
> --
> Álvaro Herrera39°49'30"S 73°17'W
> "Someone said that it is at least an order of magnitude more work to do
> production software than a prototype. I think he is wrong by at least
> an order of magnitude."  (Brian Kernighan)
>


Re: Upgrading from 11 to 13

2021-04-03 Thread Koen De Groote
If an upgrade process has a part where old functionality doesn't work
anymore, or anything at all breaks, then it's not "you can just upgrade no
problems".

On Sat, Apr 3, 2021 at 8:50 PM Adrian Klaver 
wrote:

> On 4/3/21 10:37 AM, Koen De Groote wrote:
> > Yes that's it. Probably something to alert people to. If they have this
> > set up, they can't "just to straight to 13".
> >
>
> Yes you can. You run into the same issue going from 11 --> 12 --> 13,
> you would just hit it in a different portion of the process.
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


How does postgres behave if several indexes have (nearly) identical conditions?

2021-09-08 Thread Koen De Groote
Greetings all.

Example table:

CREATE TABLE my_table (
id serial PRIMARY KEY,
a001 BOOLEAN default 't',
a002 BOOLEAN default 'f',
a003 BOOLEAN default 't',
a004 BOOLEAN default 'f'
);

And these 2 indexes:

create index index_001 on my_table using btree (a001,a002,a003) where
a001=true and a002=false;

create index index_002 on my_table using btree (a003) where a001=true and
a002=false;

Now take this query:

select * from my_table where a001=true;

Which index will postgres pick? I'm wondering how postgres goes about
picking an index to consider.

And if it will consider others if the analysis of the first says a seqscan
would be better than the index it first considered?

Regards,
Koen De Groote


Re: How does postgres behave if several indexes have (nearly) identical conditions?

2021-09-08 Thread Koen De Groote
Forgot to mention, this is on Postgres 11.2

On Wed, Sep 8, 2021 at 11:04 PM Koen De Groote  wrote:

> Greetings all.
>
> Example table:
>
> CREATE TABLE my_table (
> id serial PRIMARY KEY,
> a001 BOOLEAN default 't',
> a002 BOOLEAN default 'f',
> a003 BOOLEAN default 't',
> a004 BOOLEAN default 'f'
> );
>
> And these 2 indexes:
>
> create index index_001 on my_table using btree (a001,a002,a003) where
> a001=true and a002=false;
>
> create index index_002 on my_table using btree (a003) where a001=true and
> a002=false;
>
> Now take this query:
>
> select * from my_table where a001=true;
>
> Which index will postgres pick? I'm wondering how postgres goes about
> picking an index to consider.
>
> And if it will consider others if the analysis of the first says a seqscan
> would be better than the index it first considered?
>
> Regards,
> Koen De Groote
>
>


Re: How does postgres behave if several indexes have (nearly) identical conditions?

2021-09-08 Thread Koen De Groote
And initial setup is wrong. There should be no 'and a002=false' in the
indexes.

On Wed, Sep 8, 2021 at 11:15 PM Koen De Groote  wrote:

> Forgot to mention, this is on Postgres 11.2
>
> On Wed, Sep 8, 2021 at 11:04 PM Koen De Groote  wrote:
>
>> Greetings all.
>>
>> Example table:
>>
>> CREATE TABLE my_table (
>> id serial PRIMARY KEY,
>> a001 BOOLEAN default 't',
>> a002 BOOLEAN default 'f',
>> a003 BOOLEAN default 't',
>> a004 BOOLEAN default 'f'
>> );
>>
>> And these 2 indexes:
>>
>> create index index_001 on my_table using btree (a001,a002,a003) where
>> a001=true and a002=false;
>>
>> create index index_002 on my_table using btree (a003) where a001=true and
>> a002=false;
>>
>> Now take this query:
>>
>> select * from my_table where a001=true;
>>
>> Which index will postgres pick? I'm wondering how postgres goes about
>> picking an index to consider.
>>
>> And if it will consider others if the analysis of the first says a
>> seqscan would be better than the index it first considered?
>>
>> Regards,
>> Koen De Groote
>>
>>


Question about behavior of conditional indexes

2021-09-21 Thread Koen De Groote
Greetings all,

Working on postgres 11.

I'm researching an index growing in size and never shrinking, and not being
used anymore after a while.

The index looks like this:

"index002" btree (action_performed, should_still_perform_action,
action_performed_at DESC) WHERE should_still_perform_action = false
AND action_performed = true

So, there are 2 boolean fields, and a conditional clause for both. The
table has about 50M rows, the index barely ever goes over 100K matched rows.

The idea is to search for rows matching these conditions quickly, and then
update them. This happens daily.

This means the condition no longer match the index. At this point, does the
row get removed from the index? Or does it stay in there as a dead row?

I'm noticing index bloat on this index and am wondering if all these
updated rows just stay in the index?

The index currently sits at 330MB. If I drop and re-create it, it's 1.5MB.

A cronjob runs a vacuum once per day, I can see the amount of dead rows
dropping in monitoring software.

But should this also take care of indexes? In postgres 11, you can't
reindex concurrently, so I was wondering if indexes are skipped by vacuum?
Or only in case of conditional indexes?



So I'm wondering if the behavior is as I described.

Regards,
Koen De Groote


Re: Question about behavior of conditional indexes

2021-09-22 Thread Koen De Groote
Thanks for the replies, everyone.

Gavin - I can't upgrade to a more recent version, at least not for the
foreseeable future. From what I'm reading, it's the best path forward, but
there's considerations to be made that I can't overrule.

Ninad - As I suspected about VACUUM and VACUUM FULL. Thanks for confirming

Michael - Yes, my mistakes in manually obfuscating.

Going over the documentation, is seems after pg 11, several additions are
made, like "vacuum_index_cleanup" and "VACUUM INDEX_CLEANUP".
>From what I'm reading, these did not exist in pg11.Googling a bit shows me
this:
https://www.depesz.com/2019/05/01/waiting-for-postgresql-12-allow-vacuum-to-be-run-with-index-cleanup-disabled/

>From which I assume that index cleaning is the default in pg11 and cannot
be controlled. Though it appears to not be "enough" for my use case.

There is already a cronjob that does "VACUUM ANALYZE" during the night.
Though it seems the index keeps growing, regardless of cleanup... I'm not
sure what to think of that, or what to look for, apart from more agressive
autovacuum.

I'll look into dropping and recreating this index without those fields, and
the fillfactor you mention.







On Tue, Sep 21, 2021 at 6:15 PM Michael Lewis  wrote:

> You likely need to tune your autovacuum settings such that the index bloat
> reaches a steady-state and does not continue to increase. When vacuum runs,
> it will remove dead (and no longer visible to any running transactions)
> tuples aka row versions from each page (8KB block by default) in the file
> for that table's data. It will also update the index, except in newer
> versions of Postgres where that behavior becomes optional (see manual for
> which version and whether it still runs by default). If you are I/O bound
> and cannot tune the system defaults to autovacuum more proactively (when a
> smaller percent of rows are dead), then perhaps just change the settings
> for that table as it seems to be functioning a bit like a queue. Or you
> might consider setting up a daily job to vacuum analyze on all tables, if
> there is a period of low system activity. If you didn't have the index on
> the columns you are updating, then reducing fillfactor would be an option
> to increase HOT updates and thereby prevent the index bloat. Alas, that's
> not an option with the index needing to reference those values that are
> being changed.
>
> "index002" btree (action_performed, should_still_perform_action, 
> action_performed_at DESC) WHERE should_still_perform_action = false AND 
> action_performed = true
>
> That index seems odd to me. Why index a single value for the first two 
> columns? I would recreate it with those skipped. Also, based on the names, 
> I'd expect the opposite for true and false. That is, the "interesting" rows 
> are where the action has NOT yet been performed yet and it is needed. I'd 
> expect the index as defined to cover most of the table rather than a small 
> fraction. Perhaps just a typo from manual obfuscation.
>
> For what it's worth, you can create new concurrently, drop old
> concurrently, then rename new to old. That would be the same result as a
> reindex concurrently.
>
>>


Using FDW to connect to a more recent postgres version?

2024-10-12 Thread Koen De Groote
The documentation speaks only of compatibility in terms of connecting to
older version:

> postgres_fdw can be used with remote servers dating back to PostgreSQL
8.3. Read-only capability is available back to 8.1.

But what about connecting to a foreign server that is more recent? Like
pg14 connecting to pg17?

Are the considerations in that case simply with new data types not being
recognized? Or are there cases where the setup can be done, but queries
won't behave as expected?

Regards,
Koen De Groote


Re: Foreign Data Wrapper behavior?

2024-10-12 Thread Koen De Groote
Ah, thanks for that

On Fri, Oct 11, 2024 at 5:59 PM Adrian Klaver 
wrote:

> On 10/11/24 04:08, Koen De Groote wrote:
> > In the release notes for postgres 17 I'm reading:
> >
> >  > The PostgreSQL foreign data wrapper (postgres_fdw), used to execute
> > queries on remote PostgreSQL instances, can now push EXISTS and IN
> > subqueries to the remote server for more efficient processing.
> >
> > I'm confused as to what this means. In older versions, are parts of
> > queries not sent to the foreign server? Or is this change meant to imply
> > the sending of only the subqueries, the result of which is then directly
> > used in pushing the entire query?
> >
> > Or am I still wrong as to what this means?
> >
> > I looked at the documentation and there doesn't seem to be any
> > indication of particular queries not being pushed to the foreign server,
> > so this wording that "can now push EXISTS and IN subqueries to the
> > remote server" is confusing.
> >
> > What am I missing?
>
> Read:
>
>
> https://www.postgresql.org/docs/current/postgres-fdw.html#POSTGRES-FDW-REMOTE-QUERY-OPTIMIZATION
>
> F.36.5. Remote Query Optimization
>
>
> As to the change in the Release Note see the --hackers discussion:
>
>
> https://www.postgresql.org/message-id/c9e2a757cf3ac2333714eaf83a9cc184%40postgrespro.ru
>
> >
> > Regards,
> > Koen De Groote
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Basebackup fails without useful error message

2024-10-20 Thread Koen De Groote
Hello Adrian, and everyone else.

It has finally happened, the backup ran into an error again, and the
verbose output set me on the right path.

I'm getting this error message:

> pg_basebackup: could not receive data from WAL stream: server closed the
connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.

Combined with the main server logging:

> terminating walsender process due to replication timeout

Now, the server is set up with an archive_command which gzips the WAL files
and writes them to a network filesystem.

>From looking at machine metrics at the time, my conclusion is the following:

At the time of the error, the remote filesystem experienced a very high
queue size for new writes.

So I'm assuming the process of writing WAL files, if there is an
archive_command set, is only considered to be finished after the archive is
written, not just when the WAL file is written in pg_wal.

I'm also seeing in the documentation that the default WAL method for
pg_basebackup is "stream", which waits for these WAL files as they are
produced.

I suspect that I have 2 possible paths at this point:

1: increase wal_sender_timeout
2: run the basebackup with --wal-method=none since my restore_command is
set up to explicitly go to the very same network storage to get the
archived WAL files.

I'm going to be testing this. If someone could confirm that this is how
writing WAL files works, that being: that it is only considered "done" when
the archive_command is done, that would be great.

Regards,
Koen De Groote


On Sun, Sep 29, 2024 at 6:08 PM Adrian Klaver 
wrote:

> On 9/29/24 08:57, Koen De Groote wrote:
> >  > What is the complete command you are using?
> >
> > The full command is:
> >
> > pg_basebackup -h localhost -p 5432 -U basebackup_user -D
> > /mnt/base_backup/dir -Ft -z -P
> >
> > So output Format as tar, gzipped, and with progress being printed.
> >
> >  > Have you looked at the Postgres log?
> >
> >  > Is --verbose being used?
> >
> > This is straight from the logs, it's the only output besides the %
> > progress counter.
> >
> > Will have a look at --verbose.
>
> When you report on that and if it does not report the error then what is?:
>
> Postgres version.
>
> OS and version.
>
> Anything special about the cluster like tablespaces, extensions,
> replication, etc.
>
>
> >
> > Regards,
> > Koen De Groote
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Basebackup fails without useful error message

2024-10-21 Thread Koen De Groote
See, I'm reading that, and my conclusion is "Ah great, it will try again,
everything will be fine."

There's a link between the archive_command, pg_basebackup and
wal_sender_timeout, but that link isn't clear from just reading all these
documentation pages when learning about their concepts separately.

It's just too much information to read and then "just know" in your head
that these must be the relations. Especially if you have dozens of other
things to do.

That's my 2 cents.

Regards,
Koen De Groote


On Sun, Oct 20, 2024 at 11:12 PM Adrian Klaver 
wrote:

> On 10/20/24 14:03, Koen De Groote wrote:
>
> > So I'm assuming the process of writing WAL files, if there is an
> > archive_command set, is only considered to be finished after the archive
> > is written, not just when the WAL file is written in pg_wal.
>
>
> https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-ARCHIVING-WAL
>
> "It is important that the archive command return zero exit status if and
> only if it succeeds. Upon getting a zero result, PostgreSQL will assume
> that the file has been successfully archived, and will remove or recycle
> it. However, a nonzero status tells PostgreSQL that the file was not
> archived; it will try again periodically until it succeeds."
>
>
> > Regards,
> > Koen De Groote
> >
> >
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Logical replication disabled, recovery not possible because of 1 large transaction with schema changes?

2024-10-17 Thread Koen De Groote
Hello Muhammad,

The problem with my scenario is the changes are written as a single
transaction, with a BEGIN and COMMIT. In that transaction, there are first
inserts, then a schema change, and then inserts on the new schema.

Doing as you said does not work. The subscriber will first complain it
cannot do the last inserts. If I fix the schema, it will complain it cannot
do the first inserts.

I thought I would be able to drop the table from the publication, and then
do the subscription again, but that fails, as the subscription is disabled
and cannot be enabled again, even if I remove the table from publication.

On Thu, Oct 17, 2024 at 5:49 AM Muhammad Usman Khan 
wrote:

> Hi,
> When you execute schema-altering operations on the publisher, these
> changes are not automatically replicated to the subscriber  which causes
> the following error
> logical replication target relation "public.dummy_table" is missing
> replicated columns: "contact_email", "status", "phone_number", "username"
>
> Before making schema changes, temporarily disable the subscription to
> prevent replication errors.
> ALTER SUBSCRIPTION your_subscription_name DISABLE;
>
> Manually apply the same schema modifications to the subscriber database to
> ensure alignment.
>
> Once the schema changes are applied to both databases, re-enable the
> subscription:
> ALTER SUBSCRIPTION your_subscription_name ENABLE;
>
>
> On Thu, 17 Oct 2024 at 02:59, Koen De Groote  wrote:
>
>> If this question is more suitable for another mailing list, please let me
>> know.
>>
>> I've set up the following table on both publisher and subscriber, both
>> are pg16:
>>
>> CREATE TABLE dummy_table (
>> id SERIAL PRIMARY KEY,
>> name VARCHAR(100) NOT NULL,
>> email VARCHAR(100) UNIQUE NOT NULL,
>> age INT,
>> created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
>> );
>>
>> Added to publication, refreshed subscription.
>>
>> Add some data on the publisher side:
>> INSERT INTO dummy_table (name, email, age)
>> VALUES
>> ('John Doe', 'john@example.com', 25),
>> ('Jane Smith', 'jane.sm...@example.com', 30),
>> ('Michael Johnson', 'michae...@example.com', 45),
>> ('Emily Davis', 'emil...@example.com', 27),
>> ('Robert Brown', 'robert.br...@example.com', 40);
>>
>> The data can be seen on the subscriber. So far, so good.
>>
>> I then execute the following patch on the publisher:
>> https://gist.github.com/KoenDG/d1c06d8c740c64e4e5884d0c64b81f11
>>
>> It is a single transaction that does the following:
>>
>> 1/ Insert data, 1000 items
>> 2/ Drop a column
>> 3/ Alter a column name
>> 4/ Add 2 columns, nullable
>> 5/ Add a column and give it a unique constraint
>> 6/ Update values for a column with NULL values, added in step 4.
>> 7/ Set the column updated in step 6 to be NOT NULL
>> 8/ Create a unique index with the columns from step 3 and 6
>> 9/ Insert a column with a default value
>> 10/ Insert data for this schema, another 1000 items.
>>
>> The subscription disabled, this is to be expected, there are new columns
>> names, the schema needs to be updated on the subscriber side.
>>
>> However, it seems I'm stuck.
>>
>> I can't enable the subscription. This is to be expected, it will try to
>> resume and run into the same issues.
>>
>> Ok, I update the schema and enable again. It runs into an error for the
>> inserts of step 1. These set values for columns dropped in step 2.
>>
>> I revert to the old schema and enable again. It runs into an error again,
>> this time for values that don't exist yet at step 1.
>>
>> I tried dropping the table at the subscriber side, recreating the correct
>> schema, but this runs into the same error.
>>
>> I remove the table from the publication and retry. Same error. Even with
>> the table no longer in the publication, and the table on the subscriber
>> side dropped and re-created, I'm still getting the exact same errors of
>> "logical replication target relation "public.dummy_table" is missing
>> replicated columns: "contact_email", "status", "phone_number", "username""
>>
>>
>> The only solution I've found is to drop the table from the publication,
>> and then drop the entire subscription and set it back up again, with the
>> correct schema.
>>
>> Am I making a mistake? Or does putting all these commands in a single
>> transaction ruin my chances?
>>
>> Clarification much appreciated.
>>
>> Regards,
>> Koen De Groote
>>
>>


Re: Logical replication disabled, recovery not possible because of 1 large transaction with schema changes?

2024-10-17 Thread Koen De Groote
Hello Michał,

Thanks for the reply. I suspected as much, I was just wondering if there
was an easy fix that didn't involve dropping the entire subscription and
having to re-do all the table because of that. Guess my only option is to
remove the affected tables from the publisher before the patch, refresh
subscription, do the patch, recreate the tables on the subscriber and do
the sync for only those tables.

I will look in to your suggestion.

Regards,
Koen De Groote





On Thu, Oct 17, 2024 at 11:17 AM Michał Kłeczek  wrote:

>
>
> > On 17 Oct 2024, at 11:07, Koen De Groote  wrote:
> >
> > Hello Muhammad,
> >
> > The problem with my scenario is the changes are written as a single
> transaction, with a BEGIN and COMMIT. In that transaction, there are first
> inserts, then a schema change, and then inserts on the new schema.
>
> I guess until logical replication of DDL is available you’re out of luck.
>
> The best you can do is to have a separate table for recording and
> replaying schema changes.
> Create triggers that perform actual DDL operations based on DML in this
> table.
> Publish this table on the publisher in the same publication as the tables
> affected by the DDL.
>
> On the subscriber side it is the same - just make the trigger is marked as
> ENABLE REPLICA TRIGGER or ENABLE ALWAYS TRIGGER.
>
> Kind regards,
> Michał
>
>


Logical replication disabled, recovery not possible because of 1 large transaction with schema changes?

2024-10-16 Thread Koen De Groote
If this question is more suitable for another mailing list, please let me
know.

I've set up the following table on both publisher and subscriber, both are
pg16:

CREATE TABLE dummy_table (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
age INT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Added to publication, refreshed subscription.

Add some data on the publisher side:
INSERT INTO dummy_table (name, email, age)
VALUES
('John Doe', 'john@example.com', 25),
('Jane Smith', 'jane.sm...@example.com', 30),
('Michael Johnson', 'michae...@example.com', 45),
('Emily Davis', 'emil...@example.com', 27),
('Robert Brown', 'robert.br...@example.com', 40);

The data can be seen on the subscriber. So far, so good.

I then execute the following patch on the publisher:
https://gist.github.com/KoenDG/d1c06d8c740c64e4e5884d0c64b81f11

It is a single transaction that does the following:

1/ Insert data, 1000 items
2/ Drop a column
3/ Alter a column name
4/ Add 2 columns, nullable
5/ Add a column and give it a unique constraint
6/ Update values for a column with NULL values, added in step 4.
7/ Set the column updated in step 6 to be NOT NULL
8/ Create a unique index with the columns from step 3 and 6
9/ Insert a column with a default value
10/ Insert data for this schema, another 1000 items.

The subscription disabled, this is to be expected, there are new columns
names, the schema needs to be updated on the subscriber side.

However, it seems I'm stuck.

I can't enable the subscription. This is to be expected, it will try to
resume and run into the same issues.

Ok, I update the schema and enable again. It runs into an error for the
inserts of step 1. These set values for columns dropped in step 2.

I revert to the old schema and enable again. It runs into an error again,
this time for values that don't exist yet at step 1.

I tried dropping the table at the subscriber side, recreating the correct
schema, but this runs into the same error.

I remove the table from the publication and retry. Same error. Even with
the table no longer in the publication, and the table on the subscriber
side dropped and re-created, I'm still getting the exact same errors of
"logical replication target relation "public.dummy_table" is missing
replicated columns: "contact_email", "status", "phone_number", "username""


The only solution I've found is to drop the table from the publication, and
then drop the entire subscription and set it back up again, with the
correct schema.

Am I making a mistake? Or does putting all these commands in a single
transaction ruin my chances?

Clarification much appreciated.

Regards,
Koen De Groote


Re: Basebackup fails without useful error message

2024-10-22 Thread Koen De Groote
Hello David,

I saw the backup fail. The backup logged that it terminated the walsender,
and correlating the moment it failed to the metrics of my storage, shows
the storage at that time was facing a huge IOWAIT. And this was a network
mounted storage.

The backup process continued, but because of a failure to stream WAL
without error(due to a local issue) the entire backup was marked as failed.
At the end, pg_basebackup will delete the backup, in this case. There's no
flag to control this final behavior.

I'll be testing restore soon without streaming WAL, since the actual
restore I perform doesn't use the pg_wal.tar.gz file. It gets the archived
WAL At least I think it doesn't need it, hence the need for testing.

Regards,
Koen De Groote

On Tue, Oct 22, 2024 at 12:34 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Sunday, October 20, 2024, Koen De Groote  wrote:
>>
>>
>> I'm going to be testing this. If someone could confirm that this is how
>> writing WAL files works, that being: that it is only considered "done" when
>> the archive_command is done, that would be great.
>>
>
> The archiving of WAL files by the primary does not involve a replication
> connection of any sort and thus the “WAL sender” settings are not relevant
> to it; or, here, whether or not you are archiving your WAL is immaterial
> since you are streaming it as it gets produced.
>
> If you are streaming WAL it seems highly unusual that you’d end up in a
> situation where the connection goes idle long enough that it gets killed,
> especially if the backup is still happening.  I’d probably go with
> performing the backup under a disabled (or extremely large?) timeout though
> and move on to other things.
>
> That isn’t to say I fully understand what actually is happening here…
>
> David J.
>
>


Foreign Data Wrapper behavior?

2024-10-11 Thread Koen De Groote
In the release notes for postgres 17 I'm reading:

> The PostgreSQL foreign data wrapper (postgres_fdw), used to execute
queries on remote PostgreSQL instances, can now push EXISTS and IN
subqueries to the remote server for more efficient processing.

I'm confused as to what this means. In older versions, are parts of queries
not sent to the foreign server? Or is this change meant to imply the
sending of only the subqueries, the result of which is then directly used
in pushing the entire query?

Or am I still wrong as to what this means?

I looked at the documentation and there doesn't seem to be any indication
of particular queries not being pushed to the foreign server, so this
wording that "can now push EXISTS and IN subqueries to the remote server"
is confusing.

What am I missing?

Regards,
Koen De Groote


Re: pg_wal folder high disk usage

2024-11-03 Thread Koen De Groote
A possible reason for pg_wal buildup is that there is a sort of replication
going on(logical or physical replication) and the receiving side of the
replication has stopped somehow.

This means: a different server that has a connection to your server and is
expecting to receive data. And your server is then expecting to have to
send data(this is the important bit). There could be multiple of these
connections.

If even 1 of these receiving servers is down, or the network is out, or
there is some other reason that it is no longer requesting data from your
server, your server will notice it isn't getting confirmation from that
other side, that they have received the data. As such, your postgres server
will keep this data locally, expecting this situation to be solved in the
future, and at that point in time, send all the data the other side hasn't
gotten yet.

This is 1 option. As long as your server is configured to expect that other
server to be there, and to be receiving, the buildup will continue. Taking
the other server offline won't help, in fact it is likely the cause of the
issue. The official documentation explains how to get rid of replication
slots, ideally your DBA should handle this.

Laurenz's blogpost lays out all the options, for instance it can also
happen that your system is generating data so fast, the writing of the WAL
files cannot keep up. Or your setup also does WAL archiving and the
compression on that is slow.

The post offers some ways to verify things, I suggest checking them out.

And of course, if your DBA is back, have them look at it too.

Regards,
Koen De Groote


On Fri, Nov 1, 2024 at 2:10 PM Greg Sabino Mullane 
wrote:

> On Fri, Nov 1, 2024 at 2:40 AM Muhammad Usman Khan 
> wrote:
>
>> For immediate space, move older files from pg_Wal to another storage but
>> don't delete them.
>>
>
> No, do not do this! Figure out why WAL is not getting removed by Postgres
> and let it do its job once fixed. Please recall the original poster is
> trying to figure out what to do because they are not the database admin, so
> having them figure out which WAL are "older" and safe to move is not good
> advice.
>
> Resizing the disk is a better option. Could also see if there are other
> large files on that volume that can be removed or moved elsewhere, esp.
> large log files.
>
> Hopefully all of this is moot because their DBA is back from leave. :)
>
> Cheers,
> Greg
>
>
>


Basebackup fails without useful error message

2024-09-29 Thread Koen De Groote
Having run a basebackup, I'm getting this output at the very end:

pg_basebackup: child process exited with error 1
pg_basebackup: removing contents of data directory "/mnt/base_backup/dir/"

Is there a way to get more information as to what exactly happened?

I'd like to look into fixing this or doing whatever is required so that it
doesn't happen again, but this just isn't enough info. Where do I start
looking?

Regards,
Koen De Groote


Re: Basebackup fails without useful error message

2024-09-29 Thread Koen De Groote
> What is the complete command you are using?

The full command is:

pg_basebackup -h localhost -p 5432 -U basebackup_user -D
/mnt/base_backup/dir -Ft -z -P

So output Format as tar, gzipped, and with progress being printed.

> Have you looked at the Postgres log?

> Is --verbose being used?

This is straight from the logs, it's the only output besides the % progress
counter.

Will have a look at --verbose.

Regards,
Koen De Groote

On Sun, Sep 29, 2024 at 5:01 PM Adrian Klaver 
wrote:

> On 9/29/24 07:48, Koen De Groote wrote:
> > Having run a basebackup, I'm getting this output at the very end:
> >
> > pg_basebackup: child process exited with error 1
> > pg_basebackup: removing contents of data directory
> "/mnt/base_backup/dir/"
>
> What is the complete command you are using?
>
> >
> > Is there a way to get more information as to what exactly happened?
>
> Have you looked at the Postgres log?
>
> Is --verbose being used?
>
> >
> > I'd like to look into fixing this or doing whatever is required so that
> > it doesn't happen again, but this just isn't enough info. Where do I
> > start looking?
> >
> > Regards,
> > Koen De Groote
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Running docker in postgres, SHM size of the docker container in postgres 16

2024-11-20 Thread Koen De Groote
That explains a lot.

I have the default of 2 max_parallel_maintenance_workers set, should I set
this to 0?

I realize this is of course an improvement, but working with docker
containers, I'd like to avoid taking /dev/shm away from regular queries.

I assume setting max_parallel_maintenance_workers to 0 is the fix here, is
there perhaps something else I should know about, if I want to have control
over this?

Regards,
Koen De Groote



On Wed, Nov 20, 2024 at 12:38 AM Thomas Munro 
wrote:

> On Wed, Nov 20, 2024 at 11:22 AM Koen De Groote  wrote:
> > Why would that be? It's the exact same data. The install is about 50GB
> in size. Is there something wrong with postgres 16, or did some settings
> significantly change, that I need to know about? I went over all the
> changelogs, nothing stood out as relevant to this, but that might be a lack
> of experience on my end.
>
> Parallel vacuum arrived in PostgreSQL 13, and that uses "dynamic
> shared memory" to share state between workers, and assuming
> dynamic_shared_memory_type=posix, that means shm_open(), which opens
> files under /dev/shm on Linux.
>


Running docker in postgres, SHM size of the docker container in postgres 16

2024-11-19 Thread Koen De Groote
Hello all,

Assume a machine with 16 CPU cores and 64GB of RAM. 300 max connections.

Postgres is running here, inside a docker container. Docker containers
receive a small amount of shared memory from /dev/shm

Experience teaches that the default 64MB of /dev/shm for the postgres
container, is not enough for databases of even 50GB in size. Previously
I've increased the /dev/shm setting to 256MB for the postgres container.
Issues that made this necessary all came in the form of too many queries
running at once, asking too much memory space, I assume.

I recently upgraded a setup from postgres 11 to postgres 16 via logical
replication, and found that running "vacuum(analyze, verbose)" fails unless
I give the container 1GB of /dev/shm. Where previously 256MB sufficed on
the old postgres 11 version.

Why would that be? It's the exact same data. The install is about 50GB in
size. Is there something wrong with postgres 16, or did some settings
significantly change, that I need to know about? I went over all the
changelogs, nothing stood out as relevant to this, but that might be a lack
of experience on my end.

Regards,
Koen De Groote


Memory settings when running postgres in a docker container

2024-11-20 Thread Koen De Groote
Assuming a machine with:

* 16 CPU cores
* 64GB RAM

Set to 500 max connections

A tool like this: https://pgtune.leopard.in.ua/

Will output recommended settings:

max_connections = 500
shared_buffers = 16GB
effective_cache_size = 48GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 8388kB
huge_pages = try
min_wal_size = 1GB
max_wal_size = 4GB
max_worker_processes = 16
max_parallel_workers_per_gather = 4
max_parallel_workers = 16
max_parallel_maintenance_workers = 4

And they basically use up all the memory of the machine.

16GB shared buffers, 48GB effective cache size, 8MB of work_mem for some
reason...

This seems rather extreme. I feel there should be free memory for
emergencies and monitoring solutions.

And then there's the fact that postgres on this machine will be run in a
docker container. Which, on Linux, receives 64MB of /dev/shm shared memory
by default, but can be increased.

I feel like I should probably actually lower my upper limit for memory,
regardless of what the machine actually has, so I can have free memory, and
also not bring the container process itself into danger.

Is it as straightforward as putting my limit on, say 20GB, and then giving
more /dev/shm to the container? Or is there more to consider?

Regards,
Koen De Groote


Postgres restore sometimes restores to a point 2 days in the past

2025-01-31 Thread Koen De Groote
I'm running postgres 16.6

My backup strategy is: basebackup and WAL archive. These get uploaded to
the cloud.

The restore is on an isolated machine and is performed daily. It downloads
the basebackup, unpacks it, sets a recovery.signal, and a script is
provided as restore_command, to download the WAL archives %f and unpack
them into %p

In the script, the final unpacking is simply "gzip -dc %f > %p". The gz
files are first checked with "gzip -t".

If a WAL archive is asked that doesn't exist yet, the script naturally
cannot find it, and exits with status code 1. This is the end of the
recovery.

There are a few tables that are known to receive new entries multiple times
per day. However, the state of the recovery showed the latest item to be 2
days in the past. Checking the live DB, there are an expected amount of
items since that ID.

I checked the logs, the last WAL archive that got downloaded is indeed the
last one that was available. The one that failed to download on the restore
machine, was uploaded to the cloud 8 minutes later, according to the upload
logs on the live DB.

The postgres logs themselves seem perfectly normal. It logs all these WAL
recoveries, switches the timeline, and becomes available.

What could be going wrong? My main issue is that I don't know where to
start looking, since nothing in the logs seems abnormal.

Regards,
Koen De Groote


Re: Postgres restore sometimes restores to a point 2 days in the past

2025-01-31 Thread Koen De Groote
No, it's meant to be an off-site restore, as to do a daily check if the
restore actually works.

Regards,
Koen De Groote

On Fri, Jan 31, 2025 at 2:30 PM Laurenz Albe 
wrote:

> On Fri, 2025-01-31 at 10:47 +0100, Koen De Groote wrote:
> > I'm running postgres 16.6
> >
> > My backup strategy is: basebackup and WAL archive. These get uploaded to
> the cloud.
> >
> > The restore is on an isolated machine and is performed daily. It
> downloads the
> > basebackup, unpacks it, sets a recovery.signal, and a script is provided
> as
> > restore_command, to download the WAL archives %f and unpack them into %p
> >
> > In the script, the final unpacking is simply "gzip -dc %f > %p". The gz
> files
> > are first checked with "gzip -t".
> >
> > If a WAL archive is asked that doesn't exist yet, the script naturally
> cannot
> > find it, and exits with status code 1. This is the end of the recovery.
> >
> > There are a few tables that are known to receive new entries multiple
> times
> > per day. However, the state of the recovery showed the latest item to be
> 2
> > days in the past. Checking the live DB, there are an expected amount of
> items
> > since that ID.
> >
> > I checked the logs, the last WAL archive that got downloaded is indeed
> the
> > last one that was available. The one that failed to download on the
> restore
> > machine, was uploaded to the cloud 8 minutes later, according to the
> upload
> > logs on the live DB.
> >
> > The postgres logs themselves seem perfectly normal. It logs all these WAL
> > recoveries, switches the timeline, and becomes available.
> >
> > What could be going wrong? My main issue is that I don't know where to
> start
> > looking, since nothing in the logs seems abnormal.
>
> I don't know, that all sounds like it is working as it should.
>
> If the last WAL archive that got downloaded by the "restore_command" is
> indeed
> the last one that was available, recovery did just what it is supposed to.
> If new WAL segments get archived later, that's too late.
>
> Perhaps you are looking for replication, not for restoring a backup, which
> is
> necessarily not totally up to date.
>
> Yours,
> Laurenz Albe
>


Re: Postgres restore sometimes restores to a point 2 days in the past

2025-01-31 Thread Koen De Groote
Any advice on a different mailing list that something like this would be
more suited to?

Regards,
Koen De Groote

On Fri, Jan 31, 2025 at 8:38 PM Koen De Groote  wrote:

> No, it's meant to be an off-site restore, as to do a daily check if the
> restore actually works.
>
> Regards,
> Koen De Groote
>
> On Fri, Jan 31, 2025 at 2:30 PM Laurenz Albe 
> wrote:
>
>> On Fri, 2025-01-31 at 10:47 +0100, Koen De Groote wrote:
>> > I'm running postgres 16.6
>> >
>> > My backup strategy is: basebackup and WAL archive. These get uploaded
>> to the cloud.
>> >
>> > The restore is on an isolated machine and is performed daily. It
>> downloads the
>> > basebackup, unpacks it, sets a recovery.signal, and a script is
>> provided as
>> > restore_command, to download the WAL archives %f and unpack them into %p
>> >
>> > In the script, the final unpacking is simply "gzip -dc %f > %p". The gz
>> files
>> > are first checked with "gzip -t".
>> >
>> > If a WAL archive is asked that doesn't exist yet, the script naturally
>> cannot
>> > find it, and exits with status code 1. This is the end of the recovery.
>> >
>> > There are a few tables that are known to receive new entries multiple
>> times
>> > per day. However, the state of the recovery showed the latest item to
>> be 2
>> > days in the past. Checking the live DB, there are an expected amount of
>> items
>> > since that ID.
>> >
>> > I checked the logs, the last WAL archive that got downloaded is indeed
>> the
>> > last one that was available. The one that failed to download on the
>> restore
>> > machine, was uploaded to the cloud 8 minutes later, according to the
>> upload
>> > logs on the live DB.
>> >
>> > The postgres logs themselves seem perfectly normal. It logs all these
>> WAL
>> > recoveries, switches the timeline, and becomes available.
>> >
>> > What could be going wrong? My main issue is that I don't know where to
>> start
>> > looking, since nothing in the logs seems abnormal.
>>
>> I don't know, that all sounds like it is working as it should.
>>
>> If the last WAL archive that got downloaded by the "restore_command" is
>> indeed
>> the last one that was available, recovery did just what it is supposed to.
>> If new WAL segments get archived later, that's too late.
>>
>> Perhaps you are looking for replication, not for restoring a backup,
>> which is
>> necessarily not totally up to date.
>>
>> Yours,
>> Laurenz Albe
>>
>


Re: Postgres restore sometimes restores to a point 2 days in the past

2025-01-31 Thread Koen De Groote
> What is the complete pg_basebackup command?

The command: pg_basebackup -h  -p  -U  -D
 -Ft -z -P -v --wal-method=none

So basically the same as the 2nd example here:
https://www.postgresql.org/docs/16/app-pgbasebackup.html except for the
verbose flag and the wal-method flag.

The wal-method is none for 2 reasons:
1/ Experience teaches that, in the event of storage being on a network,
timeouts to write WAL archives tot he network location can cause WAL
creation during a basebackup to be considered failed, and that causes the
entire basebackup to be considered failed, even if a retry occurs. Any
failure during a basebackup will cause postgres to auto-delete it at the
very end of pg_basebackup, declaring it "unusable". This is extremely bad
in backups that take very long. Better to not include WAL files in the
basebackup, and just get them after the fact.
2/ All my WAL files are archived and uploaded to the cloud. So, I can just
have them downloaded.

This has worked for months on end, and so has restoring.

> I don't understand the above.

> What is determining that a particular WAL file should be asked for?

The postgres server itself does this. Here's the documentation:
https://www.postgresql.org/docs/16/runtime-config-wal.html#GUC-RESTORE-COMMAND

And here: https://www.postgresql.org/docs/current/warm-standby.html

In practice, Postgres will see the "standby.signal" file and start asking
for WAL files. It will read the database it has and determine what the next
WAL filename should be. And then it asks for it. And it will keep asking
for these hexadecimal filenames, 1 at a time, for as long as the command or
set of commands provided to "restore_command" returns exit code 0. If the
process receives any other exit code, it stops recovery, switches timeline,
and considers the database to be up and running at the state its in.

It's constantly asking "I want this file now" and the script I have as the
restore command will attempt to download it from the cloud. Then it will
attempt to unzip it and move it into place. If any of these steps fails, I
return exit code 1.

> How active is the primary database you are pulling from?

Very active, plus automated testing to ensure everything is still running,
this will generate multiple items per day on its own.

> Available where?

The cloud, as I stated: WAL files get archived and these archived files are
then uploaded to the cloud.

See documentation:
https://www.postgresql.org/docs/16/runtime-config-wal.html#GUC-ARCHIVE-COMMAND

Regards,
Koen De Groote



On Fri, Jan 31, 2025 at 5:50 PM Adrian Klaver 
wrote:

> On 1/31/25 01:47, Koen De Groote wrote:
>
> Comments in line.
>
> > I'm running postgres 16.6
> >
> > My backup strategy is: basebackup and WAL archive. These get uploaded to
> > the cloud.
> >
> > The restore is on an isolated machine and is performed daily. It
> > downloads the basebackup, unpacks it, sets a recovery.signal, and a
> > script is provided as restore_command, to download the WAL archives %f
> > and unpack them into %p
> >
>
> What is the complete pg_basebackup command?
>
> > In the script, the final unpacking is simply "gzip -dc %f > %p". The gz
> > files are first checked with "gzip -t".
> >
> > If a WAL archive is asked that doesn't exist yet, the script naturally
> > cannot find it, and exits with status code 1. This is the end of the
> > recovery.
>
> I don't understand the above.
>
> What is determining that a particular WAL file should be asked for?
>
> >
> > There are a few tables that are known to receive new entries multiple
> > times per day. However, the state of the recovery showed the latest item
> > to be 2 days in the past. Checking the live DB, there are an expected
> > amount of items since that ID.
>
> How active is the primary database you are pulling from?
>
> >
> > I checked the logs, the last WAL archive that got downloaded is indeed
> > the last one that was available. The one that failed to download on the
> > restore machine, was uploaded to the cloud 8 minutes later, according to
> > the upload logs on the live DB.
>
> Available where?
>
> If that was the last one available how could the subsequent one be a
> failure to download?
>
> >
> > The postgres logs themselves seem perfectly normal. It logs all these
> > WAL recoveries, switches the timeline, and becomes available.
> >
> > What could be going wrong? My main issue is that I don't know where to
> > start looking, since nothing in the logs seems abnormal.
> >
> > Regards,
> > Koen De Groote
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Postgres restore sometimes restores to a point 2 days in the past

2025-01-31 Thread Koen De Groote
> I should have asked earlier what is the archive command

The example from the documentation, but with GZIP. So from the
documentation:
https://www.postgresql.org/docs/16/continuous-archiving.html#BACKUP-ARCHIVING-WAL

Which becomes this: archive_command = 'test ! -f
/mnt/server/archivedir/%f.gz && gzip -c %p /mnt/server/archivedir/%f.gz'


> Are you setting standby.signal or recovery.signal or both

Sorry, I keep confusing them. I checked my deployment code, it's
recovery.signal and only that.

Regards,
Koen De Groote


On Fri, Jan 31, 2025 at 10:26 PM Adrian Klaver 
wrote:

> On 1/31/25 12:10, Koen De Groote wrote:
> >  > What is the complete pg_basebackup command?
>
> > 2/ All my WAL files are archived and uploaded to the cloud. So, I can
> > just have them downloaded.
>
> I should have asked earlier what is the archive command?
>
> Are
> >  > What is determining that a particular WAL file should be asked for?
> >
> > The postgres server itself does this. Here's the documentation:
> >
> https://www.postgresql.org/docs/16/runtime-config-wal.html#GUC-RESTORE-COMMAND
> <
> https://www.postgresql.org/docs/16/runtime-config-wal.html#GUC-RESTORE-COMMAND
> >
> >
> > And here: https://www.postgresql.org/docs/current/warm-standby.html
> > <https://www.postgresql.org/docs/current/warm-standby.html>
> >
> > In practice, Postgres will see the "standby.signal" file and start
>
> In your OP you say:
>
> "It downloads the basebackup, unpacks it, sets a recovery.signal, ..."
>
> Are you setting standby.signal or recovery.signal or both?
>
>
> > asking for WAL files. It will read the database it has and determine
> > what the next WAL filename should be. And then it asks for it. And it
> > will keep asking for these hexadecimal filenames, 1 at a time, for as
> > long as the command or set of commands provided to "restore_command"
> > returns exit code 0. If the process receives any other exit code, it
> > stops recovery, switches timeline, and considers the database to be up
> > and running at the state its in.
> >
> > It's constantly asking "I want this file now" and the script I have as
> > the restore command will attempt to download it from the cloud. Then it
> > will attempt to unzip it and move it into place. If any of these steps
> > fails, I return exit code 1.
> >
>
> > Regards,
> > Koen De Groote
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Memory settings when running postgres in a docker container

2024-11-22 Thread Koen De Groote
Ah, see, I didn't know that.

On Wed, Nov 20, 2024 at 11:10 PM David Mullineux  wrote:

> i dont get why you think all memroy will be used.
>  When you say
> shared_buffers = 16GB
> effective_cache_size = 48GB
>
> ...then this is using only 16GB for shared buffers.
>
> The effective _cache_size doesn't cause any memory to.be allocated. It's
> just a hint to optomizer ....
>
> On Wed, 20 Nov 2024, 11:16 Koen De Groote,  wrote:
>
>> Assuming a machine with:
>>
>> * 16 CPU cores
>> * 64GB RAM
>>
>> Set to 500 max connections
>>
>> A tool like this: https://pgtune.leopard.in.ua/
>>
>> Will output recommended settings:
>>
>> max_connections = 500
>> shared_buffers = 16GB
>> effective_cache_size = 48GB
>> maintenance_work_mem = 2GB
>> checkpoint_completion_target = 0.9
>> wal_buffers = 16MB
>> default_statistics_target = 100
>> random_page_cost = 1.1
>> effective_io_concurrency = 200
>> work_mem = 8388kB
>> huge_pages = try
>> min_wal_size = 1GB
>> max_wal_size = 4GB
>> max_worker_processes = 16
>> max_parallel_workers_per_gather = 4
>> max_parallel_workers = 16
>> max_parallel_maintenance_workers = 4
>>
>> And they basically use up all the memory of the machine.
>>
>> 16GB shared buffers, 48GB effective cache size, 8MB of work_mem for some
>> reason...
>>
>> This seems rather extreme. I feel there should be free memory for
>> emergencies and monitoring solutions.
>>
>> And then there's the fact that postgres on this machine will be run in a
>> docker container. Which, on Linux, receives 64MB of /dev/shm shared memory
>> by default, but can be increased.
>>
>> I feel like I should probably actually lower my upper limit for memory,
>> regardless of what the machine actually has, so I can have free memory, and
>> also not bring the container process itself into danger.
>>
>> Is it as straightforward as putting my limit on, say 20GB, and then
>> giving more /dev/shm to the container? Or is there more to consider?
>>
>> Regards,
>> Koen De Groote
>>
>>
>>
>>
>>
>>
>>