Monitoring logical replication

2023-10-07 Thread Shaheed Haque
Hi,

I've been playing with logical replication (currently on PG14),
specifically in an AWS RDS Postgres context, but NOT using AWS' own
replication tooling. I'm generally familiar with the challenges of
distributed systems (such causality, time synchronisation etc), but not
especially familiar with PG.

In looking at how to tell how a given subscriber has caught up with its
publisher, there is plenty of advice around the Web, for example
https://dba.stackexchange.com/questions/314324/monitor-logical-replication-using-lsn.
Like this example, much advice ends up talking about using separate queries
on the publisher and the subscriber to compare LSNs. First, (I think) I
understand the core difficulty that comparing LSNs is inherently racy, but
given that, I'm a bit unclear as to why a single query on the publisher is
not enough...IIUC:

   - Changes sent from the publisher to the subscriber are identified by
   LSN.
   - The publisher knows it's own current latest LSN (pg_current_wal_lsn()),
   but this seems not to be exposed at the subscriber.
   - The subscriber knows what it has applied locally and even tells the
   publisher (pg_stat_subscription.latest_end_lsn), but it does not seem to
   be exposed at the publisher.

Have I missed something? Is there a way to track the LSN delta (given that
this is known to be racy) just by querying one end?

Second, how do folk "know" when replication is "done". For example, if the
two LSNs continued to match for 1 * replication lag? Or N * replication
lag? What would be a plausible N?

Third, as we know when logical replication is started, the initial table
state is captured in a snapshot, and sent across using COPY TABLE under the
covers. Now, let's say that the publisher is idle (i.e. no SQL writes to
the user's schema...obviously pg_catalog might change as replication is
configured and enabled) and that the replication starts with the publisher
as LSN_start. How could one know when the copying is done:

   - I initially assumed that the publisher's LSN would not change from
   LSN_start, but as the copying proceeds, I see that it DOES change
   (presumably because there are updates happening to pg_catalog, such as the
   temporary slots coming and going).
   - Is there some kind of singleton state on either publisher or
   subscriber that could be checked to know? (At the moment, I am counting the
   records in all copied tables).

I realise that the knowledge that the publisher is "idle" is a special
case, but right now, my test for being "done" is:

   - Number of records in copied tables matches AND the publisher's
   pg_stat_subscription matches the subscriber's pg_stat_subscription.
   latest_end_lsn.

Plus or minus the bit about replication lag, is there a better way?

Thanks, Shaheed


psql trying twice to connect to local DB

2023-10-07 Thread Steve Baldwin
I have a local DB (15.2) running in a docker container.

If I make a connection to that DB from most clients I see log entries like
this:

2023-10-07 22:32:26.518
UTC,,,16278,"172.21.0.1:33192",6521dc7a.3f96,1,"",2023-10-07
22:32:26 UTC,,0,LOG,0,"connection received: host=172.21.0.1
port=33192","","not initialized",,0
2023-10-07 22:32:26.529 UTC,"b2bc_owner","b2bcreditonline",16278,"
172.21.0.1:33192",6521dc7a.3f96,2,"authentication",2023-10-07 22:32:26
UTC,4/340,0,LOG,0,"connection authenticated: identity=""b2bc_owner""
method=scram-sha-256
(/var/lib/postgresql/data/pg_hba.conf:100)","","client backend",,0
2023-10-07 22:32:26.529 UTC,"b2bc_owner","b2bcreditonline",16278,"
172.21.0.1:33192",6521dc7a.3f96,3,"authentication",2023-10-07 22:32:26
UTC,4/340,0,LOG,0,"connection authorized: user=b2bc_owner
database=b2bcreditonline application_name=crystal","","client
backend",,0

If however I use psql I see two connection log entries - sometimes almost a
second apart. For example:

2023-10-07 22:33:16.214
UTC,,,16431,"172.21.0.1:48550",6521dcac.402f,1,"",2023-10-07
22:33:16 UTC,,0,LOG,0,"connection received: host=172.21.0.1
port=48550","","not initialized",,0
2023-10-07 22:33:17.130
UTC,,,16432,"172.21.0.1:48558",6521dcad.4030,1,"",2023-10-07
22:33:17 UTC,,0,LOG,0,"connection received: host=172.21.0.1
port=48558","","not initialized",,0
2023-10-07 22:33:17.151 UTC,"b2bc_owner","b2bcreditonline",16432,"
172.21.0.1:48558",6521dcad.4030,2,"authentication",2023-10-07 22:33:17
UTC,4/354,0,LOG,0,"connection authenticated: identity=""b2bc_owner""
method=scram-sha-256
(/var/lib/postgresql/data/pg_hba.conf:100)","","client backend",,0
2023-10-07 22:33:17.152 UTC,"b2bc_owner","b2bcreditonline",16432,"
172.21.0.1:48558",6521dcad.4030,3,"authentication",2023-10-07 22:33:17
UTC,4/354,0,LOG,0,"connection authorized: user=b2bc_owner
database=b2bcreditonline application_name=psql","","client
backend",,0

The version of psql is 15.4.

Obviously this isn't a major problem, but I'm curious if I've got something
configured incorrectly, or just something I missed from the docs.

Thanks,

Steve


Re: psql trying twice to connect to local DB

2023-10-07 Thread Steve Baldwin
It seems this is not restricted to local connections. Here's the log
entries from making a connection with the same psql client to a remote
(RDS) instance:

2023-10-07 22:47:06.506
UTC,,,23005,"10.117.100.26:55302",6521dfea.59dd,1,"",2023-10-07
22:47:06 UTC,,0,LOG,0,"connection received: host=10.117.100.26
port=55302","","not initialized",,0
2023-10-07 22:47:08.479
UTC,,,23010,"10.117.100.26:55306",6521dfec.59e2,1,"",2023-10-07
22:47:08 UTC,,0,LOG,0,"connection received: host=10.117.100.26
port=55306","","not initialized",,0
2023-10-07 22:47:09.992 UTC,"b2bc_owner","b2bcreditonline",23010,"
10.117.100.26:55306",6521dfec.59e2,2,"authentication",2023-10-07 22:47:08
UTC,29/840773,0,LOG,0,"connection authenticated:
identity=""b2bc_owner"" method=md5
(/rdsdbdata/config/pg_hba.conf:14)","","client backend",,0
2023-10-07 22:47:09.993 UTC,"b2bc_owner","b2bcreditonline",23010,"
10.117.100.26:55306",6521dfec.59e2,3,"authentication",2023-10-07 22:47:08
UTC,29/840773,0,LOG,0,"connection authorized: user=b2bc_owner
database=b2bcreditonline application_name=psql SSL enabled
(protocol=TLSv1.2, cipher=ECDHE-RSA-AES256-GCM-SHA384,
bits=256)","","client backend",,0

Compared to a connection made from our application:

2023-10-07 22:28:26.565
UTC,,,22121,"10.120.166.33:47748",6521db8a.5669,1,"",2023-10-07
22:28:26 UTC,,0,LOG,0,"connection received: host=10.120.166.33
port=47748","","not initialized",,0
2023-10-07 22:28:26.583 UTC,"b2bc_api","b2bcreditonline",22121,"
10.120.166.33:47748",6521db8a.5669,2,"authentication",2023-10-07 22:28:26
UTC,18/1263698,0,LOG,0,"connection authenticated: identity=""b2bc_api""
method=md5 (/rdsdbdata/config/pg_hba.conf:14)","","client
backend",,0
2023-10-07 22:28:26.583 UTC,"b2bc_api","b2bcreditonline",22121,"
10.120.166.33:47748",6521db8a.5669,3,"authentication",2023-10-07 22:28:26
UTC,18/1263698,0,LOG,0,"connection authorized: user=b2bc_api
database=b2bcreditonline SSL enabled (protocol=TLSv1.2,
cipher=ECDHE-RSA-AES256-GCM-SHA384, bits=256)","","client
backend",,0

It seems psql 16.0 (homebrew) behaves the same.

Steve

On Sun, Oct 8, 2023 at 9:44 AM Steve Baldwin 
wrote:

> I have a local DB (15.2) running in a docker container.
>
> If I make a connection to that DB from most clients I see log entries like
> this:
>
> 2023-10-07 22:32:26.518 
> UTC,,,16278,"172.21.0.1:33192",6521dc7a.3f96,1,"",2023-10-07
> 22:32:26 UTC,,0,LOG,0,"connection received: host=172.21.0.1
> port=33192","","not initialized",,0
> 2023-10-07 22:32:26.529 UTC,"b2bc_owner","b2bcreditonline",16278,"
> 172.21.0.1:33192",6521dc7a.3f96,2,"authentication",2023-10-07 22:32:26
> UTC,4/340,0,LOG,0,"connection authenticated: identity=""b2bc_owner""
> method=scram-sha-256
> (/var/lib/postgresql/data/pg_hba.conf:100)","","client backend",,0
> 2023-10-07 22:32:26.529 UTC,"b2bc_owner","b2bcreditonline",16278,"
> 172.21.0.1:33192",6521dc7a.3f96,3,"authentication",2023-10-07 22:32:26
> UTC,4/340,0,LOG,0,"connection authorized: user=b2bc_owner
> database=b2bcreditonline application_name=crystal","","client
> backend",,0
>
> If however I use psql I see two connection log entries - sometimes almost
> a second apart. For example:
>
> 2023-10-07 22:33:16.214 
> UTC,,,16431,"172.21.0.1:48550",6521dcac.402f,1,"",2023-10-07
> 22:33:16 UTC,,0,LOG,0,"connection received: host=172.21.0.1
> port=48550","","not initialized",,0
> 2023-10-07 22:33:17.130 
> UTC,,,16432,"172.21.0.1:48558",6521dcad.4030,1,"",2023-10-07
> 22:33:17 UTC,,0,LOG,0,"connection received: host=172.21.0.1
> port=48558","","not initialized",,0
> 2023-10-07 22:33:17.151 UTC,"b2bc_owner","b2bcreditonline",16432,"
> 172.21.0.1:48558",6521dcad.4030,2,"authentication",2023-10-07 22:33:17
> UTC,4/354,0,LOG,0,"connection authenticated: identity=""b2bc_owner""
> method=scram-sha-256
> (/var/lib/postgresql/data/pg_hba.conf:100)","","client backend",,0
> 2023-10-07 22:33:17.152 UTC,"b2bc_owner","b2bcreditonline",16432,"
> 172.21.0.1:48558",6521dcad.4030,3,"authentication",2023-10-07 22:33:17
> UTC,4/354,0,LOG,0,"connection authorized: user=b2bc_owner
> database=b2bcreditonline application_name=psql","","client
> backend",,0
>
> The version of psql is 15.4.
>
> Obviously this isn't a major problem, but I'm curious if I've got
> something configured incorrectly, or just something I missed from the docs.
>
> Thanks,
>
> Steve
>


Re: psql trying twice to connect to local DB

2023-10-07 Thread Tom Lane
Steve Baldwin  writes:
> If I make a connection to that DB from most clients I see log entries like
> this:
> ...
> If however I use psql I see two connection log entries - sometimes almost a
> second apart. For example:

These log entries show that you're using password-based authentication.
I believe what is happening is that psql tries to connect, discovers
that it needs a password, and then tries again after it's got a
password.  The delay would correspond to the time for you to type
your password (I take it you're a fast typist).

If you don't like the extra log entries, there's a command-line
switch that forces psql to prompt for password before connecting
the first time.  Of course, that might result in prompting for
a password that's not really needed.

regards, tom lane




Re: psql trying twice to connect to local DB

2023-10-07 Thread Ron

On 10/7/23 17:44, Steve Baldwin wrote:

I have a local DB (15.2) running in a docker container.

If I make a connection to that DB from most clients I see log entries like 
this:


2023-10-07 22:32:26.518 UTC,,,16278,"172.21.0.1:33192 
",6521dc7a.3f96,1,"",2023-10-07 22:32:26 
UTC,,0,LOG,0,"connection received: host=172.21.0.1 
port=33192","","not initialized",,0
2023-10-07 22:32:26.529 
UTC,"b2bc_owner","b2bcreditonline",16278,"172.21.0.1:33192 
",6521dc7a.3f96,2,"authentication",2023-10-07 
22:32:26 UTC,4/340,0,LOG,0,"connection authenticated: 
identity=""b2bc_owner"" method=scram-sha-256 
(/var/lib/postgresql/data/pg_hba.conf:100)","","client backend",,0
2023-10-07 22:32:26.529 
UTC,"b2bc_owner","b2bcreditonline",16278,"172.21.0.1:33192 
",6521dc7a.3f96,3,"authentication",2023-10-07 
22:32:26 UTC,4/340,0,LOG,0,"connection authorized: user=b2bc_owner 
database=b2bcreditonline application_name=crystal","","client 
backend",,0


If however I use psql I see two connection log entries - sometimes almost 
a second apart. For example:


2023-10-07 22:33:16.214 UTC,,,16431,"172.21.0.1:48550 
",6521dcac.402f,1,"",2023-10-07 22:33:16 
UTC,,0,LOG,0,"connection received: host=172.21.0.1 
port=48550","","not initialized",,0
2023-10-07 22:33:17.130 UTC,,,16432,"172.21.0.1:48558 
",6521dcad.4030,1,"",2023-10-07 22:33:17 
UTC,,0,LOG,0,"connection received: host=172.21.0.1 
port=48558","","not initialized",,0
2023-10-07 22:33:17.151 
UTC,"b2bc_owner","b2bcreditonline",16432,"172.21.0.1:48558 
",6521dcad.4030,2,"authentication",2023-10-07 
22:33:17 UTC,4/354,0,LOG,0,"connection authenticated: 
identity=""b2bc_owner"" method=scram-sha-256 
(/var/lib/postgresql/data/pg_hba.conf:100)","","client backend",,0
2023-10-07 22:33:17.152 
UTC,"b2bc_owner","b2bcreditonline",16432,"172.21.0.1:48558 
",6521dcad.4030,3,"authentication",2023-10-07 
22:33:17 UTC,4/354,0,LOG,0,"connection authorized: user=b2bc_owner 
database=b2bcreditonline application_name=psql","","client backend",,0


The version of psql is 15.4.

Obviously this isn't a major problem, but I'm curious if I've got 
something configured incorrectly, or just something I missed from the docs.


As follow-up to Tom's response: create a .pgpass file with an entry for 
b2bc_owner.  There might still be two entries, but the time between the two 
of them should be instantaneous.


--
Born in Arizona, moved to Babylonia.