Re: Intermittent Issue with WAL Segment Removal in Logical Replication

2024-01-01 Thread Kaushik Iska
I am unfortunately not really familiar with Google Cloud SQL internals as
well. But we have seen this happen on Amazon RDS as well.

Could it be possible that we are requesting a future WAL segment, say WAL
upto X is written and we are asking for X + 1? It could be that the error
message is misleading.

I do not have the information from pg_replication_slots as I have
terminated the test. I am fairly certain that I can reproduce this again. I
will gather both the restart_lsn and contents of pg_wal for the failed
segment. Is there any other information that would help debug this further?

On Fri, Dec 29, 2023 at 4:16 PM Tomas Vondra 
wrote:

> On 12/27/23 16:31, Kaushik Iska wrote:
> > Hi all,
> >
> > I'm including additional details, as I am able to reproduce this issue a
> > little more reliably.
> >
> > Postgres Version: POSTGRES_14_9.R20230830.01_07
> > Vendor: Google Cloud SQL
> > Logical Replication Protocol version 1
> >
>
> I don't know much about Google Cloud SQL internals. Is it relatively
> close to Postgres (as e.g. RDS) or are the internals very different /
> modified for cloud environments?
>
> > Here are the logs of attempt succeeding right after it fails:
> >
> > 2023-12-27 01:12:40.581 UTC [59790]: [6-1] db=postgres,user=postgres
> > STATEMENT:  START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL
> > 6/5AE67D79 (proto_version '1', publication_names
> > 'peerflow_pub_wal_testing_2') <- FAILS
> > 2023-12-27 01:12:41.087 UTC [59790]: [7-1] db=postgres,user=postgres
> > ERROR:  requested WAL segment 000100060059 has already been
> > removed
> > 2023-12-27 01:12:44.581 UTC [59794]: [3-1] db=postgres,user=postgres
> > STATEMENT:  START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL
> > 6/5AE67D79 (proto_version '1', publication_names
> > 'peerflow_pub_wal_testing_2')  <- SUCCEEDS
> > 2023-12-27 01:12:44.582 UTC [59794]: [4-1] db=postgres,user=postgres
> > LOG:  logical decoding found consistent point at 6/5A31F050
> >
> > Happy to include any additional details of my setup.
> >
>
> I personally don't see how could this fail and then succeed, unless
> Google does something smart with the WAL segments under the hood. Surely
> we try to open the same WAL segment (given the LSN is the same), so how
> could it not exist and then exist?
>
> As Ron already suggested, it might be useful to see information for the
> replication slot peerflow_slot_wal_testing_2 (especially the restart_lsn
> value). Also, maybe show the contents of pg_wal (especially for the
> segment referenced in the error message).
>
> Can you reproduce this outside Google cloud environment?
>
>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: Read write performance check

2024-01-01 Thread Hao Zhang
Veem

You should also be familiar with Aurora Postgres's storage
architecture, which is very different from regular Postgres (see
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Overview.html
)
Aurora is remote storage, which means if your read workload can't fit into
the PG's shared buffers, it will have a very different performance than if
the storage is a local SSD.
On write, it writes six copies to three different availability zones for
high durability and availablity. So having enough network bandwidth is a
factor as well.

Ken

On Tue, Dec 26, 2023 at 11:30 PM Kirk Wolak  wrote:

> On Thu, Dec 21, 2023 at 8:31 AM veem v  wrote:
>
>> Can someone please guide me, if any standard scripting is available for
>> doing such read/write performance test? Or point me to any available docs?
>>
>>
>> ...
>
>
> Veem, first things first... "Top Posting" is when you reply at the top of
> the email...  Notice how I replied at the bottom (and I deleted context,
> clearly).
> This is the style we prefer here.
>
> Second, since you are new to postgreSQL... Let me recommend some reading.
> Cybertec has articles on performance (Tom Kyte style).
> Also, read the "Don't Do That" wiki, and finally, have a look at pgbench
> and psql documentation.  And specifically look at GENERATE_SERIES(),
> but the Cybertec articles will touch on that.  Regardless...  Reading the
> docs is insightful.
>
> Links:
> https://www.cybertec-postgresql.com/en/postgresql-hash-index-performance/
> https://wiki.postgresql.org/wiki/Don't_Do_This
> https://www.postgresql.org/docs/current/app-psql.html
> https://www.postgresql.org/docs/current/pgbench.html
> https://www.postgresql.org/docs/16/functions-srf.html
>
> HTH,
>
> Kirk Out!
>
>
>