Re: Intermittent Issue with WAL Segment Removal in Logical Replication
I am unfortunately not really familiar with Google Cloud SQL internals as well. But we have seen this happen on Amazon RDS as well. Could it be possible that we are requesting a future WAL segment, say WAL upto X is written and we are asking for X + 1? It could be that the error message is misleading. I do not have the information from pg_replication_slots as I have terminated the test. I am fairly certain that I can reproduce this again. I will gather both the restart_lsn and contents of pg_wal for the failed segment. Is there any other information that would help debug this further? On Fri, Dec 29, 2023 at 4:16 PM Tomas Vondra wrote: > On 12/27/23 16:31, Kaushik Iska wrote: > > Hi all, > > > > I'm including additional details, as I am able to reproduce this issue a > > little more reliably. > > > > Postgres Version: POSTGRES_14_9.R20230830.01_07 > > Vendor: Google Cloud SQL > > Logical Replication Protocol version 1 > > > > I don't know much about Google Cloud SQL internals. Is it relatively > close to Postgres (as e.g. RDS) or are the internals very different / > modified for cloud environments? > > > Here are the logs of attempt succeeding right after it fails: > > > > 2023-12-27 01:12:40.581 UTC [59790]: [6-1] db=postgres,user=postgres > > STATEMENT: START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL > > 6/5AE67D79 (proto_version '1', publication_names > > 'peerflow_pub_wal_testing_2') <- FAILS > > 2023-12-27 01:12:41.087 UTC [59790]: [7-1] db=postgres,user=postgres > > ERROR: requested WAL segment 000100060059 has already been > > removed > > 2023-12-27 01:12:44.581 UTC [59794]: [3-1] db=postgres,user=postgres > > STATEMENT: START_REPLICATION SLOT peerflow_slot_wal_testing_2 LOGICAL > > 6/5AE67D79 (proto_version '1', publication_names > > 'peerflow_pub_wal_testing_2') <- SUCCEEDS > > 2023-12-27 01:12:44.582 UTC [59794]: [4-1] db=postgres,user=postgres > > LOG: logical decoding found consistent point at 6/5A31F050 > > > > Happy to include any additional details of my setup. > > > > I personally don't see how could this fail and then succeed, unless > Google does something smart with the WAL segments under the hood. Surely > we try to open the same WAL segment (given the LSN is the same), so how > could it not exist and then exist? > > As Ron already suggested, it might be useful to see information for the > replication slot peerflow_slot_wal_testing_2 (especially the restart_lsn > value). Also, maybe show the contents of pg_wal (especially for the > segment referenced in the error message). > > Can you reproduce this outside Google cloud environment? > > > regards > > -- > Tomas Vondra > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
Re: Read write performance check
Veem You should also be familiar with Aurora Postgres's storage architecture, which is very different from regular Postgres (see https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Overview.html ) Aurora is remote storage, which means if your read workload can't fit into the PG's shared buffers, it will have a very different performance than if the storage is a local SSD. On write, it writes six copies to three different availability zones for high durability and availablity. So having enough network bandwidth is a factor as well. Ken On Tue, Dec 26, 2023 at 11:30 PM Kirk Wolak wrote: > On Thu, Dec 21, 2023 at 8:31 AM veem v wrote: > >> Can someone please guide me, if any standard scripting is available for >> doing such read/write performance test? Or point me to any available docs? >> >> >> ... > > > Veem, first things first... "Top Posting" is when you reply at the top of > the email... Notice how I replied at the bottom (and I deleted context, > clearly). > This is the style we prefer here. > > Second, since you are new to postgreSQL... Let me recommend some reading. > Cybertec has articles on performance (Tom Kyte style). > Also, read the "Don't Do That" wiki, and finally, have a look at pgbench > and psql documentation. And specifically look at GENERATE_SERIES(), > but the Cybertec articles will touch on that. Regardless... Reading the > docs is insightful. > > Links: > https://www.cybertec-postgresql.com/en/postgresql-hash-index-performance/ > https://wiki.postgresql.org/wiki/Don't_Do_This > https://www.postgresql.org/docs/current/app-psql.html > https://www.postgresql.org/docs/current/pgbench.html > https://www.postgresql.org/docs/16/functions-srf.html > > HTH, > > Kirk Out! > > >