Re: Soundness of strategy for detecting locks acquired by DDL statements
On Wed, May 7, 2025, 00:57 Laurenz Albe wrote: > On Tue, 2025-05-06 at 12:06 +0300, Agis Anastasopoulos wrote: > > I'd like to "preflight" a given schema migration (i.e. one or > > more DDL statements) before applying it to the production database (e.g. > > for use in a CI pipeline). I'm thinking of a strategy and would like to > > know about its soundness. > > > > The general idea is: > > > > - you have a test database that's a clone of your production one (with > > or without data but with the schema being identical) > > - given the DDL statements, you open a transaction, grab its pid, and > > for each statement: > >1. from a different "observer" connection, you read pg_locks, > > filtering locks for that pid. This is the "before" locks > >2. from the first tx, you execute the statement > >3. from the observer, you grab again pg_locks and compute the diff > > between this and the "before" view > >4. from the first tx, you rollback the transaction > > > > By diffing the after/before pg_locks view, my assumption is that you > > know what locks will be acquired by the DDL statements (but not for how > > long). The query I'm thinking is: > > > > SELECT locktype, database, relation, objid, mode FROM > > pg_catalog.pg_locks WHERE pid = $1 AND locktype IN ('relation', > > 'object') AND granted"; > > > > The type of statements that would be fed as input would be `ALTER|CREATE > > TABLE`, `CREATE|DROP INDEX` and perhaps DML statements (`UPDATE`, > > `INSERT`, `DELETE`). > > > > Do you think this is a robust way to detect the locks that were > > acquired? Are there any caveats/drawbacks/flaws in this strategy? > > I think that that is a good strategy, as long as you run all DDL statements > in a single transaction. > > Yours, > Laurenz Albe > Can you elaborate on that? I was thinking that we should mirror the way the statements are going to be executed in production: if they're all going to be executed inside a single tx, then we should do the same. But if not, them we should follow course and execute them in separate txs. Am I missing something? Thanks >
Re: pg_rewind problem: cannot find WAL
On Wed, 2025-05-07 at 12:51 +0200, Luca Ferrari wrote: > running 17.4 on ubuntu 24.04 machines. I've three hosts, pg-1 > (primary) and two physical replicas. > I then promote host pg-3 as a master (pg_promote()) and want to rewind > the pg-1 to follow the new master, so: > > ssh pg-3 'sudo -u postgres /usr/lib/postgresql/17/bin/pg_rewind -D > /var/lib/postgresql/17/main --source-server="user=replica_fluca > host=pg-3 dbname=replica_fluca"' > pg_rewind: servers diverged at WAL location 0/B8550F8 on timeline 1 > pg_rewind: error: could not open file > "/var/lib/postgresql/17/main/pg_wal/0001000A": No such > file or directory > pg_rewind: error: could not find previous WAL record at 0/AFFF4E8 > > But the file 0x01A is not there: > > > % ssh pg-3 'sudo ls /var/lib/postgresql/17/main/pg_wal' > 0001000B.partial > 0002.history > 0002000B > 0002000C > 0002000D > 0002000E > archive_status > summaries > > % ssh pg-1 'sudo ls /var/lib/postgresql/17/main/pg_wal' > 00010005.0028.backup > 0001000B > 0001000C > 0001000D > 0001000E > archive_status > summaries > > Do i have to ensure the old primary pg-1 does a wal switch before > promoting the other one and try to rewind? I don't think it is connected to a WAL switch. I'd say that you should set "wal_keep_size" high enough that all the WAL needed for pg_rewind is still present. If you have a WAL archive, you could define a restore_command on the server you want to rewind. Yours, Laurenz Albe
Re: Pgbackrest failure for INCR and DIFF but not FULL backup
On Wed, May 7, 2025 at 7:15 AM KK CHN wrote: > *archive_command = 'pgbackrest --stanza=My_Repo archive-push %p && cp %p > /data/archive/%f' * > Don't do this. You are archiving twice, and worse, the first part is using async archiving. Remove that whole "cp" part. Once that is fixed, run: pgbackrest --stanza=My_Repo check to verify that WAL is being archived quickly and properly. Then test your backups, checking the file /var/log/pgbackrest/My_Repo-archive-push-async.log and your postgres log if any problems arise. Cheers, Greg -- Crunchy Data - https://www.crunchydata.com Enterprise Postgres Software Products & Tech Support
Re: huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory
Hi Bogdan, The root cause here is that the number of huge pages you've configured (vm.nr_hugepages = 980) is not sufficient. Each huge page on your system is 2 MB in size, so 980 pages give you roughly 1.96 GB of memory (980 × 2 MB). However, PostgreSQL is clearly requesting about 2.2 GB of shared memory (specifically, 2204106752 bytes as shown in the error message you provided), which exceeds what's available through huge pages. That’s why PostgreSQL fails to start when huge_pages = on - it requires the entire shared memory segment to come from huge pages and refuses to fall back to regular ones. Earlier, you had the huge_pages setting commented out, which means PostgreSQL used the default value: huge_pages = try. In that mode, it first attempts to use huge pages, but if that fails (like in your case due to insufficient allocation), it falls back to standard memory pages — which is why the instance started without issues then. To fix the issue, you should increase vm.nr_hugepages to at least 1100 to fully cover the shared memory request (you can go a bit higher to be safe and then reduce it as described in the article I'm pasting the link to). Also, a side note: max_connections = 1000 is quite high for an instance with 8 GB of RAM and only 2 vCPUs. Even if huge pages are properly allocated, such a high number of connections can lead to performance issues. You might want to consider lowering it or using a connection pooler like PgBouncer. If you’d like to understand how huge pages work in PostgreSQL, including how to calculate memory needs and configure the OS properly, I wrote a detailed article some time ago (still valid). It’s in Polish, which I assume is fine for you: https://linuxpolska.com/pl/baza-wiedzy/blog/postgres-pamieci-ram-tipstricks/ best regards, Alicja Kucharczyk >
Pgbackrest failure for INCR and DIFF but not FULL backup
Hi folks, I am facing a strange issue, Pgbackrest backup fails for DIFF or INCR backups but not Full backup, with the * error WAL file cannot be archived before 6 ms timeout.* The pgbackrest " *stanza check* " command *sometimes succeeds, but sometimes fails.* I don't know why * PG is unable to* *copy WAL files from pg_wal to /data/myarchive_di*r* in real time*. I always observed a delay of a few minutes for a wal file from pg_wal to appear in /data/my_archive_dir. I'hv observed in the postgresql.conf (checkpoint_timeout = 5 m, max_wal_size = 16 GB, wal_keep_size=15GB, min_wal_size=80MB etc.) and I found pg_wal dir size is 16 GB on disk (du -h -d 1 ) and /data/archive = 1.2 T /dev/mapper/rhel_bcga68-data 5.0T 1.8T 3.3T 35% /data Can we suspect the 5 M is the reason for the WAL archiving delay ? Backup to a remote RepoServer for INCR or DIFF backup always fails, I found a full backup always succeeds!!! What is the ideal value needed to be set for "*checkpoint_timeout*" ? *Or this doesn't have any impact on pgbackrest failure ?* *archive_command = 'pgbackrest --stanza=My_Repo archive-push %p && cp %p /data/archive/%f' * >From postgresql logs I am seeing this .. ERROR: [082]: unable to push WAL file '000102630002' to the archive asynchronously after 60 second(s) HINT: check '/var/log/pgbackrest/My_Repo-archive-push-async.log' for errors. INFO: archive-push command end: aborted with exception [082] 2025-05-02 12:15:17 IST LOG: archive command failed with exit code 82 2025-05-02 12:15:17 IST DETAIL: The failed archive command was: pgbackrest --stanza=My_Repo archive-push pg_wal/000102630002 && cp pg_wal/000102630002 /data/archive/000102630002 INFO: archive-push command begin 2.52.1: [pg_wal/000102630002] --archive-async --compress-type=zst --exec-id=2848559-384cf49c --log-level-console=info --log-level-file=debug --log-level-stderr=info --pg1-path= /var/lib/postgres/16/data --pg-version-force=16 --process-max=6 --repo1-host=10.x.y.202 --repo1-host-user=pgbackrest --spool-path=/var/spool/pgbackrest --stanza=My_Repo top output on DB cluster: top - 12:37:00 up 66 days, 17:24, 2 users, load average: 4.04, 4.72, 4.56 Tasks: 902 total, 4 running, 897 sleeping, 0 stopped, 1 zombie %Cpu(s): 7.4 us, 1.7 sy, 0.0 ni, 89.9 id, 0.4 wa, 0.2 hi, 0.4 si, 0.0 st MiB Mem : 31837.6 total,706.1 free, 15243.0 used, 24741.0 buff/cache MiB Swap: 8060.0 total, 6634.0 free, 1426.0 used. 16608.9 avail Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 2839363 postgre+ 20 0 8965608 7.2g 7.1g S 70.2 23.0 2:02.61 postgres 2864108 postgre+ 20 0 8967848 7.1g 7.1g S 64.9 22.8 0:30.04 postgres 2865547 postgre+ 20 0 8965432 7.1g 7.1g S 39.1 22.8 0:32.30 postgres 2865752 postgre+ 20 0 8964352 6.9g 6.9g S 16.6 22.3 0:32.94 postgres Model name:Intel(R) Xeon(R) Gold 6430 BIOS Model name: Intel(R) Xeon(R) Gold 6430 CPU family: 6 Model: 143 Thread(s) per core: 1 Core(s) per socket: 16 These are vCPUs(16 nos) , OS RHEL 9, postgres 16, pgbackrest 2.52.1 *Any hints most welcome to find the root cause / troubleshoot the pgbackrest failures for DIFF/ INCR backups.* Thank you Krishane *For More Inputs: * For more info : you can see the full backup success here. but a diff backup fails. full backup: 20250505-070204F timestamp start/stop: 2025-05-05 07:02:04+05:30 / 2025-05-05 22:11:23+05:30 wal start/stop: 0001026F0066 / 000102730045 database size: 503.1GB, database backup size: 503.1GB repo1: backup size: 79GB When I try diff backup it always fails. [root@Repo ~]# tail -f /var/log/pgbackrest/My_Repo-backup.log stack trace: command/archive/find.c:walSegmentFind:191:(this: {WalSegmentFind}, walSegment: {"0001027B006C"}) command/backup/backup.c:backupArchiveCheckCopy:(backupData: {BackupData}, manifest: {Manifest}) command/backup/backup.c:cmdBackup:(void) main.c:main:(debug log level required for parameters) 2025-05-07 15:47:49.760 P00 INFO: backup command end: aborted with exception [082] 2025-05-07 15:47:49.760 P00 DEBUG: command/exit::exitSafe: => 82 2025-05-07 15:47:49.860 P00 DEBUG: main::main: => 82 ^C [ root@Repo ~ ~]# date Wednesday 07 May 2025 04:06:37 PM IST *The postgres log says* =316781-61d82f85 --log-level-console=info --log-level-file=debug --log-level-stderr=info --pg1-path= /var/lib/postgres/16/data --pg-version-force=16 --process-max=3 --repo1-host=10.x.y.202 --repo1-host-user=pgbackrest --spool-path=/var/spool/pgbackrest --stanza=My_Repo
huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory
Hi, I have problem to run postgresql on ubuntu 24.04 server with huge_pages = on. My instance have 8GB ram and 2 vcpus (t3a.large). My configuration is: max_connections = 1000 superuser_reserved_connections = 3 shared_buffers = 1960MB effective_cache_size = 5881MB huge_pages = on temp_buffers = 32MB max_prepared_transactions = 100 work_mem = 1MB maintenance_work_mem = 392MB max_stack_depth = 4MB dynamic_shared_memory_type = posix archive_mode = on max_wal_senders = 10 wal_sender_timeout = 0 min_wal_size = 4GB max_wal_size = 16GB wal_level = logical checkpoint_timeout = 15min checkpoint_completion_target = 0.9 In my sysct.conf I have: vm.overcommit_memory=2 vm.overcommit_ratio=50 vm.vfs_cache_pressure=50 vm.dirty_background_ratio=10 vm.dirty_ratio=40 fs.nr_open=1000 fs.file-max=1000 vm.nr_hugepages=980 vm.hugetlb_shm_group=1010 my huge meminfo looks: cat /proc/meminfo | grep -i huge AnonHugePages: 0 kB ShmemHugePages:0 kB FileHugePages: 0 kB HugePages_Total: 980 HugePages_Free: 980 HugePages_Rsvd:0 HugePages_Surp:0 Hugepagesize: 2048 kB Hugetlb: 2007040 kB When I start postgres instance I get error: May 07 10:56:32 pg1 pg_ctl[16753]: server starting May 07 10:56:32 pg1 systemd[1]: Started postgres-17.4-local.service - PostgreSQL 17.4 database local server. May 07 10:56:35 pg1 pg_ctl[16756]: 2025-05-07 10:56:35.826 CEST [16756] - [ - - ] [] : XX000FATAL: could not map anonymous shared memory: Cannot allocate memory May 07 10:56:35 pg1 pg_ctl[16756]: 2025-05-07 10:56:35.826 CEST [16756] - [ - - ] [] : XX000HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 2204106752 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing "shared_buffers" or "max_connections". May 07 10:56:35 pg1 pg_ctl[16756]: 2025-05-07 10:56:35.826 CEST [16756] - [ - - ] [] : 0LOG: database system is shut down May 07 10:56:35 pg1 systemd[1]: postgres-17.4-sdx.service: Main process exited, code=exited, status=1/FAILURE May 07 10:56:35 pg1 systemd[1]: postgres-17.4-sdx.service: Failed with result 'exit-code'. May 07 10:56:35 pg1 systemd[1]: postgres-17.4-sdx.service: Consumed 3.493s CPU time, 2.8M memory peak, 0B memory swap peak. When I comment #huge_pages=on in configuration postgresql started without problems. Can someone tell me where is the problem? Regards Bogdan
pg_rewind problem: cannot find WAL
Hi all, running 17.4 on ubuntu 24.04 machines. I've three hosts, pg-1 (primary) and two physical replicas. I then promote host pg-3 as a master (pg_promote()) and want to rewind the pg-1 to follow the new master, so: ssh pg-3 'sudo -u postgres /usr/lib/postgresql/17/bin/pg_rewind -D /var/lib/postgresql/17/main --source-server="user=replica_fluca host=pg-3 dbname=replica_fluca"' pg_rewind: servers diverged at WAL location 0/B8550F8 on timeline 1 pg_rewind: error: could not open file "/var/lib/postgresql/17/main/pg_wal/0001000A": No such file or directory pg_rewind: error: could not find previous WAL record at 0/AFFF4E8 But the file 0x01A is not there: % ssh pg-3 'sudo ls /var/lib/postgresql/17/main/pg_wal' 0001000B.partial 0002.history 0002000B 0002000C 0002000D 0002000E archive_status summaries % ssh pg-1 'sudo ls /var/lib/postgresql/17/main/pg_wal' 00010005.0028.backup 0001000B 0001000C 0001000D 0001000E archive_status summaries Do i have to ensure the old primary pg-1 does a wal switch before promoting the other one and try to rewind? Thanks, Luca
Re: huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory
Hi Alicja, Thanks for your advice, now postgresql works fine with 'huge_pages=on'. Regards Bogdan śr., 7 maj 2025 o 14:17 Alicja Kucharczyk napisał(a): > Hi Bogdan, > The root cause here is that the number of huge pages you've configured > (vm.nr_hugepages = 980) is not sufficient. > Each huge page on your system is 2 MB in size, so 980 pages give you > roughly 1.96 GB of memory (980 × 2 MB). However, PostgreSQL is clearly > requesting about 2.2 GB of shared memory (specifically, 2204106752 bytes as > shown in the error message you provided), which exceeds what's available > through huge pages. > > That’s why PostgreSQL fails to start when huge_pages = on - it requires > the entire shared memory segment to come from huge pages and refuses to > fall back to regular ones. > > Earlier, you had the huge_pages setting commented out, which means > PostgreSQL used the default value: huge_pages = try. In that mode, it first > attempts to use huge pages, but if that fails (like in your case due to > insufficient allocation), it falls back to standard memory pages — which is > why the instance started without issues then. > > To fix the issue, you should increase vm.nr_hugepages to at least 1100 to > fully cover the shared memory request (you can go a bit higher to be safe > and then reduce it as described in the article I'm pasting the link to). > > Also, a side note: max_connections = 1000 is quite high for an instance > with 8 GB of RAM and only 2 vCPUs. Even if huge pages are properly > allocated, such a high number of connections can lead to performance > issues. You might want to consider lowering it or using a connection pooler > like PgBouncer. > > If you’d like to understand how huge pages work in PostgreSQL, including > how to calculate memory needs and configure the OS properly, I wrote a > detailed article some time ago (still valid). It’s in Polish, which I > assume is fine for you: > https://linuxpolska.com/pl/baza-wiedzy/blog/postgres-pamieci-ram-tipstricks/ > > best regards, > Alicja Kucharczyk > >>
Re: huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory
Kudos to Alicja for that excellent answer. It would be nice if the Postgres log message was a little more descriptive. (throws it onto my huge pile of "future maybe enhancement ideas") Cheers, Greg -- Crunchy Data - https://www.crunchydata.com Enterprise Postgres Software Products & Tech Support
Re: pg_rewind problem: cannot find WAL
On Wed, May 7, 2025 at 3:55 PM Laurenz Albe wrote: > > I don't think it is connected to a WAL switch. > Thanks. > I'd say that you should set "wal_keep_size" high enough that all the WAL > needed for pg_rewind is still present. > > If you have a WAL archive, you could define a restore_command on the server > you want to rewind. I've pgbackrest making backups, so I have an archive_command. I'm going to see if putting a restore_command can fix the problem. Thanks for the suggestion. Luca
Error in DROP TABLESPACE
INTERNAL Hi, I've run into an odd problem - I have what seems to be a "zombie" tablespace that PostgreSQL won't let me drop, but nothing inside it is active. drop tablespace ts_idx_wb2; ERROR: tablespace "ts_idx_wb2" is not empty postgres=# \db+ ts_idx_wb2 List of tablespaces Name| Owner | Location | Access privileges | Options | Size | Description +--+--+-+-++- ts_idx_wb2 | postgres | /pgsql/postgres/tsdata1/ts_idx_wb2 | postgres=C/postgres+| | 250 GB | | | | mydatabase=C/postgres | || (1 row) Check on filesystem location. On the filesystem there are files related only to fileref 51188: postgres@myserver:/pgsql/postgres/tsdata1/ts_idx_wb2/PG_12_201909212/16419> ls -la total 261854080 drwx-- 2 postgres postgres 69632 Apr 22 17:45 . drwx-- 3 postgres postgres 4096 Oct 31 2024 .. -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188 -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.1 -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.10 -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.100 -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.101 -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.102 -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.103 -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.104 ... ... -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.97 -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.98 -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.99 SELECT pg_tablespace_databases((SELECT oid FROM pg_tablespace WHERE spcname = 'ts_idx_wb2')); pg_tablespace_databases - 16419 SELECT oid, datname FROM pg_database WHERE oid = 16419; oid | datname ---+--- 16419 | mydatabase< Connected to database mydatabase: select cl.relfilenode, nsp.nspname as schema_name, cl.relname, cl.relkind from pg_class cl join pg_namespace nsp on cl.relnamespace = nsp.oid and cl.relfilenode=51188 ; relfilenode | schema_name | relname | relkind -+-+-+- (0 rows) To be sure I run the same query on the other databases with the same result (0 rows) So there are no actual references to these files, yet the relationship persists sufficiently for the DROP TABLESPACE to error out. My guess is that the files belong to an index being moved on Oct 31 to ts_idx_wb2 tablespace, and during the move the postgres instance crashed due space shortage on wal filesystem, and this crash left this "zombie" files on the filesystem. In order to drop the tablespace, is it safe to shut that postmaster down and manually remove the contents of this directory? Thanks Gabriele Fagnani
Re: Error in DROP TABLESPACE
On Wed, 2025-05-07 at 08:16 +, Fagnani Gabriele G (GDS I&TS) wrote: > INTERNAL ? > I've run into an odd problem - I have what seems to be a "zombie" tablespace > that PostgreSQL won't let me drop, but nothing inside it is active. > > drop tablespace ts_idx_wb2; > ERROR: tablespace "ts_idx_wb2" is not empty > > postgres=# \db+ ts_idx_wb2 > List of tablespaces > Name | Owner | Location | Access > privileges | Options | Size | Description > +--+--+-+-++- > ts_idx_wb2 | postgres | /pgsql/postgres/tsdata1/ts_idx_wb2 | > postgres=C/postgres+| | 250 GB | > | | | > mydatabase=C/postgres | | | > (1 row) > > Check on filesystem location. > On the filesystem there are files related only to fileref 51188: > > postgres@myserver:/pgsql/postgres/tsdata1/ts_idx_wb2/PG_12_201909212/16419> > ls -la > total 261854080 > drwx-- 2 postgres postgres 69632 Apr 22 17:45 . > drwx-- 3 postgres postgres 4096 Oct 31 2024 .. > -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188 > -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.1 > -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.10 > -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.100 > -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.101 > -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.102 > -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.103 > -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.104 > ... > ... > -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.97 > -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.98 > -rw--- 1 postgres postgres 1073741824 Oct 31 2024 51188.99 > > SELECT pg_tablespace_databases((SELECT oid FROM pg_tablespace WHERE spcname = > 'ts_idx_wb2')); > pg_tablespace_databases > - > 16419 > > > SELECT oid, datname FROM pg_database WHERE oid = 16419; > oid | datname > ---+--- > 16419 | mydatabase < > > > Connected to database mydatabase: > > select cl.relfilenode, nsp.nspname as schema_name, cl.relname, cl.relkind > from pg_class cl > join pg_namespace nsp on cl.relnamespace = nsp.oid and > cl.relfilenode=51188 ; > relfilenode | schema_name | relname | relkind > -+-+-+- > (0 rows) > > To be sure I run the same query on the other databases with the same result > (0 rows) > > So there are no actual references to these files, yet the relationship > persists sufficiently for the DROP TABLESPACE to error out. > My guess is that the files belong to an index being moved on Oct 31 to > ts_idx_wb2 tablespace, and during the move the > postgres instance crashed due space shortage on wal filesystem, and this > crash left this "zombie" files on the filesystem. > > In order to drop the tablespace, is it safe to shut that postmaster down and > manually remove the contents of this > directory? Your diagnosis seems to be right, and I agree with your proposed remedy. Yours, Laurenz Albe