date:20250507

Re: Soundness of strategy for detecting locks acquired by DDL statements

2025-05-07 Thread Agis

On Wed, May 7, 2025, 00:57 Laurenz Albe  wrote:

> On Tue, 2025-05-06 at 12:06 +0300, Agis Anastasopoulos wrote:
> > I'd like to "preflight" a given schema migration (i.e. one or
> > more DDL statements) before applying it to the production database (e.g.
> > for use in a CI pipeline). I'm thinking of a strategy and would like to
> > know about its soundness.
> >
> > The general idea is:
> >
> > - you have a test database that's a clone of your production one (with
> > or without data but with the schema being identical)
> > - given the DDL statements, you open a transaction, grab its pid, and
> > for each statement:
> >1. from a different "observer" connection, you read pg_locks,
> > filtering locks for that pid. This is the "before" locks
> >2. from the first tx, you execute the statement
> >3. from the observer, you grab again pg_locks and compute the diff
> > between this and the "before" view
> >4. from the first tx, you rollback the transaction
> >
> > By diffing the after/before pg_locks view, my assumption is that you
> > know what locks will be acquired by the DDL statements (but not for how
> > long). The query I'm thinking is:
> >
> >  SELECT locktype, database, relation, objid, mode FROM
> > pg_catalog.pg_locks WHERE pid = $1 AND locktype IN ('relation',
> > 'object') AND granted";
> >
> > The type of statements that would be fed as input would be `ALTER|CREATE
> > TABLE`, `CREATE|DROP INDEX` and perhaps DML statements (`UPDATE`,
> > `INSERT`, `DELETE`).
> >
> > Do you think this is a robust way to detect the locks that were
> > acquired? Are there any caveats/drawbacks/flaws in this strategy?
>
> I think that that is a good strategy, as long as you run all DDL statements
> in a single transaction.
>
> Yours,
> Laurenz Albe
>

Can you elaborate on that?

I was thinking that we should mirror the way the statements are going to be
executed in production: if they're all going to be executed inside a single
tx, then we should do the same. But if not, them we should follow course
and execute them in separate txs.

Am I missing something?

Thanks

>

Re: pg_rewind problem: cannot find WAL

2025-05-07 Thread Laurenz Albe

On Wed, 2025-05-07 at 12:51 +0200, Luca Ferrari wrote:
> running 17.4 on ubuntu 24.04 machines. I've three hosts, pg-1
> (primary) and two physical replicas.
> I then promote host pg-3 as a master (pg_promote()) and want to rewind
> the pg-1 to follow the new master, so:
> 
> ssh pg-3 'sudo -u postgres /usr/lib/postgresql/17/bin/pg_rewind -D
> /var/lib/postgresql/17/main --source-server="user=replica_fluca
> host=pg-3 dbname=replica_fluca"'
> pg_rewind: servers diverged at WAL location 0/B8550F8 on timeline 1
> pg_rewind: error: could not open file
> "/var/lib/postgresql/17/main/pg_wal/0001000A": No such
> file or directory
> pg_rewind: error: could not find previous WAL record at 0/AFFF4E8
> 
> But the file 0x01A is not there:
> 
> 
>  % ssh pg-3 'sudo ls /var/lib/postgresql/17/main/pg_wal'
> 0001000B.partial
> 0002.history
> 0002000B
> 0002000C
> 0002000D
> 0002000E
> archive_status
> summaries
> 
> % ssh pg-1 'sudo ls /var/lib/postgresql/17/main/pg_wal'
> 00010005.0028.backup
> 0001000B
> 0001000C
> 0001000D
> 0001000E
> archive_status
> summaries
> 
> Do i have to ensure the old primary pg-1 does a wal switch before
> promoting the other one and try to rewind?

I don't think it is connected to a WAL switch.

I'd say that you should set "wal_keep_size" high enough that all the WAL
needed for pg_rewind is still present.

If you have a WAL archive, you could define a restore_command on the server
you want to rewind.

Yours,
Laurenz Albe

Re: Pgbackrest failure for INCR and DIFF but not FULL backup

2025-05-07 Thread Greg Sabino Mullane

On Wed, May 7, 2025 at 7:15 AM KK CHN  wrote:

> *archive_command = 'pgbackrest --stanza=My_Repo archive-push %p && cp %p
> /data/archive/%f' *
>

Don't do this. You are archiving twice, and worse, the first part is using
async archiving. Remove that whole "cp" part. Once that is fixed, run:

pgbackrest --stanza=My_Repo check

to verify that WAL is being archived quickly and properly. Then test your
backups, checking the file
/var/log/pgbackrest/My_Repo-archive-push-async.log and your postgres log if
any problems arise.

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support

Re: huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory

2025-05-07 Thread Alicja Kucharczyk

Hi Bogdan,
The root cause here is that the number of huge pages you've configured
(vm.nr_hugepages = 980) is not sufficient.
Each huge page on your system is 2 MB in size, so 980 pages give you
roughly 1.96 GB of memory (980 × 2 MB). However, PostgreSQL is clearly
requesting about 2.2 GB of shared memory (specifically, 2204106752 bytes as
shown in the error message you provided), which exceeds what's available
through huge pages.

That’s why PostgreSQL fails to start when huge_pages = on - it requires the
entire shared memory segment to come from huge pages and refuses to fall
back to regular ones.

Earlier, you had the huge_pages setting commented out, which means
PostgreSQL used the default value: huge_pages = try. In that mode, it first
attempts to use huge pages, but if that fails (like in your case due to
insufficient allocation), it falls back to standard memory pages — which is
why the instance started without issues then.

To fix the issue, you should increase vm.nr_hugepages to at least 1100 to
fully cover the shared memory request (you can go a bit higher to be safe
and then reduce it as described in the article I'm pasting the link to).

Also, a side note: max_connections = 1000 is quite high for an instance
with 8 GB of RAM and only 2 vCPUs. Even if huge pages are properly
allocated, such a high number of connections can lead to performance
issues. You might want to consider lowering it or using a connection pooler
like PgBouncer.

If you’d like to understand how huge pages work in PostgreSQL, including
how to calculate memory needs and configure the OS properly, I wrote a
detailed article some time ago (still valid). It’s in Polish, which I
assume is fine for you:
https://linuxpolska.com/pl/baza-wiedzy/blog/postgres-pamieci-ram-tipstricks/

best regards,
Alicja Kucharczyk

>

Pgbackrest failure for INCR and DIFF but not FULL backup

2025-05-07 Thread KK CHN

Hi folks,

I am facing a strange issue, Pgbackrest backup fails for DIFF or INCR
backups but not Full backup,  with the * error WAL file cannot be archived
before 6 ms timeout.*

The pgbackrest  " *stanza check* " command  *sometimes succeeds, but
sometimes fails.*

I don't know why  *  PG is unable to*   *copy  WAL files from   pg_wal to
/data/myarchive_di*r* in real time*. I always  observed a delay of a few
minutes for a wal file from pg_wal to appear in  /data/my_archive_dir.

 I'hv observed  in the  postgresql.conf   (checkpoint_timeout = 5 m,
max_wal_size = 16 GB, wal_keep_size=15GB, min_wal_size=80MB etc.)

and I found  pg_wal dir size is 16 GB on disk (du -h -d 1 )  and
/data/archive  = 1.2 T
/dev/mapper/rhel_bcga68-data  5.0T  1.8T  3.3T  35% /data


Can we suspect the 5 M is the reason for the WAL archiving delay  ?

Backup  to a remote RepoServer for INCR or DIFF  backup always fails, I
found a full backup always succeeds!!!

What is the ideal value needed to be set for  "*checkpoint_timeout*"   ?
*Or this doesn't have any impact on  pgbackrest failure ?*


*archive_command = 'pgbackrest --stanza=My_Repo archive-push %p && cp %p
/data/archive/%f' *


>From postgresql logs  I am seeing this ..

ERROR: [082]: unable to push WAL file '000102630002' to the
archive asynchronously after 60 second(s)
   HINT: check '/var/log/pgbackrest/My_Repo-archive-push-async.log' for
errors.
INFO: archive-push command end: aborted with exception [082]
2025-05-02 12:15:17 IST LOG:  archive command failed with exit code 82
2025-05-02 12:15:17 IST DETAIL:  The failed archive command was: pgbackrest
--stanza=My_Repo archive-push pg_wal/000102630002 && cp
pg_wal/000102630002 /data/archive/000102630002
INFO: archive-push command begin 2.52.1: [pg_wal/000102630002]
--archive-async --compress-type=zst --exec-id=2848559-384cf49c
--log-level-console=info --log-level-file=debug --log-level-stderr=info
--pg1-path= /var/lib/postgres/16/data   --pg-version-force=16
--process-max=6 --repo1-host=10.x.y.202 --repo1-host-user=pgbackrest
--spool-path=/var/spool/pgbackrest --stanza=My_Repo

top  output   on DB cluster:

top - 12:37:00 up 66 days, 17:24,  2 users,  load average: 4.04, 4.72, 4.56

Tasks: 902 total,   4 running, 897 sleeping,   0 stopped,   1 zombie
%Cpu(s):  7.4 us,  1.7 sy,  0.0 ni, 89.9 id,  0.4 wa,  0.2 hi,  0.4 si,
 0.0 st
MiB Mem :  31837.6 total,706.1 free,  15243.0 used,  24741.0 buff/cache
MiB Swap:   8060.0 total,   6634.0 free,   1426.0 used.  16608.9 avail Mem

PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+
COMMAND
2839363 postgre+  20   0 8965608   7.2g   7.1g S  70.2  23.0   2:02.61
postgres
2864108 postgre+  20   0 8967848   7.1g   7.1g S  64.9  22.8   0:30.04
postgres
2865547 postgre+  20   0 8965432   7.1g   7.1g S  39.1  22.8   0:32.30
postgres
2865752 postgre+  20   0 8964352   6.9g   6.9g S  16.6  22.3   0:32.94
postgres



Model name:Intel(R) Xeon(R) Gold 6430
BIOS Model name: Intel(R) Xeon(R) Gold 6430
CPU family:  6
Model:   143
Thread(s) per core:  1
Core(s) per socket:  16

These are vCPUs(16 nos) , OS RHEL 9,  postgres 16, pgbackrest 2.52.1

*Any hints  most welcome to find the root cause / troubleshoot
the  pgbackrest failures for DIFF/ INCR backups.*

Thank you
Krishane





*For More Inputs:   *
For more info   :  you can see the full backup success here. but a diff
backup fails.

full backup: 20250505-070204F
timestamp start/stop: 2025-05-05 07:02:04+05:30 / 2025-05-05
22:11:23+05:30
wal start/stop: 0001026F0066 /
000102730045
database size: 503.1GB, database backup size: 503.1GB
repo1: backup size: 79GB


When I try diff backup  it always fails.

[root@Repo ~]# tail -f /var/log/pgbackrest/My_Repo-backup.log

stack trace:

command/archive/find.c:walSegmentFind:191:(this: {WalSegmentFind},
walSegment: {"0001027B006C"})

command/backup/backup.c:backupArchiveCheckCopy:(backupData: {BackupData},
manifest: {Manifest})
command/backup/backup.c:cmdBackup:(void)
main.c:main:(debug log level required
for parameters)


2025-05-07 15:47:49.760 P00   INFO: backup command end: aborted with
exception [082]
2025-05-07 15:47:49.760 P00  DEBUG: command/exit::exitSafe: => 82
2025-05-07 15:47:49.860 P00  DEBUG: main::main: => 82
^C
[ root@Repo ~   ~]# date
Wednesday 07 May 2025 04:06:37 PM IST

*The postgres  log says*

=316781-61d82f85 --log-level-console=info --log-level-file=debug
--log-level-stderr=info --pg1-path=
/var/lib/postgres/16/data   --pg-version-force=16 --process-max=3
--repo1-host=10.x.y.202 --repo1-host-user=pgbackrest
--spool-path=/var/spool/pgbackrest --stanza=My_Repo

huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory

2025-05-07 Thread Bogdan Siara

Hi,
I have problem to run postgresql on ubuntu 24.04 server with huge_pages =
on. My instance have 8GB ram and 2 vcpus (t3a.large). My configuration is:

max_connections = 1000
superuser_reserved_connections = 3
shared_buffers = 1960MB
effective_cache_size = 5881MB
huge_pages = on
temp_buffers = 32MB
max_prepared_transactions = 100
work_mem = 1MB
maintenance_work_mem = 392MB
max_stack_depth = 4MB
dynamic_shared_memory_type = posix
archive_mode = on
max_wal_senders = 10
wal_sender_timeout = 0
min_wal_size = 4GB
max_wal_size = 16GB
wal_level = logical
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9

In my sysct.conf I have:

vm.overcommit_memory=2
vm.overcommit_ratio=50
vm.vfs_cache_pressure=50
vm.dirty_background_ratio=10
vm.dirty_ratio=40
fs.nr_open=1000
fs.file-max=1000
vm.nr_hugepages=980
vm.hugetlb_shm_group=1010

my huge meminfo looks:

cat /proc/meminfo | grep -i huge
AnonHugePages: 0 kB
ShmemHugePages:0 kB
FileHugePages: 0 kB
HugePages_Total: 980
HugePages_Free:  980
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
Hugetlb: 2007040 kB

When I start postgres instance I get error:
May 07 10:56:32 pg1 pg_ctl[16753]: server starting
May 07 10:56:32 pg1 systemd[1]: Started postgres-17.4-local.service -
PostgreSQL 17.4 database local server.
May 07 10:56:35 pg1 pg_ctl[16756]: 2025-05-07 10:56:35.826 CEST [16756] - [
-  - ] [] : XX000FATAL:  could not map anonymous shared memory: Cannot
allocate memory
May 07 10:56:35 pg1 pg_ctl[16756]: 2025-05-07 10:56:35.826 CEST [16756] - [
-  - ] [] : XX000HINT:  This error usually means that PostgreSQL's request
for a shared memory segment exceeded available memory, swap space, or huge
pages. To reduce the request size (currently 2204106752 bytes), reduce
PostgreSQL's shared memory usage, perhaps by reducing "shared_buffers"
or "max_connections".
May 07 10:56:35 pg1 pg_ctl[16756]: 2025-05-07 10:56:35.826 CEST [16756] - [
-  - ] [] : 0LOG:  database system is shut down
May 07 10:56:35 pg1 systemd[1]: postgres-17.4-sdx.service: Main process
exited, code=exited, status=1/FAILURE
May 07 10:56:35 pg1 systemd[1]: postgres-17.4-sdx.service: Failed with
result 'exit-code'.
May 07 10:56:35 pg1 systemd[1]: postgres-17.4-sdx.service: Consumed 3.493s
CPU time, 2.8M memory peak, 0B memory swap peak.

When I comment #huge_pages=on in configuration postgresql started without
problems.
Can someone tell me where is the problem?

Regards
Bogdan

pg_rewind problem: cannot find WAL

2025-05-07 Thread Luca Ferrari

Hi all,
running 17.4 on ubuntu 24.04 machines. I've three hosts, pg-1
(primary) and two physical replicas.
I then promote host pg-3 as a master (pg_promote()) and want to rewind
the pg-1 to follow the new master, so:

ssh pg-3 'sudo -u postgres /usr/lib/postgresql/17/bin/pg_rewind -D
/var/lib/postgresql/17/main --source-server="user=replica_fluca
host=pg-3 dbname=replica_fluca"'
pg_rewind: servers diverged at WAL location 0/B8550F8 on timeline 1
pg_rewind: error: could not open file
"/var/lib/postgresql/17/main/pg_wal/0001000A": No such
file or directory
pg_rewind: error: could not find previous WAL record at 0/AFFF4E8

But the file 0x01A is not there:


 % ssh pg-3 'sudo ls /var/lib/postgresql/17/main/pg_wal'
0001000B.partial
0002.history
0002000B
0002000C
0002000D
0002000E
archive_status
summaries

% ssh pg-1 'sudo ls /var/lib/postgresql/17/main/pg_wal'
00010005.0028.backup
0001000B
0001000C
0001000D
0001000E
archive_status
summaries

Do i have to ensure the old primary pg-1 does a wal switch before
promoting the other one and try to rewind?

Thanks,
Luca

Re: huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory

2025-05-07 Thread Bogdan Siara

Hi Alicja,
Thanks for your advice, now postgresql works fine with 'huge_pages=on'.
Regards
Bogdan

śr., 7 maj 2025 o 14:17 Alicja Kucharczyk 
napisał(a):

> Hi Bogdan,
> The root cause here is that the number of huge pages you've configured
> (vm.nr_hugepages = 980) is not sufficient.
> Each huge page on your system is 2 MB in size, so 980 pages give you
> roughly 1.96 GB of memory (980 × 2 MB). However, PostgreSQL is clearly
> requesting about 2.2 GB of shared memory (specifically, 2204106752 bytes as
> shown in the error message you provided), which exceeds what's available
> through huge pages.
>
> That’s why PostgreSQL fails to start when huge_pages = on - it requires
> the entire shared memory segment to come from huge pages and refuses to
> fall back to regular ones.
>
> Earlier, you had the huge_pages setting commented out, which means
> PostgreSQL used the default value: huge_pages = try. In that mode, it first
> attempts to use huge pages, but if that fails (like in your case due to
> insufficient allocation), it falls back to standard memory pages — which is
> why the instance started without issues then.
>
> To fix the issue, you should increase vm.nr_hugepages to at least 1100 to
> fully cover the shared memory request (you can go a bit higher to be safe
> and then reduce it as described in the article I'm pasting the link to).
>
> Also, a side note: max_connections = 1000 is quite high for an instance
> with 8 GB of RAM and only 2 vCPUs. Even if huge pages are properly
> allocated, such a high number of connections can lead to performance
> issues. You might want to consider lowering it or using a connection pooler
> like PgBouncer.
>
> If you’d like to understand how huge pages work in PostgreSQL, including
> how to calculate memory needs and configure the OS properly, I wrote a
> detailed article some time ago (still valid). It’s in Polish, which I
> assume is fine for you:
> https://linuxpolska.com/pl/baza-wiedzy/blog/postgres-pamieci-ram-tipstricks/
>
> best regards,
> Alicja Kucharczyk
>
>>

Re: huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory

2025-05-07 Thread Greg Sabino Mullane

Kudos to Alicja for that excellent answer.

It would be nice if the Postgres log message was a little more descriptive.
(throws it onto my huge pile of "future maybe enhancement ideas")


Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support

Re: pg_rewind problem: cannot find WAL

2025-05-07 Thread Luca Ferrari

On Wed, May 7, 2025 at 3:55 PM Laurenz Albe  wrote:
>
> I don't think it is connected to a WAL switch.
>

Thanks.

> I'd say that you should set "wal_keep_size" high enough that all the WAL
> needed for pg_rewind is still present.
>
> If you have a WAL archive, you could define a restore_command on the server
> you want to rewind.

I've pgbackrest making backups, so I have an archive_command. I'm
going to see if putting a restore_command can fix the problem.

Thanks for the suggestion.

Luca

Error in DROP TABLESPACE

2025-05-07 Thread

INTERNAL

Hi,

I've run into an odd problem - I have what seems to be a "zombie" tablespace
that PostgreSQL won't let me drop, but nothing inside it is active.

drop tablespace ts_idx_wb2;
ERROR:  tablespace "ts_idx_wb2" is not empty

postgres=# \db+ ts_idx_wb2
   List of tablespaces
Name|  Owner   | Location |  Access privileges  
| Options |  Size  | Description
+--+--+-+-++-
ts_idx_wb2 | postgres | /pgsql/postgres/tsdata1/ts_idx_wb2 | 
postgres=C/postgres+| | 250 GB |
|  |  | 
mydatabase=C/postgres | ||
(1 row)

Check on filesystem location.
On the filesystem there are files related only to fileref 51188:

postgres@myserver:/pgsql/postgres/tsdata1/ts_idx_wb2/PG_12_201909212/16419> ls 
-la
total 261854080
drwx-- 2 postgres postgres  69632 Apr 22 17:45 .
drwx-- 3 postgres postgres   4096 Oct 31  2024 ..
-rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188
-rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.1
-rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.10
-rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.100
-rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.101
-rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.102
-rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.103
-rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.104
...
...
-rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.97
-rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.98
-rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.99

SELECT pg_tablespace_databases((SELECT oid FROM pg_tablespace WHERE spcname = 
'ts_idx_wb2'));
pg_tablespace_databases
-
   16419


SELECT oid, datname FROM pg_database WHERE oid = 16419;
oid  |  datname
---+---
16419 | mydatabase<


Connected to database mydatabase:

select cl.relfilenode, nsp.nspname as schema_name, cl.relname, cl.relkind
from pg_class cl
  join pg_namespace nsp on cl.relnamespace = nsp.oid and 
cl.relfilenode=51188 ;
relfilenode | schema_name | relname | relkind
-+-+-+-
(0 rows)

To be sure I run the same query on the other databases with the same result (0 
rows)

So there are no actual references to these files, yet the relationship
persists sufficiently for the DROP TABLESPACE to error out.
My guess is that the files belong to an index being moved on Oct 31 to 
ts_idx_wb2 tablespace, and during the move the
postgres instance crashed due space shortage on wal filesystem, and this crash 
left this "zombie" files on the filesystem.

In order to drop the tablespace, is it safe to shut  that postmaster down and 
manually remove the contents of this
directory?

Thanks
Gabriele Fagnani

Re: Error in DROP TABLESPACE

2025-05-07 Thread Laurenz Albe

On Wed, 2025-05-07 at 08:16 +, Fagnani Gabriele G (GDS I&TS) wrote:
> INTERNAL

?

> I've run into an odd problem - I have what seems to be a "zombie" tablespace
> that PostgreSQL won't let me drop, but nothing inside it is active.
>  
> drop tablespace ts_idx_wb2;
> ERROR:  tablespace "ts_idx_wb2" is not empty
>  
> postgres=# \db+ ts_idx_wb2
>    List of tablespaces
>     Name    |  Owner   | Location |  Access 
> privileges  | Options |  Size  | Description
> +--+--+-+-++-
> ts_idx_wb2 | postgres | /pgsql/postgres/tsdata1/ts_idx_wb2 | 
> postgres=C/postgres+| | 250 GB |
>     |  |  | 
> mydatabase=C/postgres | |    |
> (1 row)
>  
> Check on filesystem location.
> On the filesystem there are files related only to fileref 51188:
>  
> postgres@myserver:/pgsql/postgres/tsdata1/ts_idx_wb2/PG_12_201909212/16419> 
> ls -la
> total 261854080
> drwx-- 2 postgres postgres  69632 Apr 22 17:45 .
> drwx-- 3 postgres postgres   4096 Oct 31  2024 ..
> -rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188
> -rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.1
> -rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.10
> -rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.100
> -rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.101
> -rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.102
> -rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.103
> -rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.104
> ...
> ...
> -rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.97
> -rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.98
> -rw--- 1 postgres postgres 1073741824 Oct 31  2024 51188.99
>  
> SELECT pg_tablespace_databases((SELECT oid FROM pg_tablespace WHERE spcname = 
> 'ts_idx_wb2'));
> pg_tablespace_databases
> -
>    16419
>  
>  
> SELECT oid, datname FROM pg_database WHERE oid = 16419;
> oid  |  datname
> ---+---
> 16419 | mydatabase    <
>  
>  
> Connected to database mydatabase:
>  
> select cl.relfilenode, nsp.nspname as schema_name, cl.relname, cl.relkind
> from pg_class cl
>   join pg_namespace nsp on cl.relnamespace = nsp.oid and 
> cl.relfilenode=51188 ;
> relfilenode | schema_name | relname | relkind
> -+-+-+-
> (0 rows)
>  
> To be sure I run the same query on the other databases with the same result 
> (0 rows)
>  
> So there are no actual references to these files, yet the relationship
> persists sufficiently for the DROP TABLESPACE to error out.
> My guess is that the files belong to an index being moved on Oct 31 to 
> ts_idx_wb2 tablespace, and during the move the
> postgres instance crashed due space shortage on wal filesystem, and this 
> crash left this "zombie" files on the filesystem.
>  
> In order to drop the tablespace, is it safe to shut  that postmaster down and 
> manually remove the contents of this
> directory?

Your diagnosis seems to be right, and I agree with your proposed remedy.

Yours,
Laurenz Albe

Re: Soundness of strategy for detecting locks acquired by DDL statements

Re: pg_rewind problem: cannot find WAL

Re: Pgbackrest failure for INCR and DIFF but not FULL backup

Re: huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory

Pgbackrest failure for INCR and DIFF but not FULL backup

huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory

pg_rewind problem: cannot find WAL

Re: huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory

Re: huge_pages=on cause could not map anonymous shared memory: Cannot allocate memory

Re: pg_rewind problem: cannot find WAL

Error in DROP TABLESPACE

Re: Error in DROP TABLESPACE

12 matches

Site Navigation

Mail list logo

Footer information