-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi!
Running postgresql-9.2.2 on FreeBSD 9.1 using vanilla ufs file system. I have the postgresql base/ on the /usr disk, and a separate volume /opt where the default tablespace resides. I found many databases that had not used the default tablespace, an ran many hundreds of ALTER TABLE foobar SET TABLESPACE foo; ALTER INDEX foobar_idx SET TABLESPACE foo; a while back to free up disk space on /usr. I got lots of space freed up, but it seems that after that the disk usage grows linearly (it seems to leave many inodes unreferenced). The the discrepancy between df and du is now huge: # du -sxh /usr; df -h /usr 4,6G /usr Filesystem Size Used Avail Capacity Mounted on /dev/da0s1f 104G 88G 8.0G 92% /usr 4,6G vs 88GB, that must be more than a rounding error? Strange thing is I cannot find any open files. # lsof /usr| awk '{print $9}'|xargs ls -l > /dev/null returns no errors (a missing file would render an error with ls). If there where open files not referenced in any directory, they should be found. Next thing is fsck, and yes, there are plenty of unreferenced files. I ran fsck while system is running (i.e. read only) to get a grip oif the amount of lost inodes: fsck /usr | awk '{print $1}'|cut -f 2 -d=| perl -e '$i = 0; while (<>) { $i += $_;}; print $i / 1024 / 1024; print "\n";' 85223.3530330658 ~85 GB gone, that's 80% of the disk, and it accounts fo all the missing space. MTIME for the inodes are pretty evenly spread over time since the machine was updated to FreeBSD 9.1, rebooted, and PostgreSQL was updated to 9.2. All was done at the same time, so I can't really tell who's to blaim, but this is the only server out of a dozen where we did all these ALTER table/index SET TABLEPSPACE... and it is the only server, out of a dozen that where updated to exactly the same versions, that has this problem. All other servers have their base/ disk as horisontal the horizone (since all data resides on a separate tablespace). All servers where updated from 8.4 or 9.0 using pg_update. Only this server has problems. This is the only server with a multitude (~30) active databases, and the only one where we ran the SET TABLESPACE stuff. The unreferenced inodes are almost exclusively around 16 MB in size, so i.e. they would most probably all be pg_xlog files. config in postgresql.conf is checkpoint_segments = 40 Could there be anything fishy in the code that moves the indexes or tables between different tablespaces? How could it possibly be able to leave unreferenced inodes around like this? Is the culprit a combination of postgresql and file system code? checkpoints seems to happen approximately every three minutes: Mar 13 00:39:08 dbserver postgres[5298]: [48-1] db=,user= LOG: checkpoint starting: time Mar 13 00:41:38 dbserver postgres[5298]: [49-1] db=,user= LOG: checkpoint complete: wrote 2542 buffers (0.3%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=149.667 s, sync=0.101 s, total=149.770 s; sync files=628, longest=0.021 s, average=0.000 s Mar 13 00:44:08 dbserver postgres[5298]: [50-1] db=,user= LOG: checkpoint starting: time Mar 13 00:46:38 dbserver postgres[5298]: [51-1] db=,user= LOG: checkpoint complete: wrote 3996 buffers (0.4%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=149.438 s, sync=0.111 s, total=149.551 s; sync files=823, longest=0.006 s, average=0.000 s Mar 13 00:49:08 dbserver postgres[5298]: [52-1] db=,user= LOG: checkpoint starting: time Mar 13 00:51:38 dbserver postgres[5298]: [53-1] db=,user= LOG: checkpoint complete: wrote 13736 buffers (1.4%); 0 transaction log file(s) added, 0 removed, 2 recycled; write=149.958 s, sync=0.311 s, total=150.271 s; sync files=1335, longest=0.079 s, average=0.000 s Mar 13 00:54:08 dbserver postgres[5298]: [54-1] db=,user= LOG: checkpoint starting: time Mar 13 00:56:38 dbserver postgres[5298]: [55-1] db=,user= LOG: checkpoint complete: wrote 14638 buffers (1.5%); 0 transaction log file(s) added, 0 removed, 17 recycled; write=149.330 s, sync=0.271 s, total=149.603 s; sync files=1363, longest=0.017 s, average=0.000 s Mar 13 00:59:08 dbserver postgres[5298]: [56-1] db=,user= LOG: checkpoint starting: time Mar 13 01:01:38 dbserver postgres[5298]: [57-1] db=,user= LOG: checkpoint complete: wrote 8035 buffers (0.8%); 0 transaction log file(s) added, 0 removed, 21 recycled; write=149.285 s, sync=0.146 s, total=149.433 s; sync files=1160, longest=0.003 s, average=0.000 s Mar 13 01:04:08 dbserver postgres[5298]: [58-1] db=,user= LOG: checkpoint starting: time Mar 13 01:06:37 dbserver postgres[5298]: [59-1] db=,user= LOG: checkpoint complete: wrote 2156 buffers (0.2%); 0 transaction log file(s) added, 0 removed, 9 recycled; write=149.402 s, sync=0.057 s, total=149.461 s; sync files=610, longest=0.000 s, average=0.000 s Mar 13 01:09:08 dbserver postgres[5298]: [60-1] db=,user= LOG: checkpoint starting: time I'm pretty certain that unmounting the file system and running fsck will regain the lost space, but will it stop there? Has it got to do with the ALTER table/index SET TABLESPACE, or is that a smoke screen, and it is really something else? Stopping postgresql briefly did not help, I tried that. The server has about 30 databases and ~127 concurrent connections (not all beeing active simultaneously, though), so it is fair to say it is pretty active, but nothing extreme. Hardware is HP DL360, using their HT Smart Array P410i. Any ideas how to debug this? Or shall I just reboot, fsck, hope the problem will go away, and when it does, forget about it? Thanks, Palle -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJRP8o5AAoJEIhV+7FrxBJDTl0H/2JWb3lFuaL+QYDe9+Le8zhu 7Q1KEYeY91Noq88rLa+eM3ENPibrsJYGYWYaQb9U2PSFXvud5/+czWMZrcVx+3H4 +tFME+FO+f9kasNrGA++LmRwlXlSWYZj9oZXumdIbmqX0Nvne0nvMc34Qyg8yqB5 l3pMFRdMEO4kM9n/pJkQf/7nuWZDEvmNTlkL1wARjGyTa8GYQa1a4pBDq/ovXuE8 +6NF64X8Mbm55geEHIybop72BTcfQmGMpYaeInjH2V2zbaqC6HV3eqfTv+CGT8op tAxRYKa/9brYj9IIcaLn+zQvBTxbklZVON6BrlKI6+z144nXOdH+Ny4Zs4upP+Y= =J6T7 -----END PGP SIGNATURE----- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers