The following bug has been logged online: Bug reference: 4801 Logged by: Peter Much Email address: p...@citylink.dinoex.sub.org PostgreSQL version: 8.2.13 Operating system: FreeBSD 7.2 Description: Performance failure: 600 MB written to each WAL log Details:
Server becomes very slow while tremendous amounts of data are written to WAL logs - disk activity log shows 600 MB of effective write requests before each 16 MB WAL log is archived and recycled. Problem appears occasionally all of a sudden with no apparent reason. I perveived it with 8.2.5(?), 8.2.11 and now 8.2.13, with FreeBSD Rel. 6.3 and 7.2. I perceived it only during "vaccum full" yet, but then a long-running "vacuum full" on a large table is the only significant activity on the server. (I know that "vacuum full" may not be really necessary, but I think it should practically function.) When starting the "vacuum full", the activity looks what I think normal: there is heavy activity on the database table files, accompanied by a certain amount of accesses to the WAL logs. The amount of disk writes to WAL then figures (more or less) to the WAL switching frequency. And it may stay so and the vacuum may complete successful. But sometimes (same "vacuum full", same table) the server suddenly changes into a very slow state of proceeding: then there is only few activity on the database table disk, while the WAL disk is busy writing at 100%. But the WAL logs are filled very slow, too. I can watch the proceeding of the vacuum by listing the open file handle offsets on the database table files (with lsof). And the ratio between these offsets proceeding and the number of archived WAL logs is the same in both states. Also, the server functions correctly by all means when in this slow state, and all requests are processed. And there are no error messages. The only difference is that in normal state of operation the amount of data written to one WAL log is just some more than the 16 MB it contains, while in the slow state it is about 600 MB - and this slows down operation. (The vacuum job must be killed then, because it would take days instead of hours to complete.) I have not really figured out the conditions that do bring the server into this slow mode of operation. It seems to happen when a certain amount of other activity (on other databases within the same server) occurs. That is, as long as there is not much other activity on the system, the vacuum will usually complete with normal speed. But after the server has fallen into the slow state, it will not get out of it again. It has to be stopped and restarted to get back to normal speed; killing and restarting the vacuum process will not help. The WAL handling configuration in the config file is as default, only I do WAL archiving. I have tried to increae wal_buffers, this had no effect. I have moved the whole installation from FreeBSD-UFS diskspace to ZFS, and have switched off full_page_writes. This does significantly reduce the absolute amount of produced WALs, and therefore the slow state of operation becomes faster than before, but besides this the behaviour did not change. The command "show wal_sync_method" yields "fsync". Over all, I do not think that the problem relates to WAL storage. It seems to be mainly the question: what is this tremendous amount of -obviousely redundant- data that is written to WAL, and how do I get the server out of this mode again without disrupting operation? One other thing should be remarked here: The server is a very small system. It has not much memory and will do a real lot of paging; the disks are also not fast. I occasionally experience that under such conditions software shows bugs that would never become visible on the usual oversized systems. Configuration of the server is as follows (as changed from defaults): > max_connections = 40 > shared_buffers = 28MB > max_fsm_pages = 179200 > effective_cache_size = 12MB > autovacuum = off (autovacuum is off to reduce noise - disks will spindown when databases are not used.) -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs