Some years ago we did a pass through the various worker processes to add hibernation as a mechanism to reduce power consumption on an idle server. Replication never got the memo, so power consumption on an idle server is not very effective on standby or logical subscribers. The code and timing for hibernation is also different for each worker, which is confusing.
Proposal is to improve the situation and reduce power consumption on an idle server, with a series of fairly light changes to give positive green/environmental benefit. CURRENT STATE These servers naturally sleep for long periods when inactive: * postmaster - 60s * checkpointer - checkpoint_timeout * syslogger - log_rotation_age * pgarch - 60s * autovac launcher - autovacuum_naptime * walsender - wal_sender_timeout /2 * contrib/prewarm - checkpoint_timeout or shutdown * pgstat - except on windows, see later The following servers all have some kind of hibernation modes: * bgwriter - 50 * bgwriter_delay * walwriter. - 25 * wal_writer_delay These servers don't try to hibernate at all: * logical worker - 1s * logical launcher - wal_retrieve_retry_interval (undocumented) * startup - hardcoded 5s when streaming, wal_receiver_retry_interval for WAL files * wal_receiver - 100ms, currently gets woken when WAL arrives * pgstat - 2s (on Windows only) PROPOSED CHANGES 1. Standardize the hibernation time at 60s, using a #define HIBERNATE_DELAY_SEC 60 2. Refactor postmaster and pgarch to use that setting, rather than hardcoded 60 3. Standardize the hibernation design pattern through a set of macros in latch.h, based on the coding of walwriter.c, since that was the best example of hibernation code. Hibernation is independent for each process. This is explained in the header for latch.h, with an example in src/test/modules/worker_spi/worker_spi.c The intention here is to provide simple facilities to allow authors of bg workers to add hibernation code also. Summary: after 50 idle loops, hibernate for 60s each time through the loop In all cases, switch immediately back into action when needed. 4. Change these processes to hibernate using the standard design pattern * bgwriter - currently gets woken when user allocates a bugger, no change proposed * walwriter - currently gets woken by XLogFlush, no change proposed 5. Startup process has a hardcoded 5s loop because it checks for trigger file to promote it. So hibernating would mean that it would promote more slowly, and/or restart failing walreceiver more slowly, so this requires user approval, and hence add new GUCs to approve that choice. This is a valid choice because a long-term idle server is obviously not in current use, so waiting 60s for failover or restart is very unlikely to cause significant issue. Startup process is woken by WALReceiver when WAL arrives, so if all is well, Startup will swing back into action quickly when needed. If standby_can_hibernate = true, then these processes can hibernate: * startup * walreceiver - hibernate, but only for wal_receiver_timeout/2, not 60s, so other existing behavior is preserved. If subscription_can_hibernate = true, then these processes can hibernate: * logical launcher * logical worker - hibernate, but only for wal_receiver_timeout/2, not 60s, so other existing behavior is preserved. Patch to do all of the above attached. Comments please. -- Simon Riggs http://www.EnterpriseDB.com/
hibernate_to_reduce_power_consumption.v1.patch
Description: Binary data