On Wed, Jun 29, 2011 at 1:50 PM, Simon Riggs <si...@2ndquadrant.com> wrote: >> As implemented, the feature will work with either streaming >> replication or with file-based replication. > > That sounds like the exact opposite of yours and Fujii's comments > above. Please explain.
I think our comments above were addressing the issue of whether it's feasible to correct for time skew between the master and the slave. Tom was arguing that we should try, but I was arguing that any system we put together is likely to be pretty unreliable (since good time synchronization algorithms are quite complex, and to my knowledge no one here is an expert on implementing them, nor do I think we want that much complexity in the backend) and Fujii was pointing out that it won't work at all if the WAL files are going through the archive rather than through streaming replication, which (if I understand you correctly) will be a more common case than I had assumed. >> I don't see any value in >> restricting to work ONLY with file-based replication. > > As explained above, it won't work in practice because of the amount of > file space required. I guess it depends on how busy your system is and how much disk space you have. If using streaming replication causes pg_xlog to fill up on your standby, then you can either (1) put pg_xlog on a larger file system or (2) configure only restore_command and not primary_conninfo, so that only the archive is used. > Or, an alternative question: what will you do when it waits so long > that the standby runs out of disk space? I don't really see how that's any different from what happens now. If (for whatever reason) the master is generating WAL faster than a streaming standby can replay it, then the excess WAL is going to pile up someplace, and you might run out of disk space. Time-delaying the standby creates an additional way for that to happen, but I don't think it's an entirely new problem. I am not sure exactly how walreceiver handles it if the disk is full. I assume it craps out and eventually retries, so probably what will happen is that, after the standby's pg_xlog directory fills up, walreceiver will sit there and error out until replay advances enough to remove a WAL file and thus permit some more data to be streamed. If the standby gets far enough behind the master that the required files are no longer there, then it will switch to the archive, if available. It might be nice to have a mode that only allows streaming replication when the amount of disk space on the standby is greater than or equal to some threshold, but that seems like a topic for another patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers