On Tue, 2009-02-03 at 14:28 +0000, Simon Riggs wrote: > On Tue, 2009-02-03 at 08:40 -0500, Andrew Dunstan wrote: > > > > Hannu Krosing wrote: > > > Actually we came up with a solution to this - use filesystem level > > > snapshots (like LVM2+XFS or ZFS), and redirect backends with > > > long-running queries to use fs snapshot mounted to a different > > > mountpoint. > > > > > > I don't think Simon has yet put full support for it in code, but it is > > > clearly _the_ solution for those who want to eat the cake and have it > > > too. > > > How does that work if you're using mutiple file systems via tablespaces > > (e.g. indexes in a different TS)? > > It's a great idea and easy to do, but I can't do everything in one go. > > The main reasons not to are multiple file system difficulties and lack > of a mainstream Linux solution, and more simply lack of time and test > resources.
More general, but also lot harder, solution would be going back to roots and implement what original postgres 4.2 and earlier versions were meant to do - namely VACUUM was not meant to just discard older versions , but rather move it to WORM storage (write once read many was all the rage back then :) . If we did that in a way that each relation, at least on warm standby , has its own "archive" fork, possibly in a separate tablespace for cheaper storage, then we could basically apply WAL's as fast we want and just move the old versions to "archive". It will be slower(ish), especially for HOT updates, but may be a good solution for lots of usecases. And the decision to do the archiving on master and WAL-copy to slave, or just do it on slave only could probably be left to user. Reintroducing keeping old tuples "forever" would also allow us to bring back time travel feature, that is SELECT .... AS OF 'yesterday afternoon'::timestamp; Which was thrown out at the times we got WAL-logging. To be really useful we should also have some way to know trx timestamps, but that can be easily done using ticker feature from Slony - SkyTools/pgQ, which could be run a a separate server thread similar to what we do with background writer, autovacuum etc. now. > > So not now, maybe later. > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers