On Thursday, June 21, 2012 01:41:25 PM Andres Freund wrote: > Below are two possible implementation strategies for that concept > > Advantages: > * Decoding is done on the master in an asynchronous fashion > * low overhead during normal DML execution, not much additional code in > that path > * can be very efficient if architecture/version are the same > * version/architecture compatibility can be done transparently by falling > back to textual versions on mismatch > > Disadvantages: > * decoding probably has to happen on the master which might not be what > people want performancewise
> 3b) > Ensure that enough information in the catalog remains by fudging the xmin > horizon. Then reassemble an appropriate snapshot to read the catalog as > the tuple in question has seen it. > > Advantages: > * should be implementable with low impact to general code > > Disadvantages: > * requires some complex code for assembling snapshots > * it might be hard to guarantee that we always have enough information to > reassemble a snapshot (subxid overflows ...) > * impacts vacuum if replication to some site is slow There are some interesting problems related to locking and snapshots here. Not sure if they are resolvable: We need to restrict SnapshotNow to represent to the view it had back when the wal record were currently decoding had. Otherwise we would possibly get wrong column types and similar. As were working in the past locking doesn't protect us against much here. I have that (mostly and inefficiently). One interesting problem are table rewrites (truncate, cluster, some ALTER TABLE's) and dropping tables. Because we nudge SnapshotNow to the past view it had back when the wal record was created we get the old relfilenode. Which might have been dropped in part of the transaction cleanup... With most types thats not a problem. Even things like records and arrays aren't problematic. More interesting cases include VACUUM FULL $systable (e.g. pg_enum) and vacuum full'ing a table which is used in the *_out function of a type (like a user level pg_enum implementation). The only theoretical way I see against that problem would be to postpone all relation unlinks untill everything that could possibly read them has finished. Doesn't seem to alluring although it would be needed if we ever move more things of SnapshotNow. Input/Ideas/Opinions? Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers