čt 13. 12. 2018 v 10:23 odesílatel Simon Riggs <si...@2ndquadrant.com> napsal:
> Currently, tables provide MVCC access semantics as the only option. > > A more complete list of desirable semantics in applications are > > * MVCC - provide latest consistent view > * Historical - keep all old row versions so that queries can access data > as it used to be > * TTL=Duration - keep committed rows around for at most Duration seconds > * Latest - show only the most current committed row version, at the cost > of inconsistency > There might be others > > Since I see these options as semantics rather than physical, I think we > should separate these operations away from Table Access Methods. This > allows those semantics to be implemented in different ways for different > storage types. > > "Historical" access has been discussed many times, so no need to revisit > here. Clearly, it is a very poopular idea, just not easily feasible with > the current heap access layer. We might want an option for row_visibility > retention. For tables with this option set, we would in later releases > allow historical queries according to the SQL Standard. > > "TTL" or "Time To Live" - time-limited access to data is available in > many other databases. It is simple to implement and we could easily have > this in PG12. Implementation is 1) adding the option, 2) adding a > time-component into the visibility check for scan and vacuum. This option > implies an option exists to specify row_visibility retention. > > "Latest" is similar to the way EvalPlanQual works, allowing UPDATE to see > the latest version of a row before update, and similar to the way catalog > scans work in that any access to a catalog entry sees the latest row based > upon an immediate snapshot, not the one taken at the start of a query. It > makes sense to allow this as an explicit table-level option, so any data > viewed can see the latest version, just as UPDATEs already do. This also > means that juggling bloat and old row versions becomes much less of an > issue for very high volume update applications such as booking systems or > stock tickers. (Clearly, better table access methods would improve on this > further and even they would benefit from this simplification of the main > issue around MVCC). > Logic for this type of visibility already exists in PG > via HeapTupleSatisfiesSelf(), so we are just exposing what is already there > to the user; no need to discuss semantics at length. > Again, patches to implement this option are simple and viable for PG12. > > User interface are 2 new table-level options > * row_visibility = MVCC (default), TTL, Latest, Historical > * row_visibility_retention_interval = 'system' (default) > For MVCC, the only valid setting would be system, i.e. current MVCC > behavior (but this might be altered by specific storage plugin parameters) > For Latest, the only valid setting would be system > For TTL, the interval to retain data for, setting of 0 is not valid > For Historical, the interval to retain old row versions for, 0 means > forever > > Implementation summary > 1. Add new table-level option for row_visibility and > row_visibility_retention_interval > 2. Add option to heap_beginscan > 3. Add option handling in heap prune > 4. Add option to tqual > > Thoughts? > looks great Pavel > -- > Simon Riggs http://www.2ndQuadrant.com/ > <http://www.2ndquadrant.com/> > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services >