On Sat, 11 Sep 2004, Tiago Wright wrote:
IMHO, it is worth duplicating the mvcc data to all index entries. To summarize what I understand from this discussion, with the current method:
a1 - Index seeks must return invisible tuples because mvcc data is not found in the index. These tuples are eliminated once the data is read from the actual data pages.
a2 - Covered queries are not possible since the data page must be visited to determine visibility
a4 - Indexes must be fully vacuumed before vacuuming the corresponding heap entries
If mvcc data is replicated to the index entries:
b1 - Index seeks will never return invisible tuples, possibly eliminating some page reads
b2 - Covered queries are possible
b3 - Inserts are not affected performancewise. Deletes must now visit every index entry, which is a larger cost. Updates must visit every index entry too. It may be possible to reduce the cost of update if the indexed data is not affected, since the new index entry will likely end up in the same page as the index entry that must be deleted, so no extra page reads would be necessary in this scenario.
b4 - Heap and index pages can be vacuumed independently.
Since the great majority of performance issues are related to select queries, the benefit of eliminating invisible tuple page fetches and supporting covered queries probably outweight the extra cost of updating index entries. And once covered queries are supported, it would be possible to build indexed views or multi-table indexes that can address some of the most performance demanding queries out there.
I am wondering whether it would be possible to measure the costs of a1 and a2 above and compare with the probable costs for b3. It seems to me that applications for which b3 are most expensive are also those for which a1 would be most expensive, and since selects are much more common than updates, could one offset the other in the long run? Can anyone shed some light on these?
If it seems that there are some cases where it's better to have the visibility information in the index and some cases where not, I think we could support both kinds of indexes and let the DBA choose.
- Heikki
---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster