Zeugswetter Andreas ADI SD wrote:
A few assumptions:
no back pointers
indexes only point at slots marked as roots (and non hot tuples)
During vacuum, you swap the tuples and keep a stub at the slot that the
user's ctid might be pointing at. You mark the stub to detect this
situation.
When a select/update by ctid comes along it needs to do one step to the
root
and use that tuple instead.
As Pavan pointed out, that's more or less what he ended up doing
originally. You need to mark the stub with the current most recent xid,
and wait until that's no longer running. Only after that you can remove
the stub.
It needs a second vacuum (or a per page vacuum during update) to remove
the
extra stub when it is dead and not recently dead.
Requiring two vacuums to remove the tuple sounds bad at first, but it's
actually not so bad since both steps could by done by retail vacuum, or
even normal scans while.
I fail to see the hole.
The only potential problem I can see is how to make sure that a heap
scan or a bitmap heap scan doesn't visit the tuple twice. If we make
sure that the page is scanned in one go while keeping the buffer pinned,
we're good. We already do that except for system catalogs, so I believe
we'd have to forbid hot updates on system tables, like we forbid bitmap
scans.
To me this sounds like the best idea this far.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match