On 2017-12-19 15:35:12 -0300, Alvaro Herrera wrote: > Andres Freund wrote: > > > I think the bugfix is going to have to essentially be something similar > > to FreezeMultiXactId(). I.e. when reusing an old tuple's xmax for a new > > tuple version, we need to prune dead multixact members. I think we can > > do so unconditionally and rely on multixact id caching layer to avoid > > unnecesarily creating multis when all members are the same. > > Actually, isn't the cache subject to the very same problem? If you use > a value from the cache, it could very well be below whatever the cutoff > multi was chosen in the other process ...
That's an potential issue somewhat indepent of this bug though (IIRC I also mentioned it in the other thread). I hit that problem a bunch in manual testing, but I didn't manage to create an actual testcase for it, without pre-existing corruption. I don't think this specific instance would be more-vulnerable than FreezeMultiXactId() itself - we'd just use alive multis and pass them to MultiXactIdCreateFromMembers(), which is exactly what FreezeMultiXactId() does. I think the cache issue ends up not quite being a live bug, because every transaction that's a member of a multixact also has done MultiXactIdSetOldestMember(). Which in turn probably, but I'm not sure, prevents the existance of multis with just alive members in the cache, that are below the multi horizon. That relies on the fact that we only create multis with alive members though, which seems fragile. It'd be good if we added some assurances to MultiXactIdCreateFromMembers() that it actually can't happen. Hope I didn't miss a live version of the above problem? Greetings, Andres Freund