Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable

Amit Kapila Sun, 23 Aug 2020 23:14:42 -0700

On Sat, Aug 22, 2020 at 8:33 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
>
> Amit Kapila <amit.kapil...@gmail.com> writes:
> > On Thu, Aug 20, 2020 at 10:28 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> >> 2. On the other hand, the code is *releasing* the
> >> ReplicationSlotControlLock before it calls
> >> ProcArraySetReplicationSlotXmin, and that seems like a flat out
> >> concurrency bug.
>
> > It is not clear to me how those values can go backward.
>
> After releasing ReplicationSlotControlLock, that code is holding no
> lock at all (in the already_locked=false case I'm concerned about).
> Thus the scenario to consider is:
>
> 1. Walsender A runs ReplicationSlotsComputeRequiredXmin, computes
> some perfectly-valid xmins, releases ReplicationSlotControlLock,
> amd then gets swapped out to Timbuktu.
>
> 2. Time passes and the "true values" of those xmins advance thanks
> to other walsender activity.
>
> 3. Walsender B runs ReplicationSlotsComputeRequiredXmin, computes
> some perfectly-valid xmins, and successfully stores them in the
> procarray.
>
> 4. Walsender A returns from never-never land, and stores its now
> quite stale results in the procarray, causing the globally visible
> xmins to go backwards from where they were after step 3.
>
> I see no mechanism in the code that prevents this scenario.
> On reflection I'm not even very sure that the code change
> I'm suggesting would prevent it.  It'd prevent walsenders
> from entering or exiting until we've updated the procarray,
> but there's nothing to stop the furthest-back walsender from
> advancing its values.
>


I think we can prevent that if we allow
ProcArraySetReplicationSlotXmin to update the shared values only when
new xmin values follows the shared values. I am not very sure if it is
safe but I am not able to think of a problem with it. The other way
could be to always acquire ProcArrayLock before calling
ReplicationSlotsComputeRequiredXmin or before acquiring
ReplicationSlotControlLock but that seems too restrictive.

-- 
With Regards,
Amit Kapila.

Re: ReplicationSlotsComputeRequiredXmin seems pretty questionable

Reply via email to