Re: Delay locking partitions during query execution

David Rowley Thu, 31 Jan 2019 17:31:25 -0800

On Fri, 1 Feb 2019 at 10:32, Robert Haas <[email protected]> wrote:
>
> On Sun, Jan 27, 2019 at 8:26 PM David Rowley
> <[email protected]> wrote:
> > One way around this would be to always perform an invalidation on the
> > partition's parent when performing a relcache invalidation on the
> > partition.  We could perhaps invalidate all the way up to the top
> > level partitioned table, that way we could just obtain a lock on the
> > target partitioned table during AcquireExecutorLocks(). I'm currently
> > only setting the delaylock flag to false for leaf partitions only.
>
> Would this problem go away if we adopted the proposal discussed in
> http://postgr.es/m/[email protected] and, if so, is that
> a good fix?
>
> I don't quite understand why this is happening.  It seems like as long
> as you take at least one new lock, you'll process *every* pending
> invalidation message, and that should trigger replanning as long as
> the dependencies are correct.  But maybe the issue is that you hold
> all the other locks you need already, and the lock on the partition at
> issue is only acquired during execution, at which point it's too late
> to replan.  If so, then I think something along the lines of the above
> might make a lot of sense.


I think perhaps accepting invalidations at the start of the statement
might appear to fix the problem in master, but I think there's still a
race condition around CheckCachedPlan() since we'll ignore
invalidation messages when we perform LockRelationOid() in
AcquireExecutorLocks() due to the lock already being held.  So there's
likely still a window where the invalidation message could come in and
we miss it.    That said, if lock is already taken in one session, and
some other session alters some relation we have a lock on, then that
alternation can't have been done with an AEL, as that would have
blocked on our lock, so it must be some ShareUpdateExclusiveLock
change, and if so that change must not be important enough to block
concurrently executing queries, so likely shouldn't actually break
anything.

So in summary, I think that checking for invalidation messages at the
start of the statement allows us to replan within a transaction, but
does not guarantee we can the exact correct plan for the current
settings.

I also don't have a full grasp on why we must ignore invalidation
messages when we already have a lock on the relation.  I assume it's
not a huge issue since obtaining a new lock on any other relation will
process the invalidation queue anyway. I know that f868a8143a9
recently made some changes to fix some recursive issues around these
invalidations.

(I admit to not having the best grasp on how all this works, so feel
free to shoot this down)



--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Delay locking partitions during query execution

Reply via email to