Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Andres Freund
On 2018-09-11 12:50:06 -0400, Tom Lane wrote: > I am not sure which part of "I will not fix this" you didn't understand. Maybe the "this is an open list, and we can discuss things" bit?

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Tom Lane
Andres Freund writes: > On 2018-09-11 12:26:44 -0400, Tom Lane wrote: >> Well, there remains the fact that we've seen no field reports that seem >> to trace to failure-to-acquire-AEL since 9.6 came out. So arguing that >> this *could* be a probable scenario fails to comport with the available >>

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Andres Freund
On 2018-09-11 12:26:44 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2018-09-11 12:18:59 -0400, Tom Lane wrote: > >> Doesn't matter: startup would hit a lock conflict and cancel the pg_dump > >> to get out of it, long before approaching locktable full. > > > Only if all that's happening i

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Tom Lane
Andres Freund writes: > On 2018-09-11 12:18:59 -0400, Tom Lane wrote: >> Doesn't matter: startup would hit a lock conflict and cancel the pg_dump >> to get out of it, long before approaching locktable full. > Only if all that's happening in the same database, which is far from a > given. Well, t

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Andres Freund
On 2018-09-11 12:18:59 -0400, Tom Lane wrote: > Andres Freund writes: > > On 2018-09-11 12:03:44 -0400, Tom Lane wrote: > >> If the startup process has acquired enough AELs to approach locktable > >> full, any concurrent pg_dump has probably failed already, because it'd > >> be trying to share-loc

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Tom Lane
Andres Freund writes: > On 2018-09-11 12:03:44 -0400, Tom Lane wrote: >> If the startup process has acquired enough AELs to approach locktable >> full, any concurrent pg_dump has probably failed already, because it'd >> be trying to share-lock every table and so would have a huge conflict >> cross

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Andres Freund
Hi, On 2018-09-11 12:03:44 -0400, Tom Lane wrote: > Andres Freund writes: > > Isn't one of the most common ways to run into "out of shared memory" > > "You might need to increase max_locks_per_transaction." to run pg_dump? > > And that's pretty commonly done against standbys? > > If the startup

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Tom Lane
Andres Freund writes: > On 2018-09-11 16:23:44 +0100, Simon Riggs wrote: >> It's hard to see how any reasonable workload would affect the standby. And >> if it did, you'd change the parameter and restart, just like you already >> have to do if someone changes max_connections on master first. > Is

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Tom Lane
Robert Haas writes: > On Tue, Sep 11, 2018 at 10:25 AM, Tom Lane wrote: >> The point of the previous coding here was that perhaps there's some >> range of number-of-locks-needed where kicking hot-standby queries >> off of locks would allow recovery to proceed. However, it is (as >> far as I know

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Andres Freund
Hi, On 2018-09-11 16:23:44 +0100, Simon Riggs wrote: > It's hard to see how any reasonable workload would affect the standby. And > if it did, you'd change the parameter and restart, just like you already > have to do if someone changes max_connections on master first. Isn't one of the most commo

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Simon Riggs
On 11 September 2018 at 16:11, Robert Haas wrote: > On Tue, Sep 11, 2018 at 10:25 AM, Tom Lane wrote: > > The point of the previous coding here was that perhaps there's some > > range of number-of-locks-needed where kicking hot-standby queries > > off of locks would allow recovery to proceed. H

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Tom Lane
Robert Haas writes: > On Tue, Sep 11, 2018 at 5:54 AM, Simon Riggs wrote: >> Please explain why you think that would be with no restart. > Because the startup process will die, and if that happens, IIRC, > there's no crash-and-restart loop. You're just done. Unless we think that the startup pr

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Robert Haas
On Tue, Sep 11, 2018 at 10:25 AM, Tom Lane wrote: > The point of the previous coding here was that perhaps there's some > range of number-of-locks-needed where kicking hot-standby queries > off of locks would allow recovery to proceed. However, it is (as > far as I know) unproven that that actual

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Robert Haas
On Tue, Sep 11, 2018 at 5:54 AM, Simon Riggs wrote: > Please explain why you think that would be with no restart. Because the startup process will die, and if that happens, IIRC, there's no crash-and-restart loop. You're just done. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The E

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Tom Lane
Simon Riggs writes: > On 10 September 2018 at 19:16, Robert Haas wrote: >> On Fri, Sep 7, 2018 at 6:37 PM, Tom Lane wrote: >>> So my inclination is to remove the reportMemoryError = false parameter, >>> and just let an error happen in the unlikely situation that we hit OOM >>> for the lock table

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-11 Thread Simon Riggs
On 10 September 2018 at 19:16, Robert Haas wrote: > On Fri, Sep 7, 2018 at 6:37 PM, Tom Lane wrote: > > So my inclination is to remove the reportMemoryError = false parameter, > > and just let an error happen in the unlikely situation that we hit OOM > > for the lock table. > > Wouldn't that tak

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-10 Thread Robert Haas
On Fri, Sep 7, 2018 at 6:37 PM, Tom Lane wrote: > So my inclination is to remove the reportMemoryError = false parameter, > and just let an error happen in the unlikely situation that we hit OOM > for the lock table. Wouldn't that take down the entire cluster with no restart? -- Robert Haas Ent

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-08 Thread Simon Riggs
On 8 September 2018 at 00:37, Tom Lane wrote: > Commit 37c54863c removed the code in StandbyAcquireAccessExclusiveLock > that checked the return value of LockAcquireExtended. AFAICS this was > flat out wrong, because it's still passing reportMemoryError = false > to LockAcquireExtended, meaning

StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-07 Thread Tom Lane
Commit 37c54863c removed the code in StandbyAcquireAccessExclusiveLock that checked the return value of LockAcquireExtended. AFAICS this was flat out wrong, because it's still passing reportMemoryError = false to LockAcquireExtended, meaning there are still cases where LOCKACQUIRE_NOT_AVAIL will b