Re: [sqlsmith] Unpinning error in parallel worker

2018-05-22 Thread Thomas Munro
On Wed, Apr 25, 2018 at 8:15 AM, Jonathan Rudenberg wrote: > On Tue, Apr 24, 2018, at 16:06, Thomas Munro wrote: >> I'll write a patch to fix that unpleasant symptom. While holding >> DynamicSharedMemoryControlLock we shouldn't raise any errors without >> releasing it first, because the error han

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-24 Thread Jonathan Rudenberg
On Tue, Apr 24, 2018, at 16:06, Thomas Munro wrote: > On Wed, Apr 25, 2018 at 2:21 AM, Jonathan Rudenberg > wrote: > > This issue happened again in production, here are the stack traces from > > three we grabbed before nuking the >400 hanging backends. > > > > [...] > > #4 0x55fccb93b21c i

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-24 Thread Thomas Munro
On Wed, Apr 25, 2018 at 2:21 AM, Jonathan Rudenberg wrote: > This issue happened again in production, here are the stack traces from three > we grabbed before nuking the >400 hanging backends. > > [...] > #4 0x55fccb93b21c in LWLockAcquire+188() at > /usr/lib/postgresql/10/bin/postgres at l

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-24 Thread Jonathan Rudenberg
On Fri, Apr 20, 2018, at 00:42, Thomas Munro wrote: > On Wed, Apr 18, 2018 at 11:43 AM, Jonathan Rudenberg > wrote: > > On Tue, Apr 17, 2018, at 19:31, Thomas Munro wrote: > >> On Wed, Apr 18, 2018 at 11:01 AM, Jonathan Rudenberg > >> wrote: > >> > Yep, I think I know approximately what it look

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-20 Thread Jonathan Rudenberg
On Fri, Apr 20, 2018, at 00:42, Thomas Munro wrote: > On Wed, Apr 18, 2018 at 11:43 AM, Jonathan Rudenberg > wrote: > > On Tue, Apr 17, 2018, at 19:31, Thomas Munro wrote: > >> On Wed, Apr 18, 2018 at 11:01 AM, Jonathan Rudenberg > >> wrote: > >> > Yep, I think I know approximately what it look

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-19 Thread Thomas Munro
On Wed, Apr 18, 2018 at 11:43 AM, Jonathan Rudenberg wrote: > On Tue, Apr 17, 2018, at 19:31, Thomas Munro wrote: >> On Wed, Apr 18, 2018 at 11:01 AM, Jonathan Rudenberg >> wrote: >> > Yep, I think I know approximately what it looked like, I've attached a >> > lightly redacted plan. All of the h

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-17 Thread Jonathan Rudenberg
On Tue, Apr 17, 2018, at 19:31, Thomas Munro wrote: > On Wed, Apr 18, 2018 at 11:01 AM, Jonathan Rudenberg > wrote: > > On Tue, Apr 17, 2018, at 18:38, Thomas Munro wrote: > >> Thanks, that would be much appreciated, as would any clues about what > >> workload you're running. Do you know what the

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-17 Thread Thomas Munro
On Wed, Apr 18, 2018 at 11:01 AM, Jonathan Rudenberg wrote: > On Tue, Apr 17, 2018, at 18:38, Thomas Munro wrote: >> Thanks, that would be much appreciated, as would any clues about what >> workload you're running. Do you know what the query plan looks like >> for the queries that crashed? > > Ye

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-17 Thread Jonathan Rudenberg
On Tue, Apr 17, 2018, at 18:38, Thomas Munro wrote: > I don't have any theories about how that could be going wrong right > now, but I'm looking into it. Thank you! > > I don't have a backtrace yet, but I will provide them if/when the issue > > happens again. > > Thanks, that would be much

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-17 Thread Thomas Munro
On Wed, Apr 18, 2018 at 8:52 AM, Jonathan Rudenberg wrote: > Hundreds of queries stuck with a wait_event of DynamicSharedMemoryControlLock > and pg_terminate_backend did not terminate the queries. > > In the log: > >> FATAL: cannot unpin a segment that is not pinned Thanks for the report. That

Re: [sqlsmith] Unpinning error in parallel worker

2018-04-17 Thread Jonathan Rudenberg
On Wed, Mar 29, 2017, at 10:50, Robert Haas wrote: > On Wed, Mar 29, 2017 at 1:31 AM, Thomas Munro > wrote: > > I considered whether the error message could be improved but it > > matches the message for an existing similar case (where you try to > > attach to an unknown handle). > > Ugh, OK. I