Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-18 Thread Michael Paquier
On Fri, Oct 18, 2019 at 07:30:37AM -0300, Alvaro Herrera wrote: > Sure thing, thanks, done :-) Thanks, Alvaro. -- Michael signature.asc Description: PGP signature

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-18 Thread Alvaro Herrera
On 2019-Oct-18, Michael Paquier wrote: > What you are proposing here sounds fine to me. Perhaps you would > prefer to adjust the code yourself? Sure thing, thanks, done :-) -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-17 Thread Michael Paquier
On Thu, Oct 17, 2019 at 06:56:48AM -0300, Alvaro Herrera wrote: > On 2019-Oct-17, Michael Paquier wrote: >> pgstat_progress_end_command() is done for REINDEX CONCURRENTLY after >> the concurrent drop, so it made sense to me to still report any PID >> REINDEX CONC is waiting for at this stage. > >

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-17 Thread Alvaro Herrera
On 2019-Oct-17, Michael Paquier wrote: > On Thu, Oct 17, 2019 at 05:33:22AM -0300, Alvaro Herrera wrote: > > Hmm, I wonder if it isn't the right solution to set 'progress' to false > > in that spot, instead. index_drop says it must only be called by the > > dependency machinery; are we depending

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-17 Thread Michael Paquier
On Thu, Oct 17, 2019 at 05:33:22AM -0300, Alvaro Herrera wrote: > Hmm, I wonder if it isn't the right solution to set 'progress' to false > in that spot, instead. index_drop says it must only be called by the > dependency machinery; are we depending on that to pass-through the need > to update pro

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-17 Thread Alvaro Herrera
On 2019-Oct-17, Michael Paquier wrote: > You may not have a backtrace, but I think that you are right: > WaitForLockers() gets called in index_drop() with progress reporting > enabled. index_drop() would also be taken by REINDEX CONCURRENTLY > through performMultipleDeletions() but we cannot know

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-16 Thread Michael Paquier
On Wed, Oct 16, 2019 at 04:11:46PM -0500, Justin Pryzby wrote: > On Sun, Oct 13, 2019 at 04:18:34PM -0300, Alvaro Herrera wrote: >> (FWIW I expect the crash is possible not just in reindex but also in >> CREATE INDEX CONCURRENTLY.) > > FWIW, for sake of list archives, and for anyone running v12 ho

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-16 Thread Michael Paquier
On Wed, Oct 16, 2019 at 09:53:56AM -0300, Alvaro Herrera wrote: > Thanks, pushed. Thanks, Alvaro. -- Michael signature.asc Description: PGP signature

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-16 Thread Justin Pryzby
On Sun, Oct 13, 2019 at 04:18:34PM -0300, Alvaro Herrera wrote: > (FWIW I expect the crash is possible not just in reindex but also in > CREATE INDEX CONCURRENTLY.) FWIW, for sake of list archives, and for anyone running v12 hoping to avoid crashing, I believe we hit this for DROP INDEX CONCURRENT

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-16 Thread Alvaro Herrera
On 2019-Oct-15, Michael Paquier wrote: > So, Alvaro, your patch looks good to me. Could you apply it? Thanks, pushed. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-14 Thread Michael Paquier
On Mon, Oct 14, 2019 at 08:57:16AM +0900, Michael Paquier wrote: > I need to think about that, but shouldn't we have a way to reproduce > that case rather reliably with an isolation test? The patch looks to > good to me, these are also the two places I spotted yesterday after a > quick lookup. Th

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-14 Thread Justin Pryzby
On Sun, Oct 13, 2019 at 03:10:21PM -0300, Alvaro Herrera wrote: > On 2019-Oct-13, Justin Pryzby wrote: > > > Looks like it's a race condition and dereferencing *holder=NULL. The first > > crash was probably the same bug, due to report query running during "reindex > > CONCURRENTLY", and probably

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-14 Thread Alvaro Herrera
On 2019-Oct-13, Justin Pryzby wrote: > Looks like it's a race condition and dereferencing *holder=NULL. The first > crash was probably the same bug, due to report query running during "reindex > CONCURRENTLY", and probably finished at nearly the same time as another > locker. Ooh, right, makes

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-14 Thread Justin Pryzby
On Sun, Oct 13, 2019 at 06:06:43PM +0900, Michael Paquier wrote: > On Fri, Oct 11, 2019 at 07:44:46PM -0500, Justin Pryzby wrote: > > Unfortunately, there was no core file, and I'm still trying to reproduce it. > > Forgot to set ulimit -c? Having a backtrace would surely help. Fortunately (?) an

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-13 Thread Michael Paquier
On Sun, Oct 13, 2019 at 04:18:34PM -0300, Alvaro Herrera wrote: > True. And we can copy the resulting comment to the other spot. > > (FWIW I expect the crash is possible not just in reindex but also in > CREATE INDEX CONCURRENTLY.) I need to think about that, but shouldn't we have a way to repro

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-13 Thread Alvaro Herrera
On 2019-Oct-13, Justin Pryzby wrote: > On Sun, Oct 13, 2019 at 03:10:21PM -0300, Alvaro Herrera wrote: > > On 2019-Oct-13, Justin Pryzby wrote: > > > > > Looks like it's a race condition and dereferencing *holder=NULL. The > > > first > > > crash was probably the same bug, due to report query r

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-13 Thread Justin Pryzby
Resending this message, which didn't make it to the list when I sent it earlier. (And, notified -www). On Sun, Oct 13, 2019 at 06:06:43PM +0900, Michael Paquier wrote: > On Fri, Oct 11, 2019 at 07:44:46PM -0500, Justin Pryzby wrote: > > Unfortunately, there was no core file, and I'm still trying

Re: v12.0: segfault in reindex CONCURRENTLY

2019-10-13 Thread Michael Paquier
On Fri, Oct 11, 2019 at 07:44:46PM -0500, Justin Pryzby wrote: > That's an index on a table partition, but not itself a child of a relkind=I > index. Interesting. Testing with a partition tree, and indexes on leaves which do not have dependencies with a parent I cannot reproduce anything. Perhap