Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-27 Thread Amit Kapila
On Thu, Oct 18, 2018 at 2:33 PM Thomas Munro wrote: > > On Thu, Oct 18, 2018 at 5:00 PM Amit Kapila wrote: > > The below code seems to be problemetic: > > dsm_cleanup_using_control_segment() > > { > > .. > > if (!dsm_control_segment_sane(old_control, mapped_size)) > > { > > dsm_impl_op(DSM_OP_DET

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-18 Thread Thomas Munro
On Thu, Oct 18, 2018 at 5:00 PM Amit Kapila wrote: > The below code seems to be problemetic: > dsm_cleanup_using_control_segment() > { > .. > if (!dsm_control_segment_sane(old_control, mapped_size)) > { > dsm_impl_op(DSM_OP_DETACH, old_control_handle, 0, &impl_private, > &mapped_address, &mapped_s

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-18 Thread Thomas Munro
On Thu, Oct 18, 2018 at 6:02 PM Larry Rosenman wrote: > Let me know soon(ish) if any of you want to poke at this machine, as I'm > likely to forget and reboot it. Hi Larry, Thanks for the offer but it looks like there is no way to get our hands on that memory for forensics now. I'll see if

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Larry Rosenman
On Wed, Oct 17, 2018 at 08:19:52PM -0500, Larry Rosenman wrote: > On Thu, Oct 18, 2018 at 02:17:14PM +1300, Thomas Munro wrote: > > On Thu, Oct 18, 2018 at 1:10 PM Tom Lane wrote: > > > ... However, I'm still slightly interested in how it > > > was that that broke DSM so thoroughly ... > > > > Me

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Amit Kapila
On Thu, Oct 18, 2018 at 6:30 AM Thomas Munro wrote: > > On Thu, Oct 18, 2018 at 11:08 AM Thomas Munro > wrote: > > On Thu, Oct 18, 2018 at 9:43 AM Thomas Munro > > wrote: > > > On Thu, Oct 18, 2018 at 9:00 AM Tom Lane wrote: > > > > I would argue that both dsm_postmaster_shutdown and > > > > d

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Thomas Munro
On Thu, Oct 18, 2018 at 2:36 PM Tom Lane wrote: > Larry's REL_10_STABLE failure logs are interesting: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=peripatus&dt=2018-10-17%2020%3A42%3A17 > > 2018-10-17 15:48:08.849 CDT [55240:7] LOG: dynamic shared memory control > segment is corru

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Tom Lane
Thomas Munro writes: > On Thu, Oct 18, 2018 at 1:10 PM Tom Lane wrote: >> ... However, I'm still slightly interested in how it >> was that that broke DSM so thoroughly ... > Me too. Frustratingly, that vm object might still exist on Larry's > machine if it hasn't been rebooted (since we failed

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Larry Rosenman
On Thu, Oct 18, 2018 at 02:17:14PM +1300, Thomas Munro wrote: > On Thu, Oct 18, 2018 at 1:10 PM Tom Lane wrote: > > ... However, I'm still slightly interested in how it > > was that that broke DSM so thoroughly ... > > Me too. Frustratingly, that vm object might still exist on Larry's > machine

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Thomas Munro
On Thu, Oct 18, 2018 at 1:10 PM Tom Lane wrote: > ... However, I'm still slightly interested in how it > was that that broke DSM so thoroughly ... Me too. Frustratingly, that vm object might still exist on Larry's machine if it hasn't been rebooted (since we failed to shm_unlink() it), so if we

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Larry Rosenman
On Wed, Oct 17, 2018 at 08:55:09PM -0400, Tom Lane wrote: > Larry Rosenman writes: > > On Wed, Oct 17, 2018 at 08:10:28PM -0400, Tom Lane wrote: > >> However, I'm still slightly interested in how it > >> was that that broke DSM so thoroughly ... I pulled down your version of > >> python2.7 and wil

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Thomas Munro
On Thu, Oct 18, 2018 at 11:08 AM Thomas Munro wrote: > On Thu, Oct 18, 2018 at 9:43 AM Thomas Munro > wrote: > > On Thu, Oct 18, 2018 at 9:00 AM Tom Lane wrote: > > > I would argue that both dsm_postmaster_shutdown and dsm_postmaster_startup > > > are broken here; the former because it makes no

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Tom Lane
Larry Rosenman writes: > On Wed, Oct 17, 2018 at 08:10:28PM -0400, Tom Lane wrote: >> However, I'm still slightly interested in how it >> was that that broke DSM so thoroughly ... I pulled down your version of >> python2.7 and will see if that reproduces it. > It was built on a previous alpha, so

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Larry Rosenman
On Wed, Oct 17, 2018 at 08:10:28PM -0400, Tom Lane wrote: > Larry Rosenman writes: > > On Wed, Oct 17, 2018 at 07:07:09PM -0400, Tom Lane wrote: > >> ... Was your Python install built > >> with any special switches? I just used what came from "pkg install". > > > It had been built on a previous

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Tom Lane
Larry Rosenman writes: > On Wed, Oct 17, 2018 at 07:07:09PM -0400, Tom Lane wrote: >> ... Was your Python install built >> with any special switches? I just used what came from "pkg install". > It had been built on a previous FreeBSD build, I have my own poudriere > infrastructure. I can probab

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Larry Rosenman
On Wed, Oct 17, 2018 at 07:07:09PM -0400, Tom Lane wrote: > Larry Rosenman writes: > > On the original failure, I recompiled and reinstalled the 2 Python's I > > have on this box, and at least 9.3 went back to OK. > > Hmm. I'd just finished pulling down FreeBSD-12.0-ALPHA9 and failing > to repr

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Tom Lane
Larry Rosenman writes: > On the original failure, I recompiled and reinstalled the 2 Python's I > have on this box, and at least 9.3 went back to OK. Hmm. I'd just finished pulling down FreeBSD-12.0-ALPHA9 and failing to reproduce any problem with that ... and then I noticed your box said it wa

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Larry Rosenman
On Thu, Oct 18, 2018 at 11:08:33AM +1300, Thomas Munro wrote: > On Thu, Oct 18, 2018 at 9:43 AM Thomas Munro > wrote: > > On Thu, Oct 18, 2018 at 9:00 AM Tom Lane wrote: > > > I would argue that both dsm_postmaster_shutdown and dsm_postmaster_startup > > > are broken here; the former because it m

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Thomas Munro
On Thu, Oct 18, 2018 at 9:43 AM Thomas Munro wrote: > On Thu, Oct 18, 2018 at 9:00 AM Tom Lane wrote: > > I would argue that both dsm_postmaster_shutdown and dsm_postmaster_startup > > are broken here; the former because it makes no attempt to unmap > > the old control segment (which it oughta be

Re: DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Thomas Munro
On Thu, Oct 18, 2018 at 9:00 AM Tom Lane wrote: > 2018-10-17 13:43:24.235 CDT [46467:6] LOG: dynamic shared memory control > segment is corrupt > TRAP: FailedAssertion("!(dsm_control_mapped_size == 0)", File: "dsm.c", Line: > 181) > > It looks to me like what's happening is > > (1) crashing pro

DSM robustness failure (was Re: Peripatus/failures)

2018-10-17 Thread Tom Lane
Larry Rosenman writes: > That got it further, but still fails at PLCheck-C (at least on 9.3). > It's still running the other branches. Hmm. I'm not sure why plpython is crashing for you, but this is exposing a robustness problem in the DSM logic: https://buildfarm.postgresql.org/cgi-bin/show_lo

Re: Peripatus/failures

2018-10-17 Thread Larry Rosenman
On Wed, Oct 17, 2018 at 01:41:59PM -0400, Tom Lane wrote: > Larry Rosenman writes: > > It looks like my upgrade to the current head of FreeBSD 12-to-be, which > > includes OpenSSL 1.1.1 broke a bunch of our stuff. > > In at least the 9.x branches.  Just a heads up. > > It looks like configure is

Re: Peripatus/failures

2018-10-17 Thread Tom Lane
Larry Rosenman writes: > It looks like my upgrade to the current head of FreeBSD 12-to-be, which > includes OpenSSL 1.1.1 broke a bunch of our stuff. > In at least the 9.x branches.  Just a heads up. It looks like configure is drawing the wrong conclusions about OpenSSL's API options. Since the

Peripatus/failures

2018-10-17 Thread Larry Rosenman
It looks like my upgrade to the current head of FreeBSD 12-to-be, which includes OpenSSL 1.1.1 broke a bunch of our stuff. In at least the 9.x branches.  Just a heads up. -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: l...