Re: [HACKERS] shm_mq_set_sender() crash

Noah Misch Sun, 01 Oct 2017 14:27:37 -0700

On Thu, Sep 15, 2016 at 06:21:30PM -0400, Robert Haas wrote:
> On Thu, Sep 15, 2016 at 5:22 PM, Tom Lane <[email protected]> wrote:
> > Robert Haas <[email protected]> writes:
> >> Of course, it's also possible that the ParallelWorkerNumber code is
> >> entirely correct and something overwrote the null bytes that were
> >> supposed to be found at that location.  It would be very useful to see
> >> (a) the value of ParallelWorkerNumber and (b) the contents of
> >> vmq->mq_sender, and in particular whether that's actually a valid
> >> pointer to a PGPROC in the ProcArray.  But unless we can reproduce
> >> this I don't see how to manage that.
> >
> > Is it worth replacing that Assert with a test-and-elog that would
> > print those values?
> >
> > Given that we've seen only one such instance in the buildfarm, this
> > might've been just a cosmic ray bit-flip.  So one part of me says
> > not to worry too much until we see it again.  OTOH, if it is real
> > but rare, missing an opportunity to diagnose would be bad.
> 
> I wonder if we could persuade somebody to run pgbench on a Windows
> machine with a similar environment, at least some concurrency, and
> force_parallel_mode=on.  Assuming this is a generic
> initialize-the-parallel-stuff bug and not something specific to a
> particular query, that might well trigger it a lot quicker than
> waiting for it to recur in the BF.


For the sake of this thread in the archives, the cause was almost surely a
Cygwin signal handling bug:

  https://postgr.es/m/[email protected]
  https://marc.info/?t=150183296400001
  https://marc.info/?t=150291861700011


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] shm_mq_set_sender() crash

Reply via email to