On Thu, Sep 15, 2016 at 06:21:30PM -0400, Robert Haas wrote: > On Thu, Sep 15, 2016 at 5:22 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > > Robert Haas <robertmh...@gmail.com> writes: > >> Of course, it's also possible that the ParallelWorkerNumber code is > >> entirely correct and something overwrote the null bytes that were > >> supposed to be found at that location. It would be very useful to see > >> (a) the value of ParallelWorkerNumber and (b) the contents of > >> vmq->mq_sender, and in particular whether that's actually a valid > >> pointer to a PGPROC in the ProcArray. But unless we can reproduce > >> this I don't see how to manage that. > > > > Is it worth replacing that Assert with a test-and-elog that would > > print those values? > > > > Given that we've seen only one such instance in the buildfarm, this > > might've been just a cosmic ray bit-flip. So one part of me says > > not to worry too much until we see it again. OTOH, if it is real > > but rare, missing an opportunity to diagnose would be bad. > > I wonder if we could persuade somebody to run pgbench on a Windows > machine with a similar environment, at least some concurrency, and > force_parallel_mode=on. Assuming this is a generic > initialize-the-parallel-stuff bug and not something specific to a > particular query, that might well trigger it a lot quicker than > waiting for it to recur in the BF.
For the sake of this thread in the archives, the cause was almost surely a Cygwin signal handling bug: https://postgr.es/m/20170803034740.ga2641...@rfd.leadboat.com https://marc.info/?t=150183296400001 https://marc.info/?t=150291861700011 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers