On 12/05/2014 12:54 PM, Josh Berkus wrote: > Hackers, > > This is not a complete enough report for a diagnosis. I'm posting it > here just in case someone else sees something like it, and having an > additional report will help figure out the underlying issue. > > * 700GB database with around 5,000 writes per second > * 8 replicas handling around 10,000 read queries per second each > * replicas are slammed (40-70% utilization) > * replication produces lots of replication query cancels > > In this scenario, a specific query against some of the less busy and > fairly small tables would produce a segfault (signal 11) once every 1-4 > days randomly. This query could have 100's of successful runs for every > segfault. This was not reproduceable manually, and the segfaults never > happened on the master. Nor did we ever see a segfault based on any > other query, including against the tables which were generally the > source of the query cancels. > > In case it's relevant, the query included use of regexp_split_to_array() > and ORDER BY random(), neither of which are generally used in the user's > other queries. > > We made some changes which decreased query cancel (optimizing queries, > turning on hot_standby_feedback) and we haven't seen a segfault since > then. As far as the user is concerned, this solves the problem, so I'm > never going to get a trace or a core dump file.
Forgot a major piece of evidence as to why I think this is related to query cancel: in each case, the segfault was preceeded by a multi-backend query cancel 3ms to 30ms beforehand. It is possible that the backend running the query which segfaulted might have been the only backend *not* cancelled due to query conflict concurrently. Contradicting this, there are other multi-backend query cancels in the logs which do NOT produce a segfault. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers