Re: [GENERAL] seg fault crashed the postmaster

2011-01-04 Thread Gordon Shannon
I'm putting this on this thread, since it could be related to the issue. I'm now seeing this in the log on the HSB/SR server. It's happened about 4 times in the past 2 days. 23964 2011-01-04 05:23:00 EST [47]LOG: invalid record length at 6E53/46E8A010 23535 2011-01-04 05:23:00 EST [2]FATAL

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Gordon Shannon
Unfortunately it's now impossible to say how many were updated, as they get deleted by another process later. I may be able to restore part of a dump from 2 days ago on another machine, and get some counts from that, assuming I have the disk space. I'll work on that. I do not believe there could

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Tom Lane
Gordon Shannon writes: > I assume you can now see the plan? I uploaded it twice, once via gmail and > once on Nabble. Yeah, the Nabble one works. Now I'm even more confused, because the whole-row var seems to be coming from the outside of the nestloop, which is about the simplest possible case.

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Gordon Shannon
The number of matching rows on these queries is anything from 0 to 1. I don't think I can tell how many would have matched on the ones that crashed. Although I suspect it would have been toward the 1 end. I've been trying to get a reproducable test case with no luck so far. I assume y

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Tom Lane
I wrote: > The odds seem pretty good that the "corrupt compressed data" message > has the same origin at bottom, although the lack of any obvious data > to be compressed in this table is confusing. Maybe you could get that > from trying to copy over a garbage value of that one varchar column, > th

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Gordon Shannon
Maybe it doesn't work from gmail. I'll try uploading from here. http://postgresql.1045698.n5.nabble.com/file/n3323933/plan.txt plan.txt -- View this message in context: http://postgresql.1045698.n5.nabble.com/seg-fault-crashed-the-postmaster-tp3323117p3323933.html Sent from the PostgreSQL - g

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Tom Lane
Gordon Shannon writes: > Enclosed is the query plan -- 21000 lines Um ... nope? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Gordon Shannon
Yes that query does take 30 or 90 secs. I'm pretty sure it was blocking on its twin update running concurrently. However I'm not really sure how to identify what "transaction 1283585646" was. Enclosed is the query plan -- 21000 lines -gordon I tried to replicate the problem here without succes

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Tom Lane
Gordon Shannon writes: > Sorry, I left that out. Yeah, I wondered that too, since these tables do > not use toast. Hm. Well, given that the stack trace suggests we're trying to access a tuple value that's not there (bogus pointer, or data overwritten since the pointer was created), the "invalid

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Gordon Shannon
Sorry, I left that out. Yeah, I wondered that too, since these tables do not use toast. CREATE TYPE message_status_enum AS ENUM ( 'V', 'X', 'S', 'R', 'U', 'D' ); On Fri, Dec 31, 2010 at 12:38 PM, Tom Lane-2 [via PostgreSQL] < ml-node+3323859-1425181809-56...@n5.nabble.com > wrote: > Hmmm ...

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Tom Lane
Gordon Shannon writes: > Here is the ddl for the tables in question. There are foreign keys to other > tables that I omitted. > http://postgresql.1045698.n5.nabble.com/file/n3323804/parts.sql parts.sql Hmmm ... what is "message_status_enum"? Is that an actual enum type, or some kind of domain

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Gordon Shannon
Here is the ddl for the tables in question. There are foreign keys to other tables that I omitted. http://postgresql.1045698.n5.nabble.com/file/n3323804/parts.sql parts.sql -- View this message in context: http://postgresql.1045698.n5.nabble.com/seg-fault-crashed-the-postmaster-tp3323117p3323

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Gordon Shannon
Interesting. That's exactly what we have been doing -- trying to update the same rows in multiple txns. For us to proceed in production, I will take steps to ensure we stop doing that, as it's just an app bug really. The table in question -- v_messages -- is an empty base table with 76 partitions

Re: [GENERAL] seg fault crashed the postmaster

2010-12-31 Thread Tom Lane
Gordon Shannon writes: > Stack trace: > #0 0x0031a147c15c in memcpy () from /lib64/libc.so.6 > #1 0x00450cb8 in __memcpy_ichk (tuple=0x7fffb29ac900) at > /usr/include/bits/string3.h:51 > #2 heap_copytuple (tuple=0x7fffb29ac900) at heaptuple.c:592 > #3 0x00543d4c in EvalPla

Re: [GENERAL] seg fault crashed the postmaster

2010-12-30 Thread Gordon Shannon
Stack trace: #0 0x0031a147c15c in memcpy () from /lib64/libc.so.6 #1 0x00450cb8 in __memcpy_ichk (tuple=0x7fffb29ac900) at /usr/include/bits/string3.h:51 #2 heap_copytuple (tuple=0x7fffb29ac900) at heaptuple.c:592 #3 0x00543d4c in EvalPlanQualFetchRowMarks (epqstate=0x3cd8

Re: [GENERAL] seg fault crashed the postmaster

2010-12-30 Thread Tom Lane
Gordon Shannon writes: > I'd love to send you a stack trace. Any suggestions on how to get one? It > has since happened again, on the same update command, so I'm guessing I can > repeat it. http://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend

Re: [GENERAL] seg fault crashed the postmaster

2010-12-30 Thread Gordon Shannon
I'd love to send you a stack trace. Any suggestions on how to get one? It has since happened again, on the same update command, so I'm guessing I can repeat it. On Thu, Dec 30, 2010 at 6:52 PM, Tom Lane-2 [via PostgreSQL] < ml-node+3323151-436577542-56...@n5.nabble.com > wrote: > Gordon Shannon

Re: [GENERAL] seg fault crashed the postmaster

2010-12-30 Thread Tom Lane
Gordon Shannon writes: > Running Centos, just upgraded our production db from 8.4.4 to 9.0.2 last > night. About 20 hours later, an update statement seg faulted and crashed > the server. This is a typical update that has worked fine for a long time. Could we see a stack trace from that? Or at l