On Fri, Oct 13, 2017 at 10:57:32PM -0500, Justin Pryzby wrote: > > Also notice the vacuum process was interrupted, same as yesterday (think > > goodness for full logs). Our INSERT script is using python > > multiprocessing.pool() with "maxtasksperchild=1", which I think means we > > load > > one file and then exit the subprocess, and pool() creates a new subproc, > > which > > starts a new PG session and transaction. Which explains why autovacuum > > starts > > processing the table only to be immediately interrupted.
On Sun, Oct 15, 2017 at 01:57:14AM +0200, Tomas Vondra wrote: > I don't follow. Why does it explain that autovacuum gets canceled? I > mean, merely opening a new connection/session should not cancel > autovacuum. That requires a command that requires table-level lock > conflicting with autovacuum (so e.g. explicit LOCK command, DDL, ...). I was thinking that INSERT would do it, but I gather you're right about autovacuum. Let me get back to you about this.. > > Due to a .."behavioral deficiency" in the loader for those tables, the > > crashed > > backend causes the loader to get stuck, so the tables should be untouched > > since > > the crash, should it be desirable to inspect them. > > > > It's a bit difficult to guess what went wrong from this backtrace. For > me gdb typically prints a bunch of lines immediately before the frames, > explaining what went wrong - not sure why it's missing here. Do you mean this ? ... Loaded symbols for /lib64/libnss_files-2.12.so Core was generated by `postgres: autovacuum worker process gtt '. Program terminated with signal 11, Segmentation fault. #0 pfree (pointer=0x298c740) at mcxt.c:954 954 (*context->methods->free_p) (context, pointer); > Perhaps some of those pointers are bogus, the memory was already pfree-d > or something like that. You'll have to poke around and try dereferencing > the pointers to find what works and what does not. > > For example what do these gdb commands do in the #0 frame? > > (gdb) p *(MemoryContext)context (gdb) p *(MemoryContext)context Cannot access memory at address 0x7474617261763a20 > (gdb) p *GetMemoryChunkContext(pointer) (gdb) p *GetMemoryChunkContext(pointer) No symbol "GetMemoryChunkContext" in current context. I had to do this since it's apparently inlined/macro: (gdb) p *(MemoryContext *) (((char *) pointer) - sizeof(void *)) $8 = (MemoryContext) 0x7474617261763a20 I uploaded the corefile: http://telsasoft.com/tmp/coredump-postgres-autovacuum-brin-summarize.gz Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers