Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-07 Thread Tom Lane
"Pavan Deolasee" <[EMAIL PROTECTED]> writes: > The WIP patch looks good to me. I haven't yet tested it (will wait for the > final version). The following pointer arithmetic caught my eye though. > ! nunused = (end - nowunused); > Shouldn't we typecast them to (char *) first ? No ... we want the

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-07 Thread Pavan Deolasee
On Thu, Mar 6, 2008 at 11:30 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > > I think that just makes things more complex and fragile. I like > Heikki's idea, in part because it makes the normal path and the WAL > recovery path guaranteed to work alike. I'll attach my work-in-progress > patch for

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-06 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > I'm glad we got away with a single "marked" array. I was afraid we would > need to consult the unused/redirected/dead arrays separately. Yeah, I was worried about that too. The fundamental reason why it's okay seems to be that redirects can only

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-06 Thread Heikki Linnakangas
Tom Lane wrote: "Pavan Deolasee" <[EMAIL PROTECTED]> writes: On Wed, Mar 5, 2008 at 9:29 PM, Tom Lane <[EMAIL PROTECTED]> wrote: [ thinks some more... ] I guess we could use a flag array dimensioned MaxHeapTuplesPerPage to mark already-processed tuples, so that you wouldn't need to search the

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-06 Thread Tom Lane
"Pavan Deolasee" <[EMAIL PROTECTED]> writes: > On Wed, Mar 5, 2008 at 9:29 PM, Tom Lane <[EMAIL PROTECTED]> wrote: >> [ thinks some more... ] I guess we could use a flag array dimensioned >> MaxHeapTuplesPerPage to mark already-processed tuples, so that you >> wouldn't need to search the existing

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-06 Thread Pavan Deolasee
On Wed, Mar 5, 2008 at 9:29 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > > > [ thinks some more... ] I guess we could use a flag array dimensioned > MaxHeapTuplesPerPage to mark already-processed tuples, so that you > wouldn't need to search the existing arrays but just index into the flag > arra

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Tom Lane
"Gavin M. Roy" <[EMAIL PROTECTED]> writes: > On Wed, Mar 5, 2008 at 10:13 AM, Tom Lane <[EMAIL PROTECTED]> wrote: >> Actually, maybe it *has* been seen before. Gavin, are you in the habit >> of running concurrent VACUUM FULLs on system catalogs, and if so have >> you noted that they occasionally g

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Tom Lane
"Gavin M. Roy" <[EMAIL PROTECTED]> writes: > 2008-03-04 05:45:20 EST [6742]: [7-1] PANIC: deadlock detected > 2008-03-04 05:45:20 EST [6742]: [8-1] DETAIL: Process 6742 waits for > AccessShareLock on relation 2619 of database 16385; blocked by process 6740. > Process 6740 waits for AccessShareLoc

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> I think we really are at too much risk of PANIC the way it's being done >> now. Has anyone got a better idea? > We could do the pruning in two phases: first figure out what to do > without modifyng anything, outside critical-s

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Gavin M. Roy
2008-03-04 05:45:20 EST [6742]: [7-1] PANIC: deadlock detected 2008-03-04 05:45:20 EST [6742]: [8-1] DETAIL: Process 6742 waits for AccessShareLock on relation 2619 of database 16385; blocked by process 6740. Process 6740 waits for AccessShareLock on relation 1247 of database 16385; blocked by pr

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Tom Lane
"Gavin M. Roy" <[EMAIL PROTECTED]> writes: > The panic may have made it if this is what you were looking for: > 2008-03-04 05:45:20 EST [6742]: [7-1] PANIC: deadlock detected > 2008-03-04 05:58:33 EST [8751]: [3-1] PANIC: deadlock detected That's what I expected to find, but where's the DETAIL

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Heikki Linnakangas
Tom Lane wrote: The reason the critical section is so large is that we're manipulating the contents of a shared buffer, and we don't want a failure to leave a partially-modified page in the buffer. We could fix that if we were to memcpy the page into local storage and do all the pruning work the

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Gavin M. Roy
On Wed, Mar 5, 2008 at 10:31 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Gavin M. Roy" <[EMAIL PROTECTED]> writes: > > 2008-03-04 05:45:47 EST [6698]: [1-1] LOG: process 6698 still waiting > for > > AccessShareLock on relation 1247 of database 16385 after 1001.519 ms > > 2008-03-04 05:45:47 EST [6

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Gavin M. Roy
On Wed, Mar 5, 2008 at 10:31 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Gavin M. Roy" <[EMAIL PROTECTED]> writes: > > 2008-03-04 05:45:47 EST [6698]: [1-1] LOG: process 6698 still waiting > for > > AccessShareLock on relation 1247 of database 16385 after 1001.519 ms > > 2008-03-04 05:45:47 EST [6

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Gavin M. Roy
On Wed, Mar 5, 2008 at 10:13 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > I wrote: > > In particular, if that's the problem, why has this not been seen before? > > The fact that it's going through heap_page_prune doesn't seem very > > relevant --- VACUUM FULL has certainly always had to invoke > > Ca

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Tom Lane
"Gavin M. Roy" <[EMAIL PROTECTED]> writes: > 2008-03-04 05:45:47 EST [6698]: [1-1] LOG: process 6698 still waiting for > AccessShareLock on relation 1247 of database 16385 after 1001.519 ms > 2008-03-04 05:45:47 EST [6698]: [2-1] STATEMENT: VACUUM FULL > autograph.autograph_creators > 2008-03-04

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Gavin M. Roy
2008-03-04 05:45:47 EST [6698]: [1-1] LOG: process 6698 still waiting for AccessShareLock on relation 1247 of database 16385 after 1001.519 ms 2008-03-04 05:45:47 EST [6698]: [2-1] STATEMENT: VACUUM FULL autograph.autograph_creators 2008-03-04 05:46:28 EST [6730]: [1-1] LOG: process 6730 still w

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Tom Lane
I wrote: > In particular, if that's the problem, why has this not been seen before? > The fact that it's going through heap_page_prune doesn't seem very > relevant --- VACUUM FULL has certainly always had to invoke > CacheInvalidateHeapTuple someplace or other. So I still want to see > the deadloc

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Tom Lane
"Pavan Deolasee" <[EMAIL PROTECTED]> writes: > Why not just unconditionally finish the phase 2 as part of InitPostgres ? You're jumping to a patch before we even understand what's happening. In particular, if that's the problem, why has this not been seen before? The fact that it's going through h

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Pavan Deolasee
On Wed, Mar 5, 2008 at 3:41 PM, Pavan Deolasee <[EMAIL PROTECTED]> wrote: > > > > Two backends try to vacuum full two different catalog tables. Each acquires > an > exclusive lock on the respective catalog relation. Then each try to > initialize its > own catalog cache. But to do that they nee

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-05 Thread Pavan Deolasee
On Wed, Mar 5, 2008 at 8:26 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Gavin M. Roy" <[EMAIL PROTECTED]> writes: > > (gdb) where > > #0 0x003fe362e21d in raise () from /lib64/tls/libc.so.6 > > #1 0x003fe362fa1e in abort () from /lib64/tls/libc.so.6 > > #2 0x0063a2e3 in errfin

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-04 Thread Tom Lane
"Gavin M. Roy" <[EMAIL PROTECTED]> writes: > (gdb) where > #0 0x003fe362e21d in raise () from /lib64/tls/libc.so.6 > #1 0x003fe362fa1e in abort () from /lib64/tls/libc.so.6 > #2 0x0063a2e3 in errfinish () > #3 0x005974c4 in DeadLockReport () > #4 0x0059381f in L

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-04 Thread Gavin M. Roy
[EMAIL PROTECTED] backup]$ cat /etc/redhat-release CentOS release 4.4 (Final) BINDIR = /usr/local/pgsql/bin DOCDIR = /usr/local/pgsql/doc INCLUDEDIR = /usr/local/pgsql/include PKGINCLUDEDIR = /usr/local/pgsql/include INCLUDEDIR-SERVER = /usr/local/pgsql/include/server LIBDIR = /usr/local/pgsql/lib

Re: [HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-04 Thread Alvaro Herrera
Gavin M. Roy wrote: > This morning I had a postgres 8.3 install core this morning while multiple > vacuum fulls were taking place. I saved the core file, would anyone be > interested in dissecting it? I've otherwise had no issues with this machine > or pgsql install. Of course. Please post the b

[HACKERS] 8.3.0 Core with concurrent vacuum fulls

2008-03-04 Thread Gavin M. Roy
This morning I had a postgres 8.3 install core this morning while multiple vacuum fulls were taking place. I saved the core file, would anyone be interested in dissecting it? I've otherwise had no issues with this machine or pgsql install. Gavin