Hi, Tom Lane írta: > Boszormenyi Zoltan <z...@cybertec.at> writes: > >> You expressed stability concerns coming from this patch. >> Were these concerns because of locks timing out making >> things fragile or because of general feelings about introducing >> such a patch at the end of the release cycle? I was thinking >> about the former, hence this modification. >> > > Indeed, I am *very* concerned about the stability implications of this > patch. I just don't believe that arbitrarily restricting which > processes the GUC applies to will make it any safer. > > regards, tom lane >
Okay, here is the rewritten lock_timeout GUC patch that uses setitimer() to set the timeout for lock timeout. I removed the GUC assignment/validation function. I left the current statement timeout vs deadlock timeout logic mostly intact in enable_sig_alarm(), because it's used by a few places. The only change is that statement_fin_time is always computed there because the newly introduced function (enable_sig_alarm_for_lock_timeout()) checks it to see whether the lock timeout triggers earlier then the deadlock timeout. As it was discussed before, this is 9.1 material. Best regards, Zoltán Böszörményi -- Bible has answers for everything. Proof: "But let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil." (Matthew 5:37) - basics of digital technology. "May your kingdom come" - superficial description of plate tectonics ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH http://www.postgresql.at/
diff -dcrpN pgsql.orig/doc/src/sgml/config.sgml pgsql/doc/src/sgml/config.sgml *** pgsql.orig/doc/src/sgml/config.sgml 2010-02-17 10:05:40.000000000 +0100 --- pgsql/doc/src/sgml/config.sgml 2010-02-19 11:29:18.000000000 +0100 *************** COPY postgres_log FROM '/full/path/to/lo *** 4240,4245 **** --- 4240,4269 ---- </listitem> </varlistentry> + <varlistentry id="guc-lock-timeout" xreflabel="lock_timeout"> + <term><varname>lock_timeout</varname> (<type>integer</type>)</term> + <indexterm> + <primary><varname>lock_timeout</> configuration parameter</primary> + </indexterm> + <listitem> + <para> + Abort any statement that tries to acquire a heavy-weight lock (e.g. rows, + pages, tables, indices or other objects) and the lock has to wait more + than the specified number of milliseconds, starting from the time the + command arrives at the server from the client. + If <varname>log_min_error_statement</> is set to <literal>ERROR</> or lower, + the statement that timed out will also be logged. A value of zero + (the default) turns off the limitation. + </para> + + <para> + Setting <varname>lock_timeout</> in + <filename>postgresql.conf</> is not recommended because it + affects all sessions. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-vacuum-freeze-table-age" xreflabel="vacuum_freeze_table_age"> <term><varname>vacuum_freeze_table_age</varname> (<type>integer</type>)</term> <indexterm> diff -dcrpN pgsql.orig/doc/src/sgml/ref/lock.sgml pgsql/doc/src/sgml/ref/lock.sgml *** pgsql.orig/doc/src/sgml/ref/lock.sgml 2009-09-18 08:26:40.000000000 +0200 --- pgsql/doc/src/sgml/ref/lock.sgml 2010-02-19 11:29:18.000000000 +0100 *************** LOCK [ TABLE ] [ ONLY ] <replaceable cla *** 39,46 **** <literal>NOWAIT</literal> is specified, <command>LOCK TABLE</command> does not wait to acquire the desired lock: if it cannot be acquired immediately, the command is aborted and an ! error is emitted. Once obtained, the lock is held for the ! remainder of the current transaction. (There is no <command>UNLOCK TABLE</command> command; locks are always released at transaction end.) </para> --- 39,49 ---- <literal>NOWAIT</literal> is specified, <command>LOCK TABLE</command> does not wait to acquire the desired lock: if it cannot be acquired immediately, the command is aborted and an ! error is emitted. If <varname>lock_timeout</varname> is set to a value ! higher than 0, and the lock cannot be acquired under the specified ! timeout value in milliseconds, the command is aborted and an error ! is emitted. Once obtained, the lock is held for the remainder of ! the current transaction. (There is no <command>UNLOCK TABLE</command> command; locks are always released at transaction end.) </para> diff -dcrpN pgsql.orig/doc/src/sgml/ref/select.sgml pgsql/doc/src/sgml/ref/select.sgml *** pgsql.orig/doc/src/sgml/ref/select.sgml 2010-02-13 19:44:33.000000000 +0100 --- pgsql/doc/src/sgml/ref/select.sgml 2010-02-19 11:29:18.000000000 +0100 *************** FOR SHARE [ OF <replaceable class="param *** 1160,1165 **** --- 1160,1173 ---- </para> <para> + If <literal>NOWAIT</> option is not specified and <varname>lock_timeout</varname> + is set to a value higher than 0, and the lock needs to wait more than + the specified value in milliseconds, the command reports an error after + timing out, rather than waiting indefinitely. The note in the previous + paragraph applies to the <varname>lock_timeout</varname>, too. + </para> + + <para> If specific tables are named in <literal>FOR UPDATE</literal> or <literal>FOR SHARE</literal>, then only rows coming from those tables are locked; any other diff -dcrpN pgsql.orig/src/backend/port/posix_sema.c pgsql/src/backend/port/posix_sema.c *** pgsql.orig/src/backend/port/posix_sema.c 2010-01-03 12:54:22.000000000 +0100 --- pgsql/src/backend/port/posix_sema.c 2010-02-19 21:21:21.000000000 +0100 *************** *** 24,29 **** --- 24,30 ---- #include "miscadmin.h" #include "storage/ipc.h" #include "storage/pg_sema.h" + #include "storage/proc.h" #ifdef USE_NAMED_POSIX_SEMAPHORES *************** PGSemaphoreTryLock(PGSemaphore sema) *** 313,315 **** --- 314,341 ---- return true; } + + /* + * PGSemaphoreTimedLock + * + * Lock a semaphore (decrement count), blocking if count would be < 0 + * Return if lock_timeout expired + */ + void + PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK) + { + int errStatus; + + do + { + ImmediateInterruptOK = interruptOK; + CHECK_FOR_INTERRUPTS(); + errStatus = sem_wait(PG_SEM_REF(sema)); + ImmediateInterruptOK = false; + } while (errStatus < 0 && errno == EINTR && !lock_timeout_detected); + + if (lock_timeout_detected) + return; + if (errStatus < 0) + elog(FATAL, "sem_wait failed: %m"); + } diff -dcrpN pgsql.orig/src/backend/port/sysv_sema.c pgsql/src/backend/port/sysv_sema.c *** pgsql.orig/src/backend/port/sysv_sema.c 2010-01-03 12:54:22.000000000 +0100 --- pgsql/src/backend/port/sysv_sema.c 2010-02-19 21:20:17.000000000 +0100 *************** *** 30,35 **** --- 30,36 ---- #include "miscadmin.h" #include "storage/ipc.h" #include "storage/pg_sema.h" + #include "storage/proc.h" #ifndef HAVE_UNION_SEMUN *************** PGSemaphoreTryLock(PGSemaphore sema) *** 497,499 **** --- 498,530 ---- return true; } + + /* + * PGSemaphoreTimedLock + * + * Lock a semaphore (decrement count), blocking if count would be < 0 + * Return if lock_timeout expired + */ + void + PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK) + { + int errStatus; + struct sembuf sops; + + sops.sem_op = -1; /* decrement */ + sops.sem_flg = 0; + sops.sem_num = sema->semNum; + + do + { + ImmediateInterruptOK = interruptOK; + CHECK_FOR_INTERRUPTS(); + errStatus = semop(sema->semId, &sops, 1); + ImmediateInterruptOK = false; + } while (errStatus < 0 && errno == EINTR && !lock_timeout_detected); + + if (lock_timeout_detected) + return; + if (errStatus < 0) + elog(FATAL, "semop(id=%d) failed: %m", sema->semId); + } diff -dcrpN pgsql.orig/src/backend/port/win32_sema.c pgsql/src/backend/port/win32_sema.c *** pgsql.orig/src/backend/port/win32_sema.c 2010-01-03 12:54:22.000000000 +0100 --- pgsql/src/backend/port/win32_sema.c 2010-02-19 21:18:52.000000000 +0100 *************** *** 16,21 **** --- 16,22 ---- #include "miscadmin.h" #include "storage/ipc.h" #include "storage/pg_sema.h" + #include "storage/proc.h" static HANDLE *mySemSet; /* IDs of sema sets acquired so far */ static int numSems; /* number of sema sets acquired so far */ *************** PGSemaphoreTryLock(PGSemaphore sema) *** 205,207 **** --- 206,263 ---- /* keep compiler quiet */ return false; } + + /* + * PGSemaphoreTimedLock + * + * Lock a semaphore (decrement count), blocking if count would be < 0. + * Serve the interrupt if interruptOK is true. + * Return if lock_timeout expired. + */ + void + PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK) + { + DWORD ret; + HANDLE wh[2]; + + wh[0] = *sema; + wh[1] = pgwin32_signal_event; + + /* + * As in other implementations of PGSemaphoreLock, we need to check for + * cancel/die interrupts each time through the loop. But here, there is + * no hidden magic about whether the syscall will internally service a + * signal --- we do that ourselves. + */ + do + { + ImmediateInterruptOK = interruptOK; + CHECK_FOR_INTERRUPTS(); + + errno = 0; + ret = WaitForMultipleObjectsEx(2, wh, FALSE, INFINITE, TRUE); + + if (ret == WAIT_OBJECT_0) + { + /* We got it! */ + return; + } + else if (ret == WAIT_OBJECT_0 + 1) + { + /* Signal event is set - we have a signal to deliver */ + pgwin32_dispatch_queued_signals(); + errno = EINTR; + } + else + /* Otherwise we are in trouble */ + errno = EIDRM; + + ImmediateInterruptOK = false; + } while (errno == EINTR && !lock_timeout_detected); + + if (lock_timeout_detected) + return; + if (errno != 0) + ereport(FATAL, + (errmsg("could not lock semaphore: error code %d", (int) GetLastError()))); + } diff -dcrpN pgsql.orig/src/backend/storage/lmgr/lmgr.c pgsql/src/backend/storage/lmgr/lmgr.c *** pgsql.orig/src/backend/storage/lmgr/lmgr.c 2010-01-03 12:54:25.000000000 +0100 --- pgsql/src/backend/storage/lmgr/lmgr.c 2010-02-19 20:58:51.000000000 +0100 *************** *** 19,26 **** --- 19,29 ---- #include "access/transam.h" #include "access/xact.h" #include "catalog/catalog.h" + #include "catalog/pg_database.h" #include "miscadmin.h" #include "storage/lmgr.h" + #include "utils/lsyscache.h" + #include "storage/proc.h" #include "storage/procarray.h" #include "utils/inval.h" *************** LockRelationOid(Oid relid, LOCKMODE lock *** 78,83 **** --- 81,101 ---- res = LockAcquire(&tag, lockmode, false, false); + if (res == LOCKACQUIRE_NOT_AVAIL) + { + char *relname = get_rel_name(relid); + if (relname) + ereport(ERROR, + (errcode(ERRCODE_LOCK_NOT_AVAILABLE), + errmsg("could not obtain lock on relation \"%s\"", + relname))); + else + ereport(ERROR, + (errcode(ERRCODE_LOCK_NOT_AVAILABLE), + errmsg("could not obtain lock on relation with OID %u", + relid))); + } + /* * Now that we have the lock, check for invalidation messages, so that we * will update or flush any stale relcache entry before we try to use it. *************** LockRelation(Relation relation, LOCKMODE *** 173,178 **** --- 191,202 ---- res = LockAcquire(&tag, lockmode, false, false); + if (res == LOCKACQUIRE_NOT_AVAIL) + ereport(ERROR, + (errcode(ERRCODE_LOCK_NOT_AVAILABLE), + errmsg("could not obtain lock on relation \"%s\"", + RelationGetRelationName(relation)))); + /* * Now that we have the lock, check for invalidation messages; see notes * in LockRelationOid. *************** LockRelationIdForSession(LockRelId *reli *** 250,256 **** SET_LOCKTAG_RELATION(tag, relid->dbId, relid->relId); ! (void) LockAcquire(&tag, lockmode, true, false); } /* --- 274,293 ---- SET_LOCKTAG_RELATION(tag, relid->dbId, relid->relId); ! if (LockAcquire(&tag, lockmode, true, false) == LOCKACQUIRE_NOT_AVAIL) ! { ! char *relname = get_rel_name(relid->relId); ! if (relname) ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on relation \"%s\"", ! relname))); ! else ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on relation with OID %u", ! relid->relId))); ! } } /* *************** LockRelationForExtension(Relation relati *** 285,291 **** relation->rd_lockInfo.lockRelId.dbId, relation->rd_lockInfo.lockRelId.relId); ! (void) LockAcquire(&tag, lockmode, false, false); } /* --- 322,332 ---- relation->rd_lockInfo.lockRelId.dbId, relation->rd_lockInfo.lockRelId.relId); ! if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL) ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on index \"%s\"", ! RelationGetRelationName(relation)))); } /* *************** LockPage(Relation relation, BlockNumber *** 319,325 **** relation->rd_lockInfo.lockRelId.relId, blkno); ! (void) LockAcquire(&tag, lockmode, false, false); } /* --- 360,370 ---- relation->rd_lockInfo.lockRelId.relId, blkno); ! if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL) ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on page %u of relation \"%s\"", ! blkno, RelationGetRelationName(relation)))); } /* *************** LockTuple(Relation relation, ItemPointer *** 375,381 **** ItemPointerGetBlockNumber(tid), ItemPointerGetOffsetNumber(tid)); ! (void) LockAcquire(&tag, lockmode, false, false); } /* --- 420,430 ---- ItemPointerGetBlockNumber(tid), ItemPointerGetOffsetNumber(tid)); ! if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL) ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on row in relation \"%s\"", ! RelationGetRelationName(relation)))); } /* *************** XactLockTableInsert(TransactionId xid) *** 429,435 **** SET_LOCKTAG_TRANSACTION(tag, xid); ! (void) LockAcquire(&tag, ExclusiveLock, false, false); } /* --- 478,487 ---- SET_LOCKTAG_TRANSACTION(tag, xid); ! if (LockAcquire(&tag, ExclusiveLock, false, false) == LOCKACQUIRE_NOT_AVAIL) ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on transaction with ID %u", xid))); } /* *************** XactLockTableWait(TransactionId xid) *** 473,479 **** SET_LOCKTAG_TRANSACTION(tag, xid); ! (void) LockAcquire(&tag, ShareLock, false, false); LockRelease(&tag, ShareLock, false); --- 525,534 ---- SET_LOCKTAG_TRANSACTION(tag, xid); ! if (LockAcquire(&tag, ShareLock, false, false) == LOCKACQUIRE_NOT_AVAIL) ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on transaction with ID %u", xid))); LockRelease(&tag, ShareLock, false); *************** VirtualXactLockTableInsert(VirtualTransa *** 531,537 **** SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid); ! (void) LockAcquire(&tag, ExclusiveLock, false, false); } /* --- 586,596 ---- SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid); ! if (LockAcquire(&tag, ExclusiveLock, false, false) == LOCKACQUIRE_NOT_AVAIL) ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on virtual transaction with ID %u", ! vxid.localTransactionId))); } /* *************** VirtualXactLockTableWait(VirtualTransact *** 549,555 **** SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid); ! (void) LockAcquire(&tag, ShareLock, false, false); LockRelease(&tag, ShareLock, false); } --- 608,618 ---- SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid); ! if (LockAcquire(&tag, ShareLock, false, false) == LOCKACQUIRE_NOT_AVAIL) ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on virtual transaction with ID %u", ! vxid.localTransactionId))); LockRelease(&tag, ShareLock, false); } *************** LockDatabaseObject(Oid classid, Oid obji *** 598,604 **** objid, objsubid); ! (void) LockAcquire(&tag, lockmode, false, false); } /* --- 661,671 ---- objid, objsubid); ! if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL) ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on class:object: %u:%u", ! classid, objid))); } /* *************** LockSharedObject(Oid classid, Oid objid, *** 636,642 **** objid, objsubid); ! (void) LockAcquire(&tag, lockmode, false, false); /* Make sure syscaches are up-to-date with any changes we waited for */ AcceptInvalidationMessages(); --- 703,713 ---- objid, objsubid); ! if (LockAcquire(&tag, lockmode, false, false) == LOCKACQUIRE_NOT_AVAIL) ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on class:object: %u:%u", ! classid, objid))); /* Make sure syscaches are up-to-date with any changes we waited for */ AcceptInvalidationMessages(); *************** LockSharedObjectForSession(Oid classid, *** 678,684 **** objid, objsubid); ! (void) LockAcquire(&tag, lockmode, true, false); } /* --- 749,770 ---- objid, objsubid); ! if (LockAcquire(&tag, lockmode, true, false) == LOCKACQUIRE_NOT_AVAIL) ! switch(classid) ! { ! case DatabaseRelationId: ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on database with ID %u", ! objid))); ! break; ! default: ! ereport(ERROR, ! (errcode(ERRCODE_LOCK_NOT_AVAILABLE), ! errmsg("could not obtain lock on class:object: %u:%u", ! classid, objid))); ! break; ! } } /* diff -dcrpN pgsql.orig/src/backend/storage/lmgr/lock.c pgsql/src/backend/storage/lmgr/lock.c *** pgsql.orig/src/backend/storage/lmgr/lock.c 2010-02-01 10:45:49.000000000 +0100 --- pgsql/src/backend/storage/lmgr/lock.c 2010-02-19 11:29:18.000000000 +0100 *************** PROCLOCK_PRINT(const char *where, const *** 255,261 **** static uint32 proclock_hash(const void *key, Size keysize); static void RemoveLocalLock(LOCALLOCK *locallock); static void GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner); ! static void WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner); static bool UnGrantLock(LOCK *lock, LOCKMODE lockmode, PROCLOCK *proclock, LockMethod lockMethodTable); static void CleanUpLock(LOCK *lock, PROCLOCK *proclock, --- 255,261 ---- static uint32 proclock_hash(const void *key, Size keysize); static void RemoveLocalLock(LOCALLOCK *locallock); static void GrantLockLocal(LOCALLOCK *locallock, ResourceOwner owner); ! static int WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner); static bool UnGrantLock(LOCK *lock, LOCKMODE lockmode, PROCLOCK *proclock, LockMethod lockMethodTable); static void CleanUpLock(LOCK *lock, PROCLOCK *proclock, *************** ProcLockHashCode(const PROCLOCKTAG *proc *** 451,457 **** * dontWait: if true, don't wait to acquire lock * * Returns one of: ! * LOCKACQUIRE_NOT_AVAIL lock not available, and dontWait=true * LOCKACQUIRE_OK lock successfully acquired * LOCKACQUIRE_ALREADY_HELD incremented count for lock already held * --- 451,457 ---- * dontWait: if true, don't wait to acquire lock * * Returns one of: ! * LOCKACQUIRE_NOT_AVAIL lock not available, either dontWait=true or timeout * LOCKACQUIRE_OK lock successfully acquired * LOCKACQUIRE_ALREADY_HELD incremented count for lock already held * *************** LockAcquireExtended(const LOCKTAG *lockt *** 837,843 **** locktag->locktag_type, lockmode); ! WaitOnLock(locallock, owner); TRACE_POSTGRESQL_LOCK_WAIT_DONE(locktag->locktag_field1, locktag->locktag_field2, --- 837,843 ---- locktag->locktag_type, lockmode); ! status = WaitOnLock(locallock, owner); TRACE_POSTGRESQL_LOCK_WAIT_DONE(locktag->locktag_field1, locktag->locktag_field2, *************** LockAcquireExtended(const LOCKTAG *lockt *** 852,871 **** * done when the lock was granted to us --- see notes in WaitOnLock. */ ! /* ! * Check the proclock entry status, in case something in the ipc ! * communication doesn't work correctly. ! */ ! if (!(proclock->holdMask & LOCKBIT_ON(lockmode))) { ! PROCLOCK_PRINT("LockAcquire: INCONSISTENT", proclock); ! LOCK_PRINT("LockAcquire: INCONSISTENT", lock, lockmode); ! /* Should we retry ? */ ! LWLockRelease(partitionLock); ! elog(ERROR, "LockAcquire failed"); } - PROCLOCK_PRINT("LockAcquire: granted", proclock); - LOCK_PRINT("LockAcquire: granted", lock, lockmode); } LWLockRelease(partitionLock); --- 852,883 ---- * done when the lock was granted to us --- see notes in WaitOnLock. */ ! switch (status) { ! case STATUS_OK: ! /* ! * Check the proclock entry status, in case something in the ipc ! * communication doesn't work correctly. ! */ ! if (!(proclock->holdMask & LOCKBIT_ON(lockmode))) ! { ! PROCLOCK_PRINT("LockAcquire: INCONSISTENT", proclock); ! LOCK_PRINT("LockAcquire: INCONSISTENT", lock, lockmode); ! /* Should we retry ? */ ! LWLockRelease(partitionLock); ! elog(ERROR, "LockAcquire failed"); ! } ! PROCLOCK_PRINT("LockAcquire: granted", proclock); ! LOCK_PRINT("LockAcquire: granted", lock, lockmode); ! break; ! case STATUS_WAITING: ! PROCLOCK_PRINT("LockAcquire: timed out", proclock); ! LOCK_PRINT("LockAcquire: timed out", lock, lockmode); ! break; ! default: ! elog(ERROR, "LockAcquire invalid status"); ! break; } } LWLockRelease(partitionLock); *************** LockAcquireExtended(const LOCKTAG *lockt *** 891,897 **** locktag->locktag_field2); } ! return LOCKACQUIRE_OK; } /* --- 903,909 ---- locktag->locktag_field2); } ! return (status == STATUS_OK ? LOCKACQUIRE_OK : LOCKACQUIRE_NOT_AVAIL); } /* *************** GrantAwaitedLock(void) *** 1169,1182 **** * Caller must have set MyProc->heldLocks to reflect locks already held * on the lockable object by this process. * * The appropriate partition lock must be held at entry. */ ! static void WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner) { LOCKMETHODID lockmethodid = LOCALLOCK_LOCKMETHOD(*locallock); LockMethod lockMethodTable = LockMethods[lockmethodid]; char *volatile new_status = NULL; LOCK_PRINT("WaitOnLock: sleeping on lock", locallock->lock, locallock->tag.mode); --- 1181,1200 ---- * Caller must have set MyProc->heldLocks to reflect locks already held * on the lockable object by this process. * + * Result: returns value of ProcSleep() + * STATUS_OK if we acquired the lock + * STATUS_ERROR if not (deadlock) + * STATUS_WAITING if not (timeout) + * * The appropriate partition lock must be held at entry. */ ! static int WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner) { LOCKMETHODID lockmethodid = LOCALLOCK_LOCKMETHOD(*locallock); LockMethod lockMethodTable = LockMethods[lockmethodid]; char *volatile new_status = NULL; + int wait_status; LOCK_PRINT("WaitOnLock: sleeping on lock", locallock->lock, locallock->tag.mode); *************** WaitOnLock(LOCALLOCK *locallock, Resourc *** 1218,1225 **** */ PG_TRY(); { ! if (ProcSleep(locallock, lockMethodTable) != STATUS_OK) { /* * We failed as a result of a deadlock, see CheckDeadLock(). Quit * now. --- 1236,1248 ---- */ PG_TRY(); { ! wait_status = ProcSleep(locallock, lockMethodTable); ! switch (wait_status) { + case STATUS_OK: + case STATUS_WAITING: + break; + default: /* * We failed as a result of a deadlock, see CheckDeadLock(). Quit * now. *************** WaitOnLock(LOCALLOCK *locallock, Resourc *** 1264,1271 **** pfree(new_status); } ! LOCK_PRINT("WaitOnLock: wakeup on lock", locallock->lock, locallock->tag.mode); } /* --- 1287,1300 ---- pfree(new_status); } ! if (wait_status == STATUS_OK) ! LOCK_PRINT("WaitOnLock: wakeup on lock", ! locallock->lock, locallock->tag.mode); ! else if (wait_status == STATUS_WAITING) ! LOCK_PRINT("WaitOnLock: timeout on lock", locallock->lock, locallock->tag.mode); + + return wait_status; } /* diff -dcrpN pgsql.orig/src/backend/storage/lmgr/proc.c pgsql/src/backend/storage/lmgr/proc.c *** pgsql.orig/src/backend/storage/lmgr/proc.c 2010-02-13 19:44:38.000000000 +0100 --- pgsql/src/backend/storage/lmgr/proc.c 2010-02-19 22:38:13.000000000 +0100 *************** *** 52,57 **** --- 52,59 ---- /* GUC variables */ int DeadlockTimeout = 1000; int StatementTimeout = 0; + int LockTimeout = 0; + static int CurrentLockTimeout = 0; bool log_lock_waits = false; /* Pointer to this process's PGPROC struct, if any */ *************** static volatile bool statement_timeout_a *** 79,87 **** --- 81,92 ---- static volatile bool deadlock_timeout_active = false; static volatile DeadLockState deadlock_state = DS_NOT_YET_CHECKED; volatile bool cancel_from_timeout = false; + static volatile bool lock_timeout_active = false; + volatile bool lock_timeout_detected = false; /* timeout_start_time is set when log_lock_waits is true */ static TimestampTz timeout_start_time; + static TimestampTz timeout_fin_time; /* statement_fin_time is valid only if statement_timeout_active is true */ static TimestampTz statement_fin_time; *************** static void ProcKill(int code, Datum arg *** 92,97 **** --- 97,103 ---- static void AuxiliaryProcKill(int code, Datum arg); static bool CheckStatementTimeout(void); static bool CheckStandbyTimeout(void); + static bool enable_sig_alarm_for_lock_timeout(int delayms); /* *************** ProcQueueInit(PROC_QUEUE *queue) *** 795,801 **** * The lock table's partition lock must be held at entry, and will be held * at exit. * ! * Result: STATUS_OK if we acquired the lock, STATUS_ERROR if not (deadlock). * * ASSUME: that no one will fiddle with the queue until after * we release the partition lock. --- 801,810 ---- * The lock table's partition lock must be held at entry, and will be held * at exit. * ! * Result: ! * STATUS_OK if we acquired the lock ! * STATUS_ERROR if not (deadlock) ! * STATUS_WAITING if not (timeout) * * ASSUME: that no one will fiddle with the queue until after * we release the partition lock. *************** ProcSleep(LOCALLOCK *locallock, LockMeth *** 949,955 **** elog(FATAL, "could not set timer for process wakeup"); /* ! * If someone wakes us between LWLockRelease and PGSemaphoreLock, * PGSemaphoreLock will not block. The wakeup is "saved" by the semaphore * implementation. While this is normally good, there are cases where a * saved wakeup might be leftover from a previous operation (for example, --- 958,973 ---- elog(FATAL, "could not set timer for process wakeup"); /* ! * Reset timer so we are awaken in case of lock timeout. ! * This doesn't modify the timer for deadlock check in case ! * the deadlock check happens earlier. ! */ ! CurrentLockTimeout = LockTimeout; ! if (!enable_sig_alarm_for_lock_timeout(CurrentLockTimeout)) ! elog(FATAL, "could not set timer for process wakeup"); ! ! /* ! * If someone wakes us between LWLockRelease and PGSemaphoreTimedLock, * PGSemaphoreLock will not block. The wakeup is "saved" by the semaphore * implementation. While this is normally good, there are cases where a * saved wakeup might be leftover from a previous operation (for example, *************** ProcSleep(LOCALLOCK *locallock, LockMeth *** 967,973 **** */ do { ! PGSemaphoreLock(&MyProc->sem, true); /* * waitStatus could change from STATUS_WAITING to something else --- 985,994 ---- */ do { ! PGSemaphoreTimedLock(&MyProc->sem, true); ! ! if (lock_timeout_detected == true) ! break; /* * waitStatus could change from STATUS_WAITING to something else *************** ProcSleep(LOCALLOCK *locallock, LockMeth *** 1107,1112 **** --- 1128,1141 ---- LWLockAcquire(partitionLock, LW_EXCLUSIVE); /* + * If we're in timeout, so we're not waiting anymore and + * we're not the one that the lock will be granted to. + * So remove ourselves from the wait queue. + */ + if (lock_timeout_detected) + RemoveFromWaitQueue(MyProc, hashcode); + + /* * We no longer want LockWaitCancel to do anything. */ lockAwaited = NULL; *************** ProcSleep(LOCALLOCK *locallock, LockMeth *** 1120,1127 **** /* * We don't have to do anything else, because the awaker did all the * necessary update of the lock table and MyProc. */ ! return MyProc->waitStatus; } --- 1149,1158 ---- /* * We don't have to do anything else, because the awaker did all the * necessary update of the lock table and MyProc. + * RemoveFromWaitQueue() have set MyProc->waitStatus = STATUS_ERROR, + * we need to distinguish this case. */ ! return (lock_timeout_detected ? STATUS_WAITING : MyProc->waitStatus); } *************** enable_sig_alarm(int delayms, bool is_st *** 1460,1477 **** * than normal, but that does no harm. */ timeout_start_time = GetCurrentTimestamp(); ! fin_time = TimestampTzPlusMilliseconds(timeout_start_time, delayms); deadlock_timeout_active = true; ! if (fin_time >= statement_fin_time) return true; } else { /* Begin deadlock timeout with no statement-level timeout */ deadlock_timeout_active = true; ! /* GetCurrentTimestamp can be expensive, so only do it if we must */ ! if (log_lock_waits) ! timeout_start_time = GetCurrentTimestamp(); } /* If we reach here, okay to set the timer interrupt */ --- 1491,1553 ---- * than normal, but that does no harm. */ timeout_start_time = GetCurrentTimestamp(); ! timeout_fin_time = TimestampTzPlusMilliseconds(timeout_start_time, delayms); deadlock_timeout_active = true; ! if (timeout_fin_time >= statement_fin_time) return true; } else { /* Begin deadlock timeout with no statement-level timeout */ deadlock_timeout_active = true; ! /* ! * Computing the timeout_fin_time is needed because ! * the lock timeout logic checks for it. ! */ ! timeout_start_time = GetCurrentTimestamp(); ! timeout_fin_time = TimestampTzPlusMilliseconds(timeout_start_time, delayms); ! } ! ! /* If we reach here, okay to set the timer interrupt */ ! MemSet(&timeval, 0, sizeof(struct itimerval)); ! timeval.it_value.tv_sec = delayms / 1000; ! timeval.it_value.tv_usec = (delayms % 1000) * 1000; ! if (setitimer(ITIMER_REAL, &timeval, NULL)) ! return false; ! return true; ! } ! ! /* ! * Enable the SIGALRM interrupt to fire after the specified delay ! * in case LockTimeout is set. ! * ! * This code properly handles nesting of lock_timeout timeout alarm ! * within deadlock timeout and statement timeout alarms. ! * ! * Returns TRUE if okay, FALSE on failure. ! */ ! static bool ! enable_sig_alarm_for_lock_timeout(int delayms) ! { ! struct itimerval timeval; ! TimestampTz fin_time; ! ! lock_timeout_detected = false; ! if (LockTimeout == 0) ! return true; ! ! fin_time = GetCurrentTimestamp(); ! fin_time = TimestampTzPlusMilliseconds(fin_time, delayms); ! ! if (statement_timeout_active) ! { ! if (fin_time > statement_fin_time) ! return true; ! } ! else if (deadlock_timeout_active) ! { ! if (fin_time > timeout_fin_time) ! return true; } /* If we reach here, okay to set the timer interrupt */ *************** enable_sig_alarm(int delayms, bool is_st *** 1480,1485 **** --- 1556,1563 ---- timeval.it_value.tv_usec = (delayms % 1000) * 1000; if (setitimer(ITIMER_REAL, &timeval, NULL)) return false; + + lock_timeout_active = true; return true; } *************** disable_sig_alarm(bool is_statement_time *** 1500,1506 **** * * We will re-enable the interrupt if necessary in CheckStatementTimeout. */ ! if (statement_timeout_active || deadlock_timeout_active) { struct itimerval timeval; --- 1578,1584 ---- * * We will re-enable the interrupt if necessary in CheckStatementTimeout. */ ! if (statement_timeout_active || deadlock_timeout_active || lock_timeout_active) { struct itimerval timeval; *************** handle_sig_alarm(SIGNAL_ARGS) *** 1600,1605 **** --- 1678,1689 ---- { int save_errno = errno; + if (lock_timeout_active) + { + lock_timeout_active = false; + lock_timeout_detected = true; + } + if (deadlock_timeout_active) { deadlock_timeout_active = false; *************** handle_sig_alarm(SIGNAL_ARGS) *** 1609,1614 **** --- 1693,1710 ---- if (statement_timeout_active) (void) CheckStatementTimeout(); + /* + * If we got here, the DeadlockTimeout already triggered and found no deadlock. + * If LockTimeout is set, and it's greater than DeadlockTimeout, + * then retry with a (LockTimeout - DeadlockTimeout) timeout value. + */ + if (!lock_timeout_detected && CurrentLockTimeout && CurrentLockTimeout > DeadlockTimeout) + { + CurrentLockTimeout -= DeadlockTimeout; + if (!enable_sig_alarm_for_lock_timeout(CurrentLockTimeout)) + elog(FATAL, "could not set timer for process wakeup"); + } + errno = save_errno; } diff -dcrpN pgsql.orig/src/backend/utils/misc/guc.c pgsql/src/backend/utils/misc/guc.c *** pgsql.orig/src/backend/utils/misc/guc.c 2010-02-17 10:05:47.000000000 +0100 --- pgsql/src/backend/utils/misc/guc.c 2010-02-19 13:32:28.000000000 +0100 *************** static struct config_int ConfigureNamesI *** 1585,1590 **** --- 1585,1600 ---- }, { + {"lock_timeout", PGC_USERSET, CLIENT_CONN_STATEMENT, + gettext_noop("Sets the maximum allowed timeout for any lock taken by a statement."), + gettext_noop("A value of 0 turns off the timeout."), + GUC_UNIT_MS + }, + &LockTimeout, + 0, 0, INT_MAX, NULL, NULL + }, + + { {"vacuum_freeze_min_age", PGC_USERSET, CLIENT_CONN_STATEMENT, gettext_noop("Minimum age at which VACUUM should freeze a table row."), NULL diff -dcrpN pgsql.orig/src/backend/utils/misc/postgresql.conf.sample pgsql/src/backend/utils/misc/postgresql.conf.sample *** pgsql.orig/src/backend/utils/misc/postgresql.conf.sample 2010-02-17 10:05:47.000000000 +0100 --- pgsql/src/backend/utils/misc/postgresql.conf.sample 2010-02-19 11:29:19.000000000 +0100 *************** *** 481,486 **** --- 481,489 ---- #------------------------------------------------------------------------------ #deadlock_timeout = 1s + #lock_timeout = 0 # timeout value for heavy-weight locks + # taken by statements. 0 disables timeout + # unit in milliseconds, default is 0 #max_locks_per_transaction = 64 # min 10 # (change requires restart) # Note: Each lock table slot uses ~270 bytes of shared memory, and there are diff -dcrpN pgsql.orig/src/include/storage/pg_sema.h pgsql/src/include/storage/pg_sema.h *** pgsql.orig/src/include/storage/pg_sema.h 2010-01-03 12:54:39.000000000 +0100 --- pgsql/src/include/storage/pg_sema.h 2010-02-19 21:17:27.000000000 +0100 *************** extern void PGSemaphoreUnlock(PGSemaphor *** 80,83 **** --- 80,86 ---- /* Lock a semaphore only if able to do so without blocking */ extern bool PGSemaphoreTryLock(PGSemaphore sema); + /* Lock a semaphore (decrement count), blocking if count would be < 0 */ + extern void PGSemaphoreTimedLock(PGSemaphore sema, bool interruptOK); + #endif /* PG_SEMA_H */ diff -dcrpN pgsql.orig/src/include/storage/proc.h pgsql/src/include/storage/proc.h *** pgsql.orig/src/include/storage/proc.h 2010-02-13 19:44:42.000000000 +0100 --- pgsql/src/include/storage/proc.h 2010-02-19 21:16:32.000000000 +0100 *************** typedef struct PROC_HDR *** 163,170 **** --- 163,172 ---- /* configurable options */ extern int DeadlockTimeout; extern int StatementTimeout; + extern int LockTimeout; extern bool log_lock_waits; + extern volatile bool lock_timeout_detected; extern volatile bool cancel_from_timeout;
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers