Re: Hung kernel from sysv semaphore semu_list corruption

2007-03-08 Thread Ed Maste
On Thu, Mar 08, 2007 at 10:55:22AM +0100, Divacky Roman wrote: > this is wrong.. you cannot remove element from a *LIST when its iterated > using *LIST_FOREACH. > Use *LIST_FOREACH_SAFE instead... We're not freeing the item in the loop so it would work unless QUEUE_MACRO_DEBUG is turned on to i

Re: Hung kernel from sysv semaphore semu_list corruption

2007-03-08 Thread Divacky Roman
On Wed, Mar 07, 2007 at 06:07:31PM -0500, Ed Maste wrote: > Nightly tests on our 6.1-based installation using pgsql have resulted in > a number of kernel hangs, due to a corrupt semu_list (the list ended up > with a loop). > > It seems there are a few holes in the locking in the semaphore code. T

Re: Hung kernel from sysv semaphore semu_list corruption

2007-03-07 Thread Edwin Groothuis
On Wed, Mar 07, 2007 at 06:07:31PM -0500, Ed Maste wrote: > Nightly tests on our 6.1-based installation using pgsql have resulted in > a number of kernel hangs, due to a corrupt semu_list (the list ended up > with a loop). Sounds like an issue we have for a long time. Unfortunately it only happens

Hung kernel from sysv semaphore semu_list corruption

2007-03-07 Thread Ed Maste
Nightly tests on our 6.1-based installation using pgsql have resulted in a number of kernel hangs, due to a corrupt semu_list (the list ended up with a loop). It seems there are a few holes in the locking in the semaphore code. The issue we've encountered comes from semexit_myhook. It obtains a