On Wed, Mar 07, 2007 at 06:07:31PM -0500, Ed Maste wrote:
> Nightly tests on our 6.1-based installation using pgsql have resulted in
> a number of kernel hangs, due to a corrupt semu_list (the list ended up
> with a loop).
> 
> It seems there are a few holes in the locking in the semaphore code.  The
> issue we've encountered comes from semexit_myhook.  It obtains a pointer
> to a list element after acquiring SEMUNDO_LOCK, and after dropping the
> lock manipulates the next pointer to remove the element from the list.
> 
> The fix below solves our current problem.  Any comments?
> 
> --- RELENG_6/src/sys/kern/sysv_sem.c    Tue Jun  7 01:03:27 2005
> +++ swbuild_plt_boson/src/sys/kern/sysv_sem.c   Tue Mar  6 16:13:45 2007
> @@ -1259,16 +1259,17 @@
>         struct proc *p;
>  {
>         struct sem_undo *suptr;
> -       struct sem_undo **supptr;
> 
>         /*
>          * Go through the chain of undo vectors looking for one
>          * associated with this process.
>          */
>         SEMUNDO_LOCK();
> -       SLIST_FOREACH_PREVPTR(suptr, supptr, &semu_list, un_next) {
> -               if (suptr->un_proc == p)
> +       SLIST_FOREACH(suptr, &semu_list, un_next) {
> +               if (suptr->un_proc == p) {
> +                       SLIST_REMOVE(&semu_list, suptr, sem_undo, un_next);

this is wrong.. you cannot remove element from a *LIST when its iterated using 
*LIST_FOREACH.
Use *LIST_FOREACH_SAFE instead... 

thnx for the patch!

roman
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to