On Thu, May 21, 2015 at 11:33:02AM -0400, John Baldwin wrote: > On Wednesday, May 20, 2015 12:00:46 PM Alexander Kabaev wrote: > > On Fri, 15 May 2015 13:50:38 +0000 (UTC) > > John Baldwin <j...@freebsd.org> wrote: > > > > > Author: jhb > > > Date: Fri May 15 13:50:37 2015 > > > New Revision: 282971 > > > URL: https://svnweb.freebsd.org/changeset/base/282971 > > > > > > Log: > > > Previously, cv_waiters was only updated by cv_signal or cv_wait. If > > > a thread awakened due to a time out, then cv_waiters was not > > > decremented. If INT_MAX threads timed out on a cv without an > > > intervening cv_broadcast, then cv_waiters could overflow. To fix > > > this, have each sleeping thread decrement cv_waiters when it resumes. > > > > > > Note that previously cv_waiters was protected by the sleepq chain > > > lock. However, that lock is not held when threads resume from sleep. > > > In addition, the interlock is also not always reacquired after > > > resuming (cv_wait_unlock), nor is it always held by callers of > > > cv_signal() or cv_broadcast(). Instead, use atomic ops to update > > > cv_waiters. Since the sleepq chain lock is still held on every > > > increment, it should still be safe to compare cv_waiters against zero > > > while holding the lock in the wakeup routines as the only way the > > > race should be lost would result in extra calls to sleepq_signal() or > > > sleepq_broadcast(). > > > Differential Revision: https://reviews.freebsd.org/D2427 > > > Reviewed by: benno > > > Reported by: benno (wrap of cv_waiters in the field) > > > MFC after: 2 weeks > > > > > > Modified: > > > head/sys/kern/kern_condvar.c > > > head/sys/sys/condvar.h > > > > > > > This breaks ZFS range locking code, which expects to be able to wakeup > > everyone on the condition variable and then free the structure that > > contains it. Having woken up threads modify cv_waiters results in a > > race that leads to already freed memory to be accessed. > > > > It is debatable just how correct ZFS code in its expectations, but I > > think this commit should probably be reverted until either ZFS is > > changed not to expect cv modifiable by waking threads or until > > alternative solution is found to the cv_waiters overflow issue fixed by > > this commit. > > Yes, this is fine to revert for now. I'll have to think about which way > to fix the bug in question. The simplest route is probably to remove > cv_waiters entirely, though hacking up sleepq_timeout to recognize cv's > and decrement the count is another option (but hackier). >
The code was buggy even prior to this. _cv_wait does: #ifdef KTRACE if (KTRPOINT(td, KTR_CSW)) ktrcsw(0, 0, cv_wmesg(cvp)); #endif So with ktrace enabled cvp is possibly accessed after it gets freed. I checked "solaris compatiblity layer" from "zfs on linux" project and it looks like they are screwed by this as well. In other words, I would argue modifying native privmitives to accomodate for zfs use may not be the best course of action. Hacking up zfs or cv_ primitives seems better and opens up posibilities for minor optimisations. I'm not up to it though. -- Mateusz Guzik <mjguzik gmail.com> _______________________________________________ svn-src-head@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-head To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"